15 基于bs4库的HTML格式化和编码


一、格式化主要用prettify()方法
"""基于bs4库的HTML格式化"""

import requests
from bs4 import BeautifulSoup

#方法一:下行遍历
url = "https://python123.io/ws/demo.html"
r = requests.get(url)
demo = r.text
soup = BeautifulSoup(demo, "html.parser")
print(demo)
print(soup.prettify())
# 也可用于某个标签
print(soup.a.prettify())

二、编码:
#bs4库使用utf-8编码方式与Python3.X匹配
原文地址:https://www.cnblogs.com/sruzzg/p/13047378.html