Python3 爬虫-自定义字体反爬

 1 <html>
 2     <head>
 3         <title>new font</title>
 4         <meta charset="utf-8" lang="zh">
 5         <style>
 6             @font-face { 
 7                 font-family: 'new_font'; 
 8                 src: url('Font/f1c26632.woff') //谷歌
 9             }
10             .new_font { font-family: "new_font"; }
11         </style>
12     </head>
13     <body>
14         <div>
15             <span class="new_font">新的字体格式:舒&#xf4c7;&#xebcc;</span>
16         </div>
17     </body>
18 </html>
  • 页面显示正常,但是通过开发者工具查看则不正常
  • 获取网页使用的自定义字体文件,可以使用百度编辑器进行查看
 1 # 把上图转换为下图,unie573转换为58739
 2 FONT_DICT = {58739: '1', 58275: '2', 63321: '3', 63537: '4', 58042: '5', 59755: '6', 59348: '7', 63702: '8', 60197: '9', 61011: '0',}
 3 def get_font_w(content):
 4     if len(content) == 0:
 5         return ''
 6     content = str(content).replace('&#', '0')
 7     # 例如 <span class="new_font">新的字体格式:舒&#xf4c7;&#xebcc;</span>
 8     # 如何获取该内容,这是个问题<span class="new_font">新的字体格式:舒&#xf4c7;&#xebcc;</span>
 9     for key in FONT_DICT.keys():
10         key_16 = hex(key)
11         initstr = str(key_16) + ';'
12         content = content.replace(initstr, str(FONT_DICT[key]))
13     # <span class="new_font">新的字体格式:舒适型</span>
14     print(content)
 
 
原文地址:https://www.cnblogs.com/My-Sun-Shine/p/13550851.html