网站入口:http://wise.xmu.edu.cn/people/faculty
爬取信息:姓名和主页地址
python版本:3.5
import requests
r = requests.get('http://www.wise.xmu.edu.cn/people/faculty')
html = r.content
from bs4 import BeautifulSoup
soup = BeautifulSoup(html, 'xml')
div_people_list = soup.find('div', attrs={'class': 'people_list'})
a_s = div_people_list.find_all('a', attrs={'target': '_blank'})
for a in a_s:
url = a['href']
name = a.get_text()
print(name, url)
输出:
敖萌幪 /people/faculty/494d4f1c-0470-4f53-8b7c-d3594241876b.html
Bowers, Roslyn /people/faculty/d01fe119-7980-4238-a3ec-abb9b66ec706.html
Brown, Katherine /people/faculty/36c6b263-2cc2-4682-9975-02b75e6505f7.html
鲍小佳 /people/faculty/bdc3fd77-84de-4020-846d-344e02f110e9.html
Chang, Seong Yeon /people/faculty/0534965d-6393-4e22-a6bb-6ac3b11fe431.html
蔡熙乾 /people/faculty/95d97944-beb6-4a47-af85-a0778e1788b2.html