1.Python爬虫入门

1.Python爬虫入门_urllib

 1 #2019-11-22
 2 import urllib.request #Pthon自带的网络连接库 
 3 import gzip #解压缩库
 4 
 5 #程序入口
 6 if __name__=='__main__':
 7     #url:我们要爬取的网址
 8     url='http://www.qq.com/'  #腾讯qq的网页代码进行了压缩,而且编码格式为gbk
 9     
10     #response:特定网址返回的数据,response接收的是一个对象实例
11     response=urllib.request.urlopen(url) #发起请求,百度服务器会有响应
12     
13     #1.response          #<class 'http.client.HTTPResponse'>
14     #2.response.info()    存储响应报文(可通过str()方法转为字符串), #http.client.HTTPMessage,报文头中无编码,默认编码为UTF-8
15     #3.response.getcode()  响应码(int类型),比如访问成功,访问码为200,无法访问为404
16     #4.response.read()   网页代码,字节形式,可用decode()解码
17     print(type(response))
18     print(response.info())
19     print(type(response.info()))
20     print(response.getcode())
21     #print(response.read())
22     
23     temp=response.read()
24     data=gzip.decompress(temp) #zip解压
25     data=data.decode('gbk') #gbk解码

应一个邻居姐姐的要求,她做设计需要图片素材,

而素材非常难找有些还要收费,所以她找上了我,

我想着平时也没空搞,借着这个机会学习一下 !