简单的爬虫例子

#coding=utf-8
import urllib
import re

def getHtml(url):
    page = urllib.urlopen(url)
    html = page.read()
    return html

def getImg(html):
    reg = r'src="(.+?.jpg)" pic_ext'
    imgre = re.compile(reg)
    imglist = re.findall(imgre,html)
    x = 0
    for imgurl in imglist:
        urllib.urlretrieve(imgurl,'%s.jpg' % x)
        x+=1
    return imglist

html = getHtml("http://tieba.baidu.com/p/2460150866")

print getImg(html)

【推广】免费学中医，健康全家人

原文地址：https://www.cnblogs.com/xiaoxiaoshuaishuai0219/p/6422913.html

推荐文章
暑假集训-8.04总结
前缀统计
暑假集训-8.03总结
Upgrading Technology
回文子串的最大长度
暑假集训-8.02总结
AcWing：110. 防晒（贪心）
暑假集训
AcWing：108. 奇数码问题（归并排序 + 逆序数）
HUD 1166：敌兵布阵（线段树 or 树状数组）
AcWing 107. 超快速排序（归并排序 + 逆序对 or 树状数组）
AcWing：106. 动态中位数（对顶堆）
AcWing：139. 回文子串的最大长度（字符串Hash + 前缀和 + 后缀和 + 二分）
暑假计划
二分mid的取法
AcWing：138. 兔子与兔子（字符串Hash）
Count on a tree(树上路径第K小)
Sequence II
codeforces D Salary Changing
Sequence
2017ICPC南宁补题
H. The Game of Life
I
Twice Equation
（贪心+队列）String
Marcin and Training Camp
莫比乌斯函数模版
HDU-1695 莫比乌斯反演
Steps to One DP+莫比乌斯反演
Educational Codeforces Round 62 (Rated for Div. 2)