python 关于函数递归调用自己

爬取b站博人传

每页短评20个,页数超过1000页,

代码如下

import requests
import json
import csv
def main(start_url):
    headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.75 Safari/537.36',}
    res = requests.get(url=start_url,headers=headers).content.decode()
    data = json.loads(res)
    try:
        data = data['result']['list']
    except:
        print('-----------')
    cursor = re.findall('"cursor":"(d+)",',res)


    for i in data:
        mid = i['author']['mid']
        uname = i['author']['uname']
        content = i['content']
        content= content.strip()
        try:
            last_index_show = i['user_season']['last_index_show']
        except:
            last_index_show = None

        print(mid,uname,content,last_index_show)
        print('------------------------')

        with open('borenzhuan_duanping.csv', 'a', newline='',encoding='utf-8')as f:
            writer = csv.writer(f)
            writer.writerow([mid,uname,content,last_index_show])


    if cursor:
        next_url = 'https://bangumi.bilibili.com/review/web_api/short/list?media_id={}&folded=0&page_size=20&sort=0&sort=0&cursor='.format(id) + cursor[0]
        main(next_url)
    else:
        print('抓取完成')

if __name__ == '__main__':



    zhuye_url = 'https://www.bilibili.com/bangumi/media/md5978/'
    id = re.findall('md(d+)', zhuye_url)[0]
    start_url = 'https://bangumi.bilibili.com/review/web_api/short/list?media_id={}&folded=0&page_size=20&sort=0&cursor='.format(id)

    main(start_url)

在爬取过程中发现,每当递归到999会发生异常

RecursionError: maximum recursion depth exceeded in comparison

这个函数在递归自身是发生的异常

只需要在程序开头添加

import sys
sys.setrecursionlimit(100000)

防止内存爆炸 

原文地址:https://www.cnblogs.com/zengxm/p/10972537.html