3.python正则匹配不到内容时消耗大量内存

遇到问题：正常情况获取的网页源码可以通过正则表达式快速匹配到内容，，但是如果出现问题，没有匹配到的内容，正则就会一直回溯，导致内存激增，一直循坏查找。

解决思路：一、如果能够有特殊内容可以标记，满足标记再正则，不匹配则不正则，避免一直回溯

二、可以设置timeout的函数，如果运行超过多少时间则强制结束（下面给出了示例）

用threading.Timer的方法，通过start-》sleep-》cancel的形式，实现强制结束函数的调用

import threading
import time

def fun_timer():
    print('hello timer')
    global timer
    #重复构造定时器
    timer = threading.Timer(5.8,fun_timer)
    timer.start()
#定时调度
timer = threading.Timer(2,fun_timer)
timer.start()


# 50秒后停止定时器
time.sleep(50)
timer.cancel()

参考文章：https://blog.csdn.net/lxcnn/article/details/4756030

参考文章：https://blog.csdn.net/Homewm/article/details/92127567 （处理函数超时的三种方式）