理论: https://blog.csdn.net/jiduochou963/article/details/88020415
GIL, global interpreter lock (cpython)
python中的一个线程对应于c语言中的一个线程
GIL使得同一个时刻只有一个线程在一个CPU上执行字节码,无法将多个线程映射到多个CPU上执行,即无法体现多核CPU的优势
import dis
def add(a):
a = a+1
return a
print(dis.dis(add))
执行:
$ python test1
4 0 LOAD_FAST 0 (a)
2 LOAD_CONST 1 (1)
4 BINARY_ADD
6 STORE_FAST 0 (a)
5 8 LOAD_FAST 0 (a)
10 RETURN_VALUE
None
Python解释器会根据执行的字节码行数以及时间片等策略释放GIL,GIL在遇到IO操作的时候主动释放。
一个例子:GIL锁释放的情况
total = 0
def add():
global total
for i in range(1000000):
total += 1
def desc():
global total
for i in range(1000000):
total -= 1
import threading
thread_1 = threading.Thread(target=add)
thread_2 = threading.Thread(target=desc)
thread_1.start()
thread_2.start()
thread_1.join()
thread_2.join()
print('total:', total)
输出(结果可能不唯一):
total: 490944
现代操作系统调度的最小单元是线程,期初是进程,但由于进程对系统资源的消耗非常大,后期演变成线程。
首先介绍多线程编程。
import threading
from time import sleep, ctime
def get_detail_html():
print('start get_detail_html at:', ctime())
sleep(2)
print('get_detail_html done at:', ctime())
def get_detail_url():
print('start get_detail_html at:', ctime())
sleep(3)
print('get_detail_html done at:', ctime())
if __name__ == '__main__':
print('starting at', ctime())
thread_1 = threading.Thread(target=get_detail_html)
thread_2 = threading.Thread(target=get_detail_url)
thread_1.start()
thread_2.start()
print('all DONE at:', ctime())
一种执行情况:
starting at Thu Feb 28 23:10:30 2019
start get_detail_html at: Thu Feb 28 23:10:30 2019
start get_detail_html at:all DONE at: Thu Feb 28 23:10:30 2019Thu Feb 28 23:10:30 2019
get_detail_html done at: Thu Feb 28 23:10:32 2019
get_detail_html done at: Thu Feb 28 23:10:33 2019
需求:当主线程退出的时候,子线程kill掉:
if __name__ == '__main__':
print('starting at', ctime())
thread_1 = threading.Thread(target=get_detail_html)
thread_2 = threading.Thread(target=get_detail_url)
# 当主线程退出的时候,子线程kill掉
# 将thread_1和thread_2设置为主线程的守护线程
thread_1.setDaemon(True)
thread_2.setDaemon(True)
thread_1.start()
thread_2.start()
print('all DONE at:', ctime())
一种执行情况:
starting at Thu Feb 28 23:14:20 2019
start get_detail_html at: Thu Feb 28 23:14:20 2019
start get_detail_html at:all DONE at: Thu Feb 28 23:14:20 2019
Thu Feb 28 23:14:20 2019
为了验证守护线程的作用,做如下修改:
if __name__ == '__main__':
print('starting at', ctime())
thread_1 = threading.Thread(target=get_detail_html)
thread_2 = threading.Thread(target=get_detail_url)
# 将thread_1设置为主线程的守护线程,thread_2不变
thread_1.setDaemon(True)
# thread_2.setDaemon(True)
thread_1.start()
thread_2.start()
print('all DONE at:', ctime())
执行情况:
starting at Thu Feb 28 23:16:58 2019
start get_detail_html at: Thu Feb 28 23:16:58 2019
start get_detail_html at: Thu Feb 28 23:16:58 2019all DONE at:
Thu Feb 28 23:16:58 2019
get_detail_html done at: Thu Feb 28 23:17:00 2019
get_detail_html done at: Thu Feb 28 23:17:01 2019
需求:等待子线程完成之后返回主线程继续执行:
if __name__ == '__main__':
print('starting at', ctime())
thread_1 = threading.Thread(target=get_detail_html)
thread_2 = threading.Thread(target=get_detail_url)
from time import time
start_time = time()
thread_1.start()
thread_2.start()
thread_1.join()
thread_2.join()
print('all DONE at:', ctime())
print('用时{}'.format(time()-start_time))
执行情况:
starting at Thu Feb 28 23:24:05 2019
start get_detail_html at: Thu Feb 28 23:24:05 2019
start get_detail_html at: Thu Feb 28 23:24:05 2019
get_detail_html done at: Thu Feb 28 23:24:07 2019
get_detail_html done at: Thu Feb 28 23:24:08 2019
all DONE at: Thu Feb 28 23:24:08 2019
用时3.0013234615325928
通过继承Thread来实现多线程
import threading
from time import sleep, ctime
class GetDetailHtml(threading.Thread):
def __init__(self, name=''):
super().__init__(name=name)
# 重载父类run方法
def run(self):
print('start get_detail_html at:', ctime())
sleep(2)
print('get_detail_html done at:', ctime())
class GetDetailUrl(threading.Thread):
def __init__(self, name=''):
super().__init__(name=name)
# 重载父类run方法
def run(self):
print('start get_detail_url at:', ctime())
sleep(3)
print('get_detail_url done at:', ctime())
if __name__ == '__main__':
print('starting at', ctime())
thread_1 = GetDetailHtml("get_detail_html")
thread_2 = GetDetailUrl("get_detail_url")
from time import time
start_time = time()
thread_1.start()
thread_2.start()
thread_1.join()
thread_2.join()
print('all DONE at:', ctime())
print('用时{}'.format(time()-start_time))
输出:
starting at Thu Feb 28 23:48:23 2019
start get_detail_html at: Thu Feb 28 23:48:23 2019
start get_detail_url at: Thu Feb 28 23:48:23 2019
get_detail_html done at: Thu Feb 28 23:48:25 2019
get_detail_url done at: Thu Feb 28 23:48:26 2019
all DONE at: Thu Feb 28 23:48:26 2019
用时3.0017149448394775