Python多线程编程实践

理论: https://blog.csdn.net/jiduochou963/article/details/88020415

GIL, global interpreter lock (cpython)
python中的一个线程对应于c语言中的一个线程
GIL使得同一个时刻只有一个线程在一个CPU上执行字节码,无法将多个线程映射到多个CPU上执行,即无法体现多核CPU的优势

import dis

def add(a):
    a = a+1
    return a

print(dis.dis(add))

执行:

$ python test1
 4           0 LOAD_FAST                0 (a)
              2 LOAD_CONST               1 (1)
              4 BINARY_ADD
              6 STORE_FAST               0 (a)

 5           8 LOAD_FAST                0 (a)
             10 RETURN_VALUE
None

Python解释器会根据执行的字节码行数以及时间片等策略释放GIL,GIL在遇到IO操作的时候主动释放。

一个例子:GIL锁释放的情况

total = 0


def add():
    global total
    for i in range(1000000):
        total += 1


def desc():
    global total
    for i in range(1000000):
        total -= 1


import threading
thread_1 = threading.Thread(target=add)
thread_2 = threading.Thread(target=desc)
thread_1.start()
thread_2.start()
thread_1.join()
thread_2.join()
print('total:', total)

输出(结果可能不唯一):
total: 490944

现代操作系统调度的最小单元是线程,期初是进程,但由于进程对系统资源的消耗非常大,后期演变成线程。
首先介绍多线程编程。

import threading
from time import sleep, ctime


def get_detail_html():
    print('start get_detail_html at:', ctime())
    sleep(2)
    print('get_detail_html done at:', ctime())


def get_detail_url():
    print('start get_detail_html at:', ctime())
    sleep(3)
    print('get_detail_html done at:', ctime())


if __name__ == '__main__':
    print('starting at', ctime())
    thread_1 = threading.Thread(target=get_detail_html)
    thread_2 = threading.Thread(target=get_detail_url)
    thread_1.start()
    thread_2.start()
    print('all DONE at:', ctime())

一种执行情况:

starting at Thu Feb 28 23:10:30 2019
start get_detail_html at: Thu Feb 28 23:10:30 2019
start get_detail_html at:all DONE at:  Thu Feb 28 23:10:30 2019Thu Feb 28 23:10:30 2019

get_detail_html done at: Thu Feb 28 23:10:32 2019
get_detail_html done at: Thu Feb 28 23:10:33 2019

需求:当主线程退出的时候,子线程kill掉:

if __name__ == '__main__':
    print('starting at', ctime())
    thread_1 = threading.Thread(target=get_detail_html)
    thread_2 = threading.Thread(target=get_detail_url)
    # 当主线程退出的时候,子线程kill掉
    # 将thread_1和thread_2设置为主线程的守护线程
    thread_1.setDaemon(True)
    thread_2.setDaemon(True)
    thread_1.start()
    thread_2.start()
    print('all DONE at:', ctime())

一种执行情况:

starting at Thu Feb 28 23:14:20 2019
start get_detail_html at: Thu Feb 28 23:14:20 2019
start get_detail_html at:all DONE at: Thu Feb 28 23:14:20 2019 
Thu Feb 28 23:14:20 2019

为了验证守护线程的作用,做如下修改:

if __name__ == '__main__':
    print('starting at', ctime())
    thread_1 = threading.Thread(target=get_detail_html)
    thread_2 = threading.Thread(target=get_detail_url)
    # 将thread_1设置为主线程的守护线程,thread_2不变
    thread_1.setDaemon(True)
    # thread_2.setDaemon(True)
    thread_1.start()
    thread_2.start()
    print('all DONE at:', ctime())

执行情况:

starting at Thu Feb 28 23:16:58 2019
start get_detail_html at: Thu Feb 28 23:16:58 2019
start get_detail_html at: Thu Feb 28 23:16:58 2019all DONE at: 
Thu Feb 28 23:16:58 2019
get_detail_html done at: Thu Feb 28 23:17:00 2019
get_detail_html done at: Thu Feb 28 23:17:01 2019

需求:等待子线程完成之后返回主线程继续执行:

if __name__ == '__main__':
    print('starting at', ctime())
    thread_1 = threading.Thread(target=get_detail_html)
    thread_2 = threading.Thread(target=get_detail_url)
    from time import time
    start_time = time()
    thread_1.start()
    thread_2.start()
    thread_1.join()
    thread_2.join()
    print('all DONE at:', ctime())
    print('用时{}'.format(time()-start_time))

执行情况:

starting at Thu Feb 28 23:24:05 2019
start get_detail_html at: Thu Feb 28 23:24:05 2019
start get_detail_html at: Thu Feb 28 23:24:05 2019
get_detail_html done at: Thu Feb 28 23:24:07 2019
get_detail_html done at: Thu Feb 28 23:24:08 2019
all DONE at: Thu Feb 28 23:24:08 2019
用时3.0013234615325928

通过继承Thread来实现多线程

import threading
from time import sleep, ctime


class GetDetailHtml(threading.Thread):
    def __init__(self, name=''):
        super().__init__(name=name)

    # 重载父类run方法
    def run(self):
        print('start get_detail_html at:', ctime())
        sleep(2)
        print('get_detail_html done at:', ctime())


class GetDetailUrl(threading.Thread):
    def __init__(self, name=''):
        super().__init__(name=name)

    # 重载父类run方法
    def run(self):
        print('start get_detail_url at:', ctime())
        sleep(3)
        print('get_detail_url done at:', ctime())


if __name__ == '__main__':
    print('starting at', ctime())
    thread_1 = GetDetailHtml("get_detail_html")
    thread_2 = GetDetailUrl("get_detail_url")
    from time import time
    start_time = time()
    thread_1.start()
    thread_2.start()
    thread_1.join()
    thread_2.join()
    print('all DONE at:', ctime())
    print('用时{}'.format(time()-start_time))

输出:

starting at Thu Feb 28 23:48:23 2019
start get_detail_html at: Thu Feb 28 23:48:23 2019
start get_detail_url at: Thu Feb 28 23:48:23 2019
get_detail_html done at: Thu Feb 28 23:48:25 2019
get_detail_url done at: Thu Feb 28 23:48:26 2019
all DONE at: Thu Feb 28 23:48:26 2019
用时3.0017149448394775
原文地址:https://www.cnblogs.com/onefine/p/10499331.html