- 1
代码1:
from multiprocessing import Pool
import os, time, random
def long_time_task(name):
print('Run task %s (%s)...' % (name, os.getpid()))
start = time.time()
time.sleep(1)
#time.sleep(random.random() * 3)
end = time.time()
print('Task %s runs %0.2f seconds.' % (name, (end - start)))
if __name__=='__main__':
print('Parent process %s.' % os.getpid())
p = Pool()
for i in range(4):
p.map_async(long_time_task, (i,))
#p.apply(long_time_task, args=(i,))
#p.apply_async(long_time_task, args=(i,))
print('Waiting for all subprocesses done...')
p.close()
p.join()
print('All subprocesses done.')
结果1:
# time python simple-2.py
Parent process 25144.
Waiting for all subprocesses done...
Run task 0 (25145)...
Run task 1 (25146)...
Run task 2 (25147)...
Run task 3 (25148)...
Task 0 runs 1.00 seconds.
Task 1 runs 1.00 seconds.
Task 2 runs 1.00 seconds.
Task 3 runs 1.00 seconds.
All subprocesses done.
real 0m1.285s
user 0m0.158s
sys 0m0.053s
- 代码2:
使用p.map(long_time_task, (i,))
结果2:
# time python simple-2.py
Parent process 25228.
Run task 0 (25229)...
Task 0 runs 1.00 seconds.
Run task 1 (25230)...
Task 1 runs 1.00 seconds.
Run task 2 (25231)...
Task 2 runs 1.00 seconds.
Run task 3 (25232)...
Task 3 runs 1.00 seconds.
Waiting for all subprocesses done...
All subprocesses done.
real 0m4.302s
user 0m0.150s
sys 0m0.078s
结论:
使用map_async
,可以并行运行,而map
只能等待结束后继续运行;
apply_async
和 apply
同理
- 代码3:
……
p = Pool()
for i in range(8):
p.map_async(long_time_task, (i,))
……
结果:
# time python simple-2.py
Parent process 25400.
Waiting for all subprocesses done...
Run task 0 (25401)...
Run task 1 (25402)...
Run task 2 (25403)...
Run task 3 (25404)...
Task 0 runs 1.00 seconds.
Task 2 runs 1.00 seconds.
Task 3 runs 1.00 seconds.
Task 1 runs 1.00 seconds.
Run task 4 (25401)...
Run task 5 (25404)...
Run task 6 (25402)...
Run task 7 (25403)...
Task 4 runs 1.00 seconds.
Task 5 runs 1.00 seconds.
Task 6 runs 1.00 seconds.
Task 7 runs 1.00 seconds.
All subprocesses done.
real 0m2.292s
user 0m0.161s
sys 0m0.060s
结论:
只会创建4个进程,只有4个并行,多余任务的等待之前的进程结束后复用。
- 代码4:
……
p = Pool(8)
for i in range(8):
p.map_async(long_time_task, (i,))
……
结果:
# time python simple-2.py
Parent process 26592.
Waiting for all subprocesses done...
Run task 0 (26593)...
Run task 1 (26594)...
Run task 2 (26595)...
Run task 3 (26596)...
Run task 4 (26597)...
Run task 5 (26598)...
Run task 6 (26599)...
Run task 7 (26600)...
Task 0 runs 1.00 seconds.
Task 3 runs 1.00 seconds.
Task 1 runs 1.00 seconds.
Task 2 runs 1.01 seconds.
Task 7 runs 1.01 seconds.
Task 5 runs 1.01 seconds.
Task 6 runs 1.01 seconds.
Task 4 runs 1.02 seconds.
All subprocesses done.
real 0m1.310s
user 0m0.214s
sys 0m0.127s
结论:
可以看到4核心 跑8个任务,虽然创建了8个进程,但实际所用时间大于1秒,
因为只有4个并行,另外4个任务需要等待,但还是比 Pool(4) 快一点。
总结:
-
进程自己不跑任务,进程通过进程里的线程跑任务;
-
GIL 作用于解释器上,一个解释器只能同时跑一个线程;
-
因为gil的存在,多线程在python当中只能以时间片轮转的方式获得锁来执行;
-
使用multiprocessing,可以创建多进程;
-
所以使用 mul 可以实现并行跑任务;
-
并发和并行的区别:
并发是指同时创建任务,实际跑几个任务不知道;
并行是指同时跑几个任务;
举例:
在4核心 CPU 上使用 Pool(8),有8个并发会创建8个进程,但是只有4个并行。
关于 GIL :
Python 代码的执行由 Python 虚拟机(也叫解释器主循环)来控制。Python 在设计之初就考虑到要在主循环中,同时只有一个线程在执行,就像单 CPU 的系统中运行多个进程那样,内存中可以存放多个程序,但任意时刻,只有一个程序在 CPU 中运行。同样的,虽然 Python 解释器中可以“运行”多个线程,但在任意时刻,只有一个线程在解释其中运行。
对 Python 虚拟机的访问由全局解释器锁(GIL)来控制,正是这个锁能保证同一时刻只有一个线程在运行。