Python3-进程池与线程池

进程池与线程池

在刚开始学多进程或多线程时，我们迫不及待地基于多进程或多线程实现并发的套接字通信，然而这种实现方式的致命缺陷是：服务的开启的进程数或线程数都会随着并发的客户端数目地增多而增多，这会对服务端主机带来巨大的压力，甚至于不堪重负而瘫痪，于是我们必须对服务端开启的进程数或线程数加以控制，让机器在一个自己可以承受的范围内运行，这就是进程池或线程池的用途，例如进程池，就是用来存放进程的池子，本质还是基于多进程，只不过是对开启进程的数目加上了限制

介绍

官网：https://docs.python.org/dev/library/concurrent.futures.html

concurrent.futures模块提供了高度封装的异步调用接口
ThreadPoolExecutor：线程池，提供异步调用
ProcessPoolExecutor: 进程池，提供异步调用
Both implement the same interface, which is defined by the abstract Executor class.

基本方法

1、submit(fn, *args, **kwargs)
异步提交任务

2、map(func, *iterables, timeout=None, chunksize=1) 
取代for循环submit的操作

3、shutdown(wait=True) 
相当于进程池的pool.close()+pool.join()操作
wait=True，等待池内所有任务执行完毕回收完资源后才继续
wait=False，立即返回，并不会等待池内的任务执行完毕
但不管wait参数为何值，整个程序都会等到所有任务执行完毕
submit和map必须在shutdown之前

4、result(timeout=None)
取得结果

5、add_done_callback(fn)
回调函数

线程池用法

from concurrent.futures import ThreadPoolExecutor,ProcessPoolExecutor

import os,time,random
def task(n):
    print('%s is runing' %os.getpid())
    time.sleep(random.randint(1,3))
    return n**2

if __name__ == '__main__':

    executor=ProcessPoolExecutor(max_workers=3)

    futures=[]
    for i in range(11):
        future=executor.submit(task,i)
        futures.append(future)
    executor.shutdown(True)
    print('+++>')
    for future in futures:
        print(future.result())

进程程池用法

把ThreadPoolExecutor换成ProcessPoolExecutor，其余用法全部相同

回调函数

可以为进程池或线程池内得每个进程或线程绑定一个函数，该函数在进程或线程的任务执行完毕后自动触发，并接受任务的返回值当作参数，该函数成为回调函数。

提交任务的两种方法：

1、同步调用：提交完任务后，就在原地等待任务执行完毕，拿到结果，在执行下一行代码，导致程序是串行

2、异步调用：提交完任务后，不用原地等待任务执行完毕

#提交任务的两种方式
#1、同步调用:提交完任务后，就在原地等待任务执行完毕，拿到结果，再执行下一行代码,导致程序是串行执行
#
# from concurrent.futures import ThreadPoolExecutor
# import time
# import random
#
# def la(name):
#     print('%s is laing' %name)
#     time.sleep(random.randint(3,5))
#     res=random.randint(7,13)*'#'
#     return {'name':name,'res':res}
#
# def weigh(shit):
#     name=shit['name']
#     size=len(shit['res'])
#     print('%s 拉了 《%s》kg' %(name,size))
#
#
# if __name__ == '__main__':
#     pool=ThreadPoolExecutor(13)
#
#     shit1=pool.submit(la,'alex').result()
#     weigh(shit1)
#
#     shit2=pool.submit(la,'wupeiqi').result()
#     weigh(shit2)
#
#     shit3=pool.submit(la,'yuanhao').result()
#     weigh(shit3)


#2、异步调用：提交完任务后，不地等待任务执行完毕，

from concurrent.futures import ThreadPoolExecutor
import time
import random

def la(name):
    print('%s is laing' %name)
    time.sleep(random.randint(3,5))
    res=random.randint(7,13)*'#'
    return {'name':name,'res':res}


def weigh(shit):
    shit=shit.result()
    name=shit['name']
    size=len(shit['res'])
    print('%s 拉了 《%s》kg' %(name,size))


if __name__ == '__main__':
    pool=ThreadPoolExecutor(13)

    pool.submit(la,'alex').add_done_callback(weigh)

    pool.submit(la,'wupeiqi').add_done_callback(weigh)

    pool.submit(la,'yuanhao').add_done_callback(weigh)

View Code

from concurrent.futures import ThreadPoolExecutor
import requests
import time

def get(url):
    print('GET %s' % url)
    response = requests.get(url)
    time.sleep(3)
    return {'url': url, 'content': response.text}

def parse(res):
    res = res.result()
    print('%s parse res is %s' % (res['url'], len(res['content'])))


if __name__ == '__main__':
    urls=[
        'http://www.cnblogs.com/linhaifeng',
        'https://www.python.org',
        'https://www.openstack.org',
    ]

    pool = ThreadPoolExecutor(2)

    for url in urls:
        pool.submit(get, url).add_done_callback(parse)

线程池回调函数简单爬网页