听说过的多进程，多线程到底是什么鬼

线程

1.何为线程

线程是操作系统能够调度的最小单位，被包含在进程中，是进程的实际运作单位。一个进程可以并发多个线程。

2.线程的语法

创建并调用线程

 1 import threading
 2 import time
 3 
 4 def sayhi(num): #定义每个线程要运行的函数
 5 
 6     print("running on number:%s" %num)
 7 
 8     time.sleep(3)
 9 
10 if __name__ == '__main__':
11 
12     t1 = threading.Thread(target=sayhi,args=(1,)) #生成一个线程实例
13     t2 = threading.Thread(target=sayhi,args=(2,)) #生成另一个线程实例
14 
15     t1.start() #启动线程
16     t2.start() #启动另一个线程

此外还有一种继承式调用

 1 import threading
 2 import time
 3 
 4 class MyThread(threading.Thread):
 5     def __init__(self,num):
 6         threading.Thread.__init__(self)
 7         self.num = num
 8 
 9     def run(self):#定义每个线程要运行的函数
10 
11         print("running on number:%s" %self.num)
12 
13         time.sleep(3)
14 
15 if __name__ == '__main__':
16 
17     t1 = MyThread(1)
18     t2 = MyThread(2)
19     t1.start()
20     t2.start()

继承式调用

另外线程还有一些内置方法

start 线程准备就绪，等待CPU调度
setName 为线程设置名称
getName 获取线程名称
setDaemon 设置为后台线程或前台线程（默认）
如果是后台线程，主线程执行过程中，后台线程也在进行，主线程执行完毕后，后台线程不论成功与否，均停止
如果是前台线程，主线程执行过程中，前台线程也在进行，主线程执行完毕后，等待前台线程也执行完成后，程序停止
join 逐个执行每个线程，执行完毕后继续往下执行，该方法使得多线程变得无意义
run 线程被cpu调度后自动执行线程对象的run方法

此外我们还可以自定义线程

 1 import threading
 2 import time
 3  
 4  
 5 class MyThread(threading.Thread):
 6     def __init__(self,num):
 7         threading.Thread.__init__(self)
 8         self.num = num
 9  
10     def run(self):#定义每个线程要运行的函数
11  
12         print("running on number:%s" %self.num)
13  
14         time.sleep(3)
15  
16 if __name__ == '__main__':
17  
18     t1 = MyThread(1)
19     t2 = MyThread(2)
20     t1.start()
21     t2.start()

自定义线程

线程锁（互斥锁）

要知道，线程之间是可以共享数据的，同时线程之间是随机调度的。这也就意味着多个线程修改同一项数据时会出现脏数据的情况。这时，就需要我们设置线程锁。这样同一个时刻就只能允许一个线程执行操作。

 1 import time
 2 import threading
 3 
 4 def addNum():
 5     global num #在每个线程中都获取这个全局变量
 6     print('--get num:',num )
 7     time.sleep(1)
 8     lock.acquire() #修改数据前加锁
 9     num  -=1 #对此公共变量进行-1操作
10     lock.release() #修改后释放
11 
12 num = 100  #设定一个共享变量
13 thread_list = []
14 lock = threading.Lock() #生成全局锁
15 for i in range(100):
16     t = threading.Thread(target=addNum)
17     t.start()
18     thread_list.append(t)
19 
20 for t in thread_list: #等待所有线程执行完毕
21     t.join()
22 
23 print('final num:', num )

递归锁

纸面理解，就是锁中锁。不过，一般我们不会用到这么繁琐的语法。这样不仅使得代码更加晦涩难懂，同时也会使自己的代码出现逻辑错误。稍作了解就好。

 1 import threading,time
 2 
 3 def run1():
 4     print("grab the first part data")
 5     lock.acquire()
 6     global num
 7     num +=1
 8     lock.release()
 9     return num
10 def run2():
11     print("grab the second part data")
12     lock.acquire()
13     global  num2
14     num2+=1
15     lock.release()
16     return num2
17 def run3():
18     lock.acquire()
19     res = run1()
20     print('--------between run1 and run2-----')
21     res2 = run2()
22     lock.release()
23     print(res,res2)
24 
25 
26 if __name__ == '__main__':
27 
28     num,num2 = 0,0
29     lock = threading.RLock()
30     for i in range(10):
31         t = threading.Thread(target=run3)
32         t.start()
33 
34 while threading.active_count() != 1:
35     print(threading.active_count())
36 else:
37     print('----all threads done---')
38     print(num,num2)

递归锁

信号量

线程锁的存在使得一份数据同时只允许一个线程修改，当我们想指定多个线程修改同一份数据时，就可以使用信号量。

 1 import threading,time
 2 
 3 def run(n):
 4     semaphore.acquire()
 5     time.sleep(1)
 6     print("run the thread: %s
" %n)
 7     semaphore.release()
 8 
 9 if __name__ == '__main__':
10 
11     num= 0
12     semaphore  = threading.BoundedSemaphore(5) #最多允许5个线程同时运行
13     for i in range(20):
14         t = threading.Thread(target=run,args=(i,))
15         t.start()
16 
17 while threading.active_count() != 1:
18     pass #print threading.active_count()
19 else:
20     print('----all threads done---')
21     print(num)

事件

通过Event来实现两个或者多个线程间的交互，事件主要提供了三个方法 set、wait、clear。

事件处理的机制：全局定义了一个“Flag”，如果“Flag”值为 False，那么当程序执行 event.wait 方法时就会阻塞，如果“Flag”值为True，那么event.wait 方法时便不再阻塞。

clear：将“Flag”设置为False
set：将“Flag”设置为True

import threading,time
import random
def light():
    if not event.isSet():
        event.set() #wait就不阻塞 #绿灯状态
    count = 0
    while True:
        if count < 10:
            print('33[42;1m--green light on---33[0m')
        elif count <13:
            print('33[43;1m--yellow light on---33[0m')
        elif count <20:
            if event.isSet():
                event.clear()
            print('33[41;1m--red light on---33[0m')
        else:
            count = 0
            event.set() #打开绿灯
        time.sleep(1)
        count +=1
def car(n):
    while 1:
        time.sleep(random.randrange(10))
        if  event.isSet(): #绿灯
            print("car [%s] is running.." % n)
        else:
            print("car [%s] is waiting for the red light.." %n)
if __name__ == '__main__':
    event = threading.Event()
    Light = threading.Thread(target=light)
    Light.start()
    for i in range(3):
        t = threading.Thread(target=car,args=(i,))
        t.start()

条件

使得线程等待，只有满足某条件时，才释放n个线程

 1 import threading
 2  
 3 def run(n):
 4     con.acquire()
 5     con.wait()
 6     print("run the thread: %s" %n)
 7     con.release()
 8  
 9 if __name__ == '__main__':
10  
11     con = threading.Condition()
12     for i in range(10):
13         t = threading.Thread(target=run, args=(i,))
14         t.start()
15  
16     while True:
17         inp = input('>>>')
18         if inp == 'q':
19             break
20         con.acquire()
21         con.notify(int(inp))
22         con.release()

守护线程

设置守护线程，则其没有被设置的线程为主线。当主线程退出时，无论守护线程是否完成都会退出。

 1 import time
 2 import threading
 3 
 4 
 5 def run(n):
 6 
 7     print('[%s]------running----
' % n)
 8     time.sleep(2)
 9     print('--done--')
10 
11 def main():
12     for i in range(5):
13         t = threading.Thread(target=run,args=[i,])
14         t.start()
15         t.join(1)
16         print('starting thread', t.getName())
17 
18 
19 m = threading.Thread(target=main,args=[])
20 m.setDaemon(True) #将main线程设置为Daemon线程,它做为程序主线程的守护线程,当主线程退出时,m线程也会退出,由m启动的其它子线程会同时退出,不管是否执行完任务
21 m.start()
22 m.join(timeout=2)
23 print("---main thread done----")

Timer

定时器，指定n秒后执行某操作

from threading import Timer
 
 
def hello():
    print("hello, world")
 
t = Timer(1, hello)
t.start()  # after 1 seconds, "hello, world" will be printed

一个小小的补充 GIL（全局解释权锁）

在cpython的解释器中才会存在GIL，其他的jpython这样的编译环境就不会存在这种情况。那么这个GIL到底是什么呢，它的存在会有什么影响呢？不急，听我娓娓道来。

实际上，无论你启多少个线程，你有多少个cpu, Python在执行的时候会淡定的在同一时刻只允许一个线程运行。当你使用多线程时，似乎感觉到了并发。这只是因为python上下文切换的太快了给你的错觉。

那么GIL有什么作用呢？其实，GIL是来保证同一时间只能有一个线程来执行。

似乎和线程锁有些相似？并不尽然，线程锁是用户态的锁，而GIL则是编译器自带的。2.7版本以后，貌似线程锁又被封装到了编译器中变得和GIL差不多，所以发现即使不使用线程锁也不会出现脏数据（有待考证）。

队列queue

队列在线程编程中特别有用，因为信息必须在多个线程之间安全地交换。现在所讲的是python自带的队列，在后面的学习中会更加深入。目前稍作了解。

为什么我要使用队列？

使用队列对于我们编程而言有以下两点好处：

解耦，使程序之间实现松耦合
提高处理效率

队列的语法

class queue.Queue(maxsize=0) #先入先出
class queue.LifoQueue(maxsize=0) #后入先出
class queue.PriortyQueue(maxsize=0) #存储数据时可设置优先级队列

 1 import queue
 2 
 3 q = queue.Queue()
 4 
 5 q.put(1)
 6 q.put(2)
 7 q.put(3)
 8 
 9 print(q.get())
10 print(q.get())
11 print(q.get())
12 
13 
14 #结果
15 1
16 2
17 3

先入先出

 1 import queue
 2 
 3 q = queue.LifoQueue()
 4 
 5 q.put(1)
 6 q.put(2)
 7 q.put(3)
 8 
 9 print(q.get())
10 print(q.get())
11 print(q.get())
12 
13 #结果
14 3
15 2
16 1

先入后出

 1 import queue
 2 
 3 q = queue.PriorityQueue()
 4 
 5 q.put((2,1))
 6 q.put((3,2))
 7 q.put((1,3))
 8 
 9 print(q.get())
10 print(q.get())
11 print(q.get())
12 
13 #结果
14 (1, 3)
15 (2, 1)
16 (3, 2)

指定输出

生产者消费者模型

在并发编程中使用生产者和消费者模式能够解决绝大多数并发问题。该模式通过平衡生产线程和消费线程的工作能力来提高程序的整体处理数据的速度。

为什么要使用生产者和消费者模式

在线程世界里，生产者就是生产数据的线程，消费者就是消费数据的线程。在多线程开发当中，如果生产者处理速度很快，而消费者处理速度很慢，那么生产者就必须等待消费者处理完，才能继续生产数据。同样的道理，如果消费者的处理能力大于生产者，那么消费者就必须等待生产者。为了解决这个问题于是引入了生产者和消费者模式。

什么是生产者消费者模式

生产者消费者模式是通过一个容器来解决生产者和消费者的强耦合问题。生产者和消费者彼此之间不直接通讯，而通过阻塞队列来进行通讯，所以生产者生产完数据之后不用等待消费者处理，直接扔给阻塞队列，消费者不找生产者要数据，而是直接从阻塞队列里取，阻塞队列就相当于一个缓冲区，平衡了生产者和消费者的处理能力。

下面来学习一个最基本的生产者消费者模型的例子

 1 import time,random
 2 import queue,threading
 3 q = queue.Queue()
 4 def Producer(name):
 5   count = 0
 6   while count <20:
 7     time.sleep(random.randrange(3))
 8     q.put(count)
 9     print('Producer %s has produced %s baozi..' %(name, count))
10     count +=1
11 def Consumer(name):
12   count = 0
13   while count <20:
14     time.sleep(random.randrange(4))
15     if not q.empty():
16         data = q.get()
17         print(data)
18         print('33[32;1mConsumer %s has eat %s baozi...33[0m' %(name, data))
19     else:
20         print("-----no baozi anymore----")
21     count +=1
22 p1 = threading.Thread(target=Producer, args=('A',))
23 c1 = threading.Thread(target=Consumer, args=('B',))
24 p1.start()
25 c1.start()

多进程

1.何为进程

以一个整体的形式暴露给操作系统管理，里面包含了对各种资源的调用，内存的管理，网络管理的接口等资源，对各种资源的管理的集合，就称为进程。

2.多进程的语法

创建并调用进程

 1 from multiprocessing import Process
 2 import time
 3 def f(name):
 4     time.sleep(2)
 5     print('hello', name)
 6 
 7 if __name__ == '__main__':
 8     p = Process(target=f, args=('bob',))
 9     p.start()
10     p.join()

 1 from multiprocessing import Process
 2 import os
 3 
 4 def info(title):
 5     print(title)
 6     print('module name:', __name__)
 7     print('parent process:', os.getppid())
 8     print('process id:', os.getpid())
 9     print("

")
10 
11 def f(name):
12     info('33[31;1mfunction f33[0m')
13     print('hello', name)
14 
15 if __name__ == '__main__':
16     info('33[32;1mmain process line33[0m')
17     p = Process(target=f, args=('bob',))
18     p.start()
19     p.join()

获取进程id

进程间的通讯

不同进程间内存是不共享的，要想实现两个进程间的数据交换，可以用以下方法：

Queues

 1 from multiprocessing import Process, Queue
 2  
 3 def f(q):
 4     q.put([42, None, 'hello'])
 5  
 6 if __name__ == '__main__':
 7     q = Queue()
 8     p = Process(target=f, args=(q,))
 9     p.start()
10     print(q.get())    # prints "[42, None, 'hello']"
11     p.join()

Pipes

 1 from multiprocessing import Process, Pipe
 2  
 3 def f(conn):
 4     conn.send([42, None, 'hello'])
 5     conn.close()
 6  
 7 if __name__ == '__main__':
 8     parent_conn, child_conn = Pipe()
 9     p = Process(target=f, args=(child_conn,))
10     p.start()
11     print(parent_conn.recv())   # prints "[42, None, 'hello']"
12     p.join()

Managers（数据可共享）

 1 from multiprocessing import Process, Manager
 2  
 3 def f(d, l):
 4     d[1] = '1'
 5     d['2'] = 2
 6     d[0.25] = None
 7     l.append(1)
 8     print(l)
 9  
10 if __name__ == '__main__':
11     with Manager() as manager:
12         d = manager.dict()
13  
14         l = manager.list(range(5))
15         p_list = []
16         for i in range(10):
17             p = Process(target=f, args=(d, l))
18             p.start()
19             p_list.append(p)
20         for res in p_list:
21             res.join()
22  
23         print(d)
24         print(l)

进程锁

当创建进程时（非使用时），共享数据会被拿到子进程中，当进程中执行完毕后，再赋值给原值。

 1 from multiprocessing import Process, Lock
 2  
 3 def f(l, i):
 4     l.acquire()
 5     try:
 6         print('hello world', i)
 7     finally:
 8         l.release()
 9  
10 if __name__ == '__main__':
11     lock = Lock()
12  
13     for num in range(10):
14         Process(target=f, args=(lock, num)).start()

进程锁

进程池

进程池内部维护一个进程序列，当使用时，则去进程池中获取一个进程，如果进程池序列中没有可供使用的进进程，那么程序就会等待，直到进程池中有可用进程为止。

进程池中有两个方法：

apply
apply_async

 1 from  multiprocessing import Process,Pool
 2 import time
 3  
 4 def Foo(i):
 5     time.sleep(2)
 6     return i+100
 7  
 8 def Bar(arg):
 9     print('-->exec done:',arg)
10  
11 pool = Pool(5)
12  
13 for i in range(10):
14     pool.apply_async(func=Foo, args=(i,),callback=Bar)
15     #pool.apply(func=Foo, args=(i,))
16  
17 print('end')
18 pool.close()
19 pool.join()#进程池中进程执行完毕后再关闭，如果注释，那么程序直接关闭。

总结：

1.什么是进程和线程：

进程是一堆资源的集合

线程是一个指令

2.进程要操作cpu，必须要先创建一个线程，所有在同一个进程里的线程共享同一块空间

3.进程和线程的区别：

一线程是共享内存空间，进程的内存是独立的

二父进程创建两个子进程，子进程是对父进程的克隆，两个子进程是完全独立的，不能互相访问

　两个线程共享同一个进程，同一个进程的线程可以直接交流，两个进程想要通信，必须通过一个中间代理实现。

三创建新线程很简单，创建一个新的进程需要对父进程进行一次克隆

四一个线程可以控制和操作同一个进程的其他线程，但是进程只能操作其子进程