- 进程、与线程区别
- cpu运行原理
- python GIL全局解释器锁
- 线程
- 语法
- join
- 线程锁之LockRlock信号量
- 将线程变为守护进程
- Event事件
- queue队列
- 生产者消费者模型
- Queue队列
- 开发一个线程池
- 进程
- 语法
- 进程间通讯
- 进程池
进程与线程
什么是线程(thread)?
线程是操作系统能够进行运算调度的最小单位。它被包含在进程之中,是进程中的实际运作单位。一条线程指的是进程中一个单一顺序的控制流,一个进程中可以并发多个线程,每条线程并行执行不同的任务
A thread is an execution context, which is all the information a CPU needs to execute a stream of instructions.
Suppose you're reading a book, and you want to take a break right now, but you want to be able to come back and resume reading from the exact point where you stopped. One way to achieve that is by jotting down the page number, line number, and word number. So your execution context for reading a book is these 3 numbers.
If you have a roommate, and she's using the same technique, she can take the book while you're not using it, and resume reading from where she stopped. Then you can take it back, and resume it from where you were.
Threads work in the same way. A CPU is giving you the illusion that it's doing multiple computations at the same time. It does that by spending a bit of time on each computation. It can do that because it has an execution context for each computation. Just like you can share a book with your friend, many tasks can share a CPU.
On a more technical level, an execution context (therefore a thread) consists of the values of the CPU's registers.
Last: threads are different from processes. A thread is a context of execution, while a process is a bunch of resources associated with a computation. A process can have one or many threads.
Clarification: the resources associated with a process include memory pages (all the threads in a process have the same view of the memory), file descriptors (e.g., open sockets), and security credentials (e.g., the ID of the user who started the process).
什么是进程(process)?
An executing instance of a program is called a process.
Each process provides the resources needed to execute a program. A process has a virtual address space, executable code, open handles to system objects, a security context, a unique process identifier, environment variables, a priority class, minimum and maximum working set sizes, and at least one thread of execution. Each process is started with a single thread, often called the primary thread, but can create additional threads from any of its threads.
进程与线程的区别?
- Threads share the address space of the process that created it; processes have their own address space.
- Threads have direct access to the data segment of its process; processes have their own copy of the data segment of the parent process.
- Threads can directly communicate with other threads of its process; processes must use interprocess communication to communicate with sibling processes.
- New threads are easily created; new processes require duplication of the parent process.
- Threads can exercise considerable control over threads of the same process; processes can only exercise control over child processes.
- Changes to the main thread (cancellation, priority change, etc.) may affect the behavior of the other threads of the process; changes to the parent process does not affect child processes 。
Python threading模块
线程有2种调用方式,如下:
直接调用
1 import threading 2 import time 3 4 def sayhi(num): #定义每个线程要运行的函数 5 6 print("running on number:%s" %num) 7 8 time.sleep(3) 9 10 if __name__ == '__main__': 11 12 t1 = threading.Thread(target=sayhi,args=(1,)) #生成一个线程实例 13 t2 = threading.Thread(target=sayhi,args=(2,)) #生成另一个线程实例 14 15 t1.start() #启动线程 16 t2.start() #启动另一个线程 17 18 print(t1.getName()) #获取线程名 19 print(t2.getName())
继承式调用
1 import threading 2 import time 3 4 class MyThread(threading.Thread): 5 def __init__(self,num): 6 threading.Thread.__init__(self) 7 self.num = num 8 9 def run(self):#定义每个线程要运行的函数 10 11 print("running on number:%s" %self.num) 12 13 time.sleep(3) 14 15 if __name__ == '__main__': 16 17 t1 = MyThread(1) 18 t2 = MyThread(2) 19 t1.start() 20 t2.start()
Threading用于提供线程相关的操作,线程是应用程序中工作的最小单元。
1 #!/usr/bin/env python 2 # -*- coding:utf-8 -*- 3 import threading 4 import time 5 6 def show(arg): 7 time.sleep(1) 8 print 'thread'+str(arg) 9 10 for i in range(10): 11 t = threading.Thread(target=show, args=(i,)) 12 t.start() 13 14 print 'main thread stop'
上述代码创建了10个“前台”线程,然后控制器就交给了CPU,CPU根据指定算法进行调度,分片执行指令。
更多方法:
- start 线程准备就绪,等待CPU调度
- setName 为线程设置名称
- getName 获取线程名称
- setDaemon 设置为后台线程或前台线程(默认)
如果是后台线程,主线程执行过程中,后台线程也在进行,主线程执行完毕后,后台线程不论成功与否,均停止
如果是前台线程,主线程执行过程中,前台线程也在进行,主线程执行完毕后,等待前台线程也执行完成后,程序停止 - join 逐个执行每个线程,执行完毕后继续往下执行,该方法使得多线程变得无意义
- run 线程被cpu调度后自动执行线程对象的run方法
Join & Daemon
Some threads do background tasks, like sending keepalive packets, or performing periodic garbage collection, or whatever. These are only useful when the main program is running, and it's okay to kill them off once the other, non-daemon, threads have exited.
Without daemon threads, you'd have to keep track of them, and tell them to exit, before your program can completely quit. By setting them as daemon threads, you can let them run and forget about them, and when your program quits, any daemon threads are killed automatically.
1 import time 2 import threading 3 4 def run(n): 5 6 print('[%s]------running---- ' % n) 7 time.sleep(2) 8 print('--done--') 9 10 def main(): 11 for i in range(5): 12 t = threading.Thread(target=run,args=[i,]) 13 #time.sleep(1) 14 t.start() 15 t.join(1) 16 print('starting thread', t.getName()) 17 18 19 m = threading.Thread(target=main,args=[]) 20 m.setDaemon(True) #将主线程设置为Daemon线程,它退出时,其它子线程会同时退出,不管是否执行完任务 21 m.start() 22 #m.join(timeout=2) 23 print("---main thread done----")
Note:Daemon threads are abruptly stopped at shutdown. Their resources (such as open files, database transactions, etc.) may not be released properly. If you want your threads to stop gracefully, make them non-daemonic and use a suitable signalling mechanism such as an Event
.
线程锁(互斥锁Mutex)
一个进程下可以启动多个线程,多个线程共享父进程的内存空间,也就意味着每个线程可以访问同一份数据,此时,如果2个线程同时要修改同一份数据,会出现什么状况?
1 import time 2 import threading 3 4 def addNum(): 5 global num #在每个线程中都获取这个全局变量 6 print('--get num:',num ) 7 time.sleep(1) 8 num -=1 #对此公共变量进行-1操作 9 10 num = 100 #设定一个共享变量 11 thread_list = [] 12 for i in range(100): 13 t = threading.Thread(target=addNum) 14 t.start() 15 thread_list.append(t) 16 17 for t in thread_list: #等待所有线程执行完毕 18 t.join() 19 20 21 print('final num:', num )
正常来讲,这个num结果应该是0, 但在python 2.7上多运行几次,会发现,最后打印出来的num结果不总是0,为什么每次运行的结果不一样呢? 哈,很简单,假设你有A,B两个线程,此时都 要对num 进行减1操作, 由于2个线程是并发同时运行的,所以2个线程很有可能同时拿走了num=100这个初始变量交给cpu去运算,当A线程去处完的结果是99,但此时B线程运算完的结果也是99,两个线程同时CPU运算的结果再赋值给num变量后,结果就都是99。那怎么办呢? 很简单,每个线程在要修改公共数据时,为了避免自己在还没改完的时候别人也来修改此数据,可以给这个数据加一把锁, 这样其它线程想修改此数据时就必须等待你修改完毕并把锁释放掉后才能再访问此数据。
*注:不要在3.x上运行,不知为什么,3.x上的结果总是正确的,可能是自动加了锁
加锁版本
1 import time 2 import threading 3 4 def addNum(): 5 global num #在每个线程中都获取这个全局变量 6 print('--get num:',num ) 7 time.sleep(1) 8 lock.acquire() #修改数据前加锁 9 num -=1 #对此公共变量进行-1操作 10 lock.release() #修改后释放 11 12 num = 100 #设定一个共享变量 13 thread_list = [] 14 lock = threading.Lock() #生成全局锁 15 for i in range(100): 16 t = threading.Thread(target=addNum) 17 t.start() 18 thread_list.append(t) 19 20 for t in thread_list: #等待所有线程执行完毕 21 t.join() 22 23 print('final num:', num )
RLock(递归锁)
说白了就是在一个大锁中还要再包含子锁
1 import threading,time 2 3 def run1(): 4 print("grab the first part data") 5 lock.acquire() 6 global num 7 num +=1 8 lock.release() 9 return num 10 def run2(): 11 print("grab the second part data") 12 lock.acquire() 13 global num2 14 num2+=1 15 lock.release() 16 return num2 17 def run3(): 18 lock.acquire() 19 res = run1() 20 print('--------between run1 and run2-----') 21 res2 = run2() 22 lock.release() 23 print(res,res2) 24 25 26 if __name__ == '__main__': 27 28 num,num2 = 0,0 29 lock = threading.RLock() 30 for i in range(10): 31 t = threading.Thread(target=run3) 32 t.start() 33 34 while threading.active_count() != 1: 35 print(threading.active_count()) 36 else: 37 print('----all threads done---') 38 print(num,num2)
Semaphore(信号量)
互斥锁 同时只允许一个线程更改数据,而Semaphore是同时允许一定数量的线程更改数据 ,比如厕所有3个坑,那最多只允许3个人上厕所,后面的人只能等里面有人出来了才能再进去。
1 import threading,time 2 3 def run(n): 4 semaphore.acquire() 5 time.sleep(1) 6 print("run the thread: %s " %n) 7 semaphore.release() 8 9 if __name__ == '__main__': 10 11 num= 0 12 semaphore = threading.BoundedSemaphore(5) #最多允许5个线程同时运行 13 for i in range(20): 14 t = threading.Thread(target=run,args=(i,)) 15 t.start() 16 17 while threading.active_count() != 1: 18 pass #print threading.active_count() 19 else: 20 print('----all threads done---') 21 print(num)
Events
An event is a simple synchronization object;
the event represents an internal flag, and threads
can wait for the flag to be set, or set or clear the flag themselves.
event = threading.Event()
# a client thread can wait for the flag to be set
event.wait()
# a server thread can set or reset it
event.set()
event.clear()
If the flag is set, the wait method doesn’t do anything.
If the flag is cleared, wait will block until it becomes set again.
Any number of threads may wait for the same event.
python线程的事件用于主线程控制其他线程的执行,事件主要提供了三个方法 set、wait、clear。
事件处理的机制:全局定义了一个“Flag”,如果“Flag”值为 False,那么当程序执行 event.wait 方法时就会阻塞,如果“Flag”值为True,那么event.wait 方法时便不再阻塞。
clear:将“Flag”设置为False
set:将“Flag”设置为True
通过Event来实现两个或多个线程间的交互,下面是一个红绿灯的例子,即起动一个线程做交通指挥灯,生成几个线程做车辆,车辆行驶按红灯停,绿灯行的规则。
1 import threading,time 2 import random 3 def light(): 4 if not event.isSet(): 5 event.set() #wait就不阻塞 #绿灯状态 6 count = 0 7 while True: 8 if count < 10: 9 print('