多线程在python中的使用 thread

近期想学习研究一下python中使用多线程，来提高python在爬虫项目中的效率。
如今我们在网页上查询到在python中使用的多线程的使用大多数都是使用的threading模块，可是python中另一个模块叫做的thread模块，也能够完毕多线程的任务。

相比較两者来说。threading是更高级别的应用模块，但thread的存在必定有存在的理由。本篇主要讲介绍一下thread在python中的应用。

资源来自python开发文档的总结：https://docs.python.org/2/library/thread.html?

highlight=thread#module-thread

thread模块提供了较底层的多线程处理模块（也被称作是轻量级的处理程序）。提供了同步和简单锁（mutexes或binary semaphores）的功能。

在IDLE中import thread之后，通过help(thread) 能够查看thread的相关类函数及接口函数，在帮助文档中我们主要看到了一些有关thread的lock。error相关的函数说明，线程相关的函数说明。

主要函数说明： thread对象有下面几个函数

thread.interrupt_main():在主线程中引发一个KeyboardInterrupt（一般是Ctrl+C或者是delete）异常。子线程能够通过这个函数中断主线程。

thread.exit():

触发SystemExit异常。假设没有被捕获（不作出处理）。将会影响此线程退出终止。

thread.allocate_lock():
返回一个新的lock对象，锁的方法下文将要介绍，这样的锁起初是没有锁定的。

thread.get_ident():
返回当前线程的id号，返回值为一个非零的整形，它的值没有直接的意义。

thread.stack_size([size])：
当建立新的线程的时候返回栈的大小。

Lock对象有下面一个函数说明：

lock.acquire([waitflag])：
当这个函数没有括号里的可选參数的时候，这种方法能够无条件地捕获这个lock。假设此锁被占用则该方法会等待该锁被释放后将其捕获。假设这个int型的可选參数存在的话，函数的行为和这个int的值有关系：
假设是0的话，在此lock能够被无条件地捕获的前提下 lock仅仅是被直接捕获；当waitflag为非0值时候。lock会像之前一样被无条件捕获。假设lock被成功捕获，则返回值为True，否则返回为False.

lock.release()：
释放当前的lock对象；这里的lock必须是之前被捕获的，只是不要求是被同一个lock所捕获。

lock.locked():
返回当前所的状态。假设当前所被某个锁捕获，则返回值为True,否则返回为False.

另外，对于这些方法。lock对象也能够通过with声明来使用：

import thread  #导入thread模块
a_lock = thread.allocate_lock()
with a_lock:
    print "a_lock is locked while this executes"

分配一个锁：allocate_lock() -> lock object (allocate() is an obsolute synonym)
操作锁的方法：

2）线程相关的函数说明：
创建一个新线程：start_new_thread(function,args[,kwargs]) (start_new() is an obsolete synonym)
退出线程：exit()和exit_thread() (PyThread_exit_thread() is an obsolete synonym)

一 lock相关的函数使用演示样例（略去error）:

import thread

def print_status(a_lock):
    if a_lock.locked():
        print "locked"
    else:
        print "not locked"

a_lock = thread.allocate_lock()
print_status(a_lock)
a_lock.acquire()
print_status(a_lock)
a_lock.release()
print_status(a_lock)

二 thread相关的函数使用演示样例：

import thread

def run(n):
    # a back door, can not be run 4 times
    if n == 4:
        thread.exit()
    for i in range(n):
        print i

thread.start_new_thread(run,(5,))

三解决一个同步问题

试解决下面同步问题：使用两个线程交替输出“Hello”与“World”各5次。以“Hello”開始以“World”结束。
①HelloWorld问题的同步模型建立：

semaphore h = 1, w = 0 
# because the semaphore w is 0,so we should acquire the lock w to let it be zero(locked) when we use python to .
thread1()
{
    while(true)
    {
        p(h)
        do something;
        v(w)
    }
}
thread2()
{
    while(true)
    {
        p(w)
        do something;
        v(h)
    }
}

②使用Python实现上述同步模型，两个解决方式例如以下。
方案A用main线程和另一个线程来交替打印。

方案B使用除main线程外的另两个线程交替打印“Hello”与“World”。

import thread

def world():    
    for i in range(5):
        w_lock.acquire()    # i want to print world
        print "world"
        h_lock.release()    # you can print hello now
    w_lock.release()

# main thread
print "use two threads to print hello&world"

h_lock = thread.allocate_lock()
w_lock = thread.allocate_lock()

w_lock.acquire(); # "print world" can not be started first
thread.start_new_thread(world,())
for i in range(5):
    h_lock.acquire()
    print "hello"
    w_lock.release()

# raw_input("finished")

import thread

def hello():
    for i in range(5):
        h_ok.acquire()
        print "hello"
        w_ok.release()
def world():
    for i in range(5):
        w_ok.acquire()        
        print "world"
        h_ok.release()

# main thread
print "use two threads to print hello&world"
h_ok = thread.allocate_lock()
w_ok = thread.allocate_lock()
w_ok.acquire()
thread.start_new_thread(hello,())
thread.start_new_thread(world,())

raw_input("finished") # !!it is necessary,in case main thread exit too early