linux多线程同步pthread_cond_XXX条件变量的理解

　　在linux多线程编程中，线程的执行顺序是不可预知的，但是有时候由于某些需求，需要多个线程在启动时按照一定的顺序执行，虽然可以使用一些比较简陋的做法，例如：如果有3个线程 ABC，要求执行顺序是A-->B-->C，可以create A--->sleep---->create B---->sleep--->create C，但是这未免有点不靠谱，浪费时间不说，如果要求更多，比如要A线程跑起来并且初始化一些条件后，BC才陆续执行，怎么办呢？看到APUE的条件变量这才找到了一个合适的方法。

　　条件变量需要于互斥锁结合使用，条件变量的类型是pthread_cond_t,由于条件变量是用在多线程里的，每个线程都可以看到这个变量，通常就把它定义为全局变量。操作条件变量的函数有:

　　初始化和销毁

SYNOPSIS
       #include <pthread.h>

       int pthread_cond_destroy(pthread_cond_t *cond);
       int pthread_cond_init(pthread_cond_t *restrict cond,const pthread_condattr_t *restrict attr);　　//动态初始化方法，使用完用用destroy释放资源
       pthread_cond_t cond = PTHREAD_COND_INITIALIZER;//静态初始化

　　条件等待

       int pthread_cond_timedwait(pthread_cond_t *restrict cond,
              pthread_mutex_t *restrict mutex,
              const struct timespec *restrict abstime);
       int pthread_cond_wait(pthread_cond_t *restrict cond,
              pthread_mutex_t *restrict mutex)

　　pthread_cond_timedwait是带超时的等待函数，如果时间到了条件依然没变则返回超时错误，参数里面的mutex就是与之配合使用的互斥量。

　　条件通知

       int pthread_cond_broadcast(pthread_cond_t *cond);
       int pthread_cond_signal(pthread_cond_t *cond);

使用的方法：

　　这里假定有2个线程，一个等待条件满足，一个改变条件并发出条件改变的通知。等待的线程：

 1      while ( 1 )
 2      {
 3          pthread_mutex_lock(&mtx);
 4          while ( 条件 == FALSE )
 5          {
 6              pthread_cond_wait(&cond, &mtx);
 7          }
 8      　　 将条件改变为FALSE
 9          pthread_mutex_unlock(&mtx);
10     }

　　改变条件并通知的线程：

1         pthread_mutex_lock(&mtx);
2         条件 = TRUE;
3         pthread_mutex_unlock(&mtx);
4         pthread_cond_signal(&cond);

　　需要注意这里代码里的 '条件' 和条件变量 cond是2码事，cond只是用作在线程间传递 '条件' 改变了的一个信使。

　　先看看2个模块的流程:

在pthread_cond_wait函数中，进去前会unlock mtx,等待返回时又会lock mtx。

　　分析一下2个线程按随机顺序执行时会怎么样，左边线程叫 A,右边线程叫B,假设 A先lock，这时B就阻塞了，然后A改变条件，解锁，发出通知，由于A接了锁，B马上唤醒，获得锁，那么这时A是无法改变条件的，因为锁被B获得了，B解锁，然后等待，收到通知，B在等待条件队列里面被唤醒，加锁，（这个wait过程按道理要做成原子操作才行，我个人觉得，不然进入wait前的unlock可能又会被A线程抢了锁），B处理完一些事后，解锁，然后不管是A还是B再次获得锁，A都会在解锁后发出通知，B都会在进入等待前解开锁，处理时又锁住，处理完又解锁，也就是说条件变量在需要传递时的通道是被打开的，改变条件变量的过程中又是被封住的。

 1 #include <stdlib.h>
 2 #include <stdio.h>
 3 #include <string.h>
 4 #include <pthread.h>
 5 #include <unistd.h>
 6 
 7 static pthread_mutex_t mtx = PTHREAD_MUTEX_INITIALIZER;
 8 static pthread_cond_t cond = PTHREAD_COND_INITIALIZER;
 9 
10 struct node 
11 {
12     int n_number;
13     struct node *n_next;
14 } *head = NULL;
15 
16 void *cleanup(void *arg)
17 {
18     printf("p:%p
",arg);
19     printf("clean up
");
20 }
21 
22 void *threadfun(void *arg)
23 {
24     struct node *p;
25 
26     pthread_cleanup_push(cleanup, NULL);
27     while ( 1 )
28     {
29         pthread_mutex_lock(&mtx);
30         while ( head == NULL )
31         {
32             pthread_cond_wait(&cond, &mtx);
33         }
34         p = head;
35         head = head->n_next;
36         pthread_mutex_unlock(&mtx);
37 
38         printf("thread node number is:%d
",p->n_number);
39         free(p);
40     }
41     pthread_cleanup_pop(1);
42     return (void *)0;
43 }
44 
45 
46 
47 int main(void)
48 {
49     int ret, i;
50     pthread_t tid;
51     struct node *p;
52 
53     ret = pthread_create(&tid, NULL, threadfun, NULL);
54     if ( ret != 0 )
55     {
56         perror("pthread_create error
");
57         return -1;
58     }
59     
60     for ( i = 0 ; i < 10 ; i++ )
61     {
62         p = (struct node *)malloc(sizeof(struct node));
63         if ( p == NULL )
64         {
65             perror("malloc error
");
66             continue;
67         }
68         memset(p, 0x0, sizeof(struct node));
69         p->n_number = i;
70         pthread_mutex_lock(&mtx);
71         p->n_next = head;
72         head = p;
73         pthread_mutex_unlock(&mtx);
74         pthread_cond_signal(&cond);
75         sleep(1);
76     }
77 
78     ret = pthread_cancel(tid);
79     if ( ret != 0 )
80     {
81         printf("pthread_cancel error
");
82     }
83 
84     ret = pthread_join(tid, NULL);
85     if ( ret != 0 )
86     {
87         perror("ptread_join error
");
88         return -1;
89     }
90 
91     return 0;
92 }

　　不过我仍然感觉有问题，如果A线程执行了多次循环，也就是说条件改变了多次，通知了多次，B线程如果跑的慢只执行了一遍，那么通知是否被丢失了？

　　这里采用while( 条件 == FALSE )这种结构是有原因的，pthread_cond_signal man上说的是可以唤醒至少1个等待cond的线程，pthread_cond_broadcast 可以唤醒所有等待cond的线程。假设采用pthread_cond_signal，它唤醒了多个线程，然后有一个线程抢到先执行，wait出来后锁住，改变条件，解锁，等到另一个被唤醒的线程抢到锁时，发现条件依然为FALSE，它就不会再去执行改变条件的操作了，而是继续等待，这样确保signal唤醒的线程只会有一个执行改变条件的操作。