【多线程】零碎记录2

今天得空继续扫了一下(https://computing.llnl.gov/tutorials/pthreads/这次没有用c++,直接参考的tutorial中的c语言实现)pthread中提供的另一种线程同步的方法:condition variables

既然已经有了mutex,为什么还要有condition variables这样的技术手段呢

原文的阐述是:“While mutexes implement synchronization by controlling thread access to data, condition variables allow threads to synchronize based upon the actual value of data.”

按照我自己的理解就是:

1)mutex的作用仅限于是否允许某个子线程去访问、修改某个内存变量,以此做到同步;提供的synchronize逻辑判断仅限于:can or cannot

2)condition variables的作用比单纯mutex要强一些,可以与mutex联合使用;提供的synchronize机制可以是:if  condition then do work

上述的理解,也是我在实现过一个demo之后得出的,下面阐述一下simple demo。

考虑这样一个问题:

1)有一个全局的计数变量int count,各个线程均可以访问

2)有两个子线程对count进行累加操作

3)另外还有一个监控子线程,对于count的值是敏感的:当count累加到某个临界值的时候,触发这个子线程完成任务

代码如下:

#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>

#define NUM_THREADS 3
#define TCOUNT 10
#define COUNT_LIMIT 12

int count = 0;
int thread_id[3] = {0,1,2};

pthread_mutex_t count_mutex;
pthread_cond_t count_threshold_cv;

void *inc_count(void *t)
{
    int i;
    long my_id = (long)t;

    for (i=0; i<TCOUNT; i++){
        pthread_mutex_lock(&count_mutex);
        count++;
        if ( count==COUNT_LIMIT )
        {
            pthread_cond_signal(&count_threshold_cv);
            printf("inc_count(): thread %ld, count = %d Threshold reached 
", my_id, count );
        }
        printf("inc_count() : thread %ld, count = %d, unlocking mutex 
", my_id, count);
        pthread_mutex_unlock(&count_mutex);
        sleep(1);
    }
    pthread_exit((void *) 0);
}

void *watch_count(void *t)
{
    long my_id = (long)t;
    printf("Starting watch_count(): thread %ld
", my_id);

    pthread_mutex_lock(&count_mutex);
    while (count<COUNT_LIMIT)
    {
        pthread_cond_wait(&count_threshold_cv, &count_mutex);
        printf("watch_count(): thread %ld Condition signale received.
", my_id);
        count += 125;
        printf("watch_count(): thread %ld count now = %d.
",my_id, count);
    }
    pthread_mutex_unlock(&count_mutex);
    pthread_exit((void *) 0);
}

int main(int argc, char *argv[])
{
    int i, rc;
    long t1=1, t2=2, t3=3;
    pthread_t threads[3];
    pthread_attr_t attr;

    pthread_mutex_init(&count_mutex, NULL);
    pthread_cond_init(&count_threshold_cv, NULL);

    pthread_attr_init(&attr);
    pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_JOINABLE);
    pthread_create(&threads[0], &attr, watch_count, (void *)t1);
    pthread_create(&threads[1], &attr, inc_count, (void *)t2);
    pthread_create(&threads[2], &attr, inc_count, (void *)t3);

    for (i=0; i<NUM_THREADS; ++i ){
        pthread_join(threads[i], NULL);
    }

    printf("Main(): Waited on %d threads. Done. 
", NUM_THREADS);

    pthread_attr_destroy(&attr);
    pthread_mutex_destroy(&count_mutex);
    pthread_cond_destroy(&count_threshold_cv);
    pthread_exit((void *)0);
}

先给输出结果,再解释这段代码是怎么实现上述的问题的:

两种子线程:

1)void *inc_count(void *t)实现了给全局变量count的累加计数功能

2)void *watch_count(void *t)实现了监控全局变量count的值并触发任务的功能

下面记录我自己分析每个子线程的实现逻辑的思路顺序:

1)watch_count对count进行监控:

 1 void *watch_count(void *t)
 2 {
 3     long my_id = (long)t;
 4     printf("Starting watch_count(): thread %ld
", my_id);
 5 
 6     pthread_mutex_lock(&count_mutex);
 7     while (count<COUNT_LIMIT)
 8     {
 9         pthread_cond_wait(&count_threshold_cv, &count_mutex);
10         printf("watch_count(): thread %ld Condition signale received.
", my_id);
11         count += 125;
12         printf("watch_count(): thread %ld count now = %d.
",my_id, count);
13     }
14     pthread_mutex_unlock(&count_mutex);
15     pthread_exit((void *) 0);
16 }

a. 为什么line 6要先对count_mutex上锁呢?

  因为,当watch_count在判断count的值的时候,必须“独揽”对count的操作权利;如果正判断count的时候,count被其他线程改变了或者怎样,逻辑就很可能是错误的(这一点在这个系列的前几篇日志中说明了

b. “pthread_cond_wait(&count_threshold_cv, &count_mutex);”的作用是什么?

  首先解释下参数:pthread_cond_t count_threashold_cv正是这一节提到的条件变量;count_mutex是用于全局变量count的同步锁

  这个语句相当于做了如下两件件事情:

    b1. 先阻塞当前线程

    b2. 判断count_threshold_cv这个条件变量是否被激活:

       b21. 如果没被激活,自动放开对count_mutex的锁(这个解锁是针对line 6对count_mutex上的锁);当前线程继续保持被阻塞状态

       b22. 如果被激活,唤醒count_mutex的锁;当前的线程不被阻塞,继续往下执行(由于line 6一直锁着当前线程,则需要线程执行完毕前对count_mutex解锁) 

c. 读完上述的代码,马上产生两个疑问:

    c1. count_threshold_cv的初始状态到底激活没激活是谁管的?

      是main()函数中的“pthread_cond_init(&count_threshold_cv, NULL);”语句,初始化count_threshold_cv没激活的。

    c2. count_threshold_cv的激活是谁管的呢?

      这就要再分析另一个子线程函数inc_count了

2)inc_count子线程对count进行累加操作 & 对条件变量count_threshold_cv进行激活操作

 1 void *inc_count(void *t)
 2 {
 3     int i;
 4     long my_id = (long)t;
 5 
 6     for (i=0; i<TCOUNT; i++){
 7         pthread_mutex_lock(&count_mutex);
 8         count++;
 9         if ( count==COUNT_LIMIT )
10         {
11             pthread_cond_signal(&count_threshold_cv);
12             printf("inc_count(): thread %ld, count = %d Threshold reached 
", my_id, count );
13         }
14         printf("inc_count() : thread %ld, count = %d, unlocking mutex 
", my_id, count);
15         pthread_mutex_unlock(&count_mutex);
16         sleep(1);
17     }
18     pthread_exit((void *) 0);
19 }

这个函数比较直观:

1)Loop中每次都对count加1(当然了,对count进行操作时一定要对count_mutex加锁,line 7的语句

2)对count的值进行判断:

    a. 如果count不等于临界值:do some work,并对count_mutex进行解锁;然后再sleep一下(目的是人工给其他线程获得count_mutex控制权的机会

    b. 如果count等于临界值:

      b1. 激活条件变量count_threshold_cv

      b2. 唤醒由于等待count_threshold_cv而被阻塞的线程(这个demo里只有一个等着的线程,如果多个线程都等待count_threshold_cv呢?这个过后值得思考一下

  b2里有一个细节问题:当count_threshold_cv被激活后,watch_count是马上执行呢?还是等着激活count_threshold_cv的这个线程执行执行完再被“真正”唤醒呢?用结果说话:

  显然,即便是count_threshold_cv被激活之后,watch_count也没有马上执行;而是等着inc_count中的count_mutex被解锁后,再执行被激唤醒的watch_count线程;watch_count线程被唤醒的同时,watch_count线程又重新夺回了count_mutex的占有权。

上面的demo已经大概解释说明了condition variables是怎么使用的,有几个细节还应该扣一扣:

细节一

  这里其实还有地方没十分确定:如上图,当count_threshold_cv被thread3激活了,并且thread3线程已经对count_mutex执行unlocking了,这个时候会不会存在thread1(thread1是watch_count线程)和thread2(thread2是另一个inc_count线程)同时竞争count_mutex的情况呢?根据输出的结果来看,此时可能不存在thread1和thread2竞争的情况,操作系统赋予了被唤醒的thread1对count_mutex的优先上锁权。

================================================

2015.08.20更新

关于细节一的问题,Google到了一篇blog(http://casatwy.com/pthreadde-ge-chong-tong-bu-ji-zhi.html),从pthread_cond_signal的抛出信号的位置角度思考了类似的问题

这里借用上面blog中的一张图来说明细节一:

上面的图分析了抛出signal的位置对后续的影响。

================================================

细节二

  如果调用pthread_cond_wait的线程里面,没有对count_mutex的限制,那么运行的结果如何呢?我稍微修了一下代码,增加了一个全局的pthread_mutex_t test,并在main中将其初始化,并修改inc_count函数和watch_count函数如下(红色是修改的部分):

void *inc_count(void *t)
{
    int i;
    long my_id = (long)t;

    for (i=0; i<TCOUNT; i++){
        pthread_mutex_lock(&count_mutex);
        count++;
        if ( count==COUNT_LIMIT )
        {
            pthread_cond_signal(&count_threshold_cv);
            sleep(1);
            printf("inc_count(): thread %ld, count = %d Threshold reached 
", my_id, count );
        }
        printf("inc_count() : thread %ld, count = %d, unlocking mutex 
", my_id, count);
        pthread_mutex_unlock(&count_mutex);
        sleep(1);
    }
    pthread_exit((void *) 0);
}

void *watch_count(void *t)
{
    long my_id = (long)t;
    printf("Starting watch_count(): thread %ld
", my_id);

    pthread_mutex_lock(&test);
    int i=0;
    while (i<1)
    {
        pthread_cond_wait(&count_threshold_cv, &test);
        printf("watch_count(): thread %ld Condition signale received.
", my_id);
        count += 125;
        printf("watch_count(): thread %ld count now = %d.
",my_id, count);
        i++;
    }
    pthread_mutex_unlock(&test);
    pthread_exit((void *) 0);
}

运行结果如下:

通过这个运行结果,可以看到:一旦pthread_cond_signal之后,由于没有count_mutex的上锁限制,watch_count线程立刻执行了(这里在inc_count函数中加了sleep(1)就是故意等着,看watch_count有没有立即不受阻塞执行)。因此,从反面再次验证了,调用pthread_cond_signal的线程仅仅是发送一个激活watch_count线程的信号;如果watch_count中受到了count_mutex的限制,那么还是要等到inc_count中对count_mutex解锁后才会被真正激活。

下面这两个连接,对pthread_cond_wait()和pthread_cond_signal()有比较详细的解释

https://computing.llnl.gov/tutorials/pthreads/man/pthread_cond_wait.txt

https://computing.llnl.gov/tutorials/pthreads/man/pthread_cond_signal.txt

但实际工作中,还得去试验一下代码运行的平台系统,对pthread是怎样一种具体的实现策略。

================================================

补充一下,看了这篇很好的blog(http://casatwy.com/pthreadde-ge-chong-tong-bu-ji-zhi.html

关于使用condition variables时,要防止一种情况:

pthread_cond_signal( cv )如果在pthread_cond_wait( cv )开始阻塞之前执行了,这个发出来的激活信号相当于谁也没接到,就废掉了。于是这个pthread_cond_wait( cv )就一直等着了。

================================================

2015.08.26更新

当时记录condition variable的时候忽略了一个关键的问题,就是为什么要用condition variable呢?

参考了这篇日志的内容(http://casatwy.com/pthreadde-ge-chong-tong-bu-ji-zhi.html

要想回答这个问题,我理解的关键点有两个:

(1)如果不用condition variable行不行?

  如果设定一个volatile变量V呢?其他线程轮询这个volatile变量V的值符合要求了,再往下走呢?

  这么肯定是可以的,而且看似比较简洁;但缺点也比较明显,就是轮询的过程需要占用大量cpu资源,如果wait的线程比较多占用的资源不容忽视。

(2)用condition variable的好处有哪些?

  如果是多个线程等待某个条件,条件变量就太合适了。

  每个线程只在获得mutex的条件下问一次:如果条件满足了,就往下执行;如果不满足,就放开mutex,等着接收信号。

volatile的概念不太熟悉,搜了一下相关的内容,有一个blog和百度百科都不错

http://blog.csdn.net/tigerjibo/article/details/7427366

http://baike.baidu.com/link?url=QlZodHln5RcwU4GaTtklUhNDykw12h-SHTyqwzJfV6DLmtdWnzC83CAGRpt6-Nk7GBZI6Z0xozECO-KfG9YX8_

volatile的作用是:

  (1)防止编译器只考虑当前线程对该变量的操作,而对代码不正确地优化(如“聪明”地认为,当前线程不会对该变量进行修改操作就合并了一部分代码)

  (2)告诉系统每次从内存中获取该变量的值,而不是从寄存器中读

原文地址:https://www.cnblogs.com/xbf9xbf/p/4743701.html