【APUE】Chapter17 Advanced IPC & sign extension & 结构体内存对齐

17.1 Introduction

这一章主要讲了UNIX Domain Sockets这样的进程间通讯方式，并列举了具体的几个例子。

17.2 UNIX Domain Sockets

这是一种特殊socket类型，主要用于高效的IPC，特点主要在于高效（因为省去了很多与数据无关的格式的要求）。

int socketpair(int domain, int type, int protocol, int sockfd[2]) 这个函数用于构建一对unix domain sockets；并且与之前的pipe函数不同，这里构建fd都是full-duplex的。

下面列举一个poll + message queue + 多线程的例子。

为什么要举上面的例子？因为没办法直接用poll去管理多个message queue。

message queue在unix系统中有两种标示方法：1. 全局用一个key 2. 进程内部用一个identifier

而poll关注的对象只能是file descriptor；所以，用unix domain sockets作为二者的桥梁。

例子包含两个部分，reciver端和sender端。

reciver挂起来几个message queue，每个queue单独开一个线程去处理；主线程跟每个queue线程的关联方式就是unix domain sockets。代码如下：

 1 #include "../apue.3e/include/apue.h"
 2 #include <sys/poll.h>
 3 #include <pthread.h>    
 4 #include <sys/msg.h>
 5 #include <sys/socket.h>
 6 
 7 #define NQ 3
 8 #define MAXMSZ 512
 9 #define KEY 0x123
10 
11 struct threadinfo{
12     int qid;
13     int fd;
14 };
15 
16 struct mymesg{
17     long mtype;
18     char mtext[MAXMSZ];
19 };
20 
21 void * helper(void *arg)
22 {
23     int n;
24     struct mymesg m;
25     struct threadinfo *tip = arg;
26 
27     for(; ;)
28     {
29         memset(&m, 0, sizeof(m));
30         if ((n = msgrcv(tip->qid, &m, MAXMSZ, 0, MSG_NOERROR))<0) {
31             err_sys("msgrcv error");
32         }
33         /*来自一个消息队列的内容 就特定的file desrciptor中*/
34         if (write(tip->fd, m.mtext, n)<0) {
35             err_sys("write error");
36         }
37     }
38 }
39 
40 int main(int argc, char *argv[])
41 {
42     int i, n, err;
43     int fd[2];
44     int qid[NQ]; /*message queue在process内部的identifier*/
45     struct pollfd pfd[NQ];
46     struct threadinfo ti[NQ];
47     pthread_t tid[NQ];
48     char buf[MAXMSZ];
49 
50     /*给每个消息队列设定处理线程*/
51     for (i=0; i<NQ; i++) {
52         /*返回消息队列的identifier 类似file descriptor*/
53         if ((qid[i] = msgget((KEY+i), IPC_CREAT|0666))<0) {
54             err_sys("msgget error");
55         }
56         printf("queue ID %d is %d
", i, qid[i]);
57         /*构建unix domain sockets*/
58         if (socketpair(AF_UNIX, SOCK_DGRAM, 0, fd)<0) {
59             err_sys("socketpair error");
60         }
61         pfd[i].fd = fd[0]; /*main线程把住fd[0]这头*/
62         pfd[i].events = POLLIN; /*有data要去读*/
63         /* qid[i]在同一个process都可以用来表示同一个message queue */
64         ti[i].qid = qid[i]; /*在每个线程中记录要处理的消息队列的id*/
65         ti[i].fd = fd[1]; /*每个队列的线程把住fd[1]这头*/
66         /*为每个消息队列创建一个处理线程 并将对应的threadinfo参数传入线程*/
67         if ((err = pthread_create(&tid[i], NULL, helper, &ti[i]))!=0) {
68             err_exit(err, "pthread_create error");
69         }
70     }
71 
72     for (;;) {
73         /*一直轮询着 直到有队列可以等待了 再执行*/
74         if (poll(pfd, NQ, -1)<0) {
75             err_sys("poll error");
76         }
77         /*由于能进行到这里 则一定是有队列ready了 找到所有ready的队列*/
78         for (i=0; i<NQ; i++) {
79             if (pfd[i].revents & POLLIN) { /*挑出来所有满足POLLIN条件的*/
80                 if ((n=read(pfd[i].fd, buf, sizeof(buf)))<0) {
81                     err_sys("read error");
82                 }
83                 buf[n] = 0; /* 这个末尾赋''是必要的 因为接下来要执行printf*/
84                 printf("queue id %d, message %s
",qid[i],buf);
85             }
86         }
87     }
88     exit(0);
89 }

sender端，用command-line argument的方式读入message的外部key，以及写入message queue的数据，具体代码如下：

#include "../apue.3e/include/apue.h"
#include <sys/msg.h>

#define MAXMSZ 512

struct mymesg{
    long mtype;
    char mtext[MAXMSZ];
};

int main(int argc, char *argv[])
{
    key_t key;
    long qid;
    size_t nbytes;
    struct mymesg m;
    if (argc != 3) {
        fprintf(stderr, "usage: sendmsg KEY message
");
        exit(1);
    }
    key = strtol(argv[1], NULL, 0);
    if ((qid = msgget(key,0))<0) {
        err_sys("can't open queue key %s", argv[1]);
    }
    memset(&m, 0, sizeof(m));
    strncpy(m.mtext, argv[2], MAXMSZ-1);
    nbytes = strlen(m.mtext);
    m.mtype = 1;
    if (msgsnd(qid, &m, nbytes, 0)<0) {
        err_sys("can't send message");
    }
    exit(0);
}

执行结果如下：

分析：

（1）unix socket domain在上述代码中的好处主要是方便了多个message queue的管理

（2）引入unix socket domain虽然带来了方便，但也在reciver中引入了两次额外的cost：一个是line34的write，向unix domain socket多写了一次；一个是line80的read，从unix domain socket多读了一次。如果这种cost在可接受范围内，那么unix socket domain就可以应用。

17.2.1 Naming UNIX Domain Sockets

上面介绍的这种socketpair的方式构造unix domain sockets，输出是几个fd，因此只能用于有亲属关系的process中。

如果要unrelated process之间用unix domain sockets通信，得从外面process能找到这个unix domain socket。

struct sockaddr_un{

　　sa_family_t sun_family; /*AF_UNIX*/

　　char sun_path[108]; /*pathname*/

}

这个结构体可以用来被构造成一个“可以被外面process找到的”的unix domain socket的地址，类似于“ip+port”的作用。

具体需要如下三个步骤的操作：

（1）fd = socket(AF_UNIX, SOCK_STREAM, 0) // 产生unix domain socket

（2）un.sun_family = AF_UNIX strcpy(un.sun_path, pathname)

（3）bind(fd, (struct sockaddr *)&un, size) // 将unix domain socket与fd绑定

另，这里的pathname需要是一个独一无二的文件名。后面的一系列内容，都把sockaddr_un按照ip+port进行理解就顺畅了。

有了结构体中sun_path这个文件名，这个unix domain socket就有了自己独一无二的标识，其他进程就可以通过这个标识找到它。

 1 #include "../apue.3e/include/apue.h"
 2 #include <sys/socket.h>
 3 #include <sys/un.h>
 4 #include <string.h> 
 5 
 6 int main(int argc, char *argv[])
 7 {
 8     int fd, size;
 9     struct sockaddr_un un;
10 
11     un.sun_family = AF_UNIX;
12     memset(un.sun_path, 0, sizeof(un.sun_path));
13     strcpy(un.sun_path, "foo.socket");
14 
15     if ((fd = socket(AF_UNIX, SOCK_STREAM, 0))<0) {
16         err_sys("socket fail");
17     }
18     size = offsetof(struct sockaddr_un, sun_path) + strlen(un.sun_path);
19     if (bind(fd, (struct sockaddr *)&un, size)<0) {
20         err_sys("bind failed");
21     }
22     printf("UNIX domain socket bound
");
23     exit(0);
24 }

这里“foo.socket"不需要事先真的存在，它只需要是一个独特的名称就可以了。

执行结果如下：

程序执行的当前文件夹下是没有foo.socket这个文件的

执行如上程序：

可以看到执行完程序后：

（1）foo.socket这个文件自动生成了，而且文件类型是socket（srwxrwxr-x中的s）

（2）如果foo.socket已经被占用了是没办法再绑定其他的unix domain socket的

17.3 Unique Connections

基于17.2.1的naming unix domain socket技术，就可以针对unix domain socket展开listen, accept, connect等一些列用于network socket的操作；用这样的方式来实现同一个host内部的IPC。

具体的示意图，如下所示：

apue中分别给出了listen accept connect三个函数的unix domain socket版。

int serv_listen(const char *name);

int serv_accpet(int listenfd, uid_t *uidptr);

int cli_conn(const char *name);

具体实现如下：

serv_listen函数（返回一个unix domain socket专门用于监听client发送来的请求）

 1 #include "../apue.3e/include/apue.h"
 2 #include <sys/socket.h>
 3 #include <sys/un.h>
 4 #include <errno.h>
 5 
 6 #define QLEN 10
 7 
 8 /*只要传入一个well known name 就可返回fd*/
 9 int serv_listen(const char *name)
10 {
11     int fd;
12     int len;
13     int err;
14     int rval;
15     struct sockaddr_un un;
16 
17     /*对name的长度上限有要求*/
18     if (strlen(name) >= sizeof(un.sun_path)) {
19         errno = ENAMETOOLONG;
20         return -1;
21     }
22     /*这里创建的方式是SOCK_STREAM*/
23     if ((fd = socket(AF_UNIX, SOCK_STREAM, 0))<0) {
24         return -2;
25     }
26     /*防止name已经被占用了 这是一种排他的做法*/
27     unlink(name);
28     /*初始化socket address structure*/
29     memset(&un, 0, sizeof(un.sun_path));
30     un.sun_family = AF_UNIX;
31     strcpy(un.sun_path, name);
32     len = offsetof(struct sockaddr_un, sun_path) + strlen(name);
33     /*执行bind操作 因为有name所以可以绑定*/
34     if (bind(fd, (struct sockaddr *)&un, len)<0) {
35         rval = -3;
36         goto errout;
37     }
38     /*执行listen的操作 并设置等待队列的长度*/
39     if (listen(fd, QLEN)<0) {
40         rval = -4;
41         goto errout;
42     }
43     return fd;
44 errout:
45     err = errno;
46     close(fd);
47     errno = err;
48     return rval;
49 }

serv_accpet函数（这里有一点没看懂为什么client's name有30s的限制）

 1 #include "../apue.3e/include/apue.h"
 2 #include <sys/socket.h>
 3 #include <sys/un.h>
 4 #include <time.h>
 5 #include <errno.h>
 6 
 7 #define STALE 30 /*client's name can't be older than this sec*/
 8 
 9 int serv_accept(int listenfd, uid_t *uidptr)
10 {
11     int clifd;
12     int err;
13     int rval;
14     socklen_t len;
15     time_t staletime;
16     struct sockaddr_un un;
17     struct stat statbuf;
18     char *name; /*name中存放的是发起请求的client的地址信息*/
19 
20     /*因为sizeof不计算结尾的 所以在计算分配内存的时候要考虑进来*/
21     if ((name = malloc(sizeof(un.sun_path+1)))==NULL) {
22         return -1;
23     }
24     len = sizeof(un);
25     /*就在这里阻塞着 等着client端发送来请求*/
26     if ((clifd = accept(listenfd, (struct sockaddr *)&un, &len))<0) {
27         free(name);
28         return -2;
29     }
30     /*再让len为path的实际长度 并存到name中*/
31     len -= offsetof(struct sockaddr_un, sun_path);
32     memcpy(name, un.sun_path, len);
33     name[len] = 0; /*最后补上*/
34     if (stat(name, &statbuf)<0) { /*让statbuf获得client关联的文件的status*/
35         rval = -3;
36         goto errout;
37     }
38 
39     /*1. 验证与client端关联的文件类型是不是socket file*/
40 #ifdef S_ISSOCK
41     if (S_ISSOCK(statbuf.st_mode)==0) {
42         rval = -4;
43         goto errout;
44     }
45 #endif
46     /*2. 验证与clinet端关联的文件的权限*/
47     /*G for group    O for owner    U for user */
48     /*验证permission只有user-read user-write user-execute*/
49     /*注意 ||运算符的优先级 要高于 !=运算符的优先级*/
50     if ((statbuf.st_mode & (S_IRWXG | S_IRWXO)) || 
51             (statbuf.st_mode & S_IRWXU) != S_IRWXU) {
52         rval = -5;
53         goto errout;
54     }
55     /*3. 验证与client端关联的文件被创建的时间*/
56     staletime = time(NULL) - STALE; /**/
57     if (statbuf.st_atim < staletime || 
58             statbuf.st_ctim < staletime ||
59             statbuf.st_mtim < staletime) {
60         rval = -6;
61         goto errout;
62     }
63     if (uidptr != NULL) {
64         *uidptr = statbuf.st_uid;
65     }
66     unlink(name);
67     free(name);
68     return clifd;
69 
70 errout:
71     err = errno;
72     close(clifd);
73     free(name);
74     errno = err;
75     return rval;
76 }

cli_conn

 1 #include "../apue.3e/include/apue.h"
 2 #include <sys/socket.h>
 3 #include <sys/un.h>
 4 #include <errno.h>
 5 
 6 #define CLI_PATH "/var/tmp" /*客户端标示*/
 7 #define CLI_PERM  S_IRWXU /*权限设置*/
 8 
 9 int cli_conn(const char *name)
10 {
11     int fd;
12     int len;
13     int err;
14     int rval;
15     struct sockaddr_un un, sun;// un代表client端 sun代表server端
16     int do_unlink = 0;
17     /*1. 验证传入的name是否合理
18      *   这个name是server的name 先校验server name的长度 */
19     if (strlen(name) >= sizeof(un.sun_path)) {
20         errno = ENAMETOOLONG;
21         return -1;
22     }
23     /*2. 构建client端的fd
24      *   这个fd是client的专门发送请求的fd*/
25     if ((fd = socket(AF_UNIX, SOCK_STREAM, 0))<0) {
26         return -1;
27     }
28     /*3. 构建client端的地址*/
29     /*   将文件名+进程号共写进un.sun_path 并记录长度 这里约定了path的格式*/
30     memset(&un, 0, sizeof(un));
31     un.sun_family = AF_UNIX;
32     sprintf(un.sun_path, "%s%05ld", CLI_PATH, (long)getpid());
33     printf("file is %s
", un.sun_path);
34     len = offsetof(struct sockaddr_un, sun_path) + strlen(un.sun_path);
35     /*4. 将构建的fd与构建的client端地址绑定*/
36     unlink(un.sun_path); /*防止CLI_PATH+pid这个特殊的文件名已经被占用了*/
37     if (bind(fd, (struct sockaddr *)&un, len)<0) {
38         rval = -2;
39         goto errout;
40     }
41     /*  为什么要先绑定再设定权限？因为如果不能绑定 修改权限就是无用功*/
42     if (chmod(un.sun_path, CLI_PERM)<0) {
43         rval = -3;
44         do_unlink = 1;
45         goto errout;
46     }
47     /*5. 告诉client通过name去找server*/
48     /*   通过这个name这个key与'server'的process建立连接*/
49     memset(&sun, 0 ,sizeof(sun));
50     sun.sun_family = AF_UNIX;
51     strcpy(sun.sun_path, name);
52     len = offsetof(struct sockaddr_un, sun_path) + strlen(name);
53     if (connect(fd, (struct sockaddr *)&sun, len)<0) {
54         rval = -4;
55         do_unlink = 1;
56         goto errout;
57     }
58     return fd;
59 errout:
60     err = errno;
61     close(fd);
62     if (do_unlink) {
63         unlink(un.sun_path);
64     }
65     errno = err;
66     return raval;
67 }

17.4 Passing File Descriptors

在进程间传递file descriptor是也是unix domain socket的一种强大的功能。文件打开的各种细节，都隐藏在server端了。

至今在apue上已经有三种进程间的file descriptor的传递方式：

（1）figure3.8的情况，不同的process分别打开同一个file，每个process中的fd有各自的file table，这两个fd基本没有什么关系：

（2）figure8.2的情况，parent通过fork产生child，整个parent的memory layout都copy到child中，这两个fd属于不同的地址空间，但是值是相同的，并且共享同一个file table：

（3）17.4节的情况，通过unix domain socket的方式传递fd，这两个fd属于不同的地址空间，除了共享同一个file table没有其他的不同：

这一部分还讲了其他一些相关的结构体内容，这些细节为了看懂代码而用，关键记住上面的三种fd方式就可以了。

apue这部分自己设定了一个protocol，设定通过unix domain socket传递fd的协议，这个协议的细节不用关注太多；重点看如何告诉系统，发送的是一个fd。

利用unix domain socket发送和接收fd的代码如下：

send_fd的代码（如何告诉系统发送的是一个fd？先把struct cmsghdr cmptr设定好line43~45，将cmptr赋值给struct msghdr msg中的msg.msg_control，这样系统就知道发送的是一个fd）

 1 #include "../apue.3e/include/apue.h"
 2 #include <bits/socket.h> 
 3 #include <sys/socket.h>
 4 
 5 /* 由于不同系统对于cmsghdr的实现不同 CMSG_LEN这个宏就是计算cmsghdr+int
 6  * 所需要的memory大小是多少 这样动态分配内存的时候才知道分配多少大小*/
 7 #define CONTROLLEN CMSG_LEN(sizeof(int))
 8 
 9 static struct cmsghdr *cmptr = NULL;
10 
11 int send_fd(int fd, int fd_to_send)
12 {
13     struct iovec iov[1];
14     struct msghdr msg;
15     char buf[2]; /*这是真正的协议头的两个特征bytes*/
16     /*scatter read or gather write 具体参考14.6
17      * 具体到这里的情景比较简单 因为iovec的长度只有1 相当于就调用了一个write
18      * 但是Unix domain socket的格式要去必须是struct iovec这种数据格式*/
19     iov[0].iov_base = buf;
20     iov[0].iov_len = 2;
21     msg.msg_iov = iov;
22     msg.msg_iovlen = 1;
23     msg.msg_name = NULL;
24     msg.msg_namelen = 0;
25     /*调用send_fd分两种情况:
26      * 1. 正常调用传递fd, 则fd_to_send是大于零的
27      * 2. 在send_err中调用send_fd, 则fd_to_send表示的是errorcode*/
28     if (fd_to_send<0) {
29         msg.msg_control = NULL;
30         msg.msg_controllen = 0;
31         buf[1] = -fd_to_send; /*出错的fd_to_send都是负数*/
32         if (buf[1] == 0) { /*这个protocol并不是完美的 如果fd_to_send
33         是-256 则没有正数与其对应 协议在这里特殊处理-1与-256都代表 errorcode 1*/
34             buf[1] = 1; 
35         }
36     }
37     else {
38         /*这里cmptr获得的memory大小是由CMSG_LEN算出来的*/
39         if (cmptr == NULL && (cmptr = malloc(CONTROLLEN)) == NULL ) {
40             return -1;
41         }
42         /*通过Unix domain socket发送fd 就如下设置*/
43         cmptr->cmsg_level = SOL_SOCKET;
44         cmptr->cmsg_type = SCM_RIGHTS;
45         cmptr->cmsg_len = CONTROLLEN;
46         /*将cmptr融进要发送的msg*/
47         msg.msg_control = cmptr;
48         msg.msg_controllen = CONTROLLEN;
49         /*得搞清楚strut cmsghdr的结构
50          * struct cmsghdr{
51          *      socklen_t cmsg_len;
52          *      int cmsg_level;
53          *      int cmsg_type;
54          * }
55          * // followed by the actual control message data
56          * CMSG_DATA做的事情就是在cmsghdr紧屁股后面放上'fd_to_send'这个内容
57          * ubuntu系统上查看<sys/socket.h>文件中的这个宏的具体实现
58          * 这个宏的具体实现就是struct cmsghdr结构体的指针+1, 然后将这个位置*/
59         *(int *)CMSG_DATA(cmptr) = fd_to_send;
60         buf[1] = 0;
61     }
62     buf[0] = 0; /*这就是给recv_fd设定的null byte flag recv_fd()函数中就是靠这个位来判断的*/
63     /*这里校验的sendmsg返回值是不是2 就是char buf[2]中的内容
64      * struct msghdr msg中 只有msg_iov中的数据算是被校验的内容
65      * 而msg_control这样的数据 都叫ancillary data 即辅助数据
66      * 辅助数据虽然也跟着发送出去了 但是不在sendmsg返回值的校验标准中*/
67     if (sendmsg(fd, &msg, 0)!=2) {
68         return -1;
69     }
70     return 0
71 }

接收端的代码recv_fd如下（代码不难理解，有个坑是line56是apue勘误表中才修改过来，否则有问题；勘误表的链接：http://www.apuebook.com/errata3e.html）

 1 #include "open_fd.h"
 2 #include <sys/socket.h>        /* struct msghdr */
 3 
 4 /* size of control buffer to send/recv one file descriptor */
 5 #define    CONTROLLEN    CMSG_LEN(sizeof(int))
 6 
 7 static struct cmsghdr    *cmptr = NULL;        /* malloc'ed first time */
 8 
 9 /*
10  * Receive a file descriptor from a server process.  Also, any data
11  * received is passed to (*userfunc)(STDERR_FILENO, buf, nbytes).
12  * We have a 2-byte protocol for receiving the fd from send_fd().
13  */
14 int
15 recv_fd(int fd, ssize_t (*userfunc)(int, const void *, size_t))
16 {
17     int                newfd, nr, status;
18     char            *ptr;
19     char            buf[MAXLINE];
20     struct iovec    iov[1];
21     struct msghdr    msg;
22 
23     status = -1;
24     for ( ; ; ) {
25         iov[0].iov_base = buf;
26         iov[0].iov_len  = sizeof(buf);
27         msg.msg_iov     = iov;
28         msg.msg_iovlen  = 1;
29         msg.msg_name    = NULL;
30         msg.msg_namelen = 0;
31         if (cmptr == NULL && (cmptr = malloc(CONTROLLEN)) == NULL)
32             return(-1);
33         msg.msg_control    = cmptr;
34         msg.msg_controllen = CONTROLLEN;
35         if ((nr = recvmsg(fd, &msg, 0)) < 0) {
36             err_ret("recvmsg error");
37             return(-1);
38         } else if (nr == 0) {
39             err_ret("connection closed by server");
40             return(-1);
41         }
42 
43         /*
44          * See if this is the final data with null & status.  Null
45          * is next to last byte of buffer; status byte is last byte.
46          * Zero status means there is a file descriptor to receive.
47          */
48         for (ptr = buf; ptr < &buf[nr]; ) {
49             if (*ptr++ == 0) {
50                 if (ptr != &buf[nr-1])
51                     err_dump("message format error");
52                  status = *ptr & 0xFF;    /* prevent sign extension */
53                  if (status == 0) {
54                     printf("msg.msg_controllen:%zu
", msg.msg_controllen);
55                     printf("CONTROLLEN:%zu
", CONTROLLEN);
56                     if (msg.msg_controllen < CONTROLLEN)
57                         err_dump("status = 0 but no fd");
58                     newfd = *(int *)CMSG_DATA(cmptr);
59                 } else {
60                     newfd = -status;
61                 }
62                 nr -= 2;
63             }
64         }
65         if (nr > 0 && (*userfunc)(STDERR_FILENO, buf, nr) != nr)
66             return(-1);
67         if (status >= 0)    /* final data has arrived */
68             return(newfd);    /* descriptor, or -status */
69     }
70 }

17.5 An Open Server, Version 1

这一节正是利用17.4中的passing file descriptor的技术来构建一个"open" server：

这个server专门用来接收client发送的请求（即打开哪个文件，怎么打开），然后在server端把文件打开，再利用unix domain socket的技术把file descriptor给传递过去。

具体用到的技术就是client运行起来，通过fork+execl的方式调用opend（相当于server端的程序），并且通过socketpair的方式建立进程间的通信。

将书上的代码整理了一下（main.c表示client端，maind.c表示server端，lib文件夹中包含用到的一些函数，include文件夹中的.h文件包括各种公用的lib）

main.c代码如下：

 1 #include "open_fd.h"
 2 #include <fcntl.h>
 3 #include <sys/uio.h>
 4 
 5 #define BUFFSIZE 8192
 6 #define CL_OPEN "open" // client's request for server
 7 
 8 int csopen(char *name, int oflag)
 9 {
10     pid_t pid;
11     int len;
12     char buf[10];
13     struct iovec iov[3];
14     static int fd[2] = {-1, -1};
15     /*首次需要建立child parent的链接*/
16     if (fd[0] < 0) {
17         printf("frist time build up fd_pipe
");
18         /*构建一个全双工的pipe*/
19         if (fd_pipe(fd) < 0) {
20             err_ret("fd_pipe error");
21             return -1;
22         }
23         printf("fd[0]:%d,fd[1]:%d
",fd[0],fd[1]);
24         if((pid = fork())<0){
25             err_ret("fork error");
26             return -1;
27         }
28         else if (pid ==0) { /*child*/
29             close(fd[0]);
30             /*这个地方需要注意 这种full-duplex的fd 可以把in和out都挂到这个fd上面 之前只挂了stdin没有挂out所以有问题*/
31             /*将child的stdin 衔接到fd[1]上面*/
32             if (fd[1] != STDIN_FILENO && dup2(fd[1],STDIN_FILENO)!=STDIN_FILENO) {
33                 err_sys("dup2 error to stdin");
34             }
35             /*将child的stdout 衔接到fd[1]上面*/
36             if (fd[1] != STDOUT_FILENO && dup2(fd[1],STDOUT_FILENO)!=STDOUT_FILENO) {
37                 err_sys("dup2 error to stdout");
38             }
39             /*执行opend这个程序 这时opend这个程序的stdin就指向fd[1] child和parent通过pipe连接了起来*/
40             if (execl("./opend", "opend", (char *)0)<0) {
41                 err_sys("execl error");
42             }
43         }
44         close(fd[1]); /*parent*/
45     }
46 
47     /*iov三个char array合成一个char array 每个array以空格分开*/
48     sprintf(buf, " %d", oflag);
49     iov[0].iov_base = CL_OPEN " ";        /* string concatenation */
50     iov[0].iov_len  = strlen(CL_OPEN) + 1;
51     iov[1].iov_base = name; /*传入的filename在中间的io*/
52     iov[1].iov_len  = strlen(name);
53     iov[2].iov_base = buf;
54     iov[2].iov_len  = strlen(buf) + 1;    /* +1 for null at end of buf */
55     len = iov[0].iov_len + iov[1].iov_len + iov[2].iov_len;
56     /*通过fd[0] fd[1]这个通道 由client向server发送数据*/
57     /*writev在会把缓冲区的输出数据按顺序集合到一起 再发送出去*/
58     if (writev(fd[0], &iov[0], 3) != len) {
59         err_ret("writev error");
60         return(-1);
61     }
62     /* read descriptor, returned errors handled by write() */
63     return recv_fd(fd[0], write);
64 }
65 
66 /*这是client端调用的程序*/
67 int main(int argc, char *argv[])
68 {
69     int n, fd;
70     char buf[BUFFSIZE], line[MAXLINE];
71     /*每次从stdin cat进来filename*/
72     while (fgets(line, MAXLINE, stdin)!=NULL) {
73         /*替换把回车替换掉*/
74         if (line[strlen(line)-1] == '
') {
75             line[strlen(line)-1] = 0;
76         }
77         /*打开文件*/
78         if ((fd = csopen(line, O_RDONLY))<0) {
79             continue;
80         }
81         /*把fd这个文件读写完成*/
82         printf("fd obtained from other process : %d
",fd);
83         while ((n = read(fd, buf, BUFFSIZE))>0) {
84             if (write(STDOUT_FILENO, buf, n)!= n) {
85                 err_sys("write error");
86             }
87         }
88         if (n<0) {
89             err_sys("read error");
90         }
91         close(fd);
92     }
93     exit(0);
94 }

maind.c的代码如下：

 1 #include <errno.h>
 2 #include <fcntl.h>
 3 #include "open_fd.h" 
 4 
 5 #define CL_OPEN "open"
 6 #define MAXARGC 50
 7 #define WHITE " 	
"
 8 
 9 char errmsg[MAXLINE];
10 int oflag;
11 char *pathname;
12 
13 /* cli_args和buf_args两个函数起到把读进来的buf解析的功能
14  * 了解大体功能即可 不用细看*/
15 
16 int cli_args(int argc, char **argv)
17 {
18     if (argc != 3 || strcmp(argv[0], CL_OPEN) != 0) {
19         strcpy(errmsg, "usage: <pathname> <oflag>
");
20         return(-1);
21     }
22     pathname = argv[1];        /* save ptr to pathname to open */
23     oflag = atoi(argv[2]);
24     return(0);
25 }
26 
27 int buf_args(char *buf, int (*optfunc)(int, char **))
28 {
29     char    *ptr, *argv[MAXARGC];
30     int        argc;
31 
32     if (strtok(buf, WHITE) == NULL)        /* an argv[0] is required */
33         return(-1);
34     argv[argc = 0] = buf;
35     while ((ptr = strtok(NULL, WHITE)) != NULL) {
36         if (++argc >= MAXARGC-1)    /* -1 for room for NULL at end */
37             return(-1);
38         argv[argc] = ptr;
39     }
40     argv[++argc] = NULL;
41 
42     /*
43      * Since argv[] pointers point into the user's buf[],
44      * user's function can just copy the pointers, even
45      * though argv[] array will disappear on return.
46      */
47     return((*optfunc)(argc, argv));
48 }
49 
50 void handle_request(char *buf, int nread, int fd)
51 {
52     int        newfd;
53     if (buf[nread-1] != 0) {
54         send_err(fd, -1, errmsg);
55         return;
56     }
57     if (buf_args(buf, cli_args) < 0) {    /* parse args & set options */
58         send_err(fd, -1, errmsg);
59         return;
60     }
61     if ((newfd = open(pathname, oflag)) < 0) {
62         send_err(fd, -1, errmsg);
63         return;
64     }
65     if (send_fd(fd, newfd) < 0)        /* send the descriptor */
66         err_sys("send_fd error");
67     close(newfd);        /* we're done with descriptor */
68 }
69 
70 /*server端*/
71 int main(void)
72 {
73     int nread;
74     char buf[MAXLINE];
75     for (; ; ){
76         /*一直阻塞着 等着stdin读数据*/
77         if ((nread = read(STDIN_FILENO, buf, MAXLINE))<0) {
78             err_sys("read error on stream pipe");
79         }
80         else if (nread == 0) {
81             break;
82         }
83         handle_request(buf, nread, STDOUT_FILENO);
84     }
85     exit(0);
86 }

其余lib和include中的代码有的是apue书上这个章节的，有的是apue源代码提供的lib，这些不再赘述了。

直接看运行结果（在当前文件夹下面设定了一个xbf的文本文件，流程是让client发送以只读方式打开这个文件的请求，由server打开这个文件，然后再将fd返回）

先得注意msg.msg_controllen与CONTROLLEN是不等的，这是原书勘误表中的一个bug。

server中打开的xbf文件的fd就是存在了msg这个结构体的最后的位置发送过来的。

如果将main.c中的line91注释掉，结果如下：

可以看到，真正client接收到的fd的值，与server端发送时候的fd的值是没有关系的，只是client端哪个最小的fd的值可用，就会用这个fd的值对应上server打开的xbf这个文件。

总结一下，流程是这样的：

（1）server打开xbf文件 →

（2）server将与xbf文件对应的fd挂到cmsghdr的最后 →

（3）server通过fd_pipe产生的unix domain socket将msghdr发送到client端 →

（4）在发送的过程中kernel记录的应该是这个fd对应的file table信息 →

（5）在client接收到这个file table时候，kernel分配一个client端可用的最小fd →

（6）client端获得了一个fd并且这个fd已经指向开打的xbf文件

其余的具体protocol不用细看，但是一些技术细节后面再单独记录。

17.6 An Open Server Version 2

这里主要用到的是naming unix domain socket的技术，为的是可以在unrelated process之间传递file descriptor。

理解这个部分的重点是书上17.29和17.30两个loop函数的实现：一个用的是select函数，一个用的是poll函数。（还需要熟悉守护进程的知识以及command-line argument的解析的套路）

要想迅速串起来这部分的代码，还得回顾一下select和poll函数，这二者的输入参数中都有value-on return类型的，先理解好输入参数。

loop.select.c代码如下：

 1 #include    "opend.h"
 2 #include    <sys/select.h>
 3 
 4 void
 5 loop(void)
 6 {
 7     int        i, n, maxfd, maxi, listenfd, clifd, nread;
 8     char    buf[MAXLINE];
 9     uid_t    uid;
10     fd_set    rset, allset;
11 
12     /* 与poll的用法不同 这里喂给select的fd_set是不预先设定大小的
13      * 而是靠maxfd来标定大小*/
14     FD_ZERO(&allset);
15     /* obtain fd to listen for client requests on */
16     if ((listenfd = serv_listen(CS_OPEN)) < 0)
17         log_sys("serv_listen error");
18     /* 将server这个用于监听的fd加入集合*/
19     FD_SET(listenfd, &allset);
20     /* 需要监听的最大的fd就是刚刚分配的listenfd*/
21     maxfd = listenfd;
22     maxi = -1;
23 
24     for ( ; ; ) {
25         rset = allset;    /* rset gets modified each time around */
26         /* select中的&rset这个参数 返回的时候只保留ready的fd*/
27         if ((n = select(maxfd + 1, &rset, NULL, NULL, NULL)) < 0)
28             log_sys("select error");
29         /* 处理有client发送请求的case*/
30         if (FD_ISSET(listenfd, &rset)) {
31             /* accept new client request */
32             if ((clifd = serv_accept(listenfd, &uid)) < 0)
33                 log_sys("serv_accept error: %d", clifd);
34             i = client_add(clifd, uid);
35             FD_SET(clifd, &allset); /*A 向allset中增加需要监听的内容*/
36             if (clifd > maxfd) /* 更新select监控的最大的fd大小*/
37                 maxfd = clifd;    /* max fd for select() */
38             if (i > maxi) /* 更新Client array的大小*/
39                 maxi = i;    /* max index in client[] array */
40             log_msg("new connection: uid %d, fd %d", uid, clifd);
41             continue;
42         }
43         /* 没有新的client 处理Client array中ready的client */
44         for (i = 0; i <= maxi; i++) {    /* go through client[] array */
45             if ((clifd = client[i].fd) < 0) /*没被占用的*/
46                 continue;
47             if (FD_ISSET(clifd, &rset)) { /*在监听的set中*/
48                 /* read argument buffer from client */
49                 if ((nread = read(clifd, buf, MAXLINE)) < 0) {
50                     log_sys("read error on fd %d", clifd);
51                 } else if (nread == 0) { /* nread=0表明client已经关闭了*/
52                     log_msg("closed: uid %d, fd %d",
53                       client[i].uid, clifd);
54                     client_del(clifd);    /* client has closed cxn */
55                     FD_CLR(clifd, &allset); /* B 从allset中删除需要监听的内容*/
56                     close(clifd);
57                 } else {    /* process client's request */
58                     handle_request(buf, nread, clifd, client[i].uid);
59                 }
60             }
61         }
62     }
63 }

loop.pool.c的代码如下：

#include    "opend.h"
#include    <poll.h>

#define NALLOC    10    /* # pollfd structs to alloc/realloc */

static struct pollfd *
grow_pollfd(struct pollfd *pfd, int *maxfd)
{
    int                i;
    int                oldmax = *maxfd;
    int                newmax = oldmax + NALLOC;

    if ((pfd = realloc(pfd, newmax * sizeof(struct pollfd))) == NULL)
        err_sys("realloc error");
    for (i = oldmax; i < newmax; i++) {
        pfd[i].fd = -1;
        pfd[i].events = POLLIN;
        pfd[i].revents = 0;
    }
    *maxfd = newmax;
    return(pfd);
}

void
loop(void)
{
    int                i, listenfd, clifd, nread;
    char            buf[MAXLINE];
    uid_t            uid;
    struct pollfd    *pollfd;
    int                numfd = 1;
    int                maxfd = NALLOC;

    /* 先分配10个fd槽 */
    if ((pollfd = malloc(NALLOC * sizeof(struct pollfd))) == NULL)
        err_sys("malloc error");
    for (i = 0; i < NALLOC; i++) {
        pollfd[i].fd = -1;
        pollfd[i].events = POLLIN; /*read*/
        pollfd[i].revents = 0;
    }

    /* obtain fd to listen for client requests on */
    if ((listenfd = serv_listen(CS_OPEN)) < 0)
        log_sys("serv_listen error");
    client_add(listenfd, 0);    /* we use [0] for listenfd */
    pollfd[0].fd = listenfd;

    for ( ; ; ) {
        /* 这里控制的是numfd而不是maxfd*/
        if (poll(pollfd, numfd, -1) < 0)
            log_sys("poll error");
        /* 1. 先判断是否有新的client请求 */
        if (pollfd[0].revents & POLLIN) {
            /* accept new client request */
            if ((clifd = serv_accept(listenfd, &uid)) < 0)
                log_sys("serv_accept error: %d", clifd);
            client_add(clifd, uid);
            /* possibly increase the size of the pollfd array */
            /* 如果Client array数量超过了pollfd的数量 就realloc*/
            if (numfd == maxfd)
                pollfd = grow_pollfd(pollfd, &maxfd);
            pollfd[numfd].fd = clifd;
            pollfd[numfd].events = POLLIN;
            pollfd[numfd].revents = 0;
            numfd++;
            log_msg("new connection: uid %d, fd %d", uid, clifd);
            /* 与select不同 这里没有continue 而是可以直接向下进行
             * 为什么可以直接向下进行 而select就不可以
             * 因为poll使用pollfd来标定需要等着的fd的
             * 每个struct pollfd中
             * a. 既有关心ready的事件
             * b. 又有真正ready的事件
             * 处理一个fd并不会影响其他fd的状态*/
        }
        /* 2. 再判断有哪些ready的client*/
        for (i = 1; i < numfd; i++) {
            if (pollfd[i].revents & POLLHUP) {
                goto hungup;
            } else if (pollfd[i].revents & POLLIN) {
                /* read argument buffer from client */
                if ((nread = read(pollfd[i].fd, buf, MAXLINE)) < 0) {
                    log_sys("read error on fd %d", pollfd[i].fd);
                } else if (nread == 0) {
hungup:
                    /* the client closed the connection */
                    log_msg("closed: uid %d, fd %d",
                      client[i].uid, pollfd[i].fd);
                    client_del(pollfd[i].fd);
                    close(pollfd[i].fd);
                    if (i < (numfd-1)) { /* 这个应该是corner case的判断*/
                        /* 这么做是为了节约空间
                         * 把末端的fd及相关信息顶到i这个位置上 */
                        /* pack the array */
                        pollfd[i].fd = pollfd[numfd-1].fd;
                        pollfd[i].events = pollfd[numfd-1].events;
                        pollfd[i].revents = pollfd[numfd-1].revents;
                        /* 由于把末位的顶到i这个位置上
                         * 所以要再check一遍这个位置 */
                        i--;    /* recheck this entry */
                    }
                    numfd--;
                } else {        /* process client's request */
                    handle_request(buf, nread, pollfd[i].fd,
                      client[i].uid);
                }
            }
        }
    }
}

===================================分割线===================================

记录几个遇到的技术细节问题

1. sign extension的问题

上面recv_fd中的line54有一个不是很直观的做法

int status;

char *ptr;

status = *ptr & 0xFF;

ptr是char类型，可以代表0~255的值，代表不同的返回状态。比如*ptr为128的值用二进制表示为1000000。

由于status是int类型占4bytes 32bits，如果直接status = *ptr，就涉及到位扩展的问题，最高位到底是当成符号位还是取值位呢？

（1）首先，char到底是有符号还是无符号的，取决于编译器，见这篇文章（http://descent-incoming.blogspot.jp/2013/02/c-char-signed-unsigned.html）

（2）0xFF默认是无符号int型，高位8都为0

因此，无论char是不是有符号的，一旦与0xFF做了与运算，则相当于把char类型的最高位自动当成了取值位了。就避免了上面提到的符号位扩展的问题。

为了方便记忆，写了一个小例子记录这种sign extension带来的影响：

 1 #include <stdio.h>
 2 #include <stdlib.h>
 3 
 4 int main(int argc, char *argv[])
 5 {
 6     /*验证int的byte数目*/
 7     int status = -1;
 8     char c1 = 254; /*默认254是int类型占4bytes 转换成char类型占1bytes 直接截取低8位*/
 9     unsigned char c2 = 254;
10     /*gcc编译器 默认的char是有符号的 因为直接从char转换到int是用char的符号位补齐高位*/
11     status  = c1;
12     printf("status converted from c1 : %d
", status);
13     /*如果是unsigned char是没有符号位的 因此从unsigned char转换到int是高位直接补0*/
14     status = c2;
15     printf("status converted from c2 : %d
", status);
16     /*验证默认的0xFF是4 bytes 32 bits的*/
17     printf("size of defalut int : %ld
", sizeof(0xFF));
18     status = c1 & 0xFF;
19     printf("status converted from c1 & 0xFF : %d
", status);
20     /*如果是1 byte 8 bits的int类型*/
21     int8_t i8 = 0xFF;
22     status  = c1 & i8;
23     printf("status converted from c1 & int8_t i8 : %d
", status);
24 }

执行结果如下：

上面的例子应该可以包含绝大多数情况了。

这是当时看过的一个不错的资料：http://www.cs.umd.edu/class/sum2003/cmsc311/Notes/Data/signExt.html

2. sizeof与strelen的问题：

http://www.cnblogs.com/carekee/articles/1630789.html

3. 结构体内存对齐问题：

send_fd和recv_fd代码中都用到了一个宏定义CMSG_LEN：查看这个宏在socket.h中的定义，引申出CMSG_ALIGN这个内存对齐的宏定义。

（1）要想回顾CMSG_ALIGN怎么做到内存对齐的，可以参考下面的blog：http://blog.csdn.net/duanlove/article/details/9948947

（2）要想理解为什么要进行内存对齐，可以参考下面的blog：http://www.cppblog.com/snailcong/archive/2009/03/16/76705.html

（3）从实操层面，学习如何计算结构体的内存对齐方法，可以参考下面的blog：http://blog.csdn.net/hairetz/article/details/4084088

把上面的内容总结起来，可得结构体内存对齐如下的结论：

1 A元素是结构体前面的元素 B元素是结构体后面的元素，一般结构体开始的偏移量是0，则：A元素必须让B元素满足 B元素的寻址偏移量是B元素size的整数倍大小

2 整个结构的大小必须是其中最大字段大小的整数倍。

按照上面两个原则就大概能算出来常规套路下结构体需要内存对齐后的大小

最后还是自己写一个例子，通过实操记忆一下：

 1 #include <stdio.h>
 2 #include <stdlib.h>
 3 
 4 struct str1{
 5     char a;
 6     char b;
 7     short c;
 8     long d;
 9 };
10 
11 struct str2{
12     char a;
13 };
14 
15 int main(int argc, char *argv[])
16 {
17     struct str2 s2;
18     struct str1 s1; 
19     char *p;
20     char c;
21     short s;
22     long l;
23 
24     printf("size of str2 : %ld
", sizeof(struct str2));
25     printf("addr of str2.a : %p
", &s2.a);
26     printf("size of str1 : %ld
", sizeof(struct str1));
27     printf("addr of str1.a : %p
", &s1.a);
28     printf("addr of str1.b : %p
", &s1.b);
29     printf("addr of str1.c : %p
", &s1.c);
30     printf("addr of str1.d : %p
", &s1.d);
31     printf("addr of char pointer p : %p
", &p);
32     printf("addr of char c : %p
", &c);
33     printf("addr of long l : %p
", &l);
34     printf("addr of short s : %p
", &s);
35 }

运行结果如下：

分析：

（1）结构体内存对齐按照上面说的规律

（2）其余的变量内存分配，并不是完全按照变量定义的顺序，我的理解是按照变量的所占字节的大小，字节大的分配在高地址（stack地址分配由高向低生长），这样有助于节约内存空间，降低内存对齐带来的memory的浪费。

另，深入看了一下malloc函数，果然malloc也是考虑了内存对齐的问题的。

（1）man malloc可以看到如下的信息：

（2）这个blog专门讲malloc考虑内存对齐的内存分配机制的：http://blog.csdn.net/elpmis/article/details/4500917

4. 对于char c = 0 和 char c = ''问题的解释

二者本质是一样的，只是表述上有所区别，ascii码''的值就是0.

http://stackoverflow.com/questions/16955936/string-termination-char-c-0-vs-char-c-0

===================================分割线===================================

APUE这本书刷到这里也差不多了，后面两章内容不是很新暂时不刷了。

这本书看过一遍，感觉还是远远不够，以后应该常放在手边翻阅。