C语言中的位域(bitfield)概念

一、位域简介

接触过Linux内核网络协议栈的人，大概都见过位域的表达方式。如下是摘自Linux内核代码(include/linux/tcp.h)中关于tcp头部的定义:

 1 struct tcphdr {
 2     __be16    source;
 3     __be16    dest;
 4     __be32    seq;
 5     __be32    ack_seq;
 6 #if defined(__LITTLE_ENDIAN_BITFIELD)
 7     __u16    res1:4,
 8         doff:4,
 9         fin:1,
10         syn:1,
11         rst:1,
12         psh:1,
13         ack:1,
14         urg:1,
15         ece:1,
16         cwr:1;
17 #elif defined(__BIG_ENDIAN_BITFIELD)
18     __u16    doff:4,
19         res1:4,
20         cwr:1,
21         ece:1,
22         urg:1,
23         ack:1,
24         psh:1,
25         rst:1,
26         syn:1,
27         fin:1;
28 #else
29 #error    "Adjust your <asm/byteorder.h> defines"
30 #endif    
31     __be16    window;
32     __sum16    check;
33     __be16    urg_ptr;
34 };

位域的表达方式就是变量名:位数。从上面tcphdr的定义可以看出，位域是跟实现有关的。下面是C1999标准中关于位域的一个样例：

EXAMPLE 3 The following obscure constructions
typedef signed int t;
typedef int plain;
struct tag {
unsigned t:4;
const t:5;
plain r:5;
};
declare a typedef name t with type signed int, a typedef name plain with type int, and a structure
with three bit-field members, one named t that contains values in the range [0, 15], an unnamed constqualified
bit-field which (if it could be accessed) would contain values in either the range [−15, +15] or
[−16, +15], and one named r that contains values in one of the ranges [0, 31], [−15, +15], or [−16, +15].
(The choice of range is implementation-defined.) The first two bit-field declarations differ in that
unsigned is a type specifier (which forces t to be the name of a structure member), while const is a
type qualifier (which modifies t which is still visible as a typedef name).

样例中给出了几个匿名的结构体成员，如文中解释的，位域成员的取值范围是跟实现相关的。我对由位域构成的结构体所占内存的大小比较感兴趣，就用sizeof()测试了一下，如下：

 1 #include<stdio.h>
 2 #include<stdlib.h>
 3 #include<string.h>
 4 /*
 5 **Sample code by virHappy
 6 */
 7 
 8 typedef signed int t;
 9 typedef int plain; 
10 
11 //anoymous member
12 struct  tag {
13         unsigned t:4;
14         const t:5;
15         plain r:5;
16 };
17 // member is char
18 struct rec {
19         unsigned char a:1;
20         unsigned char b:1;
21         unsigned char c:1;
22         unsigned char d:1;
23 };
24 
25 // member is unsigned int
26 struct rec_int {
27         unsigned int a:1;
28         unsigned int b:1;
29         unsigned int c:1;
30         unsigned int d:1;
31 };
32 
33 
34 #define  TEST_AND_SET_BIT(x) \
35         do{                        \
36                 if ((x)) {             \
37                         printf("bit alread set.\n"); \
38                 } else {               \
39                         (x) = 1;           \
40                 }                      \
41         }while(0)
42 
43 int main()
44 {
45         struct tag st;
46         struct rec sr;
47 
48         printf("size of tag is: %d\n", sizeof(st));
49         printf("size of rec is: %d\n", sizeof(sr));
50         printf("size of rec is: %d\n", sizeof(struct rec_int));
51         printf("size of rec is: %d\n", sizeof(int));
52 
53         memset(&sr, 0, sizeof(struct rec));
54         TEST_AND_SET_BIT(sr.a);
55         TEST_AND_SET_BIT(sr.b);
56         TEST_AND_SET_BIT(sr.c);
57         TEST_AND_SET_BIT(sr.d);
58 
59         TEST_AND_SET_BIT(sr.a);
60         TEST_AND_SET_BIT(sr.b);
61         TEST_AND_SET_BIT(sr.c);
62         TEST_AND_SET_BIT(sr.d);
63         return 0;
64 }

输出为：

root@host]# gcc -Wall bitfield.c  -o bf
[root@host]# ./bf
size of tag is: 4
size of rec is: 1
size of rec is: 4
size of rec is: 4
bit alread set.
bit alread set.
bit alread set.
bit alread set.

结果显示由int类型说明符修饰的位域成员构成的结构体为4byte, 由char类型说明符修改的位域成员构成的结构体为1byte，即使实际上只声明了4个位长度大小的成员。

二、位域的作用

1. 在看一些rfc文档时，关于包结构部分的描述，常常看到具体的某一位具有特定的功能。而内核的网络协议栈中对应的实现就是通过位域来实现的。

2. 配置文件解析时，有时候需要比较新的配置和已有的配置的区别，这时需要做一些标记。位域在这个时候就派上了用场。优点的是占内存少。

3. 其它？