散列表

　　散列表（Hash table，也叫哈希表），是根据关键字（Key value）而直接访问在内存存储位置的数据结构。也就是说，把键值通过一个函数的计算，映射到表中一个位置来访问记录，这加快了查找速度。这个映射函数称做散列函数，存放记录的数组称做散列表。类似于数学中的映射关系。

　　现假设关键字的全域为U = {k | 0 <= k < n}

1、直接寻址表

　　如果全域U比较小，即n的数值比较小,且没有重复元素，那么可以用数组T[0, m-1](m = n)做为直接寻址表，每个槽位T[k] = tk，tk = NIL表示不存在关键字为k的元素，否则表示有关键字为k的元素。

SEARCH(T, x)
    return T[x]

INSERT(T, x)
    T[x] = x

DELETE(T, x)
    T[x] = NIL

直接寻址法限制较多，不常用。

2、散列表

当全域U很大时，可以用函数h把U中元素k映射到表T[1, m]中的某个位置，即：

h: U{k} --> T[i], (k IN [0, n-1], i IN [0, m-1])

这样我们处理的值就从n降到了m。这里的函数h称为散列函数。

好的散列函数应该满足如下性质：每个关键字都等可能的映射到T[0, m-1]中的m个槽位中，且与其他关键字散列到哪个槽位无关

但这个性质很难满足. 散列函数假定关键字为自然数N的子集，如果关键字不是自然数，则将其变换为自然数。

一般有如下几种散列函数：

2.1、除法散列

h(k) = k mod m  // m表示槽位数，k表示关键字

一般讲m选择为与2的整数幂不太接近的质数。

2.2、乘法散列

h(k) = floor(m(kA mod 1)) // 0 < A < 1

其中 kA mod 1 表示取 kA的小数部分，floor表示向下取整。

A可以取黄金分割数 (sqrt(5) - 1) / 2

2.3 解决冲突

从全域U --> 表T, 可能存在k1和k2同时映射到T[i]的情况,发生了冲突，可以通过链接法来解决。

链接法实际是使用数组T[0, m-1], 数组的任意元素T[i]为一个链表的表头来实现的。

3、散列表简单实现

采用除法散列，并用双链表解决散列冲突的简单散列表如下：

  1 static const int hash_size = 331;
  2 #define hash(x) ((unsigned int)x % hash_size)
  3 
  4 struct listNode
  5 {
  6     int key;
  7     int value;
  8     struct listNode *next;
  9     struct listNode *prev;
 10     listNode(int x, int y): key(x), value(y), next(NULL), prev(NULL) {}
 11     listNode(): key(0), value(0), next(NULL), prev(NULL) {}
 12 };
 13 
 14 class hash_table
 15 {
 16 public:
 17     hash_table()
 18     {
 19         for (int i = 0; i < hash_size; i++)
 20         {
 21             hash_map[i].next = &hash_map[i]; 
 22             hash_map[i].prev = &hash_map[i]; 
 23         }
 24         elem_size = 0;
 25     }
 26     struct listNode *find(int key);
 27     void insert(int key, int value);
 28     void erase(int key);
 29     bool empty();
 30     int size();
 31 private:
 32     struct listNode hash_map[hash_size];
 33     int elem_size;
 34 };
 35 
 36 void __list_add(struct listNode *prev, struct listNode *next, struct listNode *node)
 37 {
 38     prev->next = node;
 39     node->next = next;
 40     next->prev = node;
 41     node->prev = prev;
 42 }
 43 
 44 void list_add(struct listNode *head, struct listNode *node)
 45 {
 46     __list_add(head, head->next, node);
 47 }
 48 
 49 void __list_del(struct listNode *prev, struct listNode *next)
 50 {
 51     prev->next = next;
 52     next->prev = prev;
 53 }
 54 
 55 void list_del(struct listNode *node)
 56 {
 57     __list_del(node->prev, node->next);
 58 }
 59 
 60 bool list_empty(struct listNode *node)
 61 {
 62     return (node->next == node);
 63 }
 64 
 65 bool list_tail(struct listNode *head, struct listNode *node)
 66 {
 67     return (node->next == head);
 68 }
 69 
 70 bool hash_table::empty()
 71 {
 72     return (elem_size == 0);
 73 }
 74 
 75 int hash_table::size()
 76 {
 77     return elem_size;
 78 }
 79 
 80 struct listNode *hash_table::find(int key)
 81 {
 82     struct listNode *head = &hash_map[hash(key)], *visit = head;
 83     while (!list_tail(head, visit))
 84     {
 85         visit = visit->next;
 86         if (visit->key == key)
 87             return visit;
 88     }
 89     return NULL;
 90 }
 91 
 92 void hash_table::insert(int key, int value)
 93 {
 94     struct listNode *visit = NULL;
 95     
 96     visit = find(key);
 97     if (visit != NULL)
 98     {
 99         visit->value = value;
100     }
101     else
102     {
103         visit = new listNode(key, value);
104         list_add(&hash_map[hash(key)], visit);
105         elem_size++;
106     }
107 }
108 
109 void hash_table::erase(int key)
110 {
111     struct listNode *visit;
112 
113     visit = find(key);
114     if (visit != NULL)
115     {
116         list_del(visit);
117         delete visit;
118         elem_size--;
119     }
120 }

为了简单起见，没有实现rehash方法，仅仅作为一个散列表的简单实现.