B树

B-tree is a tree data structure that keeps data sorted and allows searches, sequential access, insertions, and deletions in logarithmic time. The B-tree is a generalization of a binary search tree in that a node can have more than two children (Comer 1979, p. 123). Unlike self-balancing binary search trees, the B-tree is optimized for systems that read and write large blocks of data. It is commonly used in databases and filesystems.

    Average Worst case
Space   O(n)   O(n)
Search O(log n) O(log n)
Insert O(log n) O(log n)
Delete O(log n) O(log n)

The term B-tree may refer to a specific design or it may refer to a general class of designs. In the narrow sense, a B-tree stores keys in its internal nodes but need not store those keys in the records at the leaves. The general class includes variations such as the B+ tree and the B*.

没有b减树一说,b-tree就是b树。。B+tree这是B加树。

In the B+ tree, copies of the keys are stored in the internal nodes; the keys and records are stored in leaves; in addition, a leaf node may include a pointer to the next leaf node to speed sequential access

In particular, a B-tree:

  • keeps keys in sorted order for sequential traversing
  • uses a hierarchical index to minimize the number of disk reads
  • uses partially full blocks to speed insertions and deletions
  • keeps the index balanced with an elegant recursive algorithm
  • In addition, a B-tree minimizes waste by making sure the interior nodes are at least half full. A B-tree can handle an arbitrary number of insertions and deletions.

Internal nodes are all nodes except for leaf nodes and the root node. They are usually represented as an ordered set of elements and child pointers. Every internal node contains a maximum of U children and a minimum of L children. Thus, the number of elements is always 1 less than the number of child pointers (the number of elements is between L−1 and U−1). U must be either 2L or 2L−1; therefore each internal node is at least half full. The relationship between U and L implies that two half-full nodes can be joined to make a legal node, and one full node can be split into two legal nodes (if there’s room to push one element up into the parent). These properties make it possible to delete and insert new values into a B-tree and adjust the tree to preserve the B-tree properties.

The root node’s number of children has the same upper limit as internal nodes, but has no lower limit. For example, when there are fewer than L−1 elements in the entire tree, the root will be the only node in the tree, with no children at all.

A B-tree of depth n+1 can hold about U times as many items as a B-tree of depth n, but the cost of search, insert, and delete operations grows with the depth of the tree. As with any balanced tree, the cost grows much more slowly than the number of elements.

B树:

  1. 定义任意非叶子结点最多只有M个儿子;且M>2;
  2. 根结点的儿子数为[2, M];
  3. 除根结点以外的非叶子结点的儿子数为[M/2, M];
  4. 每个结点存放至少M/2-1(取上整)和至多M-1个关键字;(至少2个关键字)
  5. 非叶子结点的关键字个数=指向儿子的指针个数-1;
  6. 非叶子结点的关键字:K[1], K[2], …, K[M-1];且K[i] < K[i+1];
  7. 非叶子结点的指针:P[1], P[2], …, P[M];其中P[1]指向关键字小于K[1]的子树,P[M]指向关键字大于K[M-1]的子树,其它P[i]指向关键字属于(K[i-1], K[i])的子树;
  8. 所有叶子结点位于同一层;
  9. 关键字集合分布在整颗树中;
  10. 任何一个关键字出现且只出现在一个结点中;
  11. 搜索有可能在非叶子结点结束;
  12. 其搜索性能等价于在关键字全集内做一次二分查找;

 B树的搜索,从根结点开始,对结点内的关键字(有序)序列进行二分查找,如果命中则结束,否则进入查询关键字所属范围的儿子结点;重复,直到所对应的儿子指针为空,或已经是叶子结点;

B+树:

  1. 其定义基本与B-树同,除了:
  2. 非叶子结点的子树指针与关键字个数相同;
  3. 非叶子结点的子树指针P[i],指向关键字值属于[K[i], K[i+1])的子树(B-树是开区间);
  4. 为所有叶子结点增加一个链指针;
  5. 所有关键字都在叶子结点出现;
  6. 所有关键字都出现在叶子结点的链表中(稠密索引),且链表中的关键字恰好是有序的;

  7. 不可能在非叶子结点命中;

  8. 非叶子结点相当于是叶子结点的索引(稀疏索引),叶子结点相当于是存储(关键字)数据的数据层;

  9. 更适合文件索引系统。

B*树

  1. 是B+树的变体,在B+树的非根和非叶子结点再增加指向兄弟的指针;
  2. B*树定义了非叶子结点关键字个数至少为(2/3)*M,即块的最低使用率为2/3(代替B+树的1/2);
  3. B*树分配新结点的概率比B+树要低,空间使用率更高;

以上。

部分节选自:http://www.cnblogs.com/oldhorse/archive/2009/11/16/1604009.html

原文地址:https://www.cnblogs.com/linyx/p/3991784.html