高性能MySQL笔记-第5章Indexing for High Performance-002Hash indexes

一、

1.什么是hash index

A hash index is built on a hash table and is useful only for exact lookups that use every column in the index.

For each row, the storage engine computes a hash code of the indexed columns, which is a small value that will probably differ from the hash codes computed for other rows with different key values. It stores the hash codes in the index and stores a pointer to each row in a hash table.

In MySQL, only the Memory storage engine supports explicit hash indexes. They are the default index type for Memory tables, though Memory tables can have B-Tree indexes, too. The Memory engine supports nonunique hash indexes, which is unusual in the database world. If multiple values have the same hash code, the index will store
their row pointers in the same hash table entry, using a linked list.

2.hash index数据结构

1 -- hash index的数据结构
2 CREATE TABLE testhash (
3 fname VARCHAR(50) NOT NULL,
4 lname VARCHAR(50) NOT NULL,
5 KEY USING HASH(fname)
6 ) ENGINE=MEMORY;

3.hash index的缺点

Because the indexes themselves store only short hash values, hash indexes are very compact. As a result, lookups are usually lightning fast. However, hash indexes have some limitations:
• Because the index contains only hash codes and row pointers rather than the values themselves, MySQL can’t use the values in the index to avoid reading the rows.Fortunately, accessing the in-memory rows is very fast, so this doesn’t usually degrade performance.

• MySQL can’t use hash indexes for sorting because they don’t store rows in sorted order.
• Hash indexes don’t support partial key matching, because they compute the hash from the entire indexed value. That is, if you have an index on (A,B) and your query’s WHERE clause refers only to A , the index won’t help.
• Hash indexes support only equality comparisons that use the = , IN() , and <=> operators (note that <> and <=> are not the same operator). They can’t speed up range queries, such as WHERE price > 100 .
• Accessing data in a hash index is very quick, unless there are many collisions (multiple values with the same hash). When there are collisions, the storage engine must follow each row pointer in the linked list and compare their values to the lookup value to find the right row(s).
• Some index maintenance operations can be slow if there are many hash collisions.For example, if you create a hash index on a column with a very low selectivity (many hash collisions) and then delete a row from the table, finding the pointer from the index to that row might be expensive. The storage engine will have to examine each row in that hash key’s linked list to find and remove the reference to the one row you deleted.

The InnoDB storage engine has a special feature called adaptive hash indexes. When InnoDB notices that some index values are being accessed very frequently, it builds a hash index for them in memory on top of B-Tree indexes. This gives its B-Tree indexes some properties of hash indexes, such as very fast hashed lookups. This process is completely automatic, and you can’t control or configure it, although you can disable the adaptive hash index altogether.