因为HBase 数据储存按照 row key 排序,如果HBase表的 row key 是单调递增的,则HBase 容易有RegionServer 的局部热点问题。加盐可以缓解这个问题。
create table H3 (id varchar not null primary key, cf1.a varchar, cf2.b varchar) SALT_BUCKETS=20;
只能在创建表格时候加,创建后不可更改。
alter table h1 set salt_buckets=10;
Error: ERROR 1024 (42Y83): Salt bucket number may only be specified when creating a table. tableName=H1
加盐后的注意事项:
a、sequential scan 返回的结果可能不是自然排序的,如果sequential scan使用了LIMIT语句,将与不加盐的情况不一样。
b、 Spit point:If no split points are specified for the table, the salted table would be pre-split on salt bytes boundaries to ensure load distribution among region servers even during the initial phase of the table. If users are to provide split points manually, users need to include a salt byte in the split points they provide.
实际上是改写了Row Key,添加了一个prefix
new_row_key = (++index % BUCKETS_NUMBER) + original_key