hbase 数据拷贝

由于运营数据太大,另外避免影响正常访问,所以需要临时拷贝部分数据到临时表中.

bin/hbase org.apache.hadoop.hbase.mapreduce.CopyTable [--starttime=X] [--endtime=Y] [--new.name=NEW] [--peer.adr=ADR] tablename
bin/hbase org.apache.hadoop.hbase.mapreduce.CopyTable --starttime=1265875194289 --endtime=1265878794289 --peer.adr=server1,server2,server3:2181:/hbase --families=myOldCf:myNewCf,cf2,cf3 TestTable

完整
bin/hbase org.apache.hadoop.hbase.mapreduce.CopyTable --starttime=1414550495961 --endtime=1414551715976 --new.name=test2 --families=cf --peer.adr=10.0.0.26,10.0.0.27,10.0.0.37,10.0.0.25,10.0.0.35,10.0.0.30,10.0.0.58:2181:/hbase test

hbase(main):039:0> scan 'test'
ROW COLUMN+CELL
key1 column=cf:, timestamp=1414550495961, value=a:1
key2 column=cf:, timestamp=1414550523026, value=a:2
key3 column=cf:, timestamp=1414551715976, value=a:3

hbase(main):040:0> scan 'test2'
ROW COLUMN+CELL
key1 column=cf:, timestamp=1414550495961, value=a:1
key2 column=cf:, timestamp=1414550523026, value=a:2
注意时间为前开后闭(starttine<<time<endtime), 所以上面只拷贝了两条

bin/hbase org.apache.hadoop.hbase.mapreduce.CopyTable --starttime=1407600000 --endtime=1407600002 --new.name=logtest --families=cf --peer.adr=10.0.0.26,10.0.0.27,10.0.0.37,10.0.0.25,10.0.0.35,10.0.0.30,10.0.0.58:2181:/hbase logtable


org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any valid local directory for taskTracker/jobcache/job_201108311619_0703/attempt_201108311619_0703_m_00007

原文地址:https://www.cnblogs.com/chengxin1982/p/4062205.html