使用hbase小结

背景

hbase中一张表的rowkey定义为时间戳+字符串

需求

根据时间戳和列簇中某列的值为"abc",导出一天内的数据到excel中。

使用FilterList

     FilterList filterList = new FilterList(FilterList.Operator.MUST_PASS_ALL);
        SingleColumnValueFilter filter=new SingleColumnValueFilter("info".getBytes(),"supplier".getBytes(), CompareFilter.CompareOp.EQUAL,"abc".getBytes());
        filter.setFilterIfMissing(true);
        filterList.addFilter(filter);

        List<String> list = new ArrayList<String>();
        List<ResultDTO> listSpider = new ArrayList<ResultDTO>();
        Scan scan = new Scan();
        scan.setStartRow(Bytes.toBytes(startKey));
        scan.setStopRow(Bytes.toBytes(endtKey));
        scan.setFilter(filterList);

        Connection conn = null;
        HTable table = null;
        try {
            conn = getConnection();

            table = (HTable) conn.getTable(TableName.valueOf(tableName));

            ResultScanner rs = table.getScanner(scan);

1.rowkey的range,设置startrow和StopRow值

2.列值过滤,使用

SingleColumnValueFilter 

默认情况下,列值为空时把此行结果算入

filter.setFilterIfMissing(true);//排除列值为空的

官方说明:To prevent the entire row from being emitted if the column is not found on a row, use setFilterIfMissing(boolean). Otherwise, if the column is found, the entire row will be emitted only if the value passes. If the value fails, the row will be filtered out.
原文地址:https://www.cnblogs.com/davidwang456/p/8303152.html