Hbase+Phoenix环境部署和使用

功能: Mysql数据迁移至Hbase, 通过Phoenix框架作为操作Hbase的CRUD层, 平滑过渡sql语法使用, 解决Mysql单表数据量过大导致查询效率低的问题.

Hbase的性能自然不用多说, PB级别的数据处理起来也没什么问题. 看一下Phoenix官方给出的性能介绍: http://phoenix.apache.org/faq.html#How_fast_is_Phoenix_Why_is_it_so_fast

1. Hbase Phoenix 环境安装

Hbase环境安装属于大数据领域的知识, Java开发程序员一般不太熟悉, 尤其还要涉及到Hadoop等框架的知识, 安装过程特别繁琐, 容易砸键盘.
所以最好采用docker方式安装, 部署方便, 开发测试的话也够用了. 上生产就交给大牛和运维吧! 你要是大牛, 你必须上 ^_^

docker安装和docker-compose使用, 自行了解.

环境部署:【 原文地址: https://gitee.com/assad/docker-hbase

  1. 下载仓库
    git clone https://gitee.com/assad/docker-hbase.git

  2. docker-compose 构建基础镜像。
    make build

  3. 运行 Standalone 版本
    Standalone 版本依赖结构与 quickstart HBase documentation 一致,只不过使用了 Hadoop 的 HDFS 版本作为底层储存。主要用于开发测试环境的快速搭建。
    docker-compose -f docker-compose-standalone.yml up -d

  4. 运行 Local Distributed 伪分布式版本
    运行该版本会使用一个独立的 Zookeeper,HMaster 节点和 HResgionserver 节点也会分开为独立容器。
    docker-compose -f docker-compose-distributed-local.yml up -d

查看docker运行状态: docker ps --format "table {{.ID}} {{.Names}} {{.Ports}} {{.Status}}"

CONTAINER ID        NAMES               PORTS                                                                                                                                                                            STATUS
82523e8c14a7        nodemanager         8042/tcp               Up 5 hours (healthy)
bf81f61f2df9        datanode            50075/tcp              Up 5 hours (healthy)
1a9e2f1439e2        namenode            50070/tcp              Up 5 hours (healthy)
1473a0330fc6        historyserver       8188/tcp               Up 5 hours (healthy)
65c4e55c4c02        resourcemanager     8088/tcp               Up 5 hours (healthy)
977b7ad4f4ef        hbase               0.0.0.0:2181->2181/tcp, 0.0.0.0:2888->2888/tcp, 0.0.0.0:3888->3888/tcp, 0.0.0.0:16000->16000/tcp, 0.0.0.0:16010->16010/tcp, 0.0.0.0:16020->16020/tcp, 0.0.0.0:16030->16030/tcp Up 5 hours

查看hbase容器日志: docker logs -f hbase

进入hbase容器base: docker exec -it hbase /bin/bash

服务器,添加host映射 vi /etc/profile

服务器ip  hbase

访问hbase管理页面: http://服务器ip:16010/master-status

2. Hbase shell 使用

进入hbase容器base: docker exec -it hbase /bin/bash
启动hbase shell客户端: hbase shell

[root@hbase ~]# docker exec -it hbase /bin/bash
root@hbase:/# hbase shell
HBase Shell
Use "help" to get list of supported commands.
Use "exit" to quit this interactive shell.
Version 2.0.0, r7483b111e4da77adbfc8062b3b22cbe7c2cb91c1, Sun Apr 22 20:26:55 PDT 2018
Took 0.0031 seconds                                                                                                                                                                                               
hbase(main):001:0> 

基本操作:

  1. 创建表 create <tableName> , <colFamily> [ ,<colFamily1> ,... ,<colFamilyN> ]
  2. 添加数据 put <tableName>, <rowKey> , < colFamily : col >,value [ , stampTime ]
  3. 查询数据 get <tableName>, <rowKey> , < colFamily : col >
  4. 查询所有数据 scan <tableName>
hbase(main):002:0> create 'user', 'info'
Created table user
Took 2.6171 seconds                                                                                                                                                                                               
=> Hbase::Table - user
hbase(main):003:0> put 'user', '001', 'info:name', 'linyufeng01'
Took 0.2252 seconds                                                                                                                                                                                               
hbase(main):004:0> get 'user', '001'
COLUMN                                                CELL                                                                                                                                                        
 info:name                                            timestamp=1623618903615, value=linyufeng01                                                                                                                  
1 row(s)
Took 0.0576 seconds                                                                                                                                                                                               
hbase(main):005:0> put 'user', '001', 'info:age', '12'
Took 0.0079 seconds                                                                                                                                                                                               
hbase(main):006:0> get 'user', '001'
COLUMN                                                CELL                                                                                                                                                        
 info:age                                             timestamp=1623618951968, value=12                                                                                                                           
 info:name                                            timestamp=1623618903615, value=linyufeng01                                                                                                                  
1 row(s)
Took 0.0102 seconds                                                                                                                                                                                               
hbase(main):007:0> get 'user', '001', 'info:name'
COLUMN                                                CELL                                                                                                                                                        
 info:name                                            timestamp=1623618903615, value=linyufeng01                                                                                                                  
1 row(s)
Took 0.0084 seconds                                                                                                                                                                                               
hbase(main):008:0> scan 'user'
ROW                                                   COLUMN+CELL                                                                                                                                                 
 001                                                  column=info:age, timestamp=1623618951968, value=12                                                                                                          
 001                                                  column=info:name, timestamp=1623618903615, value=linyufeng01                                                                                                
1 row(s)
Took 0.0138 seconds                                                                                                                                                                                               
hbase(main):009:0> 

4. Phoenix shell 使用

下载 http://www.apache.org/dyn/closer.lua/phoenix/apache-phoenix-5.0.0-HBase-2.0/bin/apache-phoenix-5.0.0-HBase-2.0-bin.tar.gz
解压 tar -zxvf apache-phoenix-5.0.0-HBase-2.0-bin.tar.gz, 进入bin目录
启动 Phoenix shell 客户端 ./sqlline.py localhost

[root@hbase bin]# ./sqlline.py localhost
0: jdbc:phoenix:localhost> !tables
+------------+--------------+--------------------------+---------------+----------+------------+----------------------------+-----------------+--------------+-----------------+---------------+---------------+-+
| TABLE_CAT  | TABLE_SCHEM  |        TABLE_NAME        |  TABLE_TYPE   | REMARKS  | TYPE_NAME  | SELF_REFERENCING_COL_NAME  | REF_GENERATION  | INDEX_STATE  | IMMUTABLE_ROWS  | SALT_BUCKETS  | MULTI_TENANT  | |
+------------+--------------+--------------------------+---------------+----------+------------+----------------------------+-----------------+--------------+-----------------+---------------+---------------+-+
|            |              | IDX_T_LOG                | INDEX         |          |            |                            |                 | ACTIVE       | false           | 16            | false         | |
|            |              | IDX_T_LOG_UID            | INDEX         |          |            |                            |                 | ACTIVE       | false           | 16            | false         | |
|            | SYSTEM       | CATALOG                  | SYSTEM TABLE  |          |            |                            |                 |              | false           | null          | false         | |
|            | SYSTEM       | FUNCTION                 | SYSTEM TABLE  |          |            |                            |                 |              | false           | null          | false         | |
|            | SYSTEM       | LOG                      | SYSTEM TABLE  |          |            |                            |                 |              | true            | 32            | false         | |
|            | SYSTEM       | SEQUENCE                 | SYSTEM TABLE  |          |            |                            |                 |              | false           | null          | false         | |
|            | SYSTEM       | STATS                    | SYSTEM TABLE  |          |            |                            |                 |              | false           | null          | false         | |
+------------+--------------+--------------------------+---------------+----------+------------+----------------------------+-----------------+--------------+-----------------+---------------+---------------+-+
0: jdbc:phoenix:localhost> 

基本sql语法

CREATE TABLE IF NOT EXISTS t_user (
      name VARCHAR(20) NOT NULL,
      age integer(3) NOT NULL,
      address VARCHAR,
      updatetime timestamp,
      inserttime timestamp
      CONSTRAINT pk_name_age PRIMARY KEY (name, age)// 联合主键
);

// 添加和更新数据都用 upsert, 没有 insert 和 update
UPSERT INTO t_user (name, age, address) values ('test01',12,'上海市松江区');
UPSERT INTO t_user (name, age, address) values ('test02',13,'上海市金山区');

SELECT * FROM t_user;
SELECT COUNT(*) FROM t_user;
0: jdbc:phoenix:localhost> SELECT * FROM t_user;
+---------+------+----------+-------------+-------------+
|  NAME   | AGE  | ADDRESS  | UPDATETIME  | INSERTTIME  |
+---------+------+----------+-------------+-------------+
| test01  | 12   | 上海市松江区   |             |             |
| test02  | 13   | 上海市金山区   |             |             |
+---------+------+----------+-------------+-------------+

4. Hbase Phoenix 数据库连接工具

可视化数据库连接工具势必可少的, 就像Navicat一样. 这里可以使用 squirrel sqlDbeaver
推荐Dbeaver V6.1.5, 高版本需要Jdk11才能运行

1. 下载安装

下载页面: https://github.com/dbeaver/dbeaver/releases?after=6.2.0
windows 64位下载地址: https://github.com/dbeaver/dbeaver/releases/download/6.1.5/dbeaver-ce-6.1.5-x86_64-setup.exe

安装完成后, 如果启动报错, 在安装目录找到 dbeaver.ini 文件, 在首行添加本机jdk安装目录:

-vm
C:Program FilesJavajdk1.8.0_201in // 以本机为准

复制 phoenix-5.0.0-HBase-2.0-client.jar 到根目录, 后边需要加载驱动

2. 配置连接

新建连接

配置驱动

选择本地jar包驱动, 不要使用自动下载的Hadoop Phoenix驱动

连接测试

5. SpringBoot集成

乐意用MyBatis就用, 用MyBatisPlus也行

依赖:

<dependency>
    <groupId>mysql</groupId>
    <artifactId>mysql-connector-java</artifactId>
</dependency>

<dependency>
    <groupId>com.baomidou</groupId>
    <artifactId>mybatis-plus-boot-starter</artifactId>
    <version>3.1.0</version>
</dependency>

<!--phoenix core-->
<dependency>
    <groupId>org.apache.phoenix</groupId>
    <artifactId>phoenix-core</artifactId>
    <version>5.0.0-HBase-2.0</version>
</dependency>

<dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-common</artifactId>
    <version>2.9.2</version>
</dependency>

配置:

spring.datasource.driver-class-name=org.apache.phoenix.jdbc.PhoenixDriver
spring.datasource.url=jdbc:phoenix:服务器ip:2181
spring.datasource.username=
spring.datasource.password=

官方说,不需要数据库连接池 http://phoenix.apache.org/faq.html#Should_I_pool_Phoenix_JDBC_Connections

其他配置跟普通项目一样, bean, dao, service 等等该怎么建就怎么建, 都一样.
唯一的区别就是, 在Phoenix中, 把 insertupdate 语句合并成了 upsert语句. 不要再写 insert into , 用 upsert into来代替

Bean

@Data
@ToString
@TableName("t_user")
public class User {
    private String  name;
    private Integer age;
    private String  address;
    private Date    inserttime;
    private Date    updatetime;
}

dao

@Mapper
public interface UserMapper {

    @Select("select * from t_user")
    List<User> list();

    @Insert("UPSERT INTO t_user (name, age, address, inserttime, updatetime) " +
            "values (#{name}, #{age}, #{address}, #{inserttime}, #{updatetime})")
    void insert(User user);
}

配置Hadoop环境

参考: 【 https://blog.csdn.net/weixin_41122339/article/details/81141913

也可以采用非Hadoop方案
在项目根目录添加bin目录, 拷贝winutils.exe到此目录, 修改 SpringBootApplication

@SpringBootApplication
public class HbasephoenixApplication {

    public static void main(String[] args) {
        // 或设置 HADOOP_HOME
        System.setProperty("hadoop.home.dir", System.getProperty("user.dir"));
        SpringApplication.run(HbasephoenixApplication.class, args);
    }

}

单元测试

@SpringBootTest
@RunWith(SpringJUnit4ClassRunner.class)
class HbasephoenixApplicationTests {
    
    @Autowired
    private UserMapper userMapper;
    
    @Test
    void list() {
        List<User> users = userMapper.list();
        users.forEach(System.out::println);
    }
    
    @Test
    void add() {
        User user = new User();
        user.setName("test03");
        user.setAge(14);
        user.setAddress("SHANGHAI SONGJIANG");
        user.setInserttime(new Date());
        user.setUpdatetime(new Date());
        userMapper.insert(user);
        list();
    }
    
}

Bug

1. nullinwinutils.exe

https://blog.csdn.net/weixin_41122339/article/details/81141913

2. certificate has expired or is not yet valid

https://blog.csdn.net/qq_39562468/article/details/84592139

3. spark-phoenix 一直出现一个 bug?

http://www.hainiubl.com/topics/36024

参考:

作者:林宇风
版权所有,侵权必究!标明出处,欢迎转载。
原文地址:https://www.cnblogs.com/linyufeng/p/14892451.html