最简单之安装hive

一,安装模式介绍

    Hive官网上介绍了Hive的3种安装方式,分别对应不同的应用场景。

    a、内嵌模式(元数据保村在内嵌的derby种,允许一个会话链接,尝试多个会话链接时会报错)

    b、本地模式(本地安装mysql 替代derby存储元数据)

    c、远程模式(远程安装mysql 替代derby存储元数据)

二,安装准备

前提:已经安装java JDK1.7以上,hadoop可用,mysql可用。

三,下载解压

下载链接:https://mirrors.tuna.tsinghua.edu.cn/apache/hive/

cd /home
tar zxvf apache-hive-2.3.6-bin.tar.gz
mv apache-hive-2.3.6-bin hive

四,配置环境变量

[root@192-168-22-220 hive]# vi /etc/profile
export JAVA_HOME=/usr/java/jdk1.8.0_231-amd64
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export HADOOP_HOME=/home/hadoop-2.10.0
export HADOOP_CONF_DIR=/home/hadoop-2.10.0/etc/hadoop
export HIVE_HOME=/home/hive
export HIVE_CONF_DIR=/home/hive/conf
export PATH=$JAVA_HOME/bin:$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$HIVE_HOME/bin
[root@192-168-22-220 hive]# source /etc/profile

五,修改Hive的配置

[root@192-168-22-220 conf]# cp hive-default.xml.template hive-site.xml
[root@192-168-22-220 conf]# mkdir /home/hive/warehouse
[root@192-168-22-220 conf]# vi hive-site.xml
#删除里面内容,只留<configuration></configuration> 节点
#删除方法
#将光标放在<configuration>的下一行在:模式下输入.,$-1d 按回车。从当前行删除到倒数第二行
<property>  
    <name>hive.security.authorization.enabled</name>  
    <value>true</value>  
    </property>  
      
    <property>  
        <name>hive.server2.enable.doAs</name>  
        <value>false</value>  
    </property>
     <property>
          <name>hive.security.authorization.createtable.owner.grants</name>
          <value>ALL</value>
     </property>
      
    <property>  
      <name>hive.users.in.admin.role</name>  
      <value>root</value>  
    </property>
    <property>  
    <name>hive.security.authorization.manager</name>   
    <value>org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizerFactory</value>  
    </property>  
      
    <property>  
    <name>hive.security.authenticator.manager</name>   
    <value>org.apache.hadoop.hive.ql.security.SessionStateUserAuthenticator</value>  
    </property>
    <property>
      <name>javax.jdo.option.ConnectionURL</name>
      <value>jdbc:mysql://192.168.22.220:3306/hive?createDatabaseIfNotExist=true&amp;characterEncoding=UTF-8&amp;useSSL=false</value>
    </property>
    <property>
        <name>javax.jdo.option.ConnectionDriverName</name>
        <value>com.mysql.jdbc.Driver</value>
    </property>
    <property>
        <name>javax.jdo.option.ConnectionUserName</name>
        <value>数据库账号</value>
    </property>
    <property>
        <name>javax.jdo.option.ConnectionPassword</name>
        <value>数据库密码</value>
    </property>

六,mysql 驱动

下载:https://blog.csdn.net/weixin_41804049/article/details/87719574

把mysql-connector-java-5.1.46.jar放到hive的lib目录下

七,初始化数据库

schematool -dbType mysql -initSchema #成功的话,mysql里面会多出来hive这个数据库

报错可看。http://www.lzhpo.com/article/98

八,启动

运行hive之前要确保hadoop集群启动和MySQL服务启动正常开启的情况下

hive启动常见的有三种:

本地cli启动

#由于配置过环境变量,可以直接在命令行中输入hive  本地启动命令 hive
[root@192-168-22-220 ~]# hive       
which: no hbase in (/usr/java/jdk1.8.0_231-amd64/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/home/hadoop-2.10.0/bin:/home/hadoop-2.10.0/sbin:/home/hive/bin:/root/bin)
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/hive/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/hadoop-2.10.0/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]

Logging initialized using configuration in jar:file:/home/hive/lib/hive-common-2.3.6.jar!/hive-log4j2.properties Async: true
Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
hive>

HiveServer2/beeline 客户端方式启动

由于hive启动,hadoop集群中其他节点是无法访问的,只能在hive安装的当前节点使用,若需要在其他节点操作hive,则可以用HiveServer2/beeline方式

第一:修改 hadoop 集群的 hdfs-site.xml 配置文件:
#加入一条配置信息,表示启用 webhdfs  (可选)
<property>
    <name>dfs.webhdfs.enabled</name>
    <value>true</value>
</property>
#是否需要验证的配置,此处设为NONE,暂时不需要验证 (可选)
<property>
    <name>hive.server2.authentication</name>
    <value>NONE</value>
</property>
第二:修改 hadoop 集群的 core-site.xml 配置文件:
#加入两条配置信息:表示设置 hadoop 的代理用户
<property>
    <name>hadoop.proxyuser.root.hosts</name>
    <value>*</value>
</property>
<property>
    <name>hadoop.proxyuser.root.groups</name>
    <value>*</value>
</property>
重启 stop-all.sh start-all.sh #配置解析: hadoop.proxyuser.hadoop.hosts 配置成*的意义,表示任意节点使用 hadoop 集群的代理用户 hadoop 都能访问 hdfs 集群,
hadoop.proxyuser.hadoop.groups 表示代理用户的组所属
第一步:先启动 hiveserver2 服务

前台启动方式:hiveserver2

后台启动方式:nohup hiveserver2 1>/dev/null 2>/dev/null &

第二步:然后启动 beeline 客户端去连接:

beeline -u jdbc:hive2://127.0.0.1:10000 -n hadoop    -u : 指定元数据库的链接信息 -n : 指定用户名和密码

也可以先执行 beeline 然后按图所示输入:!connect jdbc:hive2://127.0.0.1:10000 按回车,然后输入用户名,这个 用户名就是安装 hadoop 集群的用户名
hiveserver端口号默认是10000 
使用beeline通过jdbc连接上之后就可以像client一样操作。

hiveserver2会同时启动一个webui,端口号默认为10002,可以通过http://localhost:10002/访问
界面中可以看到Session/Query/Software等信息。(此网页只可查看,不可以操作hive数据仓库)

启动hiveWebInterface,通过网页访问hive

hive提供网页GUI来访问Hive数据仓库 
可以通过以下命令启动hwi,默认端口号9999

$HIVE_HOME/bin/hive --service hwi
从Hive 2.2.0开始不再支持hwi,故此处不再赘述。

九,查看

[root@192-168-22-220 ~]# hadoop dfs -lsr /
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.

lsr: DEPRECATED: Please use 'ls -R' instead.
drwx-wx-wx   - root supergroup          0 2019-12-12 18:33 /tmp
drwx-wx-wx   - root supergroup          0 2019-12-12 18:33 /tmp/hive
drwx------   - root supergroup          0 2019-12-13 09:54 /tmp/hive/root

十,测试hive创建库

0: jdbc:hive2://127.0.0.1:10000> create database hive1;   #创建库
No rows affected (0.582 seconds)
显示库:
0: jdbc:hive2://127.0.0.1:10000> show databases;
+----------------+
| database_name  |
+----------------+
| default        |
| hive1          |
+----------------+
2 rows selected (0.232 seconds)

创建库成功!

十一,再次查看

[root@192-168-22-220 ~]# hadoop dfs -lsr /
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.

lsr: DEPRECATED: Please use 'ls -R' instead.
drwxr-xr-x   - root supergroup          0 2019-12-13 13:17 /home
drwxr-xr-x   - root supergroup          0 2019-12-13 13:17 /home/hive
drwxr-xr-x   - root supergroup          0 2019-12-13 13:17 /home/hive/warehouse
drwxr-xr-x   - root supergroup          0 2019-12-13 13:17 /home/hive/warehouse/hive1.db
drwx-wx-wx   - root supergroup          0 2019-12-12 18:33 /tmp
drwx-wx-wx   - root supergroup          0 2019-12-12 18:33 /tmp/hive
drwx------   - root supergroup          0 2019-12-13 11:19 /tmp/hive/root
[root@192-168-22-220 ~]# mysql -uroot -p
Enter password:
Welcome to the MySQL monitor.  Commands end with ; or g.
Your MySQL connection id is 33
Server version: 5.7.28 MySQL Community Server (GPL)

Copyright (c) 2000, 2019, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or 'h' for help. Type 'c' to clear the current input statement.

mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| hive               |
| mysql              |
| performance_schema |
| sys                |
+--------------------+
5 rows in set (0.00 sec)

mysql> select * from hive.DBS;
+-------+-----------------------+---------------------------------------------------------+---------+------------+------------+
| DB_ID | DESC                  | DB_LOCATION_URI                                         | NAME    | OWNER_NAME | OWNER_TYPE |
+-------+-----------------------+---------------------------------------------------------+---------+------------+------------+
|     1 | Default Hive database | hdfs://192.168.22.220:9000/home/hive/warehouse          | default | public     | ROLE       |
|     2 | NULL                  | hdfs://192.168.22.220:9000/home/hive/warehouse/hive1.db | hive1   | root       | USER       |
+-------+-----------------------+---------------------------------------------------------+---------+------------+------------+
2 rows in set (0.00 sec)

十二,测试hive创建表

0: jdbc:hive2://127.0.0.1:10000> use hive1;
No rows affected (0.277 seconds)
0: jdbc:hive2://127.0.0.1:10000> create table hive_table (id int,name string);
No rows affected (1.409 seconds)
0: jdbc:hive2://127.0.0.1:10000> show tables;
+-------------+
|  tab_name   |
+-------------+
| hive_table  |
+-------------+
1 row selected (0.263 seconds)
0: jdbc:hive2://127.0.0.1:10000> insert into hive_table (id,name) values (1,'cgc');
WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
No rows affected (38.199 seconds)
0: jdbc:hive2://127.0.0.1:10000> select * from hive_table;
+----------------+------------------+
| hive_table.id  | hive_table.name  |
+----------------+------------------+
| 1              | cgc              |
+----------------+------------------+
1 row selected (0.575 seconds)

十三,成功展示

mysql> select * from hive.TBLS;
+--------+-------------+-------+------------------+-------+-----------+-------+------------+---------------+--------------------+--------------------+--------------------+
| TBL_ID | CREATE_TIME | DB_ID | LAST_ACCESS_TIME | OWNER | RETENTION | SD_ID | TBL_NAME   | TBL_TYPE      | VIEW_EXPANDED_TEXT | VIEW_ORIGINAL_TEXT | IS_REWRITE_ENABLED |
+--------+-------------+-------+------------------+-------+-----------+-------+------------+---------------+--------------------+--------------------+--------------------+
|      1 |  1576215017 |     2 |                0 | root  |         0 |     1 | hive_table | MANAGED_TABLE | NULL               | NULL               |                    |
+--------+-------------+-------+------------------+-------+-----------+-------+------------+---------------+--------------------+--------------------+--------------------+
1 row in set (0.00 sec)
[root@192-168-22-220 ~]# hadoop dfs -lsr /
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.

lsr: DEPRECATED: Please use 'ls -R' instead.
drwxr-xr-x   - root supergroup          0 2019-12-13 13:17 /home
drwxr-xr-x   - root supergroup          0 2019-12-13 13:17 /home/hive
drwxr-xr-x   - root supergroup          0 2019-12-13 13:17 /home/hive/warehouse
drwxr-xr-x   - root supergroup          0 2019-12-13 13:30 /home/hive/warehouse/hive1.db
drwxr-xr-x   - root supergroup          0 2019-12-13 13:30 /home/hive/warehouse/hive1.db/hive_table
drwx-wx-wx   - root supergroup          0 2019-12-12 18:33 /tmp
drwx-wx-wx   - root supergroup          0 2019-12-12 18:33 /tmp/hive
drwx------   - root supergroup          0 2019-12-13 11:19 /tmp/hive/root

作者:陈耿聪 —— 夕狱

出处:https://www.cnblogs.com/CGCong/

本文版权归作者和博客园共有,欢迎转载,但未经作者同意必须保留此段声明,且在文章页面明显位置给出原文连接,否则保留追究法律责任的权利。

原文地址:https://www.cnblogs.com/CGCong/p/12033482.html