Hive的连接方式

客户端cli的命令

Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.

启动命令1 ./hive

./hive

启动命令2 beeline方式

beeline

先启动hiveserver2

启动beline

[root@bigdata01 bin]# ./beeline -u jdbc:hive2://localhost:10000
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/data/soft/hive-3.12/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/data/soft/hadoop-3.2.0/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Connecting to jdbc:hive2://localhost:10000
Connected to: Apache Hive (version 3.1.2)
Driver: Hive JDBC (version 3.1.2)
Transaction isolation: TRANSACTION_REPEATABLE_READ
Beeline version 3.1.2 by Apache Hive
0: jdbc:hive2://localhost:10000>

创建table t1(id int,name String)后插入数据报错

INFO  : Completed compiling command(queryId=root_20210403203526_5e970572-12f9-40d4-85cb-730568016c34); Time taken: 5.082 seconds
INFO  : Concurrency mode is disabled, not creating a lock manager
INFO  : Executing command(queryId=root_20210403203526_5e970572-12f9-40d4-85cb-730568016c34): insert into t1(id,name) values(1,"zs")
WARN  : Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
INFO  : Query ID = root_20210403203526_5e970572-12f9-40d4-85cb-730568016c34
INFO  : Total jobs = 3
INFO  : Launching Job 1 out of 3
INFO  : Starting task [Stage-1:MAPRED] in serial mode
INFO  : Number of reduce tasks determined at compile time: 1
INFO  : In order to change the average load for a reducer (in bytes):
INFO  :   set hive.exec.reducers.bytes.per.reducer=<number>
INFO  : In order to limit the maximum number of reducers:
INFO  :   set hive.exec.reducers.max=<number>
INFO  : In order to set a constant number of reducers:
INFO  :   set mapreduce.job.reduces=<number>
INFO  : Cleaning up the staging area /tmp/hadoop-yarn/staging/anonymous/.staging/job_1617447250360_0002
ERROR : Job Submission failed with exception 'org.apache.hadoop.security.AccessControlException(Permission denied: user=anonymous, access=EXECUTE, inode="/tmp":root:supergroup:drwx------
    at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:399)

启动时指定用户root

./beeline -u jdbc:hive2://localhost:10000 -n root

INFO  : Concurrency mode is disabled, not creating a lock manager
INFO  : Semantic Analysis Completed (retrial = false)
INFO  : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:t1.id, type:int, comment:null), FieldSchema(name:t1.name, type:string, comment:null)], properties:null)
INFO  : Completed compiling command(queryId=root_20210403204130_09ab2086-01f8-4466-9f43-70c145407ff6); Time taken: 0.422 seconds
INFO  : Concurrency mode is disabled, not creating a lock manager
INFO  : Executing command(queryId=root_20210403204130_09ab2086-01f8-4466-9f43-70c145407ff6): select * from t1
INFO  : Completed executing command(queryId=root_20210403204130_09ab2086-01f8-4466-9f43-70c145407ff6); Time taken: 0.0 seconds
INFO  : OK
INFO  : Concurrency mode is disabled, not creating a lock manager
+--------+----------+
| t1.id  | t1.name  |
+--------+----------+
| 1      | zs       |
+--------+----------+
1 row selected (0.652 seconds)

JDBC方式连接

JDBC连接也需要连接到hiveServer2服务。

maven依赖如下

dependency>
        <groupId>org.apache.hive</groupId>
        <artifactId>hive-jdbc</artifactId>
        <version>3.1.2</version>
    </dependency>

测试代码如下


/**
 * Hive连接JDBC Demo
 * 注意:需要先启动hive server2服务
 * @author zhangshao
 * @date 2021/5/24 9:22 上午
 */
public class HiveJdbcDemo {
    public static void main(String[] args) throws SQLException {
        //指定hive server2的连接
        String jdbcUrl = "jdbc:hive2://192.168.21.101:10000";
        //获取jdbc连接
        Connection conn = DriverManager.getConnection(jdbcUrl);
        //获取Statement
        Statement stmt = conn.createStatement();
        //指定需要查询的sql
        String sql = "select * from t1";
        //执行sql
        ResultSet rs = stmt.executeQuery(sql);
        //循环读取结果
        while(rs.next()){
            System.out.println(rs.getInt("id"+"	"+rs.getString("name")));
        }

    }
}

执行结果如下：发现Error,原因是发现了两个log4j的实现类

 SLF4J: Found binding in [jar:file:/Users/zhangshao/java_tools/local_repository/org/apache/logging/log4j/log4j-slf4j-impl/2.10.0/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/Users/zhangshao/java_tools/local_repository/org/slf4j/slf4j-log4j12/1.6.1/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]

SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/Users/zhangshao/java_tools/local_repository/org/apache/logging/log4j/log4j-slf4j-impl/2.10.0/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/Users/zhangshao/java_tools/local_repository/org/slf4j/slf4j-log4j12/1.6.1/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
ERROR StatusLogger No log4j2 configuration file found. Using default configuration: logging only errors to the console. Set system property 'log4j2.debug' to show Log4j2 internal initialization logging.
Exception in thread "main" java.sql.SQLException: Could not find id zs in [t1.id, t1.name]
    at org.apache.hive.jdbc.HiveBaseResultSet.findColumn(HiveBaseResultSet.java:104)
    at org.apache.hive.jdbc.HiveBaseResultSet.getInt(HiveBaseResultSet.java:359)
    at com.imooc.hive.HiveJdbcDemo.main(HiveJdbcDemo.java:25)

修改maven依赖，排除hive-jdbc中的log4j

                <exclusion>
                    <groupId>org.slf4j</groupId>
                    <artifactId>slf4j-log4j12</artifactId>
                </exclusion>

在resource目录下，创建log4j2.xml

<?xml version="1.0" encoding="UTF-8"?>
<Configuration status="INFO">
    <Appenders>
        <Console name="Console" target="SYSTEM_OUT">
            <PatternLayout pattern="%d{YYYY-MM-dd HH:mm:ss} [%t] %-5p %c{1}:%
        </Console>
    </Appenders>
    <Loggers>
        <Root level="info">
            <AppenderRef ref="Console" />
        </Root>
    </Loggers>
</Configuration>

再次运行，执行结果如下:

1   zs
2   lisi

Set命令的使用

在hive中可以使用set命令临时设置一些参数的值,也就是临时修改hive-site.xml中参数的值。

通过set命令设置的参数只在当前会话有效，退出后重新打开就失效。
如果想要对当前机器的当前对象有效，可以把命令配置在~/.hiverc文件中.

在hive-site.xml中有一个参数是hive.cli.print.current.db，该参数可以显示当前所在的数据库名称，默认是false. 使用.hive命令进入 ./hive

hive> set hive.cli.print.current.db=true; 
hive (default)>

使用 hive.cli.print.header = true 控制获取结果的时候显示字段名称。

hive (default)> set hive.cli.print.header = true;
hive (default)> select * from t1;
OK
t1.id   t1.name
1       zs
2       lisi
Time taken: 3.353 seconds, Fetched: 2 row(s)

将该配置设置到当前机器的当前用户下。

[root@bigdata04 apache-hive-3.1.2-bin]# vi ~/.hiverc
set hive.cli.print.current.db = true;
set hive.cli.print.header = true;

重新启动./hive，发现hiverc中配置的文件已生效。

hive (default)>

如果想查看下hive的历史操作命令，可以在当前用户家目录下中.hivehistory中查看到。

[root@bigdata01 ~]# cat ~/.hivehistory
clear
;
show tables;
create table t1(id int ,name string);
show tables;
insert into t1(id,name) values(1,'zs');
select * from t1;
drop table t1;
show tables;
exit;
show databases;
create databses mydb1;
create database mydb1;
create database mydb2 location use '/user/hive/mydb2';
create database mydb2 location user '/user/hive/mydb2';
 create database mydb2 location '/user/hive/mydb2';
drop database mydb2;
;
show databases;
create table t2(id int);
show tables;
show create table t2;
alter table t2 rename to t2_bak;
;
clear
;
set hive.cli.print.current.db=true; 
set hive.cli.print.header = true;
select * from t1;