hive learing_1

1、什么是hive

hive是基于Hadoop的一种数据库仓库工具,将结构化数据映射为一张数据表,并提供SQL查询,分析等功能

2、hive的特点

  1. 将数据结构(shema)存储在数据库中,数据存储在HDFS中
  2. 适用于联机分析处理(OLAP:on-line analytic process)
  3. 支持HQL查询

3、hive的缺点

  1. 不支持关系数据库
  2. 不支持联机事务处理(OLTP:on-line transaction process):如银行系统
  3. 不支持实时查询和行级更新

4、hive的安装配置

  • 下载安装包,解压到需要安装的目录
  • 配置环境变量 vi /etc/profile,并立即生效 source /etc/profile

  • 修改配置文件

  • 修改hive-site.xml

</property>

<property>
 <name>datanucleus.autoCreateTables</name>
 <value>True</value>
 </property>


  <property>
    <name>javax.jdo.option.ConnectionPassword</name>
    <value>root</value>
    <description>password to use against metastore database</description>
  </property>
  <property>
    <name>javax.jdo.option.ConnectionUserName</name>
    <value>root</value>
    <description>Username to use against metastore database</description>
  </property>
 <property>
    <name>javax.jdo.option.ConnectionURL</name>
    <value>jdbc:mysql://192.168.109.150:3306/hive_metastore?createDatabaseIfNotExist=true</value>
    <description>
      JDBC connect string for a JDBC metastore.
      To use SSL to encrypt/authenticate the connection, provide database-specific SSL flag in the connection URL.
      For example, jdbc:postgresql://myhost/db?ssl=true for postgres database.
    </description>
  </property>
 <property>
    <name>javax.jdo.option.ConnectionDriverName</name>
    <value>com.mysql.jdbc.Driver</value>
    <description>Driver class name for a JDBC metastore</description>
  </property>
  <property>
    <name>hive.metastore.ds.connection.url.hook</name>
    <value/>
    <description>Name of the hook to use for retrieving the JDO connection URL. If empty, the value in javax.jdo.option.ConnectionURL is used</description>
  </property>

 

 

  • 将${system的地方改成具体的地址

  • 将MySQL的驱动包mysql-connector-java-5.1.46.jar复制到/soft/hive/lib下面

  • 启动hive,初始化元数据

[xiaoqiu@s150 /soft/hive/conf]$ schematool -initSchema -dbType mysql
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/soft/apache-hive-2.3.3-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/soft/hadoop-2.7.5/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Metastore connection URL:        jdbc:mysql://192.168.109.150:3306/hive_metastore?createDatabaseIfNotExist=true
Metastore Connection Driver :    com.mysql.jdbc.Driver
Metastore connection User:       root
Starting metastore schema initialization to 2.3.0
Initialization script hive-schema-2.3.0.mysql.sql
Initialization script completed
schemaTool completed
  • 创建数据库

hive> create database incubator;
OK
Time taken: 0.338 seconds
hive> show databases;
OK
default
incubator
mydb
Time taken: 0.014 seconds, Fetched: 3 row(s)
  • 查看HDFS(可见hive创建的数据库直接同步于HDFS的hive数据仓库中了)

[xiaoqiu@s150 /soft/hive/conf]$ hadoop fs -lsr /
lsr: DEPRECATED: Please use 'ls -R' instead.
drwx-wx-wx   - xiaoqiu supergroup          0 2018-07-02 02:21 /tmp
drwx-wx-wx   - xiaoqiu supergroup          0 2018-07-02 10:20 /tmp/hive
drwx------   - root    supergroup          0 2018-07-02 10:41 /tmp/hive/root
drwx------   - root    supergroup          0 2018-07-02 10:41 /tmp/hive/root/f5b7d613-fb3b-4125-b2d7-ea643b199c2f
drwx------   - root    supergroup          0 2018-07-02 10:41 /tmp/hive/root/f5b7d613-fb3b-4125-b2d7-ea643b199c2f/_tmp_space.db
drwx------   - xiaoqiu supergroup          0 2018-07-04 10:58 /tmp/hive/xiaoqiu
drwx------   - xiaoqiu supergroup          0 2018-07-04 10:58 /tmp/hive/xiaoqiu/9f4397b7-db00-4ad5-b5de-cc3fd0d72eb6
drwx------   - xiaoqiu supergroup          0 2018-07-04 10:58 /tmp/hive/xiaoqiu/9f4397b7-db00-4ad5-b5de-cc3fd0d72eb6/_tmp_space.db
drwx------   - xiaoqiu supergroup          0 2018-07-02 17:19 /tmp/hive/xiaoqiu/ae6a90bd-558d-482c-b408-dc0273bce93e
drwx------   - xiaoqiu supergroup          0 2018-07-02 17:19 /tmp/hive/xiaoqiu/ae6a90bd-558d-482c-b408-dc0273bce93e/_tmp_space.db
drwxr-xr-x   - xiaoqiu supergroup          0 2018-07-04 11:32 /user
drwxr-xr-x   - xiaoqiu supergroup          0 2018-07-04 11:32 /user/hive
drwxr-xr-x   - xiaoqiu supergroup          0 2018-07-04 11:34 /user/hive/warehouse
drwxr-xr-x   - xiaoqiu supergroup          0 2018-07-04 11:34 /user/hive/warehouse/incubator.db
drwxr-xr-x   - xiaoqiu supergroup          0 2018-07-04 11:32 /user/hive/warehouse/mydb.db
  • 创建table

hive> create table t(id int,name string,age int);
OK
Time taken: 1.641 seconds
hive> show tables;
OK
t
Time taken: 1.312 seconds, Fetched: 1 row(s)
  • 在MySQL中查看(其中TBL_TYPE为MANAGED_TABLE即托管表的意思,也就是删除表的时候,表的数据也会被删除掉,还有一种类型的表为EXTERNAL即外部表,删除表的时候不会删除数据)

MariaDB [hive_metastore]> select * from TBLS;
+--------+-------------+-------+------------------+---------+-----------+-------+----------+---------------+--------------------+--------------------+--------------------+
| TBL_ID | CREATE_TIME | DB_ID | LAST_ACCESS_TIME | OWNER   | RETENTION | SD_ID | TBL_NAME | TBL_TYPE      | VIEW_EXPANDED_TEXT | VIEW_ORIGINAL_TEXT | IS_REWRITE_ENABLED |
+--------+-------------+-------+------------------+---------+-----------+-------+----------+---------------+--------------------+--------------------+--------------------+
|      1 |  1530721536 |     3 |                0 | xiaoqiu |         0 |     1 | t        | MANAGED_TABLE | NULL               | NULL               |                    |
+--------+-------------+-------+------------------+---------+-----------+-------+----------+---------------+--------------------+--------------------+--------------------+
1 row in set (0.00 sec)
  • 这个时候还可以在HDFS查看表

[xiaoqiu@s150 /soft/hive/conf]$ hadoop fs -lsr /
lsr: DEPRECATED: Please use 'ls -R' instead.
drwx-wx-wx   - xiaoqiu supergroup          0 2018-07-02 02:21 /tmp
drwx-wx-wx   - xiaoqiu supergroup          0 2018-07-02 10:20 /tmp/hive
drwx------   - root    supergroup          0 2018-07-02 10:41 /tmp/hive/root
drwx------   - root    supergroup          0 2018-07-02 10:41 /tmp/hive/root/f5b7d613-fb3b-4125-b2d7-ea643b199c2f
drwx------   - root    supergroup          0 2018-07-02 10:41 /tmp/hive/root/f5b7d613-fb3b-4125-b2d7-ea643b199c2f/_tmp_space.db
drwx------   - xiaoqiu supergroup          0 2018-07-04 10:58 /tmp/hive/xiaoqiu
drwx------   - xiaoqiu supergroup          0 2018-07-04 10:58 /tmp/hive/xiaoqiu/9f4397b7-db00-4ad5-b5de-cc3fd0d72eb6
drwx------   - xiaoqiu supergroup          0 2018-07-04 10:58 /tmp/hive/xiaoqiu/9f4397b7-db00-4ad5-b5de-cc3fd0d72eb6/_tmp_space.db
drwx------   - xiaoqiu supergroup          0 2018-07-02 17:19 /tmp/hive/xiaoqiu/ae6a90bd-558d-482c-b408-dc0273bce93e
drwx------   - xiaoqiu supergroup          0 2018-07-02 17:19 /tmp/hive/xiaoqiu/ae6a90bd-558d-482c-b408-dc0273bce93e/_tmp_space.db
drwxr-xr-x   - xiaoqiu supergroup          0 2018-07-04 11:32 /user
drwxr-xr-x   - xiaoqiu supergroup          0 2018-07-04 11:32 /user/hive
drwxr-xr-x   - xiaoqiu supergroup          0 2018-07-04 11:34 /user/hive/warehouse
drwxr-xr-x   - xiaoqiu supergroup          0 2018-07-04 12:25 /user/hive/warehouse/incubator.db
drwxr-xr-x   - xiaoqiu supergroup          0 2018-07-04 12:25 /user/hive/warehouse/incubator.db/t
drwxr-xr-x   - xiaoqiu supergroup          0 2018-07-04 11:32 /user/hive/warehouse/mydb.db
  • 向表t中插入数据(直接在hive中使用insert into语句进行插入,会先调用mapreduce)

 

 

欢迎关注我的公众号:小秋的博客 CSDN博客:https://blog.csdn.net/xiaoqiu_cr github:https://github.com/crr121 联系邮箱:rongchen633@gmail.com 有什么问题可以给我留言噢~
原文地址:https://www.cnblogs.com/flyingcr/p/10326882.html