hive内表和外表的创建、载入数据、区别

创建表

创建内表

create table customer(
    customerId int,
    firstName string,
    lastName STRING,
    birstDay timestamp
) row format delimited fields  terminated by ','

创建外表

CREATE EXTERNAL table salaries(
    gender string,
    age int ,
    salary DOUBLE,
    zip int 
)row format delimited fields  terminated by ',' LOCATION '/user/train/salaries/';

载入数据

load DATA LOCAL inpath '/root/user/customer.txt' overwrite into table customer;
load DATA LOCAL inpath '/root/user/salaries.txt' overwrite into table salaries;

查看文本数据

[root@centos172 user]# cat /root/user/customer.txt
1,f,jack,,
2,f,luccy,,
[root@centos172 user]# cat /root/user/salaries.txt
male,21,10000,1
female,22,12000,2

查看数据库数据

hive> desc customer;
OK
customerid              int
firstname               string
lastname                string
birstday                timestamp
Time taken: 0.053 seconds, Fetched: 4 row(s)
hive> desc salaries;
OK
gender                  string
age                     int
salary                  double
zip                     int
Time taken: 0.041 seconds, Fetched: 4 row(s)
hive> select * from customer;
OK
1       f       jack    NULL
2       f       luccy   NULL
Time taken: 0.067 seconds, Fetched: 2 row(s)
hive> select * from salaries;
OK
male    21      10000.0 1
female  22      12000.0 2
Time taken: 0.066 seconds, Fetched: 2 row(s)
hive>

区别

因为我hive也是刚开始了解,所以只讲一部分
1.内表主要放在hdfs中默认的hive目录。外表指定了location
2.删除内表,重新创建一个一样的内表,数据不会装载
删除外表,重新创建一个一样的外表,数据会自动的装载
删除外表的操作如下

hive> drop table salaries;
OK
Time taken: 0.092 seconds
hive> select * from salaries;
FAILED: SemanticException [Error 10001]: Line 1:14 Table not found 'salaries'
hive> show tables;
OK
customer
Time taken: 0.035 seconds, Fetched: 1 row(s)
hive> CREATE EXTERNAL table salaries(
    >     gender string,
    >     age int ,
    >     salary DOUBLE,
    >     zip int
    > )row format delimited fields  terminated by ',' LOCATION '/user/train/salaries/';
OK
Time taken: 0.058 seconds
hive> show tables;
OK
customer
salaries
Time taken: 0.025 seconds, Fetched: 2 row(s)
hive> select * from salaries;
OK
male    21      10000.0 1
female  22      12000.0 2
Time taken: 0.058 seconds, Fetched: 2 row(s)
hive>

区别1的:
内表的默认路径

指定外表的路径如图:

hive是什么

我当前接触到就是:
1.把hdf文件具体为table
2.用来查询,类似sql语句处理

原文地址:https://www.cnblogs.com/JuncaiF/p/12336563.html