大数据学习——关于hive中的各种join

准备数据
2,b
3,c
4,d
7,y
8,u

2,bb
3,cc
7,yy
9,pp
建表:
create table a(id int,name string)
row format delimited fields terminated by ',';

create table b(id int,name string)
row format delimited fields terminated by ',';
导入数据:
load data local inpath '/root/hivedata/a.txt' into table a;
load data local inpath '/root/hivedata/b.txt' into table b;

inner join 只打印能匹配上的数据,没有匹配上的不输出

 select * from a inner join b on a.id =b.id;

left join 

 select * from a left join b on a.id=b.id;

 right join

select * from a right join b on a.id=b.id;

full outer join

select * from a full outer join b on a.id=b.id;

 left outer join

left semi join

select * from a  left semi join b on a.id=b.id;

相当于
 select * from a where a.id exists(select b.id from b); 在hive中效率极低

原文地址:https://www.cnblogs.com/feifeicui/p/10284854.html