SAS 中的 sql 语句的使用------多表使用

SAS 中的多表使用一般为多表之间的连接查询

连接查询分为:

(1)表内连接

(2)表与表之间的连接

这里我们只讨论表与表之间的连接,而表与表之间的连接分为:

(1)全连接

  这里的全连接就是基本的笛卡尔积,笛卡尔积通俗一点说就是每条数据的交叉相乘,使用full join

 1 data temp;
 2 input visit $ visit_dat $ age type $;
 3 cards;
 4 v1 20190201 18 a
 5 v2 20200304 21 f
 6 v3 20190825 34 e
 7 v1 20180431 58 c
 8 v2 20170902 23 d
 9 v4 20160826 25 r
10 ;
11 run;
12 
13 data temp2;
14 input visit $ visit_dat $ age type $;
15 cards;
16 v3 20190725 25 d
17 v4 20200431 35 e
18 v5 20190921 38 g
19 ;
20 run;
21 
22 /* 创建表,含条件语句 */
23 proc sql;
24 create table join_visit as
25 select
26     table1.visit as visit1,
27     table1.visit_dat as visit_dat1,
28     table1.age as age1,
29     table1.type as type1,
30     table2.visit as visit1,
31     table2.visit_dat as visit_dat2,
32     table2.age as age2,
33     table2.type as type2
34 from
35     work.temp as table1
36 full join
37     work.temp2 as table2
38 on 
39     table1.visit = table2.visit;
40 run;
41 
42 /* 打印创建的表 */
43 proc print data=join_visit;
44 run;

结果为:

(2)左外连接

  保留左边的所有数据,对于右边的,只保留符合条件的数据,使用 left join

 1 data temp;
 2 input visit $ visit_dat $ age type $;
 3 cards;
 4 v1 20190201 18 a
 5 v2 20200304 21 f
 6 v3 20190825 34 e
 7 v1 20180431 58 c
 8 v2 20170902 23 d
 9 v4 20160826 25 r
10 ;
11 run;
12 
13 data temp2;
14 input visit $ visit_dat $ age type $;
15 cards;
16 v3 20190725 25 d
17 v4 20200431 35 e
18 v5 20190921 38 g
19 ;
20 run;
21 
22 /* 创建表,含条件语句 */
23 proc sql;
24 create table join_visit as
25 select
26     table1.visit as visit1,
27     table1.visit_dat as visit_dat1,
28     table1.age as age1,
29     table1.type as type1,
30     table2.visit as visit1,
31     table2.visit_dat as visit_dat2,
32     table2.age as age2,
33     table2.type as type2
34 from
35     work.temp as table1
36 left join
37     work.temp2 as table2
38 on 
39     table1.visit = table2.visit;
40 run;
41 
42 /* 打印创建的表 */
43 proc print data=join_visit;
44 run;

(3)右外连接

  保留右边的所有数据,对于左边的,只保留符合条件的数据,使用 right join

data temp;
input visit $ visit_dat $ age type $;
cards;
v1 20190201 18 a
v2 20200304 21 f
v3 20190825 34 e
v1 20180431 58 c
v2 20170902 23 d
v4 20160826 25 r
;
run;

data temp2;
input visit $ visit_dat $ age type $;
cards;
v3 20190725 25 d
v4 20200431 35 e
v5 20190921 38 g
;
run;

/* 创建表,含条件语句 */
proc sql;
create table join_visit as
select
    table1.visit as visit1,
    table1.visit_dat as visit_dat1,
    table1.age as age1,
    table1.type as type1,
    table2.visit as visit1,
    table2.visit_dat as visit_dat2,
    table2.age as age2,
    table2.type as type2
from
    work.temp as table1
right join
    work.temp2 as table2
on 
    table1.visit = table2.visit;
run;

/* 打印创建的表 */
proc print data=join_visit;
run;

结果为:

 注意

要注意的是,多表连接使用的时候,重要的不是语句的使用,而是选择哪一个连接,才是最重要的。

原文地址:https://www.cnblogs.com/elkan/p/11121714.html