[Oracle 10g]Partition Outer Join

Partition Outer Join是Oracle 10g新增加的,用来方便“补足”在outer join的时候“漏掉”(NULL)的数据。

当两个用来做Join的表中的数据不“相当”的时候,如果我们希望在最终结果中其中一张表的所有records都不落下,一般会用到outer join。另外一个表中“缺失”的那部分数据自然都是以NULL来补足。但是如果我们希望缺失的那部分数据(NULL)能正常"补足'’ 出来,这个时候用Partiton Outer Join就会很方便,而且性能很好。

举个例子,来源这里

首先建立一个测试表customer_orders,

SQL>
  1  CREATE TABLE customer_orders(name, dt, amt)
  2  AS
  3     SELECT *
  4     FROM (
  5            SELECT owner,
  6                   TRUNC(created) + MOD(ROWNUM, 6),
  7                   TRUNC(object_id/ROWNUM)
  8            FROM   all_objects
  9            WHERE  owner IN ('ORDSYS', 'WKSYS')
 10            ORDER BY
 11                   DBMS_RANDOM.RANDOM
 12           )
 13*    WHERE ROWNUM <= 1000
SQL> / 
 
Table created. 

 

  生成一个报表,显示每个customer在2007年每个月的花销.

SQL> SELECT name,
  2         TO_CHAR(dt, 'YYYYMM') as year_month,
  3         SUM(amt) as total_amt
  4  FROM  customer_orders
  5  GROUP BY
  6         name,
  7         TO_CHAR(dt, 'YYYYMM')
  8  ORDER BY 2, 1;


 

NAME                           YEAR_M  TOTAL_AMT
------------------------------ ------ ----------
ORDSYS                         200710       224973
WKSYS                          200710       3830

SQL>

这条简单的SQL只会返回每个customer实际的花销,并不能把07年其他月份的数据显示出来(尽管其他月份都没有任何花销)。

那么如何把其他月份的数据(0)也都显示出来呢,最容易想到做法就是用外连接!

SQL> 
  1  WITH year_month AS
  2  (  SELECT TO_CHAR(
  3                ADD_MONTHS(
  4                     DATE '2007-01-01', ROWNUM-1), 'YYYYMM') as year_month
  5     FROM dual
  6     CONNECT BY ROWNUM <= 12
  7  )
  8  SELECT co.name,
  9         ym.year_month,
 10         NVL(SUM(co.amt), 0) AS total_amount
 11  FROM year_month ym
 12  LEFT OUTER JOIN  customer_orders co
 13  ON   (TO_CHAR(co.dt, 'YYYYMM') = ym.year_month)
 14  GROUP BY
 15         co.name,
 16         ym.year_month
 17  ORDER BY
 18         co.name,
 19*        ym.year_month
 

NAME                           YEAR_M TOTAL_AMOUNT
------------------------------ ------ ------------
ORDSYS                         200710           224973
WKSYS                          200710            3830
                                     200701            0
                                     200702            0
                                     200703            0
                                     200704            0
                                     200705            0
                                     200706            0
                                     200707            0
                                     200708            0
                                     200709            0
                                     200711            0

                                      200712            0

12 rows selected.

SQL>

虽然把12个月份都显示出来了,但是customer的名字除了10月份之外都没有显示在记录里面,这个显然不是那么友好。

那么该如何让customer的名字也都列出来呢,先来看看不用partiton outer join如何来实现,

SQL>  
  1  WITH year_months AS
  2  (  SELECT TO_CHAR(
  3                ADD_MONTHS(
  4                     DATE '2007-01-01', ROWNUM-1), 'YYYYMM') as year_month
  5     FROM dual
  6     CONNECT BY ROWNUM <= 12
  7  ),
  8  customer_names AS
  9  (
 10     SELECT DISTINCT name
 11     FROM custormer_orders
 12  )
 13  SELECT sq.name,
 14         sq.year_month,
 15         NVL(co.total_amount, 0) AS total_amount
 16  FROM
 17  (
 18     SELECT name,
 19            TO_CHAR(dt, 'YYYYMM') as year_month,
 20            SUM(amt)  as total_amount
 21     FROM  custormer_orders
 22     GROUP BY
 23            name,
 24            TO_CHAR(dt, 'YYYYMM')
 25  )co,
 26  (
 27     SELECT *
 28     FROM customer_names,
 29          year_months
 30  ) sq
 31  WHERE
 32      sq.year_month = co.year_month(+)
 33  AND sq.name = co.name(+)
 34  ORDER BY
 35      sq.name,
 36*     sq.year_month
 

 

NAME                           YEAR_M TOTAL_AMOUNT
------------------------------ ------ ------------
ORDSYS                         200701            0
ORDSYS                         200702            0
ORDSYS                         200703            0
ORDSYS                         200704            0
ORDSYS                         200705            0
ORDSYS                         200706            0
ORDSYS                         200707            0
ORDSYS                         200708            0
ORDSYS                         200709            0
ORDSYS                         200710       224973
ORDSYS                         200711            0

ORDSYS                         200712            0


WKSYS                          200701            0
WKSYS                          200702            0
WKSYS                          200703            0
WKSYS                          200704            0
WKSYS                          200705            0
WKSYS                          200706            0
WKSYS                          200707            0
WKSYS                          200708            0
WKSYS                          200709            0
WKSYS                          200710         3830
WKSYS                          200711            0

WKSYS                          200712            0

24 rows selected.

SQL>

 

这种方式用了两个临时表,然后这两个临时表进行了笛卡尔乘积,然后再同custormer_orders表做外连接。 注意,为了得到所有的customer,customer_orders表被访问了两次,显然会对性能造成影响。 通过查看dbms_xplan.display_cursor来看看这条SQL的执行计划及代价, 

 

SQL> SELECT plan_table_output
  2  FROM  TABLE(dbms_xplan.display_cursor);
 
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------------
 
SQL_ID  2tn2b3sdyhxn4, child number 0
-------------------------------------
    WITH year_months AS     (  SELECT TO_CHAR(
ADD_MONTHS(                        DATE '2007-01-01', ROWNUM-1),
'YYYYMM') as year_month        FROM dual        CONNECT BY ROWNUM <= 12
    ),     customer_names AS     (       SELECT DISTINCT name
FROM custormer_orders    )    SELECT sq.name,           sq.year_month,
        NVL(co.total_amount, 0) AS total_amount    FROM    (
SELECT name,              TO_CHAR(dt, 'YYYYMM') as year_month,
    SUM(amt)  as total_amount       FROM  custormer_orders       GROUP
BY              name,              TO_CHAR(dt, 'YYYYMM')    )co,    (
    SELECT *       FROM customer_names,            year_months    ) sq
  WHERE        sq.year_month = co.year_month(+)    AND sq.name =
co.name(+)    ORDER BY        sq.name,       sq.year_month
 
Plan hash value: 1191063733
 
---------------------------------------------------------------------------------------------------------------
| Id  | Operation                          | Name             | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                   |                  |       |       |       |   154 (100)|       |
|   1 |  SORT ORDER BY                     |                  | 10000 |   556K|  1352K|   154   (3)| 00:00:02 |
|*  2 |   HASH JOIN OUTER                  |                  | 10000 |   556K|       |    13  (24)| 00:00:01 |
|   3 |    VIEW                            |                  |  1000 | 22000 |       |     7  (15)| 00:00:01 |
|   4 |     MERGE JOIN CARTESIAN           |                  |  1000 | 22000 |       |     7  (15)| 00:00:01 |
|   5 |      VIEW                          |                  |     1 |     5 |       |     2   (0)| 00:00:01 |
|   6 |       COUNT                        |                  |       |       |       |            |       |
|   7 |        CONNECT BY WITHOUT FILTERING|                  |       |       |       |            |       |
|   8 |         FAST DUAL                  |                  |     1 |       |       |     2   (0)| 00:00:01 |
|   9 |      BUFFER SORT                   |                  |  1000 | 17000 |       |     7  (15)| 00:00:01 |
|  10 |       VIEW                         |                  |  1000 | 17000 |       |     5  (20)| 00:00:01 |
|  11 |        HASH UNIQUE                 |                  |  1000 | 17000 |       |     5  (20)| 00:00:01 |
|  12 |         TABLE ACCESS FULL          | CUSTORMER_ORDERS |  1000 | 17000 |       |     4   (0)| 00:00:01 |
|  13 |    VIEW                            |                  |  1000 | 35000 |       |     5  (20)| 00:00:01 |
|  14 |     HASH GROUP BY                  |                  |  1000 | 39000 |       |     5  (20)| 00:00:01 |
|  15 |      TABLE ACCESS FULL             | CUSTORMER_ORDERS |  1000 | 39000 |       |     4   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("SQ"."YEAR_MONTH"="CO"."YEAR_MONTH" AND "SQ"."NAME"="CO"."NAME")
 
Note
-----
   - dynamic sampling used for this statement
 
 
47 rows selected.
 
SQL>
 
 
从执行计划看,这个SQL的cost是154!

接下来看看用Partition Outer Join如何实现,

SQL>
1  WITH year_months AS
2  (
3      SELECT to_char
4              ( ADD_MONTHS(
5                    DATE'2007-01-01', ROWNUM - 1), 'YYYYMM') AS year_month
6      FROM dual
7      CONNECT BY ROWNUM <= 12
8  )
9  SELECT co.name,
0         ym.year_month,
1         NVL(SUM(co.amt), 0) as total_amount
2  FROM  year_months ym
3  LEFT OUTER JOIN
4        custormer_orders co
5  PARTITION BY (co.name)
6  ON   (TO_CHAR(co.dt, 'YYYYMM') = ym.year_month)
7  GROUP BY
8        co.name,
9        ym.year_month
0  ORDER BY
1        co.name,
2*       ym.year_month

NAME                           YEAR_M TOTAL_AMOUNT
------------------------------ ------ ------------
ORDSYS                         200701            0
ORDSYS                         200702            0
ORDSYS                         200703            0
ORDSYS                         200704            0
ORDSYS                         200705            0
ORDSYS                         200706            0
ORDSYS                         200707            0
ORDSYS                         200708            0
ORDSYS                         200709            0
ORDSYS                         200710       224973
ORDSYS                         200711            0

ORDSYS                         200712            0


WKSYS                          200701            0
WKSYS                          200702            0
WKSYS                          200703            0
WKSYS                          200704            0
WKSYS                          200705            0
WKSYS                          200706            0
WKSYS                          200707            0
WKSYS                          200708            0
WKSYS                          200709            0
WKSYS                          200710         3830
WKSYS                          200711            0

WKSYS                          200712            0

24 rows selected.

SQL>

 

注意PARTITION OUTER JOIN 的用法,

(xx: table 1) LEFT OUTER JOIN (xxx: table 2) PARTITION BY (xxx: column name) ON (xxx: join condition)

相应的执行计划如下,

SQL> select plan_table_output
  2  from table(dbms_xplan.display_cursor); 
 
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------ 
 
SQL_ID  4np7tnbh0pqa2, child number 0
-------------------------------------
WITH year_months AS (     SELECT to_char             ( ADD_MONTHS(
             DATE'2007-01-01', ROWNUM - 1), 'YYYYMM') AS year_month
FROM dual     CONNECT BY ROWNUM < 12 ) SELECT co.name,
ym.year_month,        NVL(SUM(co.amt), 0) as total_amount FROM
year_months ym LEFT OUTER JOIN       custormer_orders co  PARTITION BY
(co.name) ON   (TO_CHAR(co.dt, 'YYYYMM') = ym.year_month) GROUP BY
 co.name,       ym.year_month ORDER BY        co.name,
ym.year_month 
 
Plan hash value: 4042059898 
 
-------------------------------------------------------------------------------------------------------
| Id  | Operation                          | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                   |                  |       |       |    10 (100)|          |
|   1 |  SORT GROUP BY                     |                  |     2 |   138 |    10  (20)| 00:00:01 |
|   2 |   VIEW                             |                  |    10 |   690 |     9  (12)| 00:00:01 |
|   3 |    NESTED LOOPS PARTITION OUTER    |                  |    10 |   230 |     9  (12)| 00:00:01 |
|   4 |     BUFFER SORT                    |                  |       |       |            |          |
|   5 |      VIEW                          |                  |     1 |     5 |     2   (0)| 00:00:01 |
|   6 |       COUNT                        |                  |       |       |            |          |
|   7 |        CONNECT BY WITHOUT FILTERING|                  |       |       |            |          |
|   8 |         FAST DUAL                  |                  |     1 |       |     2   (0)| 00:00:01 |
|*  9 |     FILTER                         |                  |       |       |            |          |
|  10 |      SORT PARTITION JOIN           |                  |    10 |   180 |     5  (20)| 00:00:01 |
|  11 |       TABLE ACCESS FULL            | CUSTORMER_ORDERS |    10 |   180 |     4   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------------- 
 
Predicate Information (identified by operation id):
--------------------------------------------------- 
 
   9 - filter("YM"."YEAR_MONTH"=TO_CHAR(INTERNAL_FUNCTION("CO"."DT"),'YYYYMM')) 
 
35 rows selected. 
 
SQL>
 
 
 
 
 
 
 

从执行计划上看,应用Partition Outer Join之后, cost 从154降到了10!性能得到大大提升!

原文地址:https://www.cnblogs.com/fangwenyu/p/1639230.html