X 开窗函数应用 first_value ,last_value ,lead,lag ,ignore nulls ,nulls last

oracle通过分析函数获取当前行之前的不为空的值
使用last_value，但是要注明ignore nulls 
比如有数据如下：
RN      ADDRESS     ARRIVAL_TIME         USERID    
 ------  ----------  -------------------  --------- 
 1       A1          2012-7-9 下午12:03:21  1         
 (null)  A2          2012-7-9 下午12:04:21  2         
 (null)  A3          2012-7-9 下午12:05:21  3         
 2       A1          2012-7-9 下午12:08:21  4         
 (null)  A2          2012-7-9 下午12:09:21  5         
 (null)  A3          2012-7-9 下午12:10:21  6         
 3       A1          2012-7-9 下午12:13:21  7         
 (null)  A3          2012-7-9 下午12:15:21  8         
 4       A1          2012-7-9 下午12:18:23  9         
 5       A1          2012-7-9 下午12:19:21  10        
 (null)  A2          2012-7-9 下午12:20:21  11        
 (null)  A3          2012-7-9 下午12:21:21  12        
 6       A1          2012-7-9 下午12:23:23  13        
 (null)  A2          2012-7-9 下午12:24:21  14 
 
 select rn,address,arrival_time,userid,last_value(rn ignore nulls) over(order by userid) from test
 查询结果如下：
  RN      ADDRESS     ARRIVAL_TIME         USERID     GROUP_T    
 ------  ----------  -------------------  ---------  ---------- 
 1       A1          2012-7-9 下午12:03:21  1          1          
 (null)  A2          2012-7-9 下午12:04:21  2          1          
 (null)  A3          2012-7-9 下午12:05:21  3          1          
 2       A1          2012-7-9 下午12:08:21  4          2          
 (null)  A2          2012-7-9 下午12:09:21  5          2          
 (null)  A3          2012-7-9 下午12:10:21  6          2          
 3       A1          2012-7-9 下午12:13:21  7          3          
 (null)  A3          2012-7-9 下午12:15:21  8          3          
 4       A1          2012-7-9 下午12:18:23  9          4          
 5       A1          2012-7-9 下午12:19:21  10         5          
 (null)  A2          2012-7-9 下午12:20:21  11         5          
 (null)  A3          2012-7-9 下午12:21:21  12         5          
 6       A1          2012-7-9 下午12:23:23  13         6          
 (null)  A2          2012-7-9 下午12:24:21  14         6 

 除了last_value(rn ignore nulls)，到了11g r2，oracle还支持了lag(rn ignore nulls)和lead(rn ignore nulls)

需求，取最新的数据行，但是如果最新的数据行某列中没有数据，那么取次新行某列中的数据：

案例一：

SQL> 
SQL> select * from tab_1;

        ID COL2                 COL_TIME    SHUJU_ID
---------- -------------------- --------- ----------
         1 tijian               02-MAR-21         75
         2 xunjian              02-MAR-21         75
         3 chaxun               02-MAR-21         75
           dba                  02-MAR-21         75
           abc                  02-MAR-21         75
         4 def                  03-MAR-21         75

6 rows selected.
           
           

SQL> select t.*,
       to_char(col_time,'yyyy-mm-dd hh24:mi:ss') time_,
       id,
       coalesce(id,lead(id ignore nulls) over(partition by shuju_id order by col_time desc)) as shiji_id,     ---coalesce，lead 配合 ignore nulls 
       row_number() over(partition by shuju_id order by col_time desc) as rn
  from tab_1 t;   



        ID COL2                 COL_TIME    SHUJU_ID TIME_                       ID   SHIJI_ID         RN
---------- -------------------- --------- ---------- ------------------- ---------- ---------- ----------
         4 def                  03-MAR-21         75 2021-03-03 05:46:21          4          4          1    ----此条数据满足
           abc                  02-MAR-21         75 2021-03-02 14:50:25                     3          2
           dba                  02-MAR-21         75 2021-03-02 14:43:37                     3          3
         3 chaxun               02-MAR-21         75 2021-03-02 14:43:20          3          3          4
         2 xunjian              02-MAR-21         75 2021-03-02 14:43:05          2          2          5
         1 tijian               02-MAR-21         75 2021-03-02 14:42:49          1          1          6

6 rows selected.

方法二：

SQL> select t.*,
       to_char(col_time,'yyyy-mm-dd hh24:mi:ss') time_,
       id id_o,
       last_value(id ignore nulls)over(partition by shuju_id order by col_time ) as shiji_id_1,   --用 last_value的方法写
       lead(id ignore nulls) over(partition by shuju_id order by col_time desc) as shiji_id_2,    --用 lead的方法写
       row_number() over(partition by shuju_id order by col_time desc) as rn
  from tab_1 t ;

        ID COL2                 COL_TIME    SHUJU_ID TIME_                     ID_O SHIJI_ID_1 SHIJI_ID_2         RN
---------- -------------------- --------- ---------- ------------------- ---------- ---------- ---------- ----------
         1 tijian               02-MAR-21         75 2021-03-02 14:42:49          1          1                     5
         2 xunjian              02-MAR-21         75 2021-03-02 14:43:05          2          2          1          4
         3 chaxun               02-MAR-21         75 2021-03-02 14:43:20          3          3          2          3
           dba                  02-MAR-21         75 2021-03-02 14:43:37                     3          3          2
           abc                  02-MAR-21         75 2021-03-02 14:50:25                     3          3          1

SQL>

LAG 窗口函数 说明：



LAG 窗口函数返回位于分区中当前行的上方（之前）的某个给定偏移量位置的行的值。

Syntax

LAG (value_expr [, offset ])
[ IGNORE NULLS | RESPECT NULLS ]
OVER ( [ PARTITION BY window_partition ] ORDER BY window_ordering )
Arguments
value_expr
对其执行函数的目标列或表达式。

offset
一个可选参数，该参数指定要返回其值的当前行前面的行数。偏移量可以是常量整数或计算结果为整数的表达式。如果您未指定偏移量，则 Amazon Redshift 使用 1 作为默认值。偏移量为 0 表示当前行。

IGNORE NULLS
一个可选规范，该规范指示 Amazon Redshift 应跳过 null 值以确定要使用的行。如果未列出 IGNORE NULLS，则包含 Null 值。

注意
您可以使用 NVL 或 COALESCE 表达式将 null 值替换为另一个值。有关更多信息，请参阅NVL 表达式.

RESPECT NULLS
指示 Amazon Redshift 应包含 null 值以确定要使用的行。如果您未指定 IGNORE NULLS，则默认情况下不支持 RESPECT NULLS。

OVER
指定窗口分区和排序。OVER 子句不能包含窗口框架规范。

PARTITION BY window_partition
一个可选参数，该参数设置 OVER 子句中每个组的记录范围。

ORDER BY window_ordering
对每个分区中的行进行排序。

LAG 窗口函数支持使用任何 Amazon Redshift 数据类型的表达式。返回类型与 value_expr. 的类型相同。




Examples
以下示例显示已售给买家 ID 为 3 的买家的票数以及买家 3 的购票时间。要将每个销售与买家 3 的上一销售进行比较，查询要返回每个销售的上一销量。
由于 1/16/2008 之前未进行购买，则第一个上一销量值为 null：





select buyerid, saletime, qtysold,
lag(qtysold,1) over (order by buyerid, saletime) as prev_qtysold
from sales where buyerid = 3 order by buyerid, saletime;

buyerid |      saletime       | qtysold | prev_qtysold
---------+---------------------+---------+--------------
3 | 2008-01-16 01:06:09 |       1 |
3 | 2008-01-28 02:10:01 |       1 |            1
3 | 2008-03-12 10:39:53 |       1 |            1
3 | 2008-03-13 02:56:07 |       1 |            1
3 | 2008-03-29 08:21:39 |       2 |            1
3 | 2008-04-27 02:39:01 |       1 |            2
3 | 2008-08-16 07:04:37 |       2 |            1
3 | 2008-08-22 11:45:26 |       2 |            2
3 | 2008-09-12 09:11:25 |       1 |            2
3 | 2008-10-01 06:22:37 |       1 |            1
3 | 2008-10-20 01:55:51 |       2 |            1
3 | 2008-10-28 01:30:40 |       1 |            2
(12 rows)

错误案例 first_value :

SQL> select * from tab_1;

        ID COL2                 COL_TIME    SHUJU_ID
---------- -------------------- --------- ----------
         1 tijian               02-MAR-21         75
         2 xunjian              02-MAR-21         75
         3 chaxun               02-MAR-21         75
           dba                  02-MAR-21         75
           abc                  02-MAR-21         75
         4 def                  03-MAR-21         75
           efg                  03-MAR-21         75
           111                  26-FEB-21         75

8 rows selected.



SQL> select t.*,
       to_char(col_time,'yyyy-mm-dd hh24:mi:ss') time_,
       id id_o,
       last_value(id ignore nulls)over(partition by shuju_id order by col_time ) as shiji_id_1,
       coalesce(id,lead(id ignore nulls) over(partition by shuju_id order by  col_time desc)) as shiji_id_2,
       first_value(id )over(partition by shuju_id order by col_time desc) as shiji_id_31,
       first_value(id ignore nulls)over(partition by shuju_id order by col_time desc) as shiji_id_32,
       row_number() over(partition by shuju_id order by col_time desc) as rn
  from tab_1 t   ;

        ID COL2                 COL_TIME    SHUJU_ID TIME_                     ID_O SHIJI_ID_1 SHIJI_ID_2 SHIJI_ID_31 SHIJI_ID_32         RN
---------- -------------------- --------- ---------- ------------------- ---------- ---------- ----------  ----------  ---------- ----------
           111                  26-FEB-21         75 2021-02-26 05:59:47                                                      4          8
         1 tijian               02-MAR-21         75 2021-03-02 14:42:49          1          1          1                     4          7
         2 xunjian              02-MAR-21         75 2021-03-02 14:43:05          2          2          2                     4          6
         3 chaxun               02-MAR-21         75 2021-03-02 14:43:20          3          3          3                     4          5
           dba                  02-MAR-21         75 2021-03-02 14:43:37                     3          3                     4          4
           abc                  02-MAR-21         75 2021-03-02 14:50:25                     3          3                     4          3
         4 def                  03-MAR-21         75 2021-03-03 05:46:21          4          4          4                     4          2
           efg                  03-MAR-21         75 2021-03-03 05:52:53                     4          4                                1

8 rows selected.

这个问题不能用 first_value 来解决，因为 对于 null 值在排序中默认是最大值，排在最后面，如果用倒序的话就排在最上面。
这里如果用first_value 那么对应的每行都取第一个值，那么都是null值，即使用
ignore nulls 那么返回的也是null值。


ignore nulls 表示排除null值，但是所有值都是null，那么只能返回null