oracle按日期分组 (写一个定时任务, 每天晚上运行, 统计某渠道近30天里每天的数据)

 

1. 问题描述:

 

 有一张表叫domains_report, 有 渠道号, 日期,登陆数,注册数, 交易笔数, 交易金额, 退款笔数,退款金额 这几个字段, 字段是分散在4张表中(每个表都有日期字段. 可以限定统计区间). 需求是 将统计的数据查询出来, 然后插入到domains_report表里面去.

2. 业务思路:

1.先在数据库编写生成统计数据的查询sql.

2.将写好的查询sql放到java程序里面

3.查询sql , 得到list(近30天的数据) 结果.  先delete表domains_report表里近30天的数据, 遍历list, 把数据 insert到domains_report表里去. 

4.设置java程序,每天凌晨启动. 

补充:

  • 对于第3点, 为什么要先删除domains_report近30天的数据, 再插入domains_report近30天的数据?

          因为, 不删除直接插入的话, 会报主键唯一错误. 因为日期report_date是唯一键.

          并且, delete和 insert 要放在一个事物里面. 要么都一起成功,要么一起失败. 

  • 在把查询sql放到数据库里面去的时候, 最好把一些参数, 都写在配置文件里面 . 以及定时任务设置运行的时间, 也放在配置文件里面

3. 在统计查询的sql过程中, 很快就发现2大问题:

1. 发现4个表一起查询, 会有重复数据
2. 从8月1号, 到8月30号, 如果没有数据, 那天什么都不显示. 如这位网友遇到的问题一样: https://blog.csdn.net/jie11447416/article/details/50887888

为了解决这2个问题, 参考网友的做法,可以解决.  建立一张日历表, 查询的时候, 关联日历表即可.  但是网友是mysql数据库, 我是orcal, 于是函数和方法找了替代.

a . 建立一张日期表 calendar

create table CALENDAR
(
  datelist VARCHAR2(1024) not null
)
;
alter table CALENDAR
  add constraint PK_CALENDAR_ITEM primary key (DATELIST);

b. 生成日期表的数据, 把数据导入到 calendar日期表里面去

SELECT A.DATAS
  FROM (SELECT TO_CHAR(TO_DATE('20190101', 'yyyyMMdd') + ROWNUM - 1,
                       'yyyyMMdd') AS DATAS
          FROM DUAL
        CONNECT BY ROWNUM <=
                   TRUNC(TO_DATE('20500902', 'yyyyMMdd') -
                         TO_DATE('20190101', 'yyyyMMdd')) + 1) A

c. 在查询的语句中, 要关联 calendar日期表,就解决这个问题啦. 下面这个是查询数据的完整sql

select a1.domains_id,
       a1.report_date,
       a3.login_count,
       a2.regin_count,
       a1.trans_count,
       a1.trans_sum,
       a4.refund_count,
       a4.refund_sum
  from (select t2.report_date,
               nvl(t1.trans_count, 0) trans_count,
               nvl(t1.trans_sum, 0) trans_sum,
               nvl(t1.domains_id, '04103000000001000') domains_id
          from (select to_char(to_date(t.trans_time, 'yyyy-MM-dd hh24:mi:ss'),
                               'yyyyMMdd') trans_time,
                       count(t.trans_nbr) trans_count,
                       sum(t.trans_amount) trans_sum,
                       t.domains_id
                  from trans_item t
                 where t.domains_id = '04103000000001000'
                   and t.stats in (8, 9)
                 group by to_char(to_date(t.trans_time,
                                          'yyyy-MM-dd hh24:mi:ss'),
                                  'yyyyMMdd'),
                          domains_id) t1
         right join (SELECT datelist as report_date, null, null, null
                      FROM calendar t
                     where SYSDATE - 30 <=
                           trunc(to_date(t.datelist, 'yyyyMMdd'))
                       and trunc(to_date(t.datelist, 'yyyyMMdd')) <= sysdate) t2
            on t1.trans_time = t2.report_date) a1,
       (select t2.dday, nvl(t1.regin_count, 0) regin_count
          from (select count(1) regin_count,
                       to_char(to_date(t.create_time, 'yyyy-MM-dd hh24:mi:ss'),
                               'yyyyMMdd') create_time
                  from domains_users t
                 where t.domains_id = '04103000000001000'
                 group by to_char(to_date(t.create_time,
                                          'yyyy-MM-dd hh24:mi:ss'),
                                  'yyyyMMdd')) t1
         right join (SELECT datelist as dday, null, null
                      FROM calendar t
                     where SYSDATE - 30 <=
                           trunc(to_date(t.datelist, 'yyyyMMdd'))
                       and trunc(to_date(t.datelist, 'yyyyMMdd')) <= sysdate) t2
            on t1.create_time = t2.dday) a2,
       (select t2.dday, nvl(t1.login_count, 0) login_count
          from (select count(1) login_count,
                       to_char(to_date(t.login_time, 'yyyy-MM-dd hh24:mi:ss'),
                               'yyyyMMdd') login_time
                  from login_log t
                 where t.domains_id = '04103000000001000'
                 group by to_char(to_date(t.login_time,
                                          'yyyy-MM-dd hh24:mi:ss'),
                                  'yyyyMMdd')) t1
         right join (SELECT datelist as dday, null
                      FROM calendar t
                     where SYSDATE - 30 <=
                           trunc(to_date(t.datelist, 'yyyyMMdd'))
                       and trunc(to_date(t.datelist, 'yyyyMMdd')) <= sysdate) t2
            on t1.login_time = t2.dday) a3,
       (select t2.dday,
               nvl(t1.refund_count, 0) refund_count,
               nvl(t1.refund_sum, 0) refund_sum
          from (select nvl(count(1), '0') refund_count,
                       sum(ti.refund_amount) refund_sum,
                       to_char(to_date(ti.refund_time,
                                       'yyyy-MM-dd hh24:mi:ss'),
                               'yyyyMMdd') refund_time
                  from refund_item ti, trans_item tt
                 where ti.trans_nbr = tt.trans_nbr
                   and tt.domains_id = '04103000000001000'
                 group by to_char(to_date(ti.refund_time,
                                          'yyyy-MM-dd hh24:mi:ss'),
                                  'yyyyMMdd')) t1
         right join (SELECT datelist as dday, null
                      FROM calendar t
                     where SYSDATE - 30 <=
                           trunc(to_date(t.datelist, 'yyyyMMdd'))
                       and trunc(to_date(t.datelist, 'yyyyMMdd')) <= sysdate) t2
            on t1.refund_time = t2.dday) a4
 where a1.report_date = a2.dday
   and a2.dday = a3.dday
   and a3.dday = a4.dday

运行出来的效果图如下:

补充:

删除近30天统计数据的sql , 这个时间 30代表30天,  也要放在配置文件里面.

delete from domains_report t where t.report_date in (  SELECT *   FROM calendar t  where SYSDATE - 30 <= trunc(to_date(t.datelist, 'yyyyMMdd')) and trunc(to_date(t.datelist, 'yyyyMMdd')) <= sysdate)

.

~~~~~~~~~~~~~~~~~静下心来想想, 在写这个定时程序中, 细节怎么实现. 技术怎么实现.   多试试. 一定可以的. 加油!~~~~~~~~~~~~~~~~~

原文地址:https://www.cnblogs.com/PinkPink/p/11466848.html