分析函数计算起始,结束日期.

You Asked

I have a table from a 3rd party application that is used to track
an order through the various manufacturing operations. A subset of
the information looks like this:

   ORDER OPN STATION CLOSE_DATE
   ----- --- ------- ----------
   12345   10 RECV     07/01/2003
   12345   20 MACH1    07/02/2003
   12345   25 MACH1    07/05/2003
   12345   30 MACH1    07/11/2003
   12345   36 INSP1    07/12/2003
   12345   50 MACH1    07/16/2003
   12345   90 MACH2    07/30/2003
   12345 990 STOCK    08/01/2003

Where each row is a process that the order had to go through,
with OPN being the order of the processes.

What I would like to receive is the output grouped by consecutive
STATION values and include the start and close dates for each
STATION group. The start date is defined as the date the prior
station closed. So the output expected from the above data subset
would be:

   ORDER STATION START_DATE CLOSE_DATE
   ----- ------- ---------- ----------
   12345 RECV                 07/01/2003
   12345 MACH1    07/01/2003 07/11/2003
   12345 INSP1    07/11/2003 07/12/2003
   12345 MACH1    07/12/2003 07/16/2003
   12345 MACH2    07/16/2003 07/30/2003
   12345 STOCK    07/30/2003 08/01/2003

Is this possible? I've tried using analytics, but I can't seem to
get what I want. I can use the LAG function to get the start and
close dates, grouped by STATION, but it will group all the different
STATION values together (i.e. all MACH1 STATIONS will be grouped
together), not just the consecutive STATION values. I could use
procedural code to get this answer, but I was wanting to see if
it could be done in 1 statement.

I'm sure it will be something easy, but I've been racking my tiny
brain over this for the last few days and can't come up with a
solution. Can you help?

Many thanks,

Michael T.

and we said...

So, we want to keep rows that are:

a) the first row in the partition "where lag_station is null"
b) the last row in the partition "where lead_station is null"
c) the first of a possible pair "where lag_station <> station"
d) the second of a possible pair "where lead_station <> station"

This query does that:

ops$tkyte@ORA920> select order#,
2         station,
3         lag_close_date,
4         close_date,
5         decode( lead_station, station, 1, 0 ) first_of_pair,
6         decode( lag_station, station, 1, 0 ) second_of_pair
7    from (
8 select order#,
9         lag(station) over (partition by order# order by close_date)
10                                                              lag_station,
11         lead(station) over (partition by order# order by close_date)
12                                                              lead_station,
13             station,
14             close_date,
15         lag(close_date) over (partition by order# order by close_date)
16                                                             lag_close_date,
17         lead(close_date) over (partition by order# order by close_date)
18                                                             lead_close_date
19    from t
20         )
21   where lag_station is null
22          or lead_station is null
23          or lead_station <> station
24          or lag_station <> station
25 /

ORDER# STATION LAG_CLOSE_ CLOSE_DATE FIRST_OF_PAIR SECOND_OF_PAIR
------ ------- ---------- ---------- ------------- --------------
12345 RECV               07/01/2003             0              0
12345 MACH1   07/01/2003 07/02/2003             1              0
12345 MACH1   07/05/2003 07/11/2003             0              1
12345 INSP1   07/11/2003 07/12/2003             0              0
12345 MACH1   07/12/2003 07/16/2003             0              0
12345 MACH2   07/16/2003 07/30/2003             0              0
12345 STOCK   07/30/2003 08/01/2003             0              0

7 rows selected.

we can see with the 1's the first/second of a pair in there. All we need to do now is "reach
forward" for the first of a pair and grab the close date from the next record:

ops$tkyte@ORA920> select order#,
2         station,
3         lag_close_date,
4         close_date
5    from (
6 select order#,
7         station,
8         lag_close_date,
9         decode( lead_station,
10                 station,
11                 lead(close_date) over (partition by order# order by close_date),
12                 close_date ) close_date,
13         decode( lead_station, station, 1, 0 ) first_of_pair,
14         decode( lag_station, station, 1, 0 ) second_of_pair
15    from (
16 select order#,
17         lag(station) over (partition by order# order by close_date)
18                                                              lag_station,
19         lead(station) over (partition by order# order by close_date)
20                                                              lead_station,
21             station,
22             close_date,
23         lag(close_date) over (partition by order# order by close_date)
24                                                             lag_close_date,
25         lead(close_date) over (partition by order# order by close_date)
26                                                             lead_close_date
27    from t
28         )
29   where lag_station is null
30          or lead_station is null
31          or lead_station <> station
32          or lag_station <> station
33         )
34   where second_of_pair <> 1
35 /

ORDER# STATION LAG_CLOSE_ CLOSE_DATE
------ ------- ---------- ----------
12345 RECV               07/01/2003
12345 MACH1   07/01/2003 07/11/2003
12345 INSP1   07/11/2003 07/12/2003
12345 MACH1   07/12/2003 07/16/2003
12345 MACH2   07/16/2003 07/30/2003
12345 STOCK   07/30/2003 08/01/2003

6 rows selected.

and discard the second of pairs row

That is another way to do it (and an insight into how I develop analytic queries -- adding extra
columns like that just to see visually what I want to do)

魔兽就是毒瘤，大家千万不要玩。