转Asktom:Pipelined Functions

Pipelined Functions

Can you illustrate the usage of pipelined functions with a simple (EMP, DEPT) example? Under what circumstances can using a pipelined function be effective?

Pipelined functions are simply code that you can pretend is a database table. Pipelined functions give you the (amazing, to me) ability to use SELECT * FROM <PLSQL_FUNCTION>;.

Anytime you think you have the ability to use SELECT * from a function instead of a table, it might be useful. Consider the following extract/ transform/load (ETL) process, whereby a flat file is fed into a PL/SQL function that transforms it and the transformed data is then used to update existing table data. It demonstrates quite a few database features, including external tables, pipelined functions, and MERGE.

To create and use an external table, I need to use a directory object. I'll start by mapping a directory object to the temp directory:

SQL> create or replace 
  2  directory data_dir as '/tmp/'
  3  /
Directory created.

Now, I'll create the external table. Part of its definition looks like a control file, because part of creating an external table is, in effect, creating a control file:

SQL> create table external_table
  2  (EMPNO NUMBER(4) ,
  3   ENAME VARCHAR2(10),
  4   JOB VARCHAR2(9),
  5   MGR NUMBER(4),
  6   HIREDATE DATE,
  7   SAL NUMBER(7, 2),
  8   COMM NUMBER(7, 2),
  9   DEPTNO NUMBER(2)
 10  )
 11  ORGANIZATION EXTERNAL
 12  (type oracle_loader
 13    default directory data_dir
 14    access parameters
 15    (fields terminated by ',')
 16    location ('emp.dat')
 17  )
 18  /
Table created.

Now I'll use the flat utility to create a flat file from my EMP table data. You can find the flat utility at asktom.oracle.com/~tkyte/flat.

SQL> host flat scott/tiger - 
 > emp > /tmp/emp.dat

Now I am ready to test the external table; the flat file I created now works just like a database table:

SQL> select empno, ename, hiredate 
  2   from external_table
  3  where ename like '%A%'
  4  /
 
     EMPNO   ENAME     HIREDATE
 ----------   ------    ---------
      7499   ALLEN     20-FEB-81
      7521   WARD      22-FEB-81
      7654   MARTIN    28-SEP-81
      7698   BLAKE     01-MAY-81
      7782   CLARK     09-JUN-81
      7876   ADAMS     12-JAN-83
      7900   JAMES     03-DEC-81
7 rows selected.

I'll set up a PL/SQL ETL routine to ingest the flat file and output live data to be merged or inserted. A pipelined function needs to return a collection type, and I want to return a collection that looks like the EMP table itself, so I create the scalar object type and then I create a table of that type:

SQL> create or replace type 
  2  emp_scalar_type as object
  3  (EMPNO NUMBER(4) ,
  4   ENAME VARCHAR2(10),
  5   JOB VARCHAR2(9),
  6   MGR NUMBER(4),
  7   HIREDATE DATE,
  8   SAL NUMBER(7, 2),
  9   COMM NUMBER(7, 2),
 10   DEPTNO NUMBER(2)
 11  )
 12  /
Type created.
SQL> create or replace type 
  2  emp_table_type as table 
  3  of emp_scalar_type
  4  /
Type created.

Now I am ready to create the pipelined function itself. Note that the ETL function below is very simplistic; it involves modifying the ename column, but you can include any complex logic you want, including the ability to log error records and the like:

create or replace function emp_etl
(p_cursor in sys_refcursor)
return emp_table_type
PIPELINED
as
   l_rec  external_table%rowtype;
begin
   loop
      fetch p_cursor into l_rec;
      exit when (p_cursor%notfound);

         -- validation routine
      -- log bad rows elsewhere
      -- lookup some value
      -- perform conversion
      pipe row( 
      emp_scalar_type(l_rec.empno,
         LOWER(l_rec.ename),
         l_rec.job, 
         l_rec.mgr, 
         l_rec.hiredate, 
         l_rec.sal,
         l_rec.comm, 
         l_rec.deptno) );
        end loop;
        return;
end;
/
Function created.

The emp_etl pipelined function works just like a table. The following query selects columns (empno, ename) from the function, and the function selects all columns from the external table:

SQL> select empno, ename
  2    from TABLE(emp_etl(
  3     cursor(select * 
  4               from external_table 
  5               ) ) )
  6   where ename like '%a%';
     EMPNO  ENAME
 ----------  ------
      7499  allen
      7521  ward
      7654  martin
      7698  blake
      7782  clark
      7876  adams
      7900  james
7 rows selected.

Note the use of the keyword PIPELINED in the definition of this function; the keyword is mandatory in the making of a pipelined function. Also note the use of the pipe row directive in PL/SQL—that is the magic that makes a pipelined function really interesting. The pipe row directive returns data to the client immediately, meaning that I am getting output from this function in my client routine before the function generates the last row of data. If the cursor I send to this pipelined function returns 1,000,000 rows, I will not have to wait for PL/SQL to process all 1,000,000 rows to get the first row; data will start coming back as soon as it is ready. That is why these are called pipelined functions: Data streams—as if in a big pipe—from the cursor to the PL/SQL function to the caller.

Now, to finish the job, I'll create a table of data I would like to refresh from the source system, which sends me the flat file I produced above. The logic is as follows: If the record already exists in my database, UPDATE the ename and the sal columns; if the record does not exist, INSERT it. I'll start with some of the data from the EMP table:

SQL> create table emp as 
  2 select * from scott.emp 
  3 where mod(empno,2) = 0;
Table created.

And here is the MERGE, which manages data from the flat file, through ETL, straight to the table, without hitting the disk with staging files:

SQL> merge into EMP e1
  2  using (select *
  3            from TABLE
  4           (emp_etl(
  5           cursor(select * 
  6           from external_table)) 
  7        )
  8  ) e2
  9  on (e2.empno = e1.empno)
 10  when matched then
 11   update set e1.sal = e2.sal, 
 12              e1.ename = e2.ename
 13  when not matched then
 14   insert (empno, ename, job, mgr,
 15      hiredate, sal, comm, deptno)
 16   values (e2.empno, e2.ename, 
 17          e2.job, e2.mgr,
 18           e2.hiredate, e2.sal, 
 19           e2.comm, e2.deptno)
 20  /
14 rows merged.

魔兽就是毒瘤，大家千万不要玩。