Oracle实现线性回归

线性回归的具体算法说明如下:

1. 用所给样本求出两个相关变量的(算术)平均值:X_=(x1+x2+x3+...+xn)/n;Y_=(y1+y2+y3+...+yn)/n    

2. 分别计算分子和分母,分子=(x1y1+x2y2+x3y3+...+xnyn)-nX_Y_;分母=(x1^2+x2^2+x3^2+...+xn^2)-n*X_^2

3. 计算b = 分子/分母

具体oracle实现过程如下

 1 CREATE OR REPLACE PROCEDURE SP_BILINEAR_REGRESSION(NEW_X IN VARCHAR2,
 2                                                    NEW_Y IN VARCHAR2,
 3                                                    TABLENAME  IN VARCHAR2,
 4                                                    COLNAME_Y1 IN VARCHAR2,
 5                                                    COLNAME_Y2 IN VARCHAR2,
 6                                                    COLNAME_X1 IN VARCHAR2,
 7                                                    COLNAME_X2 IN VARCHAR2)
 8 IS     V_AVER_X      NUMBER(23,9);
 9        V_AVER_Y      NUMBER(23,9);
10        CNT           PLS_INTEGER;
11        B_FORECAST    NUMBER(23,9);
12        A_FORECAST    NUMBER(23,9);
13        V_TABLENAME   VARCHAR2(50);
14        V_SQL         VARCHAR2(2000);
15 BEGIN
16       V_TABLENAME := TABLENAME;
17       V_SQL := 'SELECT COUNT(1) FROM '||V_TABLENAME;
18       EXECUTE IMMEDIATE V_SQL INTO CNT;
19       IF CNT <> 0 THEN --ADD COLUMN TYPE FILTER
20         V_SQL := 'SELECT 1/2*AVG('||COLNAME_X1||'+'||COLNAME_X2||') FROM '||V_TABLENAME;EXECUTE IMMEDIATE V_SQL INTO V_AVER_X;
21         V_SQL := 'SELECT 1/2*AVG('||COLNAME_Y1||'+'||COLNAME_Y2||') FROM '||V_TABLENAME;EXECUTE IMMEDIATE V_SQL INTO V_AVER_Y;
22         V_SQL := 'SELECT (SUM('||COLNAME_X1||' * '||COLNAME_Y1||'+'||COLNAME_X2||' * '||COLNAME_Y2||') - '||CNT||' * 2 * '||V_AVER_X||' * '||V_AVER_Y||')/
23                          (SUM('||COLNAME_X1||' * '||COLNAME_X1||'+'||COLNAME_X2||' * '||COLNAME_X2||') - '||CNT||' * 2 * '||V_AVER_X||' * '||V_AVER_X||')
24                   FROM '||V_TABLENAME;
25         EXECUTE IMMEDIATE V_SQL INTO B_FORECAST;
26         V_SQL := 'SELECT '||V_AVER_Y||' - '||B_FORECAST||' * '||V_AVER_X||' FROM DUAL';
27         EXECUTE IMMEDIATE V_SQL INTO A_FORECAST;
28       END IF;
29       V_SQL := 'UPDATE '||V_TABLENAME||' T SET T.'||NEW_Y||' = T.'||NEW_X||' * '||B_FORECAST||' + '||A_FORECAST;
30       EXECUTE IMMEDIATE V_SQL;
31       COMMIT;
32 END SP_BILINEAR_REGRESSION;

调用如下:

1 CALL SP_BILINEAR_REGRESSION(NEW_X => 'MATURITYDAYS',
2                             NEW_Y => 'DISCOUNTFACTOR',
3                             TABLENAME => 'TMP_DATA_COLLECT_IRS',
4                             COLNAME_Y1 => 'DISCOUNT1',
5                             COLNAME_Y2 => 'DISCOUNT2',
6                             COLNAME_X1 => 'DAYS1',
7                             COLNAME_X2 => 'DAYS2');

其中,第一个参数是线性函数的x值所在字段,第二个是基于x和线性方程计算出来的y值所需要更新的字段,另外四个是传入表和字段名称。

原文地址:https://www.cnblogs.com/aiden-liu/p/10149285.html