当硬件成为瓶颈时怎么提高数据仓库的加载?

   很多公司流行使用数据仓库进行数据分析,一般从线上数据源备库(mirror,logshipping,slave等)抽取到ods 层

在从ods层到dw再到dm.特别在ods层到dw时,数据的清洗装载需要一定的时间和硬件资源.

但是当硬件成为瓶颈时,怎么能快速完成清洗转载,及时的提供数据分析?

下面提供一种方法使用Ssis 加载到 ods层后,直接通过分区表把数据加载到 dw 

1 准备

1 /*create filegroup*/
2  ALTER DATABASE [testxwj] ADD FILEGROUP [account_1]
3  go
4  ALTER DATABASE [testxwj] ADD FILEGROUP [account_2]
5  go
6  ALTER DATABASE [testxwj] ADD FILEGROUP [account_3]
7
8  /*create file to filegroup*/
9
10  ALTER DATABASE [testxwj] ADD FILE ( NAME = N'account_1', FILENAME = N'E:\account_1.ndf' , SIZE = 409600KB , FILEGROWTH = 20480KB ) TO FILEGROUP [account_1]
11  GO
12  ALTER DATABASE [testxwj] ADD FILE ( NAME = N'account_2', FILENAME = N'E:\account_2.ndf' , SIZE = 409600KB , FILEGROWTH = 20480KB ) TO FILEGROUP [account_2]
13  GO
14 ALTER DATABASE [testxwj] ADD FILE ( NAME = N'account_3', FILENAME = N'E:\account_3.ndf' , SIZE = 409600KB , FILEGROWTH = 20480KB ) TO FILEGROUP [account_3]
15 GO
16

2 使用ssis copy table

1 sp_spaceused accountdetail;

1
2 /* delete EarnTime is not null*/
3
4
5 /*23 sec*/
6 delete from accountdetail where EarnTime is null
7 /*26 sec*/
8 delete from accountdetail where isnull(CommitStatus,0)<1
9 /*12 sec*/
10 delete from accountdetail where isnull(EarnStatus,0) =0

对传输过来的表进行分区

1
2 /*create partition function*/
3 declare @bdate char(8),@edate varchar(8),@sql varchar(500)
4 select
5 @bdate=convert(char(8),GETDATE()-1 ,112)
6 ,@edate=convert(char(8),GETDATE() ,112)
7 select @bdate,@edate;
8 set @sql='
9 CREATE PARTITION FUNCTION ac_EarnTime (datetime)
10 AS
11 RANGE RIGHT FOR VALUES ( '''+@bdate+''' ,'''+@edate+''')'
12 execute(@sql)
13 /*create partition schema*/
14 CREATE PARTITION SCHEME ac_schema_ac_EarnTime
15 AS PARTITION ac_EarnTime TO (account_1,account_2,account_3);
16
17 /*create partition table */
18
19 alter table accountdetail
20 alter column EarnTime datetime not null;
21
22 alter TABLE accountdetail
23 add CONSTRAINT [PK_PARTITIONmis] PRIMARY KEY
24 ( id,EarnTime
25 )ON ac_schema_ac_EarnTime(EarnTime)
26

把分区partition 2指向给 dw 值得注意的是 accountdetail_dw 必须跟partition 2 分区所在同一个文件组

1
2 /*switch accountdetail to accountdetail_dwl*/
3
4 ALTER TABLE accountdetail SWITCH PARTITION 2 TO accountdetail_dw ;
5
6 /**/

整个过程在 5分钟内.数据仓库最重要的还在当初的设计和选型.

原文地址:https://www.cnblogs.com/xwj1985/p/1803272.html