慢慢改变维度 - 更多细节......

让我们来看看2型SCD的代码样本

在以前的博客条目中,我讨论了缓慢变化的尺寸或SCD的概念。一个很好的例子是如何在客户从一个地区移动到另一个区域时,如何处理客户区域的销售分析。解决这种情况的方法是2型SCD设计。让我们来看看...... 2型SCD设计要求属性被识别为历史属性,使得系统将跟踪该属性的变化,以便在包括时间尺寸的多个尺寸上可以精确地执行该属性的变化。这是完成的方式是通过在维度表中生成的“代理键”。考虑数据仓库中的DimCustomer维度表的设计。创建表[dbo]。[dimcustomer]([customersk] [int]标识(1,1)不是null,[customerkey] [int] not null,[firstname] [nvarchar](50)null,[lastname] [nvarchar](50)null,[geographykey] [int] not null,[startdate] [datetime] null,[enddate] [datetime] null,约束[pk_dimcustomer]主键群集([customersk] asc)与(pad_index =关闭,在[Primary]上的[Pri​​mary])上的Ignore_dup_key = OFF在这种情况下,地理基将是“历史属性”,因为我们希望跟踪其值为特定客户的任何更改。CustomerKey是“Business Key”,因为它是来自操作OLTP数据库的主键,但现在是维表中的普通属性。 CustomerSK is the surrogate key which will be generated by the system using the IDENTITY property and will be the Primary Key of the dimension table. The actual value of the surrogate key is of little significance except that it is unique and will be used to form a relationship with the Fact table. CREATE TABLE [dbo].[FactInternetSales]( [OrderQuantity] [smallint] NOT NULL, [SalesAmount] [money] NOT NULL, [DiscountPercentage] [float] NOT NULL, [DiscountAmount] [float] NOT NULL, [ProductCost] [money] NOT NULL, [ProductKey] [int] NOT NULL, [CustomerSK] [int] NOT NULL, [OrderDateTimeKey] [int] NOT NULL ) ON [PRIMARY] GO ALTER TABLE [dbo].[FactInternetSales] WITH CHECK ADD CONSTRAINT [FK_FactInternetSales_DimCustomer] FOREIGN KEY([CustomerSK]) REFERENCES [dbo].[DimCustomer] ([CustomerSK]) GO ALTER TABLE [dbo].[FactInternetSales] CHECK CONSTRAINT [FK_FactInternetSales_DimCustomer] GO To populate the DimCustomer dimension table we could use the following query to access data from the fully normalized Operational OLTP database: SELECT Person.Contact.FirstName, Person.Contact.LastName, Sales.Customer.CustomerID, Sales.Customer.TerritoryID FROM Person.Contact INNER JOIN Sales.Individual ON Person.Contact.ContactID = Sales.Individual.ContactID RIGHT OUTER JOIN Sales.Customer ON Sales.Individual.CustomerID = Sales.Customer.CustomerID Note that the CustomerSK surrogate key is system generated and increments by one for each new row added to the dimension table. Note also that the StartDate and EndDate attributes are populated using the Slowly Changing Dimension transform or similar logic within the SSIS ETL incremental load process. For each customer move, the EndDate will be updated to reflect the date that the customer ceased to live in the old region (GeographyKey). A new row will be inserted for the same customer with the new region code and the EndDate set to NULL to indicate the current region. The CustomerSK surrogate key guarantees a unique Primary Key even though the CustomerKey has a duplicate value. When the fact table rows are updated, the fact table load step within the SSIS ETL incremental load process may perform a sub-select to match the CustomerSK surrogate key with the correct customer row for the current location forming the correct Foreign Key value in the Fact table. If the incremental load is performed correctly in this way, the subsequent analysis will be accurate without any differences in end user requests. Here is the inner query of the fact table load query to populate the CustomerSK foreign key in the fact table. (SELECT CustomerSK FROM AdvWorksDW1.dbo.DimCustomer AS dc WHERE (dc.CustomerKey = Sales.SalesOrderHeader.CustomerID) AND (EndDate IS NULL)) I developed and tested these code samples by adapting the AdventureWorks BI Sample available under http://sqlserversamples.codeplex.com/ This is the “magic” of the Type 2 SCD. cheers Brian.

加入网络世界社区有个足球雷竞技appFacebooklinkedin.评论是最重要的主题。
有关的:

版权所有©2011.Raybet2

SD-WAN采购商指南:要询问供应商(和您自己)的关键问题