MSBI : SQL Server Integration Service (SSIS) - Slowly Changing Dimension Transformation

As a Data warehouse Administrator or ETL developer You will often come a cross to manage the Slowly Changing Dimensions (SCD).

In this postI am going to provide you the steps and guidance needed to manage Slowly Changing Dimension with Slowly Changing Dimension Transformation in data flow task with an example.
Slowly Changing Dimensions (SCD)
Dimension is a term in data management and data warehousing. It's the logical groupings of data such as Designation, geographical location, customer or product information. With Slowly Changing Dimensions (SCDs), data changes slowly rather than changing on a time-based, regular schedule.
There are different type of SCD Types. Few are
SCD Type 1
SCD Type 2
SCD Type 3
SCD Type 1 :- Change the data in the dimension table.
-Easy to implement
-but the history is lost
e.g. Employee Designation Changes

In Type 1 Slowly Changing Dimension, the new information simply overwrites the original information. In other words, no history is kept.
Consider the following Customer Dimension table:



After Christina moved from Illinois to California, the new information replaces the new record, and we have the following table:





SCD Type 2 - Add new records/ rows to the dimension table that contain the new data.

-leaves history as it is.
-but this causes the table to grow fast

In Type 2 Slowly Changing Dimension, a new record is added to the table to represent the new information. Therefore, both the original and the new record will be present. The new record gets its own primary key.

Consider the following Customer Dimension table:





After Christina moved from Illinois to California on 15-JAN-2003, we add the new information as a new row into the table:





SCD Type 3 Add new fields/ columns to the dimension

-helps keep part of the history
-the size of the table is not affected
-used when changes occur rarely

In Type 3 Slowly Changing Dimension, there will be two columns to indicate the particular attribute of interest, one indicating the original value, and one indicating the current value.

Consider the following Customer dimension:




After Christina moved from Illinois to California on 15th, the original information gets updated.




If Christina moved back to Illinois from California on 25th:





Using Slowly Changing Dimension Transformation

Let’s first create a Employee table, which will have source data and add some data to it; as you can notice below, I have added the EmployeeCode column, which is a primary key at source and will work as a business key.

CREATE TABLE [dbo].[EMPLOYEE_SRC](
       [EmployeeCode] CHAR(8) PRIMARY KEY,
       [Designation] [varchar](50) NULL,
       [Location] [varchar](50) NULL,
)

Insert below two rows to the table.

INSERT INTO [dbo].[EMPLOYEE_SRC] ([EmployeeCode], [Designation], [Location])
VALUES
('E0000001', 'Engineer', 'USA'),
('E0000002', 'Engineer', 'USA')


Now creates the Dimension table to capture the historical data.

CREATE TABLE [dbo].[DimEmployee](
       [EmployeeId] [int] IDENTITY(1,1) NOT NULL,
       [EmployeeCode] CHAR(8),
       [Designation] [varchar](50) NULL,
       [Location] [varchar](50) NULL,
       [EffectiveDate] [date] NULL,
       [ExpirationDate] [date] NULL,
       [CurrentFlag] [char](1) NULL,
       CONSTRAINT [PK_DimEmployee] PRIMARY KEY CLUSTERED ([EmployeeId] ASC)
)

Then you can create the SSIS project using BIDS. Then Drag and drop the Data Flow task to Control Flow tab. Then go to the Data Flow task tab and Create OLE DB Source  and connect with [EMPLOYEE_SRC] table. Refer the below screenshot.














Then Drag and Drop the Slowly Changing Dimension component to Data Flow Tab. Then Connect with OLE DB Source.

















Double click On the Slowly Changing Dimension component and click on the Next Button.


















Click Next.Set EmployeeCode as Business key.


















Give the Slowly Dimension Columns and Click Next as Below screen.


















Set the Fixed and Changing Attribute. Click Next.


















In the Historical Attribute Option Select the 'Use start and end dates to identify current and expired records'. Give relevant fields as below. Click Next.


















Unchecked the Enable inferred member support check box.Click Next


















Click the Finish button on the Slowly Changing Dimension Wizard.


















Here is what you will see in the data flow task:












Then Execute the Package and Inserted two rows will be available in the Dimension Table.

Only Two rows available in the Database.












Database Output.














Then Execute the Below Update scrip to updated the Employee (E0000001) in the Above table.

UPDATE [dbo].[EMPLOYEE_SRC]
SET [Designation] = 'Senior Engineer'
WHERE [EmployeeCode] = 'E0000001' 

Again Execute the Package and output will be as below.only one row will be transfor from Employee table to Employee Dimension.











Database Output.













So it has captured history data and present data also.

But, this mechanism has some limitation. Main limitation is 
It can be very slow especially for larger dimensions as there is no caching of the lookup data.

In my Next Post i'll Explain how to over come this limitation.
MSBI : SQL Server Integration Service (SSIS) - Slowly Changing Dimension Transformation MSBI : SQL Server Integration Service (SSIS) - Slowly Changing Dimension Transformation Reviewed by Pubudu Dewagama on 11:26:00 PM Rating: 5

2 comments:


  1. Through this post, I know that your good knowledge in playing with all the pieces was very helpful. I notify that this is the first place where I find issues I've been searching for. You have a clever yet attractive way of writing on Msbi online training
    Msbi online training Hyderabad
    Msbi online training India
    Msbi online course
    Msbi certification training

    ReplyDelete

  2. Interesting blog which attracted me more.Spend a worthful time.keep updating more.
    This is important thing because when we are running a business, first we have to make our focusing area and what we are ready for serve among the people.
    Thank you and i am looking forward your more information here
    Visit:Best power bi training in Hyderabad



    ReplyDelete