Business Intelligence


Star Schema vs Snowflake Schema vs 

Fact Constellation Schema


Star Schema


The Star Schema is the simplest data warehouse scheme. When we data warehousing with the star schema, database diagram resembles a star design. That is why this schema called as star schema. One fact table surrounded by dimension tables and if all the dimension tables are parent table then it's star schema data warehouse model. In this schema, a dimension cannot be a child table to a parent table. That means, dimension tables are in denormalized form and all the attributes which are defining the dimension will be in that particular dimension table. There is only one join statement between fact table and the dimension table. The data redundancy is high in this schema. So we need more storage capacity but simple queries are enough to retrieve data and we can get good query performance through this data warehouse model.

Ex: Star Schema



Snowflake Schema


Snowflake schema is a more complex data warehouse model than a star schema. In this schema, there should be at least one dimension table will have hierarchy pattern. That means dimension tables have child tables. Dimension tables are in normalized form in this schema. It reduces data redundancy and need of storage capacity. But query performance is less than star schema and need complex queries to retrieve data.

Ex: Snowflake Schema




Fact Constellation Schema


The Fact Constellation schema may required for sophisticated applications like multiple fact tables share dimension tables. This schema is similar to a collection of stars. Fact constellation schema also called as galaxy schema.

Ex: Fact Constellation Schema


Summary: 

Star schema is the simplest schema and fact constellation schema is the complicated schema. Star schema is most efficient in query performance but needs more storage capacity. Snowflake schema is poor in query performance but less in data redundancy. Every schema has good and bad but the choice will be made on business domain and the requirements.



Published: 09.07.2017




  

Different Ways Of Generating Data For
Date Dimension Table


Time is the most precious thing in the world. In business domain, time is money and everything. When we are doing data warehouse, we need date dimension or using it to track business trends, analyze data and figure out changes in the business pattern over the time.  There are many ways and methods for generating data to populate date dimension. Generate date with SQL Statement and generate date in Excel and migrate it to database table are two methods from that many.

GENERATE DATE WITH SQL STATEMENT

We can get the data what we need for populating date dimension through a SQL Statement as follow:

01. Create a table with CREATE statement.

            CREATE TABLE Table_Name (
                       [DateID] int NOT NULL IDENTITY(1, 1),
                       [Date] datetime NOT NULL,
                       [Year] int NOT NULL,
                       [Month] int NOT NULL,
                       [Day] int NOT NULL,
                       [Quarter] int NOT NULL,
                       CONSTRAINT Constraint_Name PRIMARY KEY CLUSTERED (DateID)
            )

02. Generate data and insert it into the table

            -- declare variables to define range of the data
            DECLARE @StartPoint datetime
            DECLARE @EndPoint datetime
         
            /*
             *assign values of the first and the last day of the range
             *into the StartPoint and the EndPoint respectively
           */
           SET @StartPoint = '01/01/2017'
           SET @EndPoint = '12/31/2017'

           /*
            *using a while loop increment from the first date
            *to the last date
           */
           DECLARE @LoopRunner datetime
           SET @LoopRunner = @StartPoint

           WHILE @LoopRunner <= @EndPoint
                   BEGIN
                        -- add each record one by one into the table for this date
                        INSERT INTO Table_Name VALUES (
                                 @LoopRunner,
                                 Year(@LoopRunner),
                                 Month(@LoopRunner),
                                 Day(@LoopRunner),
                                 CASE 
                                             WHEN Month(@LoopRunner) IN (1, 2, 3) THEN 1
                                             WHEN Month(@LoopRunner) IN (4, 5, 6) THEN 2
                                             WHEN Month(@LoopRunner) IN (7, 8, 9) THEN 3
                                             WHEN Month(@LoopRunner) IN (10, 11, 12) THEN 4
                                 END
                        )

                      /*
                       *increment the LoopRunner by 1 to get the next date
                       *then start the loop again
                      */
                      SET @LoopRunner = DateAdd(d, 1, @LoopRunner)
                 END


GENERATE DATE THROUGH EXCEL AND TRANSFER IT INTO DATABASE TABLE


We create a table in excel with the records of the date dimension table and then create a table in database using a create statement. After that import the excel records into the table through import wizard.

Summary:

There are many ways out there to generate data for populate the date dimension. Above we have discussed two ways, one is using SQL Statement and another one is using excel to generate data and migrate it into the database table through import wizard. SQL Statement is hard for the people who are not in technology domain and Excel method will be easy for them. But in SQL Statement method we can do everything straight away in the database. So there will not be any compatible error or other exceptions.  




Published: 17.07.2017




No comments:

Post a Comment