Monday 12 January 2009

Star Vs Snowflake Schema

In continuation to my last post we will continue with dimensional modeling in detail

In dimension modeling there are mainly two types of schemas

1.Star Schema
2.Snowflake Schema

Star Schema :
Star schema is simplest data warehouse schema .It is called star schema because ER diagram of this schema looks like star with points originating from center. Center of star schema consists of large fact table and points of star are dimensional table.

Star schema is identified by one or more large fact table at center that contain primary information in data warehouse and lot of small dimensional tables each of which contain information about particular attribute of fact tables.

Advantage of Star Schema :

1.Provide a direct mapping between the business entities and the schema design.
2.Provide highly optimized performance for star queries.
3.It is widely supported by a lot of business intelligence tools.

Disadvantage of Star Schema:
There are some requirement which can not be meet by star schema like relationship between customer and bank account can not represented purely as star schema as relationship between them is many to many.


Snow Flake Schema:
Snowflake is bit more complex than star schema. It is called snow flake schema because diagram of snowflake schema resembles snowflake.

In snowflake schema tables are normalized to remove redundancy. In snowflake dimension tables are broken into multiple dimension tables, for example product table is broken into tables product and sub product.
Snowflake schema is designed for flexible querying across more complex dimensions and relationship. It is suitable for many to many and one to many relationship between dimension levels.

Advantage of Snowflake Schema:
1.It provides greater flexibility in interrelationship between dimension levels and components.
2.No redundancy so it is easier to maintain.

Disadvantage of Snowflake Schema :

1.There are More complex queries and hence difficult to understand
2.More tables more joins so more query execution time.





Below table summarizes all differences


 

Star Schema

Snowflake Schema

Ease of maintenance / change

Has redundant data and hence difficult to maintain/change

No redundancy, so snowflake schemas are easier to maintain and change.

Ease of Use

Less query complexity and easy to understand

More complex queries and hence less easy to understand

Query Performance

Less number of foreign keys and hence shorter query execution time (faster)

More foreign keys and hence longer query execution time (slower)

Type of Datawarehouse

Good for DataMart’s with simple relationships (1:1 or 1:many)

Good to use for datawarehouse with complex relationships (many:many)

Joins

Less Joins

Higher number of Joins

Dimension table

A star schema contains only single dimension table for each dimension.

A snowflake schema may have more than one dimension table for each dimension.

When to use

When dimension table contains less number of rows, we can choose Star schema.

When dimension table is relatively big in size, snow flaking is better as it reduces space.

Normalization/ De-Normalization

Both Dimension and Fact Tables are in De-Normalized form

Dimension Tables are in Normalized form but Fact Table is in De-Normalized form

Data model

Top down approach

Bottom up approach



My Profile

No comments:

Post a Comment