Oracle Openworld Directory Index
Optimizing Dimensional Data Warehouse Design
Michael D. Schmitz
High Performance Data Warehousing
www.databaseperformance.com
2910 NW Meldrum Ct.
Bend, Oregon 97701
(541) 322-0566
mike.schmitz@databaseperformance.com
Session Abstract
Session Agenda
Recommended Reading
Professional Background
The Data Warehouse Design Challenge
While meeting
In Other Words Data Warehouse Schemas Must
Dimensional Data Warehouse Definition
Dimensional Design
Sample Dimensional Schema
Large Dimension Tables
Do most of the business questions have to go through a large dimension table as in schemas with large customer or patient dimensions?
Analytical Queries through a Large Dimension Table
Build Analytical Dimensions
Analytical Dimension Benefits
Data Locality by new dimensions
New dimensions will be smaller making them easier to index
Analytical queries won’t have to use the customer table unless they need to bring back individual customer detail
Automatic history retention
Too Many Analytical Dimensions
Consolidated Dimensions
Handling Dimension Table Attribute History
Six Considerations
Six Methods
Overwrite
New Row
New Column
New Dimension
History Fact Table
Identity Resolution Outrigger
Many Combinations
Dimension Attributes
Change
Slowly
Quickly
In mass
Lack of history can cause
Incorrect analyses
Erroneous decisions
Tracking history should not
Affect analytical performance
Formal Methodology
Business Requirements
Documented
Business Sign off
Data Authority/Steward
Six Basic Considerations
The reason the change occurred, is it a correction to a wrong value?
Dimension table size
The number of changes that need tracking
The frequency of the changes
Is the attribute analytical or detail?
Is the requirement analytical or audit based?
Assess Requirements and Change Profile for Each Dimension Attribute
Dimension Attribute History Tracking Methods
Type 1 - overwrite value
Type 2 - create new row
Type 3 - add new column
Type 4 - use analytical dimension
Type 5 - use dimension history table
Type 6 - identity resolution outrigger
No History Required
Type 1 - Overwrite old value with new value
Always used for corrections
Type 1 - Example
Update customerset street address to ‘2323 Jennings St.’
Update customerset phone number to ‘258-1913’
Type 1
If attribute history is not of analytical or historical importance use method 1
If the update is a correction use method 1
All changes need tracking, but the changes are infrequent
Reasonable table size
Use Type 2
Type 2 Example
Sales History is tracked by package design class but a new SKU is not assigned when the package design class changes
Type 2 - Create Another Dimension Row with New Warehouse Key
Type 2
All changes need tracked
The dimension is not too large
The changes are not too frequent
All changes need tracking, and the changes are frequent
Large table
Use type 4
Type 4 Example
Customer dimension
Household and individual demographics
Buying pattern - scoring
Customer Dimension
Type 4 - Create Analytical Dimensions
Automatic History
The fact table has keys to the customer attributes which were in effect when the fact table rows was produced
Type 4
All history needs tracked
The table is large
The changes are frequent
Only a limited number of changes need tracking
Current versus Last
Current and Planned
Use Type 3
Add Attribute History Column
Sample Queries
Show me last month sales totals by new regions
Select sales_region, sum(sales_amount)from sales_fact where month = ‘LAST’
Show me last month sales totals by the old regions
Select previous_sales_region, sum(sales_amount)from sales_fact where month = ‘LAST’
Type 3
Only a limited number of changes need tracked
Inelegant
But very effective and high performing
Non analytical attribute history tracking
Need street address history
Use Type 5
Type 5 - Dimension History Table
Don’t use Type 4 AloneType 4 and 5 Combined
Type 2, 4 and 5 Combined
Type 5
Traps detail attribute history
Should only be used to analyze dimension only history
Should not be used to query transaction or event history
Use other methods for that
Shouldn’t be used alone
An Identifying Attribute Changes
Customer Name changes from Global Enterprises to World Wide Enterprises
You want to provide name transparency
If someone queries by the old name you want to see all history, not just the history associated with the old name
Use Type 6
Type 6 Example
Type 6 Sample Data
Type 6
Used to provide identifier transparency when an identifying attribute changes
Summary
Type 0 - don’t update
Type 1 - overwrite value
Type 2 - create new instance
Type 3 - add new column
Type 4 - use analytical dimension
Type 5 - use dimension history table
Type 6 – use identity resolution outrigger table
Q&A