Oracle Openworld Directory Index
This session will focus on
CASE STUDY:Multi-Dimensional Data Warehouse Automation Integrity Framework
Work with Oracle Objects to construct a simple case to automate multi-dimensional data warehouse
Implement metadata to manage independent/external validation & fact/dimension relationship
Implement dynamic/centralized dimension tables
Implement views to automatically de-centralize dimension across (any) defined star-schema model
WHAT FOR ?
Reach higher levels of Data Mart Integrity:
“ Protect consistency on data and ensure consolidation on Star-model dimensions and variables during any cross-tab aggregations and/or joins operations between fact and dimension tables”
Agenda
Scenario for Data Warehouse
DW process model: “main points of control”
Business Information maturity model / Impact analysis
Scenario for case study & star-schema population
Case Study
Dimensional integrity: “just in time SM approach”
Implementation: dynamic PL/SQL, Objects & triggers
Benefits, challenges and future enhancements
Questions and answers
Turning data into knowledge
Tips for DW model construction
Do not loose any data going to business tables
Do not loose any fact/dim relationship on star-schema
Business tables as “intermediate tables” do exists after ETL processing and before star-schema population
Do not set referential constraints on business tables. Leave data flows to pseudo-fact tables and manage integrity later
Manage custom applications (PL/SQL) and use Oracle events & triggers to produce “fact-first” approach on star model construction
“Design to get information from business tables and knowledge from star-schema model”
Tips for ETL process: Oracle Warehouse Builder
Use Oracle E-business Intelligence tools to ensure proper data transportation & loading (ETL)
Oracle Warehouse Builder gets more efficient and reliable than any other custom process
OWB supports metadata to maintain the whole ETL process on database with high performance levels
Data transportation to business tables produce the first integrity level on data warehouse. OWB needs to be present to ensure proper combination of process over host data files and database tables
OWB support many data file format, including text files, flat files either on ASCII, EBCDIC etc
Business information maturity model: impact analysis
Data Warehouse Start-up
Born after “N” rounds in the business information maturity model / Legacy system construction
OLTP systems as a DW data source
Business rules are typically set on system by program logic, not by database/data file structure
Many data sources from different business sites
Data comes to business tables at different times.
Legacy systems: Not Relational model implementation
Very different system purposes
Impact scenario diagram
Why/How DW gets impacted ?
OLTP are “discrete-data event” oriented while DW are “continuous-data information model” oriented
We can’t set up full referential integrity on SM because if one ref fails, entire row is rejected thus affecting entire model
Custom applications gets complex because lack of constraints and additional consistency controls
Independent dimension table management may cause undesired query results
Legacy applications reflects the “last” status of the information at any point in time. Some states are lost.
So, the Big question is:
Must we leave data goes to business/fact tables anyway ?
The answer is: yes “Assuming sale_channel dimension is OK: ”
But we’ve still got a problem !
Tips for star-schema population
Multi-dimensional integrity
“Just in time” approach for
star-schema management
Context diagram
Model features
Star model: Fact table to be populated via custom process
Trigger “before INSERT/UPDATE” on FACT table
Fires an object method to populate a dimension table based on “:new.values” dynamic array initialization
Oracle integrity-object: data structures & methods
Metadata table entry for each FACT table column to be included on the integrity model
1 .. N target dimension tables that support centralized dimension values for any star-schema
Dynamic process to create dimension distribution trough oracle database views
Methodological approach
Dimension values will flow from Fact to Dimensions
Use of metadata allows system-centric based maintenance
Encapsulation of structures and program logic make design of application that populate fact tables easier
Use Oracle Objects, PL/SQL and and database events to automate dimensional integration across the model
Design custom applications to populate Fact tables (first) and let the integrity model do the rest
Use Oracle E-business intelligence to standardize internal process & model MIS construction (Discoverer)
Metadata_attribute (table) input data to integrity model
COLUMN DATA TYPE
----------------------------------------- -------- ----------------------------
DIMENSION_NAME NOT NULL VARCHAR2(25)
VALIDATION_TYPE NOT NULL VARCHAR2(20)
TARGET_DIM_SCHEMA NOT NULL VARCHAR2(30)
TARGET_DIM _TABLE NOT NULL VARCHAR2(30)
TARGET_DIM _COLUMN NOT NULL VARCHAR2(30)
DIMENSION_COLUMN NOT NULL VARCHAR2(30)
DEFAULT_COLUMNS VARCHAR2(300)
DEFAULT_VALUES VARCHAR2(100)
DEFAULT_NULL VARCHAR2(20)
ERROR_TABLE VARCHAR2(50)
SOURCE_SCHEMA NOT NULL VARCHAR2(30)
SOURCE_TABLE NOT NULL VARCHAR2(30)
SOURCE_COLUMN NOT NULL VARCHAR2(30)
PROCEDURE_NAME VARCHAR2(100)
“Dimension_integrity” Oracle object type implementation
Conceptualizing data warehouse
Integrity object structure design
PL/SQL object types components
create or replace type OWN_TBL as object (owner_def varchar2(30),
table_def varchar2(30));
create or replace type VALIDATION_LIST as table of varchar2(30);
create or replace type ATTRIBUTES as object (
dimension_name varchar2(25),
validation_type varchar2(20),
target_dim_sch varchar2(30),
target_dim_tab varchar2(30),
target_dim_col varchar2(30),
dimension_col varchar2(30),
default_columns varchar2(300), continue
PL/SQL object type (cont)
. . .
default_values varchar2(100),
default_null varchar2(20),
error_table varchar2(50),
source_sch varchar2(30),
source_tab varchar2(30),
source_col varchar2(30),
procedure_name varchar2(100));
create or replace ATTRIBUTE_LIST as table of ATTRIBUTES;
continue
Dimension_integrity Object final definition
Create or replace type DIMENSION_INTEGRITY
as object (
object_name OWN_TBL, -- user initialization
attr_list ATTRIBUTE_LIST, -- SELF initialization
value_list VALIDATION_LIST, -- user initialization
member procedure read_dimension_attributes,
member procedure automate_dimension,
member procedure error_management( idx in binary_integer,
msg in vachar2),
member function match_array_data return boolean
);
Object setup rules
If metadata table is populated from dba_trigger_cols dynamic view, then “where column_usage != ‘NONE’ “ must be present in the process to capture only triggered values
Elements of both attribute_list & value list (:new values in trigger body) must respect the same order of array index
Owner and table must be initialized once the object is instantiated or before the call to automate_dimension
value_list array (values to be inserted on fact table) must be dynamically initialized during the iteration process of call to automate_dimension procedure
Object type body members pseudo-code example
Create or replace type body DIMENSION_INTEGRTY as
member procedure read_dimension_attributes is
begin
bulk select attributes(attribute_table_columns)
into SELF.attr_list from metadata_attributes_table;
end;
member function match_data return boolean is
begin
if value_list.COUNT != attr_list.COUNT then raise ...
Return true/false
end;
continue
Object type body (Cont) members pseudo-code example
member procedure error_management (idx in binary_integer,
msg in varchar2) is
stmt varchar2(400);
begin
err_columns := ‘implement error columns here’;
err_values := msg||','||''''||SELF.value_list(idx)||'''';
stmt := ‘insert into || attr_list(idx).error_table ||
‘(‘ || err_columns || ‘ ) values ( ‘ || err_values || ‘)’;
execute immediate stmt;
end;
continue
Automation procedure (main)
member procedure automate_dimension is
stmt varchar2(400); -- and other declaratives
begin
if ( SELF.object_name is null) then
raise_application_error(-nnnn,’no fact table to manage’);
end if;
read_dimension_attributes -- fills attribute_list reading metadata
if (not match_array_data ) then -- check user values & attributes array index
raise_application_error(-nnnn,’arrays-mismatch’);
end if;
continue
Procedure (cont)
-- for all triggered values on fact
For index in SELF.value_list.FIRST .. Value_list.LAST LOOP
dim_columns := attr_list(index).target_dim_column;
dim_values := value_list(index);
stmt := ‘insert into || attr_list(index).target_dim_schema ||
‘.’ || attr_list(index).target_dim_table || ‘(‘ ||
dim_columns || ‘ ) values ( ‘ || dim_values || ‘)’;
begin execute immediate stmt;
exception
when dup_val_on_index then
null; -- value already exists for that dimension
when others then error_management(index,sqlcode||sqlerrm);end;
end loop; end; END; -- type body
Execution model using trigger
Before insert / update on sales
Pragma: commit is not allowed inside a trigger body
Oracle object instantiation and initialization
initialization of incoming values fired by the trigger to match with attribute list on metadata
Call to object method to automate dimension values based on metadata definition for each column dimensions
Trigger body PL/SQL block
declare
pragma autonomous_transaction;
obj dimension_integrity :=
dimension_integrity(own_tbl
(‘STAR_MODEL’,’SALES’),
null, validation_list (
:new.branch_id,
:new.product_id,
:new.sc_id ));
begin
obj.automate_dimension;
end;
Dimension table output example
Dimension-views distribution Dynamic PL/SQL block example
DECLARE
stmt varchar2(400);
BEGIN
FOR views in ( select distinct dim_name from metadata_attribute ) LOOP
stmt :=‘create view ‘|| views.source_schema||’.’||views.dim_name ||
‘ as select dim_value CODE , dim_description DESCRIPTION
from dimension
where dim_name = :1‘;
execute immediate stmt using views.dim_name;
END LOOP;
END;
Star-schema gets complete & up to date forever
Benefits, challenges & future enhancements
Benefits
Challenges & future enhancements
Join attribute table with dba_trigger_cols to read_dimension_attributes impacts performance.
Avoid reading attributes from every loop iteration. This can be done if process is implemented from external application but modularity can be lost
Implement call to database store procedures using metadata “procedure_name”, present in metadata attributes but not implemented in this presentation
Implement Oracle events to control changes on triggered columns on the fly
Summary
Custom process to populate Fact tables
Metadata table describe central dimension tables and error management across the model
Dimension tables support dimension values and descriptions on the fly because of dynamic views implementation to distribute real dimensions
Oracle core object, PL/SQL and events maintain the model continuously up to date
Integration on star-schema are always in place and can be managed externally without re-coding. Oracle Discoverer models allows to maintain MIS cubes to follow up
Thank you