Oracle Openworld Directory Index
Overview
The NAWQA Program
Business Problem
Solution
Oracle Discoverer
Other Applications
Advanced 9i Features Used
USGS NAWQA Program
National Water Quality Assessment
Started in 1991 to collect and interpret water quality data on the water resources of the Nation – stemming from Clean Water Act
A nationally consistent approach to assess conditions of water quality, causes of those conditions and trends so that water managers have a sound scientific basis on which to act
USGS NAWQA Program (cont’d)
NAWQA Data Warehouse
http://water.usgs.gov/NAWQA/data
Developed in 1999 to give USGS researchers a single integrated system of multiple legacy datasets on which to perform ad hoc queries
5M+ water and biological results
$70M to collect + lab analysis
Largest consistently collected set of water-quality results in the Nation
Perhaps, World
NAWQA Data
Environmental systems have many dimensions and pathways determining status
Data consists of water quality, groundwater levels, river flow, and biological community data with thousands of GIS derived land use attributes
Business Problem
Integrate legacy data sources
Make ad hoc querying available with out-of-the-box solution
Deliver Data to the Public
Design Philosophy
Minimize custom coding
Minimize maintenance costs
Public Delivery
Minimize burden upon NAWQA scientists in providing linked datasets to public
Make data self-serviceable on the Web
Provide parameterized query forms with minimal custom coding
Legacy Data Sources
Corporate database (NWIS) on Computer Associates Ingres
Biological database on Microsoft SQL Server
GIS data in text files
Enhanced Capabilities (Wish List)
Provide spatial interface with linkage to dataset
Data Mining
Analytical Processing
Graphing capabilities
Solution
Data Warehouse
Oracle Consulting for initial development
Iterative design
Prototyping
Technology Stack to minimize custom coding
Technology Stack
Oracle 9i RDBMS (9.0.1.1.1)
Informatica Power Mart (5.1)
Oracle 9iAS Discoverer (4.1.40.02.00)
MapInfo MapXtreme Java (4.0)
Oracle 9iAS Portal (3.0.9)
Data Loading – Informatica Power Mart
Access data in various formats on multiple platforms
Repository to store plans and metadata on Oracle RDBMS
GUI interface for rapid development
Server based to reduce processing burden from developer workstations
Oracle Discoverer
Ad Hoc Query Software – Oracle Discoverer
Robust, easy to use front end
Capable of complex data retrievals and analysis
Three-tier Java Applet deployment
Reduce burden of software distribution and installation
Deploy on multiple platforms
Scalability
End user layer shared by all application versions, Viewer, Plus and Desktop
Provides internal and public user access through Java Applet and thin HTML client
Oracle Discoverer (continued)
Presentation and security customization for multiple audiences through multiple end user layers
Custom items in end user layers offer powerful features to end users
Aggregation items - user defines aggregation level and custom item “rolls up”
Hyperlink items - links to other web applications customized based on NAWQA values passed in URL
Meets Core Requirements
Ad hoc query
Export to various formats
Very little custom-coding
Dynamic reporting publishable to web
Value Added Features
Rollups
Drilling
Metadata support
Sharing with other Discoverer users or the Public
Aggregation Example
Oracle Discoverer Hierarchies
Hierarchies defined to accommodate data drilling
Based on common business logic pathways
Allows both Plus and Viewer users to drill on aggregates of interest
Data Drilling Example
Oracle Discoverer Metadata Support
Gateway to Oracle Designer
Easy integration of metadata stored in Oracle Repository with the End User Layer
Provides Folder (Table) and Item (Column) level metadata to assist users in choosing from hundreds of fields available in star schema
Oracle Discoverer Query Sharing
Ability to share with colleagues
Parameterized queries for non-Discoverer trained colleagues, accessible with thin HTML client
Oracle Discoverer for Public Presentation of NAWQA Data
Discoverer workbooks developed in Plus immediately available in Viewer
Full featured but thin HTML interface
Parameterized Reports give public users filter capabilities
Provisional data hidden with mandatory filters in public end user layer
Discoverer Viewer Example
Discoverer Viewer Example (2)
Oracle 9i Advanced Features
Automatic Program Global Area Memory Management
Many NAWQA queries entail extensive sorting
In 8i running dedicated server mode, memory-limited by sort_area_size init.ora parameter – swap with temporary tablespace
In 9i, PGA memory management has reduced disk-sorts to near zero
Marked improvement in query performance
See Paper #822, Performance and Scalability in DSS for more details
Oracle 9i Advanced Features (cont’d)
Materialized Views
Discoverer Automated Summary Management
Query re-write for improved performance
Custom materialized views for common reports
Oracle Spatial
MapInfo MapXtreme Java Mapping interface – query within radius; dynamic layer presentation
Spatial Interface
Oracle 9i Advanced Features (cont’d)
Bitmap Join Indexes
Pre-joins Dimension tables with Fact Table
Dramatically improves multi-table query time response
Oracle 9i Advanced Features (cont’d)
Bench Test Results
Challenges
Management Sponsorship
Training
Corporate Network
Delivering Large Retrievals to Public
Conclusion
New paradigm in water-quality analysis
On Budget and On Time
Visit us at:
http://water.usgs.gov/NAWQA/data
We want to know what you think!
Title: Mining USGS Water-Quality Data
Author: Nate Booth
Visit us at:
http://water.usgs.gov/NAWQA/data