Oracle Openworld Directory Index




Advanced SQL Tips for Developers and DBAs

Session 809
Tim Quinlan
TLQ Consulting Inc.

Introduction
We will cover advanced SQL concepts, real-world problems and solutions and the performance impact of advanced SQL coding decisions. Ansi-compliant & proprietary Oracle functions and features as well as some creative solutions to SQL problems will be discussed. This will be a fast-paced look at a fun-topic. Some of the things we will look at are:
Dealing with sub-queries
Limiting rows and Displaying Data
Comparing Join Techniques
Rankings, Medians, Intervals and Best-Fit
View Tricks
and more…

Dealing With Subqueries
Correlated Subquery: What is it?
A subquery is Correlated when it is joined to the outer query within the Subquery. E.g.
Select * From Cust Where cust.city = ‘Chicago’ and
Exists (Select cust_id From Sales s where s.ttl_sales >
10000 and sales.cust_id = Cust.cust_id);
the last line in the above query is a join of the outer Cust table and inner Sales tables. The outer query is read and for each Cust in Chicago, the outer row is joined to the Subquery. i.e., the inner query is executed once for every row read in the outer query.
efficient where a small number of rows are processed - not when a large number of rows are read.

Dealing with Subqueries
Non-correlated Subquery: What is it?
A subquery is said to be uncorrelated when the two tables are not joined together in the inner query. The inner (sub) query is processed 1st and the result set is joined to the outer query. E.g.  
Select last_name, first_name
From Customer Where customer_id IN
(Select customer_id From Sales where
sales.total_sales_amt > 10000);
Sales table is processed first and all entries with a total_sales_amt > 10000 will be joined to the Customer table.
Efficient where a large number of rows is being processed.

Turn Subqueries into Joins
When possible, use joins rather than subqueries
The query on the previous slide becomes:
Select cust.last_name, cust.first_name
From Customer cust, Sales
Where cust.customer_id = sales.customer_id
and sales.total_sales_amt > 10000;
Gives the optimizer more choices when deciding on query plan
optimizer can choose between nested loop, merge scan, hash and star joins when a Join is used.
The options are limited when the compiler and optimizer are presented with a Subquery.
In and Exists: The In Clause
Subqueries can be written with ‘IN’ and ‘EXISTS’.
In general, the IN clause is used in the case of a non-correlated subquery where the inner query is processed first and the temporary result set table that is created is joined to the outer table.
The optimizer is more likely to transform In into a join.
efficient for queries that process a large number of rows. An example of the use of the IN clause is shown below:
Select last_name, first_name
From Customer Where customer_id IN
(Select customer_id From Sales
where sales.total_sales_amt > 10000);

In and Exists: The Exists Clause
EXISTS is used as a correlated subquery where the outer query is processed 1st, and as each row from the outer query is retrieved, it is joined to the inner query.
The inner query is performed once for each result row in the outer query (as opposed to the ‘IN’ query shown above where the inner query is performed only one time). e.g.
Select last_name, first_name From Customer
Where EXISTS
(Select customer_id From Sales
Where customer.customer_id = sales.customer_id
and sales.total_sales_amt > 10000);
You must understand the number of rows to be processed

In vs. Exists
use a join where possible
use IN over EXISTS (non-correlated vs. correlated subquery).
IN is more likely to translate into a Join than EXISTS
IN executes subquery once. Exists executes 1/outer row
IN is like merge-scan. Exists is like nested-loop join.
Sometimes EXISTS outperforms IN, but in more cases IN will dramatically out-perform EXISTS. In general, IN is better than EXISTS.
EXISTS tries to satisfy the subquery as quickly as possible and returns ‘true’ if the subquery returns 1 or more rows – it should be indexed. Optimize execution of the subquery.

Not In vs. Not Exists
Subqueries may use NOT IN and NOT EXISTS
NOT EXISTS is sometimes more efficient since the database only needs to verify non-existence.
With NOT EXISTS, a value in the outer query that has a NULL value in the inner will be returned.
With NOT IN the entire result set must be materialized.
if the subquery returns NULLS, the results will not be returned.
Not In performs well as an anti-join with cost-based optimizer
Not In with /*+ HASH_AJ */ hint is very fast
Outer Joins are also very fast
Comparing In and Exists
Traces of queries that join 2 tables of 25000 and 20000 rows each using IN and EXISTS
Joins using indexes were similar (in=0.18 cpu vs. exists 0.35 cpu)
Results of IN without indexes are:
call count cpu elapsed disk query current rows
----- -------- ----- ---------- ------ ------- --------- ------
total 5 0.45 0.76 97 713 16 1
 Results of EXISTS without indexes are:
call count cpu elapsed disk query current rows
----- ------- ------- ---------- ------ ---------- --------- ------
total 4 270.01 277.04 0 5287981 115946 1

Comparing In and Exists
‘IN’ Access Plan
Rows Execution Plan
------- ---------------------------------------------------
0 select statement goal: choose
1 sort (aggregate)
20000 merge join
20001 sort (join)
20000 view of 'VW_NSO_1'
20000 sort (unique)
20000 table access goal: analyzed full of ‘cust_addr_test’
20000 sort(join)
25000 table access goal: analyzed (full) of 'CUST_TEST'

Comparing In and Exists
Exists access plan
Rows Execution Plan
------- -------------------------------------------------
0 select statement goal: choose
1 sort (aggregate)
20000 filter
25001 table access goal: analyzed full of ‘cust_test’
25000 table access goal: analyzed full of
‘cust_addr_test’  

Comparing Not In vs. Not Exists
Comparison with indexes: Using NOT IN
SELECT count(*) FROM cust_test ct WHERE ct.cust_no NOT IN (select cat.cust_no from cust_addr_test cat)

call count cpu elapsed disk query current rows
total 4 291.24 298.83 0 5287981 115946 1

ROWS EXECUTION PLAN
0 select statement goal: choose
1 sort (aggregate)
5000 filter
25001 table access goal: analyzed full of ‘cust_test’
25000 table access goal: analyzed full of ‘cust_addr_test’
Comparing Not In vs. Not Exists
Comparison with indexes: Using NOT EXISTS
SELECT count(*) FROM cust_test ct WHERE
NOT EXISTS (select 1 from cust_addr_test cat where cat.cust_no = ct.cust_no)
call count cpu elapsed disk query current rows
total 4 0.36 0.36 0 50359 5 1

ROWS EXECUTION PLAN
0 select statement goal: choose
1 sort (aggregate)
5000 filter
25001 table access goal: analyzed full of ‘cust_test’
index goal: analyzed (unique scan) of
‘cust_addr_test_a01’ (unique)


Comparing Not In vs. Not Exists
Comparison with indexes
/*+ HASH_AJ */ hint results the same as “Not In” = 290.92 cpu and elapsed time
select /*+ HASH_AJ */ count(*) TEST5_with_indexes from cust_test ct where ct.cust_no not in
(select cat.cust_no from cust_addr_test cat)
Outer join is fastest at 0.29 cpu and elapsed
select count(*) TEST6_with_indexes
from cust_test ct, cust_addr_test cat
where ct.cust_no = cat.cust_no (+) and cat.cust_no is null

Comparing Not In vs. Not Exists
Comparison without indexes
Not In = 295.22 cpu and 303.27 elapsed
Not Exists = 271.44 cpu and 278.45 elapsed
Not in and /* HASH_AJ */ hint = 295.83 cpu and 303.47elapsed
Outer Join = 0.22 cpu and 0.51 elapsed
NOTE: can use Minus to perform anti-join But:
Like a Union, number of columns and types must match. It is limiting
Limiting Rows with Rownum
For each row returned by a query, the ROWNUM pseudo-column returns a number indicating the order in which Oracle selects the row from a table or set of joined rows. This list however is not ordered. An example of this is:
select username,rownum from dba_users order by username; 
USERNAME ROWNUM
------------------------------------------ ---------------
OUTLN 3
SYS 1
SYSTEM 2

ORDER BY may not solve the problem since rownum is applied to the row before they are sorted.  
Limiting Rows with Rownum
can be corrected by retrieving the rows and sorting them in a subquery and then applying the rownum in the outer query. 
greater-than sign used with rownum and a positive integer will never return a row
SELECT username,rownum FROM
(SELECT username FROM dba_users
ORDER BY username)
WHERE ROWNUM < 4;


Limiting Rows with Rownum

Greater-than sign used with rownum and a positive integer will not return a row
To show the last 3 rows we therefore cannot use >, but must instead use < and must order the rows descending:
SELECT username,rownum FROM (SELECT username FROM dba_users ORDER BY username desc) WHERE ROWNUM < 4
USERNAME ROWNUM
------------------ ----------------
TESTUSER 1
SYSTEM 2
SYS 3  
Limiting Rows with Row_Number
Row_Number function not related to Rownum.
an analytic function that assigns a unique number in the sequence field defined by ORDER BY to each row in a partition. e.g.

SELECT sales_rep, territory, total_sales, row_number() OVER (PARTITION BY territory ORDER BY total_sales DESC) as row_number FROM sales;

Sales_Rep Territory Total_Sales Row_Number
Simpson 1 990 1
Lee 1 800 2
Blake 5 2850 1
Allen 5 1600 2
Correlated Updates
Be careful when updating 1 table from another.
TABLE1 TABLE2
T1_ID T1_C1 T1_C2 T2_ID T2_C1 T2_C2
1 A AAA 2 W WWW
2 B BBB 4 X XXX

UPDATE table1 a SET a.t1_c1 = ( b.t2_c1 from
table2 b WHERE b.t2_id = a.t1_id);

T1_ID T1_C1 T1_C2
1 AAA
2 W BBB
This is not what we wanted!
Correlated Updates
Add a 2nd correlation to make sure that only those rows we want updated actually are:
Update table1 a
Set a.t1_c1 = (select b.t2_c1 from table2 b
where b.t2_id = a.t1_id)
Where a.t1_id = (select t2_id from table2
where t2_id = a.t1_id);
T1_ID T1_C1 T1_C2
1 A AAA
2 W BBB
This is the result we wanted – always check before committing!

Comparing Joins: Nested Loop
Along with merge-scan, the most common type. e.g.
Select * From Table1 T1, Table2 T2 Where T1.Table1_Id = T2.Table1_id;
for each row in the outer table (Table1), the inner table (Table2) will be accessed with an index to retrieve the matching rows. The next row on Table1 is then retrieved and matched to Table2.
efficient index access is needed on the inner table
Commonly used in OLTP apps.
Useful for a small number of rows & first_rows parm.
Cluster Joins are a special case of Nested-Loop join
Have many drawbacks and are rarely used
Comparing Joins: Merge Scan
aka. sort-merge. Useful for:
processing a large number of rows.
inefficient index access and sorted data
Batch processing and all_rows goal
Fast because of:
database multi-block fetch (helped by init.ora parm db_file_multiblock_read_count) capabilities
The fact that each table is accessed once
Steps performed for Merge-Scan are:
Comparing Joins: Merge-Scan
1) Pick an inner and outer table
2) Access the inner table, choose the rows that match the predicates in the Where clause of the SQL statement
3) Sort the rows retrieved from the inner table by join columns, store these as a Temp table. This step is not performed if data is ordered by the keys and efficient index access exists.
4) outer table may also be sorted by the join columns so both tables to be joined are sorted the same way. This step is optional and dependent on whether the outer table is well ordered by the keys and whether efficient index access can be used.
5) Read outer & inner (likely sorted temp) tables, get rows that match the join criteria. This is quick due to sorted data.
6) Optionally sort the data if a Sort was performed (e.g. 'Order By') using different columns than used to perform the join.
Comparing Joins: Hash Join
very efficient join when used in the right situation: when 1 of the 2 tables is small.
The larger of the 2 tables is chosen as the Outer table
Outer and inner are broken into sections and the inner Tables join columns are stored in memory (if hash_area_size is large enough) and 'hashed'.
hashing provides an algorithmic pointer that makes data access very efficient.
Oracle attempts to keep the inner table in memory since it will be 'scanned' many times.
Outer rows that match the query predicates are then selected and for each Outer table row chosen, hashing is performed on the key and the hash value is used to quickly find the matching row in the Inner Table.
Comparing Joins: Hash-Join
This join can often outperform a Sort Merge join, particularly when 1 table is much larger than another.
No sorting is performed and index access can be avoided since the hash algorithm is used to locate the block where the inner row is stored.
Hash-joins are also only used for equi-joins.
Other important init.ora parms are: hash_join_enabled, sort_area_size and hash_multiblock_io_count.
Comparing Joins: Star-Joins
A join common to Data Marts and Data Warehouses.
a join of a large "Fact" table with 2 or more smaller tables commonly called "Dimensions". Fact tables have transactional properties. The Dimensional tables are used to describe the Fact table (customer, product).
Star queries get their name because there is a central Fact table surrounded by smaller dimensional tables that are directly related to the Fact table
Consider the case of the central Fact table that is being joined to 3 smaller Dimensional table. They are transformed from the writted query on the left below to the transformed one on the right.
Comparing Joins: Star Joins
ORIGINAL QUERY
SELECT *
FROM Fact, Dim1, Dim2,
Dim3
WHERE
Fact.dim1_id = Dim1.id and
Fact.dim2_id = Dim2.id and
Fact.dim3_id = Dim3.id and
Dim1.name like :in_var1 and
Dim2.desc between :in_var2
and :in_var3
and Dim3.Text < :in_var4;
Comparing Joins: Star-Joins
The subselects are performed first. Bitmap indexes on Fact join columns, are merged (in this case ANDed) & Fact rows can be accessed using the resulting index values. The Fact rows retrieved are then joined to the Dimensions to complete the query.
Using this approach, a Cartesian product is not required.  
To implement star_query transformation:
Set init.ora parm star_transformation_enabled
Create bitmap indexes for all of the foreign-key columns on the fact table
Implement R.I. Only between the fact table and the dimension tables.
Full Outer Joins – 9i
Oracle9i has “many” SQL99 features
Old Proprietary Outer join:
select t1.t1col01, t2.t2col01
from t1, t2
where t1col01 = t2col01 (+)
UNION
select t1.t1col01, t2.t2col01
from t1 , t2
where t1.t1col01 (+) = t2.t2col01;
Full Outer Joins – 9i
THE SAME RESULT AS

Select t1.t1col01, t2.t2col01
From t1 FULL OUTER JOIN t2
ON t1col01 = t2col01
order by t1.t1col01;

Ansi SQL99 compliant syntax for full, left, right outer joins.
Above only processes T1 and T2 one time.
Bitmap Join Index- 9i
1 index spans multiple tables
Consider 2 tables: Vendor ? Product
Create Bitmap Index Vendor_Product_BJI
On Product(Vendor.vendor_name)
From Vendor, Product
Where Vendor.vendor_id = Product.vendor_id;
SQL below will use the bitmap index
Select product.product_name
From Product, Vendor
Where Product.vendor_id = Vendor.vendor_id
And vendor.name = ‘ORACLE’;


Bitmap Join Index – 9i
Sample bitmap join index values
Prod Row Vendor=oracle vendor=other
1 1 0
2 0 1
3 1 0
Can perform access to fact table thru bitmap join index without accessing the dimension table
Can create for more than 2 tables; more than 1 column; as a snowflake schema.
Useful for star queries
Bitmap indexes store null values.
Bitmap Join Index Restrictions
Parallel DML only on the fact table
Only 1 table can be updated concurrently by different transactions.
A table can only appear 1 time in a join
Cannot create on IOT or Temporary table
Index columns must be on dimension tables
Join columns must have primary/unique constraints
All composite primary key columns must be included
Ranking: The Rank Function
This is an analytic function which allows us to compare a row to a window of rows. Consider a Sales table:
sales_rep territory total_sales
Jones 1 345
Smith 1 345
Lee 1 200
Simpson 1 990
SELECT sales_rep, territory, total_sales, RANK() OVER (PARTITION BY territory ORDER BY total_sales DESC) as rank FROM sales;
sales_rep territory total_sales rank
Simpson 1 990 1
Jones 1 345 2
Smith 1 345 2
Lee 1 200 4
Ranking and Aggregates
sales_rep terr prod total_sales
Jones 1 8 200
Smith 1 8 400
Lee 1 9 800
Simpson 1 9 990
Blake 5 9 1600
Allen 5 8 1500
Ward 5 9 1250

Rankings With Different Boundaries
Rank can be used for different groups. Here are 2 ranks: One for a products total sales in a territory and the second for a product (in a territory) across all territories.
Rankings
Rankings are extremely flexible and provide the following:
Ranking per-cube and rollup-group
Dense Rank vs. Rank
Handles ties by going to the next value
Cume-Dist Ranking
Dist computes a fraction of a value relative to its position in its partition. It returns the result as a decimal between 0 (not including 0) and 1.
Terr Prod Amt Cume_Dist
1 5 800 .666667
1 7 300 .333333
1 8 1300 1
Rankings
Percent Rank Function: like cume_dist but uses row counts as a numerator and returns values between 0 and 1.
Ntile Function: perform calculations and statistics for tertiles, quartiles, deciles and other summary stats:
SELECT sales_rep, total_sales, NTILE(4)
OVER (ORDER BY total_sales DESC NULLS FIRST)
AS quartile FROM sales;
Sales_Rep Total_Sales Quartile
Jones 2000 1
Smith 1000 2
Blake 700 3
Ward 400 4
First/Last Functions – 9i
Analytic, aggregate functions that operate on a set of values from a set of rows
When you need the lowest or highest value from a sorted set to compare to another value from a function such as min, max, sum,avg, count. Use the Last and First analytic functions
Find the max salary of employees with the highest bonus and the lowest salary of employees with the lowest bonus

First/Last Functions – 9i
SELECT deptid,
min(salary) keep (dense_rank FIRST order by bonus) "low",
max(salary) keep (dense_rank LAST order by bonus) "high",
FROM emp_salary group by deptid;

Group By vs. Order By
With Group By, grouping is done on the tables in the From clause (on the input)
ordering is done on the columns in the select list (on the output). This leads to the following:
‘order by’ columns need to be in the select list while ‘group by’ columns do not.
‘order by’ can use an integer and expression while ‘group by’ can not.
You may need to Group By 1 way and Order By another
Group By does not guarantee that rows will be returned in that order.
Group By and Order By
Example:
Select postal_code, customer_id, count(*)
From Sales
Group by postal_code, customer_id
Order by customer_id, postal_code;
The above query groups the data on Input to perform the count(*)
then orders the data differently on the output to display the data by customer_id and postal_code.
Having clause filters data after the Group By
Where clause filters data before the Group By

Group By with Rollup
Group by can perform a function on a grouping
Select region, territory, sum(sales_dollars) TOTAL_SALES
From sales group by region, territory;
Results in:
REGION TERRITORY TOTAL_SALES
EAST 1 1500.00
EAST 2 2000.00
WEST 1 3000.00
WEST 2 500.00
ROLLUP extends this and can also summarize at the Region level
Group By with Rollup
Select nvl(region,’Total Company’) REGION,
nvl(territory, ‘Total Region’) TERRITORY,
sum(sales_dollars) TOTAL_SALES
FROM sales GROUP BY ROLLUP(region, territory);
Results in (note substitution of literals from nvl):
REGION TERRITORY TOTAL_SALES
EAST 1 1500.00
EAST 2 2000.00
EAST Total Region 3500.00
WEST 1 3000.00
WEST 2 500.00
WEST Total Region 3500.00
Total Company 7000.00
Group By with Cube
Cube can give totals for each Territory regardless of Region.
Cube gives us totals for all combinations of Columns chosen in the Group By clause for OLAP Services
Select decode(grouping(region),1,’Total Company’) region,
decode(grouping(territory),1, ‘Total Region’) territory,
sum(sales_dollars) Total_Sales
FROM sales
GROUP BY CUBE (region, territory);
Decode is a translation that changes the grouping indicator of ‘1’ to another value of ‘Total Company’ or ‘Total Region’.
Rollup and Cube return a value of 1 if NULL results from CUBE or ROLLUP and returns 0 if it is a natural result.
Group By with Cube
Cube result From the previous query
REGION TERRITORY TOTAL_SALES
EAST 1 1500.00
EAST 2 2000.00
EAST Total Region 3500.00
WEST 1 3000.00
WEST 2 500.00
WEST Total Region 3500.00
Total Company 1 4500.00
Total Company 2 2500.00
Total Company Total Region 7000.00

Grouping Sets – 9i
Enhances groupings with Cube and Rollup
Can specify the exact level of aggregation.
Aggregations across 3 different groupings.
Cube needs many groupings
Union All uses 3 queries



Lead Analytic Function
Get the next value in a list without a self-join or sub-query
E.g. a table employee with columns ename and hiredate. Develop a query where each row has the employees name, their hiredate and the next employees hire date.
SELECT ename, hiredate, LEAD(hiredate, 1)
OVER (ORDER BY hiredate) AS next_hire_date
FROM employee;
ENAME HIREDATE NEXT_HIRE_DATE
COHEN 1991-APR-01 1991-OCT-31
KING 1991-OCT-31 1992-JAN-10
LEE 1992-JAN-10
Offset of 1 (default) tells the function to get the next row.
Lag Analytic Function
To get the date of the employee hired before the employee on a row, use the LAG analytic function:
SELECT ename, hiredate, LAG(hiredate, 1)
OVER (ORDER BY hiredate) AS prev_hire_date
FROM employee;
ENAME HIREDATE PREV_HIRE_DATE
COHEN 1991-APR-01
KING 1991-OCT-31 1991-APR-01
LEE 1992-JAN-10 1992-OCT-31
Great for determining effective and expiry dates on a row where only 1 date exists.
Conclusion
Become familiar with Oracle’s SQL functions.
Try out functions as well as your own solutions to problems. This can help you improve your SQL skills and can build a repertoire that will help your most experienced developers.
SQL is not always straight-forward.
In fact, it is becoming more complicated.
We can now do things in native SQL that used to only be possible in advanced query packages.
SQL can be fun! (or at least interesting)

Bonus Section
A few extra slides we may not have time for
DDL and DML in a transaction
Medians
Histograms
Instead of Triggers
Theta-Joins
Displaying Intervals Flexibly
DDL and DML in a Transaction
Be careful where you place the DDL statements
DDL performs an implicit commit
Begin Transaction
Insert into …
Create Table table_name …
Update …
Rollback Transaction
The Rollback statement will only undo the last Update statement.
Medians in SQL
Not supported by standard SQL: e.g. from Celko and Date. Look at an example of 4 Salaries and find the median of $80,000.
First, split the table in 2 and get the lowest value of the top half rows
Medians in SQL
Medians in SQL
Select avg(E.salary) AS median From Emp_Salary E Where E.salary in
(Select MIN(e.salary) FROM emp_salary e
Where e.salary in
(Select E2.salary FROM Emp_Salary E1, Emp_Salary E2 WHERE E2.salary <= E1.salary
GROUP BY E2.salary HAVING count(*) <=
(Select CEIL(count(*) /2) FROM Emp_Salary))
UNION
Select MAX(e3.salary) FROM emp_salary e3 Where e3.salary in
(Select E4.salary FROM Emp_Salary E5, Emp_Salary E4
WHERE E4.salary >= E5.salary GROUP BY E4.salary
HAVING count(*) <=
(Select CEIL(count(*) /2) FROM Emp_Salary)));

Medians in 9i
Can use the new inverse percentile function.
an inverse distribution function that assumes a discrete distribution model
An expression evaluates a value to a distribution between 0 and 1 as with cume_dist.
The inverse percentile can then find the value at the 0.5 level.
In the example on the next slide this is 100,000
Using 0.51 would have given 60,000

Medians in 9i
SELECT salary, deptid, CUME_DIST() OVER
(PARTITION BY deptid ORDER BY salary DESC) cume_dist,
PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY salary DESC) OVER (PARTITION BY deptid) percentile_disc
FROM emp_salary ;


Histograms in Oracle
Useful to the optimizer when data is skewed and frequently queried
Not useful if predicates use bind variables, if data is evenly distributed or data is unique and queried with “=“ predicate.
Created with DBMS_STATS package or analyze statement
Oracle uses height-based histograms
If we specify 10 buckets, then Oracle divides the table into 10 equal parts and places the value of the column at that place in the table. For a skewed column, the value may look like this:
|-----|-----|------|------|-------|-------|--------|------|------|
1 3 3 1000 3000 3010 3010 7000 8000 40000
10% of the values column are > 8000 and 90% <= 8000.
Histograms in Oracle
You can view histograms in the following tables:
ALL|DBA|USER_histograms, _part_histograms, _subpart_histograms, _tab_columns, _tab_col_statistics.
SELECT endpoint_number, endpoint_value
FROM dba_histograms WHERE table_name = 'TESTTQ‘
and column_name = 'COL03';
endpoint_number endpoint_value
0 0
1 10
2 14
… …
10 220
Histogram Function in 9i
Not Optimizer Histograms: Height-based place the same number of values into each range
Width-based function: each column value is put into a corresponding bucket
For each row, returns the number of the histogram bucket for the data
(expr,min_value,max_value,num_buckets)
Histogram Function in 9i
Instead of Triggers
allows Views that violate normal Update rules for Views
can take an Insert, Update or Delete statement and rewrite it so that each Table in the View is updated individually. If for example, the following View is created:
Create View View1 as
Select a.*, b.*
From Table1 a, Table2 b
Where a.id = b.id;
The following SQL Insert statement is invalid because it attempts to insert into a joined View:
Insert into View1 Values (:var_1, :var_2, ... :var_n);
Instead of Triggers
This rule can be bypassed and the Insert successfully completed in Oracle if the following 'Instead of' Trigger is created on View1:
Create Trigger testtrigger Instead Of Insert on View1
For Each Row
Begin
Insert into Table 1 Values (:new.value_list...);
Insert into Table2 Values (:new.value_list...);
End
Fired for each row; Check clause on View not enforced; Before and After clause cannot be used.
 
Theta-Joins: Best Fit Join
Developed by Codd fit one list into another where they do not match exactly.
Example (from Celko) fitting classes into rooms:
CLASSES ROOMS
class_key class_size room_number room_size
C4 50 R4 45
C5 40 R5 55
R6 60
HOPE TO BUILD THE FOLLOWING RESULT :
Class_key class_size room_number room_size
C5 40 R4 45
C4 50 R5 55

Theta Joins: Best-Fit Joins
Step 1 create temporary table joining Class and Room where the class_size is <= the room_size and that chooses the smallest room_size.
create table temp_table as (Select class_key, class_size, min(room_size) room_size From Rooms R, Classes C Where c.class_size <= r.room_size Group By class_key, class_size);
Step 2) join temp_table created above with the rooms table.
Select c.class_key, c.class_size, r.room_number, r.room_size
From Temp_Table C, Rooms R
Where c.room_size = r.room_size;
Above can be performed in 1 step using an inline view.
Displaying Intervals Flexibly
A approach for displaying intervals is to generate ranges without hard-coding them is shown below.
Select 100*TRUNC (items/100) + 1 || ' TO ' || 100*(TRUNC (items/100) + 1) "Interval", count(*) "Customers“ From Customers Group by TRUNC(items/100); 
Interval Customers
1 TO 100 22567
101 TO 200 10952
201 TO 300 6390
301 TO 400 605
Anyone issuing queries on x$kcbrbh to tune buffer pools will recognize this syntax.