Function of OPTIMIZE FOR n ROWS in DB2

OPTIMIZE FOR n ROWS:

 Ø       This method enables the programmers to specify the estimated maximum number of rows

that will be retrieved.

Ø       It provides more information to DB2’s optimization process, thereby giving DB2 the opportunity to establish a better access path.

Ø       This type of tuning is preferable to both updating the DB2 catalog statistics and tweaking the SQL statements.

Ø       Trick here is choosing the number ‘n’ (i.e., estimated number of rows).

An accurate estimate ‘n’ gives DB2 the best opportunity to achieve optimum performance

for the SQL statement.

Ø       For Online transactions, where only 25 rows are to be retrieved and shown on the screen use OPTIMIZE FOR 25 rows.

 

Example:

Assume there are 50,000 records in the TMS941.SHIPMENT table.

Consider the following query 

SELECT S.*

FROM TMS941.SHIPMENT S

ORDER BY SHIPMENT_NBR

FETCH FIRST 5 ROWS ONLY;

 

Here we use “FETCH FIRST 5 ROWS ONLY” clause which stops the query at the 5th row.

This query first does a tablespace scan, then sorted all 50,000 matching rows and at last fetched first five resulting in a poor performance.

 EXPLAIN RESULTS: Fetch first 5 rows - 2.837 elapsed seconds

 

Tuned Query:

SELECT S.*

FROM TMS941.SHIPMENT S

ORDER BY SHIPMENT_NBR

FETCH FIRST 5 ROWS ONLY

OPTIMIZE FOR 5 ROWS;

 

EXPLAIN RESULTS: Fetch first 5 rows - 0.003 elapsed seconds

 

Reason:

 This query is essentially the same as the previous query, but this time we told DB2 to optimize the query for fetching five rows (estimated no of rows).

Unfortunately in the first query DB2 optimizer doesn’t know how to optimize the query and hence did a tablespace scan resulting in poor performance. Whereas in the tuned query it did know how to optimize (since we influenced the optimizer to follow better access path) and thus did an

Index scan with index on SHIPMENT_NBR for retrieving 5 rows resulting in optimum performance.

 

Note:

FETCH FIRST ‘n’ ROWS ONLY will limit the result set to ‘n’ rows.

But OPTIMIZE FOR ‘n’ ROWS will not limit the result set. It is just an OPTIMIZER Directive to chose the optimized path and the limit of result set lies with the WHERE clause.

comments powered by Disqus