Saturday, June 20, 2009

Tuning by Batch processing in PL/sqL (BULK COLLECT, FORALL and Associative Arrays)

Tuning Batch processing in PL/sqL
(BULK COLLECT, FORALL and Associative Arrays)

BULK COLLECT (for Batch Query processing)
==================================
BULK COLLECT reduces FETCH operations by fetching many rows at a time into a collection.It should always be used in conjunction with the LIMIT clause where there is a risk of fetching too many rows, as the resultant collection would exceed memory limitations and start paging to disk.Anywhere from 100 to 1000 is a practical LIMIT; it is not important to find the "perfect" number as there is a law of diminishing returns.

BULK COLLECT with the LIMIT clause is only available with explicit FETCH statements, which is annoying because it precludes the tidy and intuitive Implicit Cursor FOR loops.

1. Implicit Cursor FOR Loop: tidy, intuitive, and very, very, slow
------------------------------------------------------------------
FOR empl IN (
SELECT ename
FROM emp
) LOOP
<... some statements ...>
END LOOP;

2. Implicit Cursor BULK COLLECT: efficient, but not scalable
---------------------------------------------------------
SELECT ename
BULK COLLECT INTO empl_tab
FROM emp;

FOR i IN empl_tab.FIRST .. empl_tab.LAST LOOP
<... some statements ...>
END LOOP;

3. Explicit Cursor BULK COLLECT with LIMIT: fast and scalable, but wordy and complex
----------------------------------------------------------------------------

DECLARE
CURSOR empl_cur IS
SELECT ename
FROM emp;

TYPE empl_tab_typ IS TABLE OF EMPL_CUR%ROWTYPE;
empl_tab EMPL_TAB_TYPE;
BEGIN
OPEN empl_cur;
LOOP
FETCH empl_cur BULK COLLECT INTO empl_tab LIMIT 1000;
EXIT WHEN empl_tab.COUNT = 0;

FOR i IN empl_tab.FIRST .. empl_tab.LAST LOOP
<... some statements ...>
END LOOP;

EXIT WHEN empl_cur%NOTFOUND;
END LOOP;
CLOSE empl_cur;
END;

Another Example :
-----------------------------

CREATE OR REPLACE FUNCTION stlbas.dfn_month_bal_pf_anl (
p_compcode IN VARCHAR2,
p_corcod IN VARCHAR2,
p_fromdate IN DATE,
p_todate IN DATE,
p_deb_cre IN VARCHAR2
)
RETURN NUMBER
IS
CURSOR my_cur
IS
SELECT SUM (NVL (debit, 0)) - SUM (NVL (credit, 0)) bal_d,
SUM (NVL (credit, 0)) - SUM (NVL (debit, 0)) bal_c
FROM mv_sttrailb
WHERE compcode = DECODE (p_compcode, 'All', compcode, p_compcode)
AND glcode = p_corcod
AND doctdate BETWEEN p_fromdate AND p_todate;

TYPE my_cur_type IS TABLE OF my_cur%ROWTYPE;

my_rec my_cur_type;
BEGIN
OPEN my_cur;

LOOP
FETCH my_cur
BULK COLLECT INTO my_rec LIMIT 1000;

FOR i IN 1 .. my_rec.COUNT
LOOP
IF p_deb_cre = 'D'
THEN
RETURN (my_rec (i).bal_d);
ELSIF p_deb_cre = 'C'
THEN
RETURN (my_rec (i).bal_c);
END IF;
END LOOP;

EXIT WHEN my_cur%NOTFOUND;
END LOOP;
END;
/

The FORALL Statement ( for INSERT, UPDATE, and DELETE) in pl/sql
=====================================================

Unlike the FOR loop, FORALL is more truly a statement than a loop; it can only repeat a single DML statement. Every INSERT, UPDATE, and DELETE statement in a PL/SQL program performs a context switch from the PL/SQL engine to the SQL engine. But by loading all of the rows to be processed into a collection and performing the DML inside a FORALL statement, the DML and the collection are passed to the SQL engine and run with a single context switch. Exceptions can be captured using the SAVE EXCEPTIONS clause and are returned in the SQL%BULK_EXCEPTIONS collection.

DECLARE
empl_tab_type IS TABLE OF emp_cur%ROWTYPE; -- collection of records
empno_tab_type IS TABLE OF emp.empno%TYPE; -- collection of scalars

empl_tab EMPL_TAB_TYPE;
empno_tab EMPNO_TAB_TYPE;

-- create an exception handler for ORA-24381
dml_errors EXCEPTION;
PRAGMA EXCEPTION_INIT(dml_errors, -24381);

BEGIN
: :
FORALL i IN empl_tab.FIRST .. empl_tab.LAST SAVE EXCEPTIONS
UPDATE emp
SET ROW = empl_tab(i)
WHERE empno = empno_tab(i);
: :

EXCEPTION
WHEN dml_errors THEN
FOR i IN 1 .. SQL%BULK_EXCEPTIONS.COUNT LOOP
<... handle the errors ...>
END LOOP;
END;

As demonstrated in this example, it is currently a limitation of PL/SQL that SQL statements can reference only entire rows of a collection, not individual data items within a record. Entire rows are useful when updating with SET ROW (as above) or record-wise inserting, but you must maintain separate scalar collections to reference individual values in the WHERE clause. Many find it easier to use only scalar collections and ignore the attractions of record-wise inserts and updates.

Associative Arrays
=============================================

Single row lookups in PL/SQL should never be performed as SQL statements (or cursors) inside a cursor loop. Instead, they should be merged into the main cursor using a table join; use an outer-join if there is a chance the row may not exist in the lookup table. This is still an efficient method even if the conditional PL/SQL logic does not need the joined
data for every row returned.

Some expert systems may have complex or iterative branching logic that makes the number of lookup permutations impractical. In these situations, the lookup tables can be cached in PL/SQL using an associative array (Index-By Table) so that lookups do not need a context switch.

DECLARE
TYPE currency_lkp_type IS TABLE OF currency%ROWTYPE
INDEX BY CURRENCY.CURRENCY_CODE%TYPE;
currency_lkp CURRENCY_LKP_TYPE;
BEGIN
FOR curr_rec IN (SELECT * FROM currency) LOOP
currency_lkp(curr.currency_code) := curr_rec;
END LOOP;

: :
FOR i IN txn_tab.FIRST .. txn_rec.LAST LOOP
: :
-- Lookup the currency table without using SQL

txn_tab.currency_desc := currency_lkp( txn_tab(i).currency_code );
: :
END LOOP;
END;

Note that this example breaks a cardinal rule by loading the associative array using an implicit-cursor FOR loop. It would be possible to BULK COLLECT the rows into a nested table, and then transfer the nested table row-by-row into the associative array, but it seems like overkill for downloading a small table just once for the entire job.

1 comment:

Anonymous said...

good article!