Chapter 1: Do Not Use

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

C H A P T E R 1

Do Not Use

By Riyaj Shamsudeen

Congratulations on buying this book. PL/SQL is a great tool to have in your toolbox; however, you should understand that use of PL/SQL is not suitable for all scenarios. This chapter will teach you when to code your application in PL/SQL, how to write scalable code, and, more importantly, when not to code programs in PL/SQL. Abuse of some PL/SQL constructs leads to unscalable code. In this chapter, I will review various cases in which the misuse of PL/SQL was unsuitable and lead to an unscalable application.

PL/SQL and SQL

Row-by-Row Processing

In a typical row-by-row processing program, code opens a cursor, loops through the rows retrieved from the cursor, and processes those rows. This type of loop-based processing construct is highly discouraged as it leads to unscalable code. Listing 1-1 shows an example of a program using the construct.

Listing 1-1. Row-by-Row Processing

DECLARE

  CURSOR c1 IS

    SELECT prod_id, cust_id, time_id, amount_sold

    FROM sales

    WHERE amount_sold > 100;

  c1_rec c1%rowtype;

  l_cust_first_name customers.cust_first_name%TYPE;

  l_cust_lasT_name customers.cust_last_name%TYPE;

BEGIN

  FOR c1_rec IN c1

  LOOP

    -- Query customer details

    SELECT cust_first_name, cust_last_name

    INTO l_cust_first_name, l_cust_last_name

    FROM customers

    WHERE cust_id=c1_rec.cust_id;

    --

    -- Insert in to target table

    --

    INSERT INTO top_sales_customers (

      prod_id, cust_id, time_id, cust_first_name, cust_last_name,amount_sold

      )

      VALUES

      (

        c1_rec.prod_id,

        c1_rec.cust_id,

        c1_rec.time_id,

        l_cust_first_name,

        l_cust_last_name,

        c1_rec.amount_sold

      );

  END LOOP;

  COMMIT;

END;

/



PL/SQL procedure successfully completed.



Elapsed: 00:00:10.93

In Listing 1-1, the program declares a cursor c1, and opens the cursor implicitly using cursor–for-loop syntax. For each row retrieved from the cursor c1, the program queries the customers table to populate first_name and last_name to variables. A row is subsequently inserted in to the top_sales_customers table.

There is a problem with the coding practice exemplified in Listing1-1. Even if the SQL statements called in the loop are highly optimized, program execution can consume a huge amount of time. Imagine that the SQL statement querying the customers table consumes an elapsed time of 0.1 seconds, and that the INSERT statement consumes an elapsed time of 0.1 seconds, giving a total elapsed time of 0.2 seconds per loop execution. If cursor c1 retrieves 100,000 rows, then the total elapsed time for this program will be 100,000 multiplied by 0.2 seconds: 20,000 seconds or 5.5 hours approximately. Optimizing this program construct is not easy. Tom Kyte termed this type of processing as slow-by-slow processing for obvious reasons.

Note Examples in this chapter use SH schema, one of the example schemas supplied by Oracle Corporation. To install the example schemas, Oracle-provided software can be used. You can download it from http://download.oracle.com/otn/solaris/oracle11g/R2/solaris.sparc64_11gR2_examples.zip for 11gR2 Solaris platform. Refer to the Readme document in the unzipped software directories for installation instructions. Zip files for other platforms and versions are also available from Oracle's web site.

There is another inherent issue with the code in Listing 1-1. SQL statements are called from PL/SQL in a loop, so the execution will switch back and forth between the PL/SQL engine and the SQL engine. This switch between two environments is known as a context switch. Context switches increase elapsed time of your programs and introduce unnecessary CPU overhead. You should reduce the number of context switches by eliminating or reducing the switching between these two environments.

You should generally avoid row-by-row processing. Better coding practice would be to convert the program from Listing 1-1 into a SQL statement. Listing 1-2 rewrites the code, avoiding PL/SQL entirely.

Listing 1-2. Row-by-Row Processing Rewritten

--

-- Insert in to target table

--

INSERT

INTO top_sales_customers

  (

    prod_id,

    cust_id,

    time_id,

    cust_first_name,

    cust_last_name,

    amount_sold

  )

SELECT s.prod_id,

  s.cust_id,

  s.time_id,

  c.cust_first_name,

  c.cust_last_name,

  s.amount_sold

FROM sales s,

     customers c 

WHERE s.cust_id    = c.cust_id and

s.amount_sold> 100;



135669 rows created.



Elapsed: 00:00:00.26

The code in Listing 1-2, in addition to resolving the shortcomings of the row-by-row processing, has a few more advantages. Parallel execution can be used to tune the rewritten SQL statement. With the use of multiple parallel execution processes, you can decrease the elapsed time of execution sharply. Furthermore, code becomes concise and readable.

Note If you rewrite the PL/SQL loop code to a join, you need to consider duplicates. If there are duplicates in the customers table for the same cust_id columns, then the rewritten SQL statement will retrieve more rows then intended. However, in this specific example, there is a primary key on cust_id column in the customers table, so there is no danger of duplicates with an equality predicate on cust_id column.

Nested Row-by-Row Processing

You can nest cursors in PL/SQL language. It is a common coding practice to retrieve values from one cursor, feed those values to another cursor, feed the values from second level cursor to third level cursor, and so on. But the performance issues with a loop-based code increase if the cursors are deeply nested. The number of SQL executions increases sharply due to nesting of cursors, leading to a longer program runtime.

In Listing 1-3, cursors c1, c2, and c3 are nested. Cursor c1 is the top level cursor and retrieves rows from the table t1; cursor c2 is opened, passing the values from cursor c1; cursor c3 is opened, passing the values from cursor c2. An UPDATE statement is executed for every row retrieved from cursor c3. Even if the UPDATE statement is optimized to execute in 0.01 seconds, performance of the program suffers due to the deeply nested cursor. Say that cursors c1, c2, and c3 retrieve 20, 50, and 100 rows, respectively. The code then loops through 100,000 rows, and the total elapsed time of the program exceeds 1,000 seconds. Tuning this type of program usually leads to a complete rewrite.

Listing 1-3. Row-by-Row Processing with Nested Cursors

DECLARE

  CURSOR c1 AS 

    SELECT n1 FROM t1;

  CURSOR c2 (p_n1) AS 

    SELECT n1, n2 FROM t2 WHERE n1=p_n1;

  CURSOR c3 (p_n1, p_n2) AS

    SELECT text FROM t3 WHERE n1=p_n1 AND n2=p_n2;

BEGIN

  FOR c1_rec IN c1

  LOOP

    FOR c2_rec IN c2 (c1_rec.n1)

    LOOP

      FOR c3_rec IN c3(c2_rec.n1, c2_rec.n2)

      LOOP

        -- execute some sql here;

        UPDATE … SET ..where n1=c3_rec.n1 AND n2=c3_rec.n2;

      EXCEPTION

      WHEN no_data_found THEN

        INSERT into… END;

      END LOOP;

    END LOOP;

  END LOOP;

COMMIT;

END;

/

Another problem with the code in the Listing 1-3 is that an UPDATE statement is executed. If the UPDATE statement results in a no_data_found exception, then an INSERT statement is executed. It is possible to offload this type of processing from PL/SQL to the SQL engine using a MERGE statement.

Conceptually, the three loops in Listing 1-3 represent an equi-join between the tables t1,t2, and t3. In Listing 1-4, the logic is rewritten as a SQL statement with an alias of t. The combination of UPDATE and INSERT logic is replaced by a MERGE statement. MERGE syntax provides the ability to update a row if it exists and insert a row if it does not exist.

Listing 1-4. Row-by-Row Processing Rewritten Using MERGE Statement

MERGE INTO fact1 USING

(SELECT DISTINCT c3.n1,c3.n2

 FROM t1, t2, t3

 WHERE t1.n1      = t2.n1

 AND t2.n1        = t3.n1

 AND t2.n2        = t3.n2

) t

ON (fact1.n1=t.n1 AND fact1.n2=t.n2)

WHEN matched THEN

  UPDATE SET .. WHEN NOT matched THEN

  INSERT .. ;

  COMMIT;

Do not write code with deeply nested cursors in PL/SQL language. Review it to see if you can write such code in SQL instead.

Lookup Queries

Lookup queries are generally used to populate some variable or to perform data validation. Executing lookup queries in a loop causes performance issues.

In the Listing 1-5, the highlighted query retrieves the country_name using a lookup query. For every row from the cursor c1, a query to fetch the country_name is executed. As the number of rows retrieved from the cursor c1 increases, executions of the lookup query also increases, leading to a poorly performing code.

Listing 1-5. Lookup Queries, a Modified Copy of Listing 1-1

DECLARE

  CURSOR c1 IS

    SELECT prod_id, cust_id, time_id, amount_sold

    FROM sales

    WHERE amount_sold > 100;

  l_cust_first_name customers.cust_first_name%TYPE;

  l_cust_last_name customers.cust_last_name%TYPE;

  l_Country_id countries.country_id%TYPE;

  l_country_name countries.country_name%TYPE;

BEGIN

FOR c1_rec IN c1

LOOP

  -- Query customer details

  SELECT cust_first_name, cust_last_name, country_id

  INTO l_cust_first_name, l_cust_last_name, l_country_id

  FROM customers

  WHERE cust_id=c1_rec.cust_id;



  -- Query to get country_name

  SELECT country_name

  INTO l_country_name

  FROM countries WHERE country_id=l_country_id;

  --

  -- Insert in to target table

  --

  INSERT

  INTO top_sales_customers

    (

      prod_id, cust_id, time_id, cust_first_name,

      cust_last_name, amount_sold, country_name

    )

    VALUES

    (

      c1_rec.prod_id, c1_rec.cust_id, c1_rec.time_id, l_cust_first_name,

      l_cust_last_name, c1_rec.amount_sold, l_country_name

    );

END LOOP;

COMMIT;

END;

/

PL/SQL procedure successfully completed.

Elapsed: 00:00:16.18

The example in Listing 1-5 is simplistic. The lookup query for the country_name can be rewritten as a join in the main cursor c1 itself. As a first step, you should modify the lookup query into a join. In a real world application, this type of rewrite is not always possible, though.

If you can't rewrite the code to reduce the executions of a lookup query, then you have another option. You can define an associative array to cache the results of the lookup query and reuse the array in later executions, thus effectively reducing the executions of the lookup query.

Listing 1-6 illustrates the array-caching technique. Instead of executing the query to retrieve the country_name for every row from the cursor c1, a key-value pair, (country_id, country_name) in this example) is stored in an associative array named l_country_names. An associative array is similar to an index in that any given value can be accessed using a key value.

Before executing the lookup query, an existence test is performed for an element matching the country_id key value using an EXISTS operator. If an element exists in the array, then the country_name is retrieved from that array without executing the lookup query. If not, then the lookup query is executed and a new element added to the array.

You should also understand that this technique is suitable for statements with few distinct values for the key. In this example, the number of executions of the lookup query will be probably much lower as the number of unique values of country_id column is lower. Using the example schema, the maximum number of executions for the lookup query will be 23 as there are only 23 distinct values for the country_id column.

Listing 1-6. Lookup Queries with Associative Arrays

DECLARE

  CURSOR c1

  IS

    SELECT prod_id, cust_id, time_id, amount_sold

    FROM sales WHERE amount_sold > 100;

    l_country_names country_names_type;

    l_Country_id countries.country_id%TYPE;

    l_country_name countries.country_name%TYPE;

    l_cust_first_name customers.cust_first_name%TYPE;

    l_cust_lasT_name customers.cust_last_name%TYPE;

   TYPE country_names_type IS

   TABLE OF VARCHAR2(40) INDEX BY pls_integer;

   l_country_names country_names_type;

BEGIN

  FOR c1_rec IN c1 LOOP 

  -- Query customer details

  SELECT cust_first_name, cust_last_name, country_id

  INTO l_cust_first_name, l_cust_last_name, l_country_id

  FROM customers

  WHERE cust_id=c1_rec.cust_id;

  -- Check array first before executing a SQL statement



  IF ( l_country_names.EXISTS(l_country_id)) THEN

    l_country_name := l_country_names(l_country_id);

  ELSE

    SELECT country_name INTO l_country_name

    FROM countries

    WHERE country_id               = l_country_id;

  -- Store in the array for further reuse

    l_country_names(l_country_id) := l_country_name;

  END IF;

  --

  -- Insert in to target table

  --

  INSERT

  INTO top_sales_customers

    (

      prod_id, cust_id, time_id, cust_first_name, 

      cust_last_name, amount_sold, country_name

    )

    VALUES

    (

      c1_rec.prod_id, c1_rec.cust_id, c1_rec.time_id, l_cust_first_name,

      l_cust_last_name, c1_rec.amount_sold, l_country_name

    );

END LOOP;

COMMIT;

END;

/

PL/SQL procedure successfully completed.



Elapsed: 00:00:10.89

Note Associative arrays are allocated in the Program Global Area (PGA) of the dedicated server process in the database server. If there are thousands of connections caching intermediate results in the array, then there will be a noticeable increase in memory usage. You should measure memory usage increase per process and design the database server to accommodate memory increase.

Array-based techniques can be used to eliminate unnecessary work in other scenarios, too. For example, executions of costly function calls can be reduced via this technique by storing the function results in an associative array. (The “Excessive Function Calls” section later in this chapter discusses another technique to reduce the number of executions.)

Note Storing the function results in an associative array will work only if the function is a deterministic function, meaning that for a given set of input(s), the function will always return the same output.

Excessive Access to DUAL

It is not uncommon for code to access the DUAL table excessively. You should avoid overusing DUAL table access. Accessing DUAL from PL/SQL causes context switching, which hurts performance. This section reviews some common reasons for accessing DUAL excessively and discusses mitigation plans.

Arithmetics with Date

There is no reason to access DUAL table to perform arithmetic operations or DATE manipulations, as most operations can be performed using PL/SQL language constructs. Even SYSDATE can be accessed directly in PL/SQL without accessing SQL engine. In Listing 1-7, the highlighted SQL statement is calculating the UNIX epoch time (epoch time is defined as number of seconds elapsed from January 1, 1970 Midnight) using a SELECT.. from DUAL syntax. While access to the DUAL table is fast, execution of the statement still results in a context switch between the SQL and PL/SQL engines.

Listing 1-7. Excessive Access to DUAL—Arithmetics

DECLARE

  l_epoch INTEGER;

BEGIN

  SELECT ((SYSDATE-TO_DATE('01-JAN-1970 00:00:00', 'DD-MON-YYYY HH24:MI:SS'))

          * 24 *60 *60 )

  INTO l_epoch

  FROM DUAL;

  dbms_output.put_line(l_epoch);

END;

You can avoid the unnecessary access to DUAL table by performing the arithmetic calculation using PL/SQL constructs like this:

l_epoch := (SYSDATE- TO_DATE('01-JAN-1970 00:00:00', 'DD-MON-YYYY HH24:MI:SS'))

          * 24 *60 *60 ;

There's simply no need to invoke SQL to execute numeric operations. Simply do the work from PL/SQL. Your code will be easier to read, and it will perform better.

Excessive access to query the current date or timestamp is another reason for increased access to DUAL table. Consider coding a call to SYSDATE in the SQL statement directly instead of selecting SYSDATE into a variable and then passing that value back to the SQL engine. If you need to access the column value after inserting a row, then use returning clause to fetch the column value. If you need to access SYSDATE in PL/SQL itself, use PL/SQL construct to fetch the current date in to a variable.

Access to Sequences

Another common reason for unnecessary access to DUAL table is to retrieve the next value from a sequence. Listing 1-8 shows a code fragment selecting the next value from cust_id_seq in to a variable and then inserting into customers table using that variable.

Listing 1-8. Excessive Access to DUAL— Sequences

DECLARE

  l_cust_id NUMBER;

BEGIN

  FOR c1 in (SELECT cust_first_name, cust_last_name FROM customers 

             WHERE cust_marital_status!='married')

  LOOP

    SELECT cust_hist_id_seq.nextval INTO l_cust_id FROM dual;

    INSERT INTO customers_hist

      (cust_hist_id, first_name, last_name )

    VALUES

      (l_cust_id, c1.cust_first_name, c1.cust_last_name)

      ;

  END LOOP;

END;

/

PL/SQL procedure successfully completed.



Elapsed: 00:00:01.89

A better approach is to avoid retrieving the value to a variable and retrieve the value from the sequence directly in the INSERT statement itself. The following code fragment illustrates an INSERT statement inserting rows into customers using a sequence-generated value. With this coding practice, you can avoid accessing the DUAL table, and thus avoid context switches between the engines.

Insert into customers (cust_id, ...)

Values (cust_id_seq.nextval,...);

Better yet, rewrite the PL/SQL block as a SQL statement. For example, the following rewritten statement completes in 0.2 seconds compared to a run time of 1.89 seconds with PL/SQL loop-based processing:

INSERT INTO customers_hist

SELECT

   cust_hist_id_seq.nextval, cust_first_name, cust_last_name

FROM customers  

WHERE cust_marital_status!='married';



23819 rows created.

Elapsed: 00:00:00.20

Populating Master-Detail Rows

Another common reason for excessive access to DUAL table is to insert rows into tables involved in a master-detail relationship. Typically, in this coding practice, the primary key value for the master table is fetched from the sequence into a local variable. Then that local variable is used while inserting into master and detail tables. The reason this approach developed is that the primary key value of the master table is needed while inserting into the detail table(s).

A new SQL feature introduced in Oracle Database version 9i provides a better solution by allowing you to return the values from an inserted row. You can retrieve the key value from a newly-inserted master row by using the DML RETURNING clause. Then you can use that key value while inserting in to the detail table. For example:

INSERT INTO customers (cust_id, ...)

VALUES (cust_id_seq.nextval,...)

RETURNING cust_id into l_cust_id;

...

INSERT INTO customer_transactions (cust_id, ...)

VALUES (l_cust_id,...)

...

Excessive Function Calls

It is important to recognize that well designed applications will use functions, procedures, and packages. This section is not a discussion about those well designed programs using modular code practices. Rather, this section is specifically directed towards the coding practice of calling functions unnecessarily.

Unnecessary Function Execution

Executing a function call usually means that a different part of the instruction set must be loaded into the CPU. The execution jumps from one part of instruction to another part of instruction. This execution jump adds to performance issues because it entails a dumping and refilling of the instruction pipeline. The result is additional CPU usage.

By avoiding unnecessary function execution, you avoid unneeded flushing and refilling of the instruction pipeline, thus minimizing demands upon your CPU. Again, I am not arguing against modular coding practices. I argue only against excessive and unnecessary execution of function calls. I can best explain by example.

In Listing 1-9, log_entry is a debug function and is called for every validation. But that function itself has a check for v_debug, and messages are inserted only if the debug flag is set to true. Imagine a program with hundreds of such complex business validations performed in a loop. Essentially, the log_entry function will be called millions of times unnecessarily even if the v_debug flag is set to false.

Listing 1-9. Unnecessary Function Calls

create table log_table ( message_seq number, message varchar2(512));

create sequence message_id_seq;



DECLARE

 l_debug BOOLEAN := FALSE;

 r1 integer;

 FUNCTION log_entry( v_message IN VARCHAR2, v_debug in boolean)

  RETURN number

  IS

  BEGIN

    IF(v_debug) THEN

      INSERT INTO log_table

        (message_seq, MESSAGE)

        VALUES

        (message_id_seq.nextval, v_message

        );

    END IF;

    return 0;

  END;

BEGIN

 FOR c1 IN (

        SELECT s.prod_id, s.cust_id,s.time_id,

               c.cust_first_name, c.cust_last_name,

               s.amount_sold

        FROM sales s,

             customers c

        WHERE s.cust_id    = c.cust_id and

        s.amount_sold> 100)

  LOOP

    IF c1.cust_first_name IS NOT NULL THEN

      r1 := log_entry ('first_name is not null ', l_debug );

    END IF;

    IF c1.cust_last_name IS NOT NULL THEN

      r1 := log_entry ('Last_name is not null ', l_debug);

    END IF;

  END LOOP;

END;

/



PL/SQL procedure successfully completed.

Elapsed: 00:00:00.54

The code in Listing 1-9 can be rewritten to call the log_entry function only if the variable l_debug flag is set to true. This rewrite reduces unnecessary executions of log_entry function. The rewritten program completes in 0.43 seconds. The performance improvement will be noticeable with higher number of executions.

...

IF first_name IS NULL AND l_debug=TRUE THEN

  log_entry('first_name is null '),

END IF;

...

/

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.43

For a better approach, consider the conditional compilation constructs to avoid the execution of this code fragment completely. In Listing 1-10, the highlighted code uses $IF-$THEN construct with a conditional variable $debug_on. If the conditional variable debug_on is true, then the code block is executed. In a production environment, debug_on variable will be FALSE, eliminating function execution. Note that the elapsed time of the program reduces further to 0.34 seconds.

Listing 1-10. Avoiding Unnecessary Function Calls with Conditional Compilation

DECLARE

 l_debug BOOLEAN := FALSE;

 r1 integer;

 FUNCTION log_entry( v_message IN VARCHAR2, v_debug in boolean) 

  RETURN number

  IS

  BEGIN

    IF(v_debug) THEN

      INSERT INTO log_table

        (message_seq, MESSAGE)

        VALUES

        (message_id_seq.nextval, v_message

        );

    END IF;

    return 0;

  END;

BEGIN

FOR c1 IN (

        SELECT s.prod_id, s.cust_id,s.time_id,

               c.cust_first_name, c.cust_last_name,

               s.amount_sold

        FROM sales s,

             customers c

        WHERE s.cust_id    = c.cust_id and

        s.amount_sold> 100)

  LOOP

   $IF $$debug_on $THEN

    IF c1.cust_first_name IS NOT NULL  THEN

      r1 := log_entry ('first_name is not null ', l_debug );

    END IF;

    IF c1.cust_last_name IS NOT NULL   THEN

      r1 := log_entry ('Last_name is not null ', l_debug);

    END IF;

   $END

    null;

  END LOOP;

END;

/

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.34

The problem of invoking functions unnecessarily tends to occur frequently in programs copied from another template program and then modified. Watch for this problem. If a function doesn't need to be called, avoid calling it.

INTERPRETED VS. NATIVE COMPILATION

PL/SQL code, by default, executes as an interpreted code. During PL/SQL compilation, code is converted to an intermediate format and stored in the data dictionary. At execution time, that intermediate code is executed by the engine.

Oracle Database version 9i introduces a new feature known as native compilation. PL/SQL code is compiled into machine instruction and stored as a shared library. Excessive function execution might have less impact with native compilation as modern compilers can inline the subroutine and avoid the instruction jump.

Costly Function Calls

If the execution of a function consumes a few seconds of elapsed time, then calling that function in a loop will result in poorly performing code. You should optimize frequently executed functions to run as efficiently as possible.

In Listing 1-11, if the function calculate_epoch is called in a loop millions of times. Even if the execution of that function consumes just 0.01 seconds, one million executions of that function call will result in an elapsed time of 2.7 hours. One option to resolve this performance issue is to optimize the function to execute in a few milliseconds, but that much optimization is not always possible.

Listing 1-11. Costly Function Calls

CREATE OR REPLACE FUNCTION calculate_epoch (d in date)

 RETURN NUMBER DETERMINISTIC IS

 l_epoch number;

BEGIN

l_epoch := (d - TO_DATE('01-JAN-1970 00:00:00', 'DD-MON-YYYY HH24:MI:SS'))

          * 24 *60 *60 ;

 RETURN l_epoch;

END calculate_epoch;

/



SELECT  /*+ cardinality (10) */ max( calculate_epoch (s.time_id))  epoch

FROM sales s

 WHERE s.amount_sold> 100 and

       calculate_epoch (s.time_id) between 1000000000 and 1100000000;



     EPOCH

----------

1009756800



Elapsed: 00:00:01.39

Another option is to pre-store the results of the function execution, thus avoiding function execution in a query. You can do this by employing a function-based index. In Listing 1-12, a function-based index on the function calculate_epoch is created. Performance of the SQL statement improves from 1.39 seconds to 0.06 seconds.

Listing 1-12. Costly Function Call with Function-Based Index

CREATE INDEX compute_epoch_fbi ON sales

(calculate_epoch(time_id))

Parallel (degree 4);



SELECT  /*+ cardinality (10) */ max( calculate_epoch (s.time_id))  epoch

FROM sales s

 WHERE s.amount_sold> 100 and

       calculate_epoch (s.time_id) between 1000000000 and 1100000000;

     EPOCH

----------

1009756800



Elapsed: 00:00:00.06

You should also understand that function-based indexes have a cost. INSERT statements and UPDATE statements (that update the time_id column) will incur the cost of calling the function and maintaining the index. Carefully weigh the cost of function execution in DML operations against the cost of function execution in SELECT statement to choose the cheaper option.

Note From Oracle Database version 11g onwards, you can create a virtual column, and then create an index on that virtual column. The effect of an indexed virtual column is the same as that of a function-based index. An advantage of virtual columns over function-based index is that you can partition the table using a virtual column, which is not possible with the use of just function based indexes.

The function result_cache, available from Oracle Database version 11g, is another option to tune the execution of costly PL/SQL functions. Results from function execution are remembered in the result cache allocated in the Shared Global Area (SGA) of an instance. Repeated execution of a function with the same parameter will fetch the results from the function cache without repeatedly executing the function. Listing 1-13 shows an example of functions utilizing result_cache to improve performance: the SQL statement completes in 0.81 seconds.

Listing 1-13. Functions with Result_cache

DROP INDEX compute_epoch_fbi;

CREATE OR REPLACE FUNCTION calculate_epoch (d in date)

 RETURN NUMBER  DETERMINISTIC RESULT_CACHE IS

 l_epoch number;

BEGIN

 l_epoch := (d - TO_DATE('01-JAN-1970 00:00:00', 'DD-MON-YYYY HH24:MI:SS'))

          * 24 *60 *60 ;

 RETURN l_epoch;

END calculate_epoch;

/



SELECT  /*+ cardinality (10) */ max( calculate_epoch (s.time_id))  epoch

FROM sales s

 WHERE s.amount_sold> 100 and

       calculate_epoch (s.time_id) between 1000000000 and 1100000000;



     EPOCH

----------

1009756800



Elapsed: 00:00:00.81

In summary, excessive function execution leads to performance issues. If you can't reduce or eliminate function execution, you may be able to employ function-based indexes or result_cache as a short term fix in order to minimize the impact from function invocation.

Database Link Calls

Excessive database link-based calls can affect application performance. Accessing a remote table or modifying a remote table over a database link within a loop is not a scalable approach. For each access to a remote table, several SQL*Net packets are exchanged between the databases involved in the database link. If the databases are located in geographically separated datacenters or, worse, across the globe, then the waits for SQL*Net traffic will result in program performance issues.

In Listing 1-14, for every row returned from the cursor, the customer table in the remote database is accessed. Let's assume that the round trip network call takes 100ms, so 1 million round trip calls will take 27 hours approximately to complete. A response time of 100ms between the databases located in different parts of the country is not uncommon.

Listing 1-14. Excessive Database Link Calls

DECLARE

  V_customer_name VARCHAR2(32);

BEGIN

  ...

  FOR c1 IN (SELECT …)

  LOOP

    ...

    SELECT customer_name

    INTO v_customer_name

    FROM customers@remotedb

    WHERE account_id = c1.account_id;

    ...

  END LOOP;

END;

Judicial use of materialized views can be used to reduce network round trip calls during program execution. In the case of Listing 1-14, the customer table can be created as a materialized view. Refresh the materialized view before program execution and access that materialized view in the program. Materializing the table locally reduces the number of SQL*Net round trip calls. Of course, as an Application Designer, you need to compare the cost of materializing the whole table versus the cost of accessing a remote table in a loop, and choose an optimal solution.

Rewriting the program as a SQL statement with a join to a remote table is another option. The query optimizer in Oracle Database can optimize such statements so as to reduce the SQL*Net trip overhead. For this technique to work, you should rewrite the program so that SQL statement is executed once and probably not in a loop.

Materializing the data locally or rewriting the code as a SQL statement with a remote join are the initial steps to tune the program in Listing 1-14. However, if you are unable to do even these things, there is a workaround. As an interim measure, you can convert the program to use a multi-process architecture. For example, process #1 will process the customers in the range of 1 to 100,000, process #2 will process the customers in the range of 100,001 to 200,000, and so on. Apply this logic to the example program by creating 10 processes, and you can reduce the total run time of the program to 2.7 hours approximately. Use of DBMS_PARALLEL_EXECUTE is another option to consider for splitting the code in to parallel processing.

Excessive Use of Triggers

Triggers are usually written in PL/SQL, although you can write trigger code in Java as well. Excessive triggers are not ideal for performance reasons. Row changes are performed in the SQL engine and triggers are executed in the PL/SQL engine. Once again, you encounter the dreaded context-switch problem.

In some cases, triggers are unavoidable. For example, complex business validation in a trigger can't be avoided. In those scenarios, you should write that type of complex validation in PL/SQL code. You should avoid overusing triggers for simple validation. For example, use check constraints rather than a trigger to check the list of valid values for a column.

Further, avoid using multiple triggers for the same trigger action. Instead of writing two different triggers for the same action, you should combine them into one so as to minimize the number of context switches.

Excessive Commits

It is a not uncommon to see commits after every row inserted or modified (or deleted) in a PL/SQL loop. The coding practice of committing after every row will lead to slower program execution. Frequent commits generate more redo, require Log Writer to flush the contents of log buffer to log file frequently, can lead to data integrity issues, and consume more resources. The PL/SQL engine is optimized to reduce the effect of frequent commits, but there is no substitute for a well written code when it comes to reducing commits.

You should commit only at the completion of a business transaction. If you commit earlier than your business transaction boundary, you can encounter data integrity issues. If you must commit to improve restartability, consider batch commits. For example, rather than commit after each row, it's better to do batch commit every 1000 or 5000 rows (the choice of batch size depends upon your application). Fewer commits will reduce the elapsed time of the program. Furthermore, fewer commits from the application will also improve the performance of the database.

Excessive Parsing

Don't use dynamic SQL statements in a PL/SQL loop as doing so will induce excessive parsing issues. Instead, reduce amount of hard parsing through the use of bind variables.

In Listing 1-15, the customers table is accessed to retrieve customer details, passing cust_id from cursor c1. A SQL statement with literal values is constructed and then executed using the native dynamic SQL EXECUTE IMMEDIATE construct. The problem is that for every unique row retrieved from cursor c1, a new SQL statement is constructed and sent to the SQL engine for execution.

Statements that don't exist in the shared pool when you execute them will incur a hard parse. Excessive hard parsing stresses the library cache, thereby reducing the application's scalability and concurrency. As the number of rows returned from cursor c1 increases, the number of hard parses will increase linearly. This program might work in a development database with a small number of rows to process, but the approach could very well become a problem in a production environment.

Listing 1-15. Excessive Parsing

DECLARE

  ...

BEGIN

 FOR c1_rec IN c1

 LOOP

  -- Query customer details

  EXECUTE IMMEDIATE  

   'SELECT cust_first_name, cust_last_name, country_id

    FROM customers

    WHERE cust_id= ' || c1_rec.cust_id  INTO l_cust_first_name, l_cust_last_name,

l_country_id;

  ...

 END LOOP;

COMMIT;

END;

/

You should reduce hard parses as much as possible. Dynamic SQL statements in a loop tend to increase the effect of hard parse and the effect is amplified if the concurrency of execution increases.

Summary

This chapter reviewed various scenarios in which the use of few PL/SQL constructs was not appropriate. Keeping in mind that SQL is a set language and PL/SQL is a procedural language, the following recommendations should be considered as guidelines while designing a program:

Solve query problems using SQL. Think in terms of sets! It's easier to tune queries written in SQL than to tune, say, PL/SQL programs having nested loops to essentially execute queries using row-at-a-time processing.
If you must code your program in PL/SQL, try offloading work to the SQL engine as much as possible. This becomes more and more important with new technologies such as Exadata. Smart scan facilities available in an Exadata database machine can offload work to the storage nodes and improve the performance of a program written in SQL. PL/SQL constructs do not gain such benefits from Exadata database machines (at least not as of version 11gR2).
Use bulk processing facilities available in PL/SQL if you must use loop-based processing. Reduce unnecessary work in PL/SQL such as unnecessary execution of functions or excessive access to DUAL by using the techniques discussed in this chapter.
Use single-row, loop-based processing only as a last resort.

Indeed, use PL/SQL for all your data and business processing. Use Java or other language for presentation logic and user validation. You can write highly scalable PL/SQL language programs using the techniques outlined in this chapter.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 1: Do Not Use

Create new playlist

Sign In

Sign Up

C H A P T E R 1

Do Not Use

Row-by-Row Processing

Nested Row-by-Row Processing

Lookup Queries

Excessive Access to DUAL

Arithmetics with Date

Access to Sequences

Populating Master-Detail Rows

Excessive Function Calls

Unnecessary Function Execution

Costly Function Calls

Database Link Calls

Excessive Use of Triggers

Excessive Commits

Excessive Parsing

Summary

Table of Contents for
Chapter 1: Do Not Use