Chapter 6: Bulk SQL Operations

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

C H A P T E R 6

Bulk SQL Operations

by Connor McDonald

This chapter is about the bulk SQL operations available in PL/SQL. Bulk operations in PL/SQL enable you to manipulate and process many rows at once, rather than one row at a time. In this chapter, you’ll learn about bulk fetching and bulk binding. Bulk fetching refers to the ability to retrieve a set of rows from the database into PL/SQL structures with a single call, rather than making a call to the database for each row to be retrieved. Bulk binding lets you perform the converse: to take those sets of rows stored in PL/SQL structures and save them to the database in an efficient manner.

You’ll see the tremendous performance benefits possible with bulk SQL operations. To achieve these benefits requires a little extra complexity in your code, especially in the area of error handling, but I’ll show you how to manage that complexity so that you will achieve performance benefits without compromising the manageability of your code.

The Hardware Store

My journey toward the benefit of bulk operations began at the hardware store—not the computer hardware store, but the DIY-style store where you purchase tools, fixings, paint and the like to perform handyman projects and maintenance on the family home.

I love the hardware store. Of course, being an Information Technology professional, my knowledge about anything that remotely resembles manual labor is pretty much zero. So every time I arrive at the cash register to pay for a new tool or a new tin of paint, within about 5 seconds I am back in the aisles to pick up an accompanying item; for example, a new tin of paint is not much use without a paint brush! By the time I actually have all of the tools required to perform the job at hand, I’ve probably been back and forth to the cash register six or seven times, each trip accompanied by an increasing level of profanity from the person at the till. There’s a term of endearment my wife came up with when she observed me doing this—picking up a single item at a time— and that term is idiot. She’s right; I should just grab a trolley, collect all of the items I need, and pay for them all at once!

And that’s the great hypocrisy. For some strange reason, the same IT professionals that apply simple common sense to tackling “challenges” like going to the hardware store, struggle to apply that same common sense when it comes to working with the database. Much like the hardware store, if you are going to pick up a number of items (of data), it makes a lot more sense to use the coding equivalent of a trolley to collect that data.

Throughout this chapter, I’ll often refer back to and expand upon this metaphor of purchasing items from the hardware store because it’s useful for the sake of building examples, but also to reinforce the fact that everything discussed in this chapter is still merely a reflection of the common sense principles we use in our everyday lives away from technology.

Setting for the Examples in this Chapter

All of the examples in this chapter are downloadable from the book’s catalog page on the Apress website, and each example is annotated with the appropriate file name. For each example, you will see an initial call to a script, like so:

@@bulk_setup.sql [populate | empty]

This creates a table representing the “hardware store” that all of the examples will use.

SQL> desc HARDWARE

 Name                          Null?    Type

 ----------------------------- -------- -------------

 AISLE                                  NUMBER(5)

 ITEM                                   NUMBER(12)

 DESCR                                  CHAR(50)

The table is created empty unless the ‘populate’ keyword is passed, upon which 1,000,000 rows will be added to the table via the following SQL:

SQL> insert /*+ APPEND */ into  HARDWARE

  2  select trunc(rownum/1000)+1 aisle,

  3         rownum item,

  4         'Description '||rownum descr

  5  from

  6    ( select 1 from dual connect by level <= 1000),^**

  7    ( select 1 from dual connect by level <= 1000);



1000000 rows created.

The script drops and recreates the table so that examples should run on your database with similar results to what you see in this chapter. Obviously with so many variants in terms of hardware platform and software versions, your mileage may vary, but the examples have all been run on the most common defaults, namely 8k block size residing within an ASSM tablespace.

Tip The trick of using DUAL to generate an arbitrary number of rows dates back to a post by Mikito Harakiri on the asktom.oracle.com site many years ago. Since then, it has almost become the de facto standard in demos on newsgroups, books, and the like for quickly synthesizing data. When pushed to extremes, care must be taken in order not to excessively consume memory. Refer to http://blog.tanelpoder.com/2008/06/08/generating-lots-of-rows-using-connect-by-safely/ for details.

Bulk Operations in PL/SQL

Bulk processing in Oracle pre-dates PL/SQL, just via a different name, namely array processing. For as a long as I have worked with Oracle (circa version 6 in the early 1990s), the concept of array processing has been available. Rather than fetch a single row of data from the database, a set of rows is fetched into a buffer, and that buffer is passed back to the client as an array. Similarly, rather than modify or create a single row of data in the database, a buffer with an array of rows is populated and passed to the database. Application programmers could do array processing natively via OCI, or many of the popular Oracle tools of the time (such as Forms) would transparently take care of the task. You will soon see the performance and scalability benefits of such a strategy.

Presumably the PL/SQL team at Oracle faced a nomenclature problem in version 8.1 when they introduced native array processing. By that stage, the term “array” was already entrenched in PL/SQL, referring to the INDEX BY array data structure, and perhaps from this conflict, the term “bulk” arose. In fact, array processing in PL/SQL pre-dates even version 8.1; it was available via the DBMS_SQL package way back in Oracle 7. An example of PL/SQL array processing from version 7 is presented next (I will not step through the code in detail, since I’ll be covering simpler mechanisms shortly).

The following example (bulk_dbms_sql_array.sql) shows how to perform array processing in version 8.0 and below. It fetches 500 rows at a time from the HARDWARE table.

SQL> set serverout on

SQL> declare

  2    l_cursor   int  := dbms_sql.open_cursor;

  3    l_num_row  dbms_sql.number_table;

  4    l_exec     int;

  5    l_fetched_rows    int;

  6  begin

  7   dbms_sql.parse(

  8           l_cursor,

  9           'select item from hardware where item <= 1200',

 10           dbms_sql.native);

 11   dbms_sql.define_array(l_cursor,1,l_num_row,500,1);

 12   l_exec := dbms_sql.execute(l_cursor);

 13   loop

 14       l_fetched_rows := dbms_sql.fetch_rows(l_cursor);

 15       dbms_sql.column_value(l_cursor, 1, l_num_row);

 16       dbms_output.put_line('Fetched '||l_fetched_rows||' rows'),

 17       exit when l_fetched_rows < 500;

 18    end loop;

 19    dbms_sql.close_cursor(l_cursor);

 20  end;

 21  /

Fetched 500 rows

Fetched 500 rows

Fetched 200 rows



PL/SQL procedure successfully completed.

Sadly, even a decade after they were introduced, the bulk operations in PL/SQL are still an underused feature in modern PL/SQL-centric applications. Many a production PL/SQL program still processes data from the database in a row-by-row fashion. In particular, because most of my work is in the tuning arena, my primary motivation for the use of collections is that it encourages the developer to think more in terms of sets, rather than rows. While there is no functional reason that should prohibit developers from processing resultsets one row at a time, from an efficiency and performance perspective, it is generally bad news.

Similarly, criticism is often aimed at PL/SQL in terms of performance, but the most common cause of this is row-at-a-time processing rather than anything inherent in the PL/SQL engine. PL/SQL is not alone in misplaced criticism of this kind. Developers not taking advantage of host-based arrays in Pro*C led Oracle to add a PREFETCH compiler option, which converts the runtime Pro*C code into array fetching even though the developer has coded it in a conventional row-at-a-time manner. Similarly, ODP.NET defaults to array processing for most queries via its FetchSize parameter.

Getting Started with BULK Fetch

One of the great things about transitioning your code to take advantage of bulk operations in PL/SQL, is that it’s easy to do and has a direct mapping to your existing code. Before bulk operations arrived in version 8.1, you could use one of the following three constructs to retrieve a row of data in PL/SQL

1. Implicit Cursor

A standard SQL query (SELECT-INTO) is used to retrieve a single row of data, or columns from that single row from a table into target variables. If no rows are retrieved or more than a single row is retrieved, an exception is raised. Here’s an example (bulk_implicit_1.sql):

SQL> declare

  2    l_descr hardware.descr%type;

  3  begin

  4    select descr

  5    into   l_descr

  6    from   hardware

  7    where  aisle = 1

  8    and    item = 1;

  9  end;

 10  /



PL/SQL procedure successfully completed.

2. Explicit Fetch Calls

A cursor is explicitly defined within the PL/SQL declaration section. The cursor is then opened and fetched from, typically within a loop until the available rows are exhausted, at which point the cursor is closed. Here’s an example (bulk_explicit_1.sql):

SQL> declare

  2    cursor c_tool_list is

  3      select descr

  4      from   hardware

  5      where  aisle = 1

  6      and    item between 1 and 500;

  7

  8    l_descr hardware.descr%type;

  9  begin

 10    open c_tool_list;

 11    loop

 12      fetch c_tool_list into l_descr;

 13      exit when c_tool_list%notfound;

 14    end loop;

 15    close c_tool_list;

 16  end;

 17  /



PL/SQL procedure successfully completed.

3. Implicit Fetch Calls

The third type is a hybrid between the two approaches. A FOR loop takes care of the cursor management, the cursor being either explicitly defined in advance or the directly coded within the FOR-loop itself. Here’s an example (bulk_implicit_fetch_1.sql):

SQL> begin

  2    for i in (

  3      select descr

  4      from   hardware

  5      where  aisle = 1

  6      and    item between 1 and 500 )

  7    loop

  8       <processing code for each row>

  9    end loop;

 10  end;

 11  /



PL/SQL procedure successfully completed.

Converting each of those constructs to a bulk collection model is easy and straightforward. The following are the three bulk processing constructs that are the analogs of the non-bulk constructs just shown.

1. Implicit Cursor BULK Mode

A standard SQL query (SELECT-INTO) can now be used to retrieve multiple rows of data into a collection type simply by adding the BULK COLLECT keywords. Here’s an example (bulk_implicit_2.sql):

SQL> declare

  2    type t_descr_list is table of hardware.descr%type;

  3    l_descr_list t_descr_list;

  4  begin

  5    select descr

  6    bulk collect

  7    into   l_descr_list

  8    from   hardware

  9    where  aisle = 1

 10    and    item between 1 and 100;

 11  end;

 12  /



PL/SQL procedure successfully completed.

2. Explicit Fetch Calls BULK Mode

The only changes required are to define a collection type to hold the results and to add the BULK COLLECT clause to the FETCH command. All of the rows in the cursor resultset will be fetched into the collection type variable in a single call. Here’s an example (bulk_explicit_2.sql):

SQL> declare

  2    cursor c_tool_list is

  3      select descr

  4      from   hardware

  5      where  aisle = 1

  6      and    item between 1 and 500;

  7

  8    type t_descr_list is table of c_tool_list%rowtype;

  9    l_descr_list t_descr_list;

 10

 11  begin

 12    open c_tool_list;

 13    fetch c_tool_list bulk collect into l_descr_list;

 14    close c_tool_list;

 15  end;

 16  /



PL/SQL procedure successfully completed.

3. Implicit Fetch Calls BULK mode

Things gets even easier when converting the hybrid approach to bulk collect because there are no code changes to make. One of the best features to arrive in Oracle 10g was the “automatic bulk collect” enhancement for FOR loops. Because a FOR loop on a cursor will, by definition, fetch all of the rows in the cursor (unless an explicit EXIT command is present within the loop), the PL/SQL compiler can safely employ a bulk collect optimization to retrieve those rows as efficiently as possible. You will automatically get this optimization if you are on at least version 10 of Oracle, and the database parameter plsql_optimize_level is set to at least 2 (the default). The following example (bulk_implicit_fetch_2.sql) shows the database optimization level followed by a FOR loop that is automatically implemented for you using the bulk processing features:

SQL> select banner from v$version where rownum = 1;



BANNER

----------------------------------------------------------------------------

Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production



SQL> select value from v$parameter where name = 'plsql_optimize_level';

VALUE

----------------------------------------------------------------------------

2



SQL> begin

  2    for i in (

  3      select descr

  4      from   hardware

  5      where  aisle = 1

  6      and    item between 1 and 500 )

  7    loop

  8       null;

  9    end loop;

 10  end;

 11  /



PL/SQL procedure successfully completed.

It’s not immediately apparent that bulk collection is taking place in the previous code. You’ll soon see how you can verify that the code is indeed fetching multiple rows with a single call.

Three Collection-Style Datatypes

The final three examples in the preceding section are retrieving their rows in a nested table datatype. Since Oracle 8.1, there are three collection-style datatypes into which rows can be fetched into in bulk fashion.

Varray
Nested table
Associative array

Deciding on what collection types to use typically will come down to which suits your application most. The types are not mutually exclusive. As well as collections being passed between PL/SQL programs, you may also be passing these collections back and forth between PL/SQL and your 3GL-based applications residing away from the database server. Many of the popular 3GL technologies used to communicate with Oracle databases have simple and effective interfaces to PL/SQL routines which take associative arrays as parameters, but not necessarily varray or nested tables.

For example,

Within JDBC, the methods setPlsqlIndexTable and registerIndexTableOutParameter are available for passing data to and from the database via PL/SQL associative arrays.
Within Pro*C, host arrays in C can be directly mapped to associative arrays in PL/SQL parameters.
Within ODP.NET, Oracle parameters can be directly mapped to associative arrays, namely, Param1.CollectionType = OracleCollectionType.PLSQLAssociativeArray;

By comparison, if you head down the “object” path by creating user defined types comprising varrays and/or nested tables, then interacting with PL/SQL via these same 3GL’s can start to get more complicated. Translation utilities are required, such as the Object Type Translator for Pro*C or custom metadata mapping for ODP.Net. For this reason, it may well be the case that associative arrays are the best choice if you have a strong interaction between your 3GL application code and PL/SQL. Alternatively, given the industry’s current obsession with Service Oriented Architecture (SOA) where the data exchange medium is XML, it may be easier for 3GL applications to pass XML into the PL/SQL and use the facilities within the XMLTYPE datatype to cast the incoming XML into an PL/SQL object type. Here is example of such a translation (bulk_xml_to_obj.sql):

SQL> create or replace

  2  type COMING_FROM_XML as object

  3     ( COL1 int,

  4       COL2 int)

  5  /



Type created.



SQL> declare

  2    source_xml xmltype;

  3    target_obj coming_from_xml;

  4  begin

  5    source_xml :=

  6       xmltype('<DEMO>

  7                  <COL1>10</COL1>

  8                  <COL2>20</COL2>

  9                </DEMO>'),

 10

 11    source_xml.toObject(target_obj);

 12  end;

 13  /



PL/SQL procedure successfully completed.

Conversely, if your application is PL/SQL centric (Oracle’s own Application Express is a perfect example of this), then perhaps using nested table and varray types are a better option, due to expansive ranges of set operators available on variables of these data types. See Chapter 5 of The Object-Relational Developer’s Guide for details on the various set operations you can perform on nested table types.

Of course, you are free to mix and match. 3GL’s can pass data to your PL/SQL via associative arrays, and then you could cast these arrays into more flexible object types within the PL/SQL for ongoing processing.

Why should I bother?

As you have just seen, converting row-by-row fetch calls into code that uses bulk collection requires just a few extra lines of code. So why should you bother with these extra few lines? The hardware store metaphor suggests that processing rows in bulk is all about efficiency, so let’s compare the results between making many trips to the hardware store (the HARDWARE table) to pick up every item and just a single trip with a suitably large trolley (the nested table). The following code example (bulk_collect_ perf_test_1.sql) compares the response time of a row-by-row code and its bulk collect equivalent:

SQL> declare

  2    cursor c_tool_list is

  3      select descr d1

  4      from   hardware;

  5

  6    l_descr hardware.descr%type;

  7  begin

  8    open c_tool_list;

  9    loop

 10      fetch c_tool_list into l_descr;

 11      exit when c_tool_list%notfound;

 12    end loop;

 13    close c_tool_list;

 14  end;

 15  /



PL/SQL procedure successfully completed.



Elapsed: 00:00:21.39



SQL> declare

  2    cursor c_tool_list is

  3      select descr d2

  4      from   hardware;

  5

  6    type t_descr_list is table of c_tool_list%rowtype;

  7    l_descr_list t_descr_list;

  8

  9  begin

 10    open c_tool_list;

 11    fetch c_tool_list bulk collect into l_descr_list;

 12    close c_tool_list;

 13  end;

 14  /



PL/SQL procedure successfully completed.



Elapsed: 00:00:02.20

It’s 10 times faster, just by reducing the number of trips made to the database for data. Moreover, if the previous example is repeated with a session trace enabled, you’ll get some interesting results from the resulting trace data. Now let’s enable session tracing and re-run the demo:

SQL> alter session set sql_trace = true;

SQL> [repeat demo]

SQL> alter session set sql_trace = false;

Within the formatted trace file, if you search for column alias “D1”, you will locate the SQL query that was subject to single row fetch calls.

SELECT DESCR D1

FROM   HARDWARE





call      count       cpu    elapsed       disk      query    current        rows

-------  ------  -------- ---------- ---------- ---------- ----------  ----------

Parse         1      0.00       0.00          0          1          0           0

Execute       1      0.00       0.00          0          0          0           0

Fetch   1000001      8.17       8.15          0    1000010          0     1000000

------- -------  -------- ---------- ---------- ---------- ----------  ----------

total   1000003      8.17       8.15          0    1000011          0     1000000

And further along in the trace file, you will find the query containing the D2 alias, which was fetched from using bulk fetch:

SELECT DESCR D2

FROM   HARDWARE





call     count       cpu    elapsed       disk      query    current        rows

------- ------  -------- ---------- ---------- ---------- ----------  ----------

Parse        1      0.00       0.00          0          1          0           0

Execute      1      0.00       0.00          0          0          0           0

Fetch        1      1.85       2.08          0       9001          0     1000000

------- ------  -------- ---------- ---------- ---------- ----------  ----------

total        3      1.85       2.09          0       9002          0     1000000

Notice that along with reduction in elapsed time, there has been a marked reduction in CPU consumption as well. With bulk collect, not only are you are getting increased performance in your applications, their CPU footprint will shrink as well, which makes them more scalable.

Tip See Chapter 21 of the Performance Tuning Manual in the Oracle 11.2 database documentation for details on how to generate session trace files and format them with tkprof.

Returning briefly to the question of which collection type to use, all appear to be equivalent in terms of bulk collection performance. The following demo (bulk_which_collection_type.sql) ran virtually identically for each of the three collection types:

SQL> set timing on

SQL> declare

  2    type t_va is varray(1000) of number;

  3    type t_nt is table of number;

  4    type t_aa is table of number index by pls_integer;

  5

  6    va t_va;

  7    nt t_nt;

  8    aa t_aa;

  9  begin

 10    for i in 1 .. 10000 loop

 11

 12      select rownum

 13      --

 14      -- Comment in the collection type you want to test

 15      --

 16      bulk collect into va

 17      --bulk collect into nt

 18      --bulk collect into aa

 19

 20      from dual

 21      connect by level <= 1000 ;

 22    end loop;

 23  end;

 24  /



PL/SQL procedure successfully completed.

Monitoring Bulk Collect Overheads

Before you leap into converting all of your PL/SQL cursors to using bulk collect operations, it’s important to be aware of one significant impact that bulk collect will have on your database session. As mentioned at the start of the chapter, array processing is about retrieving data into a buffer. That buffer has to be held somewhere, and that somewhere is in your session’s memory. In Oracle parlance, your session’s memory is the session’s User Global Area (UGA). Depending on how your database has been configured, the session UGA will be held either privately for the connected session in the Process Global Area (PGA) or it will be sharable across processes in the System Global Area (SGA). A discussion on the pros and cons of Oracle’s dedicated versus shared server architecture is beyond the scope of this book; just know that PL/SQL collections will consume session memory, which can easily become an important consideration if you either have large collections or large numbers of concurrent sessions in your database.

For the examples that follow, the most common configuration will be assumed, namely dedicated server connections, and thus the memory consumption focus will be on the PGA. Let’s explore the memory consumption as larger and larger collection sizes are used for bulk collect. I’ll also introduce the LIMIT clause, which can be used on the BULK COLLECT statement.

Note The example that follows intersperses text with code. You can find the entirety of the code in the file named bulk_collect_memory.sql.

First, I’ll reconnect to reset the session level statistics (including PGA consumption).

SQL> connect

Enter user-name: *****

Enter password:  *****



SQL> set serverout on

SQL> declare

  2    type t_row_list is table of hardware.descr%type;

  3    l_rows t_row_list;

  4

  5    l_pga_ceiling  number(10);

  6

Next, I’ll define an array of different fetch sizes that will be used to retrieve a set of rows from the HARDWARE table. So in the first iteration, 5 rows will be fetched, then 10 rows, then 50 rows, and so forth.

  7    type t_fetch_size is table of pls_integer;

  8    l_fetch_sizes t_fetch_size := t_fetch_sizes(5,10,50,100,500,1000,10000,100000,1000000);

  9

 10    rc      sys_refcursor;

 11  begin

 12    select value

 13    into   l_pga_ceiling

 14    from   v$mystat m, v$statname s

 15    where  s.statistic# = m.statistic#

 16    and    s.name = 'session pga memory max';

 17

 18    dbms_output.put_line('Initial PGA: '||l_pga_ceiling);

 19

 20    for i in 1 .. l_fetch_sizes.count

 21    loop

For each of the fetch sizes, I’ll use the LIMIT clause to fetch a set of the rows from the table.

 22      open rc for select descr from hardware;

 23      loop

 24         fetch rc bulk collect into l_rows limit l_fetch_sizes(i);

 25         exit when rc%notfound;

 26      end loop;

 27      close rc;

 28

Then, having performed the fetch, I’ll capture the session level PGA statistics to see if the fetch has had an impact on the memory being consumed by the session on the database server. Here’s the code to do that:

 29      select value

 30      into   l_pga_ceiling

 31      from   v$mystat m, v$statname s

 32      where  s.statistic# = m.statistic#

 33      and    s.name = 'session pga memory max';

 34

 35      dbms_output.put_line('Fetch size: '||l_fetch_sizes(i));

 36      dbms_output.put_line('- PGA Max: '||l_pga_ceiling);

 37

 38    end loop;

 39

 40  end;

 41  /

Initial PGA: 3175904

Fetch size: 5

- PGA Max: 3175904

Fetch size: 10

- PGA Max: 3175904

Fetch size: 50

- PGA Max: 3241440

Fetch size: 100

- PGA Max: 3306976

Fetch size: 500

- PGA Max: 3306976

Fetch size: 1000

- PGA Max: 3372512

Fetch size: 10000

- PGA Max: 4224480

Fetch size: 100000

- PGA Max: 12482016

Fetch size: 1000000

- PGA Max: 95122912



PL/SQL procedure successfully completed.

Once the fetch sizes exceed 1,000 rows, the PGA consumption grows from 3MB up to 95MB when bulk collecting the entire set of 1,000,000 rows in a single fetch call. If you have hundreds of sessions all consuming hundreds of megabytes of memory, then that certainly is a scalability threat. Unless you have stock options in a memory chip company, massive bulk collect sizes are probably a bad idea. For this reason, it is recommended you never issue a FETCH BULK COLLECT on a resultset without a LIMIT clause when you do not know ahead of time the size (or approximate size) of the resultset.

Taken to extremes, a runaway collection could exhaust all of the memory on your server and possibly crash it. For example, the demo below attempts to bulk collect one billion rows into a PL/SQL collection; it hung my laptop in the process (which is why you will not find a script for this demo in the download catalog for this book!)

SQL> declare

  2    type t_huge_set is table of number;

  3    l_the_server_slaminator t_huge_set;

  4  begin

  5    select rownum

  6    bulk collect into l_the_server_slaminator

  7    from

  8      ( select level from dual connect by level <= 1000 ),

  9      ( select level from dual connect by level <= 1000 ),

 10      ( select level from dual connect by level <= 1000 );

 11  end;

 12  /



ERROR at line 1:

ORA-04030: out of process memory when trying to allocate 16396 bytes

Tip You may be thinking that the database initialization parameter pga_aggregate_target will insulate your system from such problems. This is not correct. This parameter only applies to memory allocations that the database can adjust internally as required, such as memory for sorting or hashing. If you ask for 100GB of PGA for PL/SQL collection memory, the database will try to honor that request, no matter how much trouble that may cause.

So on the surface, it would appear a compromise between performance and the memory consumption is required when using bulk operations. Let’s explore that issue by taking advantage of the automatic bulk collect optimization with implicit cursor loops, when plsql_optimize_level is set to 2 or higher (which is the default from 10g onwards anyway). I’ll reconnect to reset the session level statistics, and then fetch all 1,000,000 rows from the HARDWARE table, which is equivalent to the last iteration of the previous example that consumed 95 megabytes of PGA. The following example (bulk_collect_ perf_test_2.sql) demonstrates the fetch of 1,000,000 rows using the automatic bulk collect optimization:

SQL> connect

Enter user-name: *****

Enter password:  *****

SQL> alter session set plsql_optimize_level = 2;



Session altered.



SQL> begin

  2    for i in (

  3      select descr d3

  4      from   hardware )

  5    loop

  6      null;

  7    end loop;

  8  end;

  9  /



PL/SQL procedure successfully completed.



Elapsed: 00:00:01.78

Notice that performance is just as good as when the full bulk collect example. By examining a session trace, it’s possible to determine what the bulk collect fetch size is for automatic bulk collect. The following is the trace output from the previous PL/SQL block:

SELECT DESCR D3

FROM   HARDWARE





call     count       cpu    elapsed       disk      query    current        rows

------- ------  -------- ---------- ---------- ---------- ----------  ----------

Parse        1      0.00       0.00          0          1          0           0

Execute      1      0.00       0.00          0          0          0           0

Fetch    10001      1.03       1.13          0      18907          0     1000000

------- ------  -------- ---------- ---------- ---------- ----------  ----------

total    10003      1.03       1.14          0      18908          0     1000000

1,000,000 rows fetched via 10,001 fetch calls suggests a fetch size of 100 rows. Although 10,000 fetch calls have been issued instead of 1, the performance is about the same. There are diminishing returns once your fetch array sizes get above the number of rows in a database block, so it’s common for a fetch size of around 100 to be the sweet spot. Rather than carefully benchmarking every piece of code to determine the optimal bulk collect size, simply refactoring cursor loops just to take advantage of the automatic bulk collect optimization in the compiler is going to be close to optimal anyway

Refactoring Code to Use Bulk Collect

So far, converting your code to use bulk collect almost sounds too good to be true. Very simple code changes (or even none, if you are already using implicit fetch cursor loops) lead to tremendous performance improvements. However, there are a few pitfalls you need to guard against when refactoring your code

No Exception Raised

A conventional implicit cursor will raise the no_data_found exception if there are no rows in the resultset, and your application may depend upon this fact to take reparative actions. In the following example (bulk_ndf_1.sql), no_data_found is used to indicate that the query predicates are not valid:

SQL> set serverout on

SQL> declare

  2    l_descr hardware.descr%type;

  3  begin

  4    select descr

  5    into   l_descr

  6    from   hardware

  7    where  aisle = 0

  8    and    item = 0;

  9    dbms_output.put_line('Item was found'),

 10  exception

 11    when no_data_found then

 12      dbms_output.put_line('Invalid item specified'),

 13  end;

 14  /

Invalid item specified



PL/SQL procedure successfully completed.

However, when this example is converted to use bulk collect (without appropriate care), a bug is introduced into the code. Execute the following version (bulk_ndf_2.sql) and check the results on your own system:

SQL> set serverout on

SQL> declare

  2    type t_descr_list is table of hardware.descr%type;

  3    l_descr_list t_descr_list;

  4  begin

  5    select descr

  6    bulk collect

  7    into   l_descr_list

  8    from   hardware

  9    where  aisle = 0

 10    and    item = 0;

 11    dbms_output.put_line('Item was found'),

 12  exception

 13    when no_data_found then

 14      dbms_output.put_line('Invalid item specified'),

 15  end;

 16  /

Item was found  <==== wrong !



PL/SQL procedure successfully completed.

The procedure runs successfully but returns the wrong result! This is because a bulk collect fetch call will not raise the no_data_found exception. You can interpret that lack of an exception as PL/SQL saying that it has successfully populated the target array with all of the available rows (in this case, none). The target array is indeed initialized even though it contains no data. This end result is subtly different to the target array not being initialized at all

The following example (bulk_ndf_3.sql) shows there is an important difference between a bulk collect returning no rows into a collection versus a collection not being used within a bulk collect call at all:

SQL> set serverout on

SQL> declare

  2    type t_descr_list is table of hardware.descr%type;

  3    l_descr_list t_descr_list;

  4  begin

  5    select descr

  6    bulk collect

  7    into   l_descr_list

  8    from   hardware

  9    where  aisle = 0

 10    and    item = 0;

 11    dbms_output.put_line(l_descr_list.count||' rows found'),

 12  end;

 13  /

0 rows found



PL/SQL procedure successfully completed.

Comment out the SELECT statement and execute the same code again. The bulk collect never runs. Thus, the target array is never initialized. That lack of initialization leads to an error. For example, bulk_ndf_4.sql now crashes when attempting to reference the content of the collection.

SQL> declare

  2    type t_descr_list is table of hardware.descr%type;

  3    l_descr_list t_descr_list;

  4  begin

  5  --  select descr

  6  --  bulk collect

  7  --  into   l_descr_list

  8  --  from   hardware

  9  --  where  aisle = 0

 10  --  and    item = 0;

 11    dbms_output.put_line(l_descr_list.count||' rows found'),

 12  end;

 13  /

declare

*

ERROR at line 1:

ORA-06531: Reference to uninitialized collection

ORA-06512: at line 11

Knowing that bulk collect operations initialize target arrays even when no rows are returned allows you to refactor existing code without too much difficulty. Simply check the number of records in the array and explicitly raise the no_data_found exception to keep the rest of the code working as previously. For example, the following code (bulk_ndf_5.sql) adds a conditional check to raise no_data_found when the collection is empty:

SQL> set serverout on

SQL> declare

  2    type t_descr_list is table of hardware.descr%type;

  3    l_descr_list t_descr_list;

  4  begin

  5    select descr

  6    bulk collect

  7    into   l_descr_list

  8    from   hardware

  9    where  aisle = 0

 10    and    item = 0;

 11

 12    if l_descr_list.count = 0 then

 13       raise no_data_found;

 14    end if;

 15

 16    dbms_output.put_line('Item was found'),

 17  exception

 18    when no_data_found then

 19      dbms_output.put_line('Invalid item specified'),

 20  end;

 21  /

Invalid item specified



PL/SQL procedure successfully completed.

Exiting a Loop

When fetching within a cursor loop with the conventional row-at-a-time mechanism, it is implicit when there are no more rows left in the result set. You issue a fetch call; if it fails, you’re done. Things are slightly more complicated than that when you fetch in bulk—a fetch call may retrieve sufficient rows to fill the array, or it may retrieve no rows, or it may receive some rows but not enough to completely fill the array. This leads to a common program mistake when using the LIMIT clause.

First, here again is the row-at-a-time code (bulk_limit_exit_1.sql) as presented earlier in the chapter:

SQL> set serverout on

SQL> declare

  2    cursor c_tool_list is

  3      select descr

  4      from   hardware

  5      where  aisle = 1

  6      and    item between 1 and 25;

  7

  8    l_descr hardware.descr%type;

  9  begin

 10    open c_tool_list;

 11    loop

 12      fetch c_tool_list into l_descr;

 13      exit when c_tool_list%notfound;

 14      dbms_output.put_line('Fetched '||l_descr);

 15    end loop;

 16    close c_tool_list;

 17  end;

 18  /

Fetched Description 1

Fetched Description 2

Fetched Description 3

[snip]

Fetched Description 23

Fetched Description 24

Fetched Description 25



PL/SQL procedure successfully completed.

Note that 25 rows were fetched and displayed on the screen. Now the code is converted to bulk collect with the LIMIT clause in what appears to be the intuitive way, but as it turns out, the wrong way. Here is the incorrect solution (bulk_limit_exit_2.sql) and its output:

SQL> set serverout on

SQL> declare

  2    cursor c_tool_list is

  3      select descr

  4      from   hardware

  5      where  aisle = 1

  6      and    item between 1 and 25;

  7

  8    type t_descr_list is table of c_tool_list%rowtype;

  9    l_descr_list t_descr_list;

 10

 11  begin

 12    open c_tool_list;

 13    loop

 14      fetch c_tool_list

 15      bulk collect into l_descr_list limit 10;

 16      exit when c_tool_list%notfound;

 17

 18      for i in 1 .. l_descr_list.count loop

 19        dbms_output.put_line('Fetched '||l_descr_list(i).descr);

 20      end loop;

 21    end loop;

 22

 23    close c_tool_list;

 24  end;

 25  /

Fetched Description 1

Fetched Description 2

Fetched Description 3

[snip]

Fetched Description 18

Fetched Description 19

Fetched Description 20



PL/SQL procedure successfully completed.

Notice that the cursor loop has been exited prematurely: 25 rows (exactly like the row-at-a-time example) should have been output, but the code stopped after 20. The problem lies with line 16: exit when c_tool_list%notfound. 10 rows per fetch call are being requested with the LIMIT clause. On the third pass through the loop, there are only 5 rows left to be fetched. Because all of the rows are now fetched, the %NOTFOUND attribute becomes true and the loop exits without processing those remaining 5 rows in the array.

The solution is simple, either defer the checking of the cursor attribute until after processing any rows found in the fetch array or change the exit condition to be <array>.count = 0 (which would also mean one extra iteration through the fetch loop).

The following is an example of the first solution in which the code defers checking of the cursor attribute until after processing any rows in the array (bulk_limit_exit_3.sql):

SQL> set serverout on

SQL> declare

  2    cursor c_tool_list is

  3      select descr

  4      from   hardware

  5      where  aisle = 1

  6      and    item between 1 and 25;

  7

  8    type t_descr_list is table of c_tool_list%rowtype;

  9    l_descr_list t_descr_list;

 10

 11  begin

 12    open c_tool_list;

 13    loop

 14      fetch c_tool_list

 15      bulk collect into l_descr_list limit 10;

 16

Note that because it is possible that there are no rows fetched, you cannot make any assumptions that there are rows in the array. All processing must be mindful of that fact and must honor the value from the array’s count attribute, or you may get errors due to illegal references to array indexes. In the following code, the use of l_descr_list.count ensures that the loop will not even be entered if there are no entries in the array:

 17      for i in 1 .. l_descr_list.count loop

 18        dbms_output.put_line('Fetched '||l_descr_list(i).descr);

 19      end loop;

 20      exit when c_tool_list%notfound;

 21    end loop;

 22

 23    close c_tool_list;

 24  end;

 25  /

Fetched Description 1

Fetched Description 2

Fetched Description 3

[snip]

Fetched Description 23

Fetched Description 24

Fetched Description 25



PL/SQL procedure successfully completed.

Over Bulking

It’s rare but sometimes bulking up is not the appropriate way to retrieve way data from the database. Returning to the hardware store metaphor, consider the case where I drive down the store with only a few dollars in my pocket. I’ll only be able to buy things until I run out of cash which, with only a few dollars, will not be much. Let’s say I blindly follow the advice of always getting a trolley (a.k.a. bulk collect) to hold my items until I get to the cash register. What will be the result ?

It will take time to initially get a shopping trolley from the front of the store.
My trolley gets loaded with just a couple of items and then I have to stop—I’ve gone over my spending limit.
I pay for my items.
It takes more time to unload the items from the trolley into a bag, and then return the trolley back to the front of the store.

It is the same in database terms. If you know in advance that there may be some circumstance where you are not going to process an entire array of fetched rows, then going to the effort of trying to fetch them might actually hurt your performance rather than improve it. A very common cause of this is when processing a resultset that may need to prematurely exit. Consider the following example (bulk_early_exit_1.sql). First, I define a cursor to fetch some rows from the HARDWARE table, like so:

SQL> set timing on

SQL> declare

  2    cursor c_might_exit_early is

  3      select aisle, item

  4      from   hardware

  5      where  item between 400000 and 400050

  6        or   descr like '%10000%';

  7  begin

With the data in the HARDWARE table, this cursor will return 72 rows. The implicit fetch construct has been used in order to automatically get the bulk collect fetch size of 100 rows. However, the cursor is called c_might_exit_early for a reason.

  8    for i in c_might_exit_early

  9    loop

 10      if c_might_exit_early%rowcount = 40 then

 11         exit;

 12      end if;

Once 40 rows have been fetched, the cursor loop will be exited. Of course, a rownum <= 40 predicate could have been added to the SQL defining the cursor, but in a real world scenario, the condition is likely to be more complicated. For example, with each row being fetched, it could be passed to a web service that may return a flag indicating that no more rows should be sent to it. The key thing here is that there is a likelihood that the cursor loop may be prematurely exited with a condition that can’t be easily folded back into the cursor SQL statement.

 13    end loop;

 14  end;

 15  /



PL/SQL procedure successfully completed.



Elapsed: 00:00:00.38

The routine ran reasonably quickly, but 0.38 seconds seems sluggish for only 40 rows, and this is due to the automatic bulk collect. The code actually did the work of retrieving all of the 72 rows, because the first fetch call tried to retrieve 100 rows. The PL/SQL compiler did not know that the cursor processing may end prematurely. A session trace confirms that all 72 rows were retrieved.

SELECT AISLE, ITEM

FROM

 HARDWARE WHERE ITEM BETWEEN 400000 AND 400050 OR DESCR LIKE '%10000%'





call     count       cpu    elapsed       disk      query    current        rows

------- ------  -------- ---------- ---------- ---------- ----------  ----------

Parse        1      0.00       0.00          0          0          0           0

Execute      1      0.00       0.00          0          0          0           0

Fetch        1      0.17       0.17          0      10100          0          72

------- ------  -------- ---------- ---------- ---------- ----------  ----------

total        3      0.17       0.17          0      10100          0          72

It is the responsibility of the developer to recognize these types of situations and code accordingly; the PL/SQL compiler is not a mind reader. In the previous example, it turns out to be more efficient to not use bulk collect at all. For example (bulk_early_exit_2.sql) is the same routine coded in a more traditional row-by-row method.

SQL> set timing on

SQL> declare

  2    cursor c_might_exit_early is

  3      select aisle, item

  4      from   hardware

  5      where  item between 400000 and 400050

  6        or   descr like '%10000%';

  7    l_row c_might_exit_early%rowtype;

  8  begin

  9    open c_might_exit_early;

 10    loop

 11      fetch c_might_exit_early

 12      into l_row;

 13      exit when c_might_exit_early%notfound;

 14

 15      if c_might_exit_early%rowcount = 40 then

 16         exit;

 17      end if;

 18

 19    end loop;

 20    close c_might_exit_early;

 21  end;

 22  /



PL/SQL procedure successfully completed.



Elapsed: 00:00:00.17

Even with the overhead of 40 individual fetch calls, the performance is twice as good. The reason for this is that the SQL in this example has been crafted so that the last few rows are hard to find; that is, they are likely to be near the high water mark of the table. So while this is a slightly artificial example, it does demonstrate that you can’t just assume the bulk collect will always yield benefits. For this particular example, savvy developers will realize that with some extra coding, you can have the best of both worlds by explicitly nominating the fetch size rather than leaving it to the PL/SQL compiler.

The following example comes from bulk_early_exit_3.sql; it explicitly nominates a fetch size as a compromise between row-by-row fetches and the automatic bulk collect fetch size of 100:

SQL> set timing on

SQL> declare

  2    cursor c_might_exit_early is

  3      select aisle, item

  4      from   hardware

  5      where  item between 400000 and 400050

  6        or   descr like '%10000%';

  7    type t_rows is table of c_might_exit_early%rowtype;

  8    l_rows t_rows;

  9

 10    l_row_cnt pls_integer := 0;

 11

 12  begin

 13    open c_might_exit_early;

 14    <<cursor_loop>>

 15    loop

 16      fetch c_might_exit_early

Obviously, for this demo, it’s known that processing will stop after 40 rows, and thus a fetch size of 40 would be optimal. However, I’ve chosen a fetch size of 20 to reflect a real-world scenario where it’s not known precisely when the processing would prematurely terminate, so a developer would opt for a compromise between getting good value from each fetch call and not fetching an excessive amount of rows that may not ultimately be required.

 17      bulk collect into l_rows limit 20;

 18

 19      for i in 1 .. l_rows.count

 20      loop

 21         l_row_cnt := l_row_cnt + 1;

 22         if l_row_cnt = 40 then

 23            exit cursor_loop;

 24         end if;

 25      end loop;

 26

 27      exit when c_might_exit_early%notfound;

 28    end loop;

 29    close c_might_exit_early;

 30  end;

 31  /



PL/SQL procedure successfully completed.



Elapsed: 00:00:00.09

For this example, the turnaround time has been reduced from 0.38 seconds down to 0.09 seconds just by applying some of my own intelligence into the code rather than relying on the PL/SQL compiler’s.

Bulk Binding

Fetching data in bulk using BULK COLLECT completes half the picture of optimizing the interaction between PL/SQL and the database. The other half is the ability to manipulate data dispatched from PL/SQL to database tables as efficiently as possible. Every time a PL/SQL program needs to insert, update, or delete data within the database, in most circumstances, it’s best to achieve that in a single trip to the database. For some application requirements (for example, update attributes for a known primary key), the modification will be for a single row. However, when the requirement is to touch multiple rows in the database, PL/SQL can still perform that job in a single trip. This is known as “bulk binding,” and just like bulk collect, is all about moving from a metaphor of row processing to set processing (via a PL/SQL collection).

Note My wife generously offered to extend the metaphor by observing: “Is bulk bind the part of the chapter where you take all the junk you didn’t need and should not have bought back to the hardware store?” If that helps your understanding of bulk bind, then so be it.

It’s a common misconception that if the application requirement is modification of a single row, then bulk binding must not be appropriate. However, part of the art of writing efficient PL/SQL is the ability to recognize when what appears to be single-row process is perhaps not. For example, consider a web page that displays a tabular list of the employees, where each employee’s details can be updated by the operator. While it’s true that from the operator’s viewpoint, each employee is being updated in isolation from all of the others, this is not a justification to map this a series of single row update SQL statements, called either directly or within a PL/SQL block.

The set of altered records can be stored in an array and passed down to a PL/SQL program in a single call and the appropriate bulk binding performed. Always be on the lookout for parts of your application that might benefit from such an approach.

Getting Started with Bulk Bind

Like bulk collect, converting existing conventional DML code into its bulk bind DML equivalent is easy and straightforward. In conventional DML, it’s common to have a variable populated with values, and that variable is used within a DML statement. The following example (bulk_bind_1.sql) demonstrates a simple insert with PL/SQL variables repeated 100 times with a FOR loop:

SQL> declare

  2    l_row hardware%rowtype;

  3  begin

  4    for i in 1 .. 100 loop

  5      l_row.aisle := 1;

  6      l_row.item  := i;

  7      insert into hardware values l_row;

  8    end loop;

  9  end;

 10  /



PL/SQL procedure successfully completed.

In bulk bind DML, there is still a variable populated with values, but the variable is now an array, and DML is deferred until the array is filled with the values that will be used for the bulk binding. The FORALL keyword indicates that this is a bulk bind DML. For example (bulk_bind_2.sql) is equivalent to the row-at-a-time insert example above, but inserts 100 rows with a single call rather than 100:

SQL> declare

  2    type t_row_list is table of hardware%rowtype;

  3    l_row t_row_list := t_row_list();

  4  begin

  5    for i in 1 .. 100 loop

  6      l_row.extend;

  7      l_row(i).aisle := 1;

  8      l_row(i).item  := i;

  9    end loop;

 10

 11    forall i in 1 .. 100

 12      insert into hardware values l_row(i);

 13  end;

 14  /



PL/SQL procedure successfully completed.

Note Although FORALL seems to suggest some sort of loop processing, it is a single call to the database to perform the DML. This is easily confirmed with a session level trace.

INSERT INTO HARDWARE

VALUES  (:B1 ,:B2 ,:B3 ,:B4 )





call     count       cpu    elapsed       disk      query    current        rows

------- ------  -------- ---------- ---------- ---------- ----------  ----------

Parse        1      0.00       0.00          0          0          0           0

Execute      1      0.00       0.00          0          1          6         100

Fetch        0      0.00       0.00          0          0          0           0

------- ------  -------- ---------- ---------- ---------- ----------  ----------

total        2      0.00       0.00          0          1          6         100

Measuring Bulk Binding Performance

As with bulk collect, the motivation for bulk binding is performance, which can be quantified with simple benchmarks. I will start with a simple PL/SQL program that inserts a large number of rows into a table via single row inserts. The following example (bulk_insert_conventional.sql) adds 100,000 rows to the HARDWARE table, one row at a time:

SQL> set timing on

SQL> declare

  2    l_now date := sysdate;

  3  begin

  4   for i in 1 .. 100000 loop

  5     insert into HARDWARE

  6     values (i/1000, i, to_char(i), l_now);

  7   end loop;

  8  end;

  9  /



PL/SQL procedure successfully completed.



Elapsed: 00:00:04.52

Before moving into the bulk binding version, it’s worth nothing that Oracle insertion performance, even single row at a time, is exceptional. But the database can do better. Let’s move on to the bulk binding version (bulk_insert_bind.sql) and insert those 100,000 rows with a single call.

SQL> set timing on

SQL> declare

  2    l_now date := sysdate;

  3

A nested table type is defined to hold the array of records to be inserted.

  4    type t_rows is table of hardware%rowtype;

  5

  6    l_rows t_rows := t_rows();

  7  begin

And now instead of inserting the rows directly, the array is filled with the row values to be used.

  8   l_rows.extend(100000);

  9   for i in 1 .. 100000 loop

 10     l_rows(i).aisle := i/1000;

 11     l_rows(i).item := i;

 12     l_rows(i).descr := to_char(i);

 13     l_rows(i).stocked := l_now;

 14   end loop;

Finally, the FORALL syntax is used to load the records into the database in a single call.

 15

 16   forall i in 1 .. 100000

 17     insert into hardware values l_rows(i);

 18

 19  end;

 20  /



PL/SQL procedure successfully completed.



Elapsed: 00:00:00.31

From 4.52 seconds down to just 0.31 seconds! Just like bulk collect, migrating your code to a bulk bind approach for modifying sets of data gives dramatic performance improvements. But there are other not-so-obvious benefits occurring here as well. Every time you commence a new DML statement, undo structures must be allocated by the database to ensure that the DML statement can be rolled back if required, either due to error or explicit request. Using bulk binding issues less DML calls, which means less stress on your undo infrastructure. The following example (bulk_bind_undo.sql) examines the session level statistics to compare the undo required when using a row-by-row insert versus a bulk bind on insert:

SQL> set serverout on

SQL> declare

  2    type t_row_list is table of hardware.descr%type

  3       index by pls_integer;

  4    l_rows t_row_list;

  5

  6    l_stat1 int;

  7    l_stat2 int;

  8    l_stat3 int;

  9

 10  begin

 11    select value

 12    into   l_stat1

 13    from   v$mystat m, v$statname s

 14    where  s.statistic# = m.statistic#

 15    and    s.name = 'undo change vector size';

 16

 17    for i in 1 .. 1000 loop

 18       l_rows(i) := rpad('x',50);

 19       insert into hardware ( descr )  values ( l_rows(i) );

 20    end loop;

 21

 22    select value

 23    into   l_stat2

 24    from   v$mystat m, v$statname s

 25    where  s.statistic# = m.statistic#

 26    and    s.name = 'undo change vector size';

 27

 28    forall i in 1 .. l_rows.count

 29       insert into hardware ( descr )  values ( l_rows(i) );

 30

 31    select value

 32    into   l_stat3

 33    from   v$mystat m, v$statname s

 34    where  s.statistic# = m.statistic#

 35    and    s.name = 'undo change vector size';

 36

 37    dbms_output.put_line('Row at a time: '||(l_stat2-l_stat1));

 38    dbms_output.put_line('Bulk bind:     '||(l_stat3-l_stat2));

 39

 40  end;

 41  /

Row at a time: 64556

Bulk bind:     3296



PL/SQL procedure successfully completed.

So bulk bind gives a lot less undo. Similarly, those undo structures need to be protected by redo log entries so that the database instance is recoverable. If the previous example is repeated, but instead collects the redo size statistic instead of the undo change vector size (bulk_bind_redo.sql) then the output also shows a reduction in redo.

Row at a time: 295448

Bulk bind:     69552



PL/SQL procedure successfully completed.

Tip When producing examples for benchmarking, you need to ensure that you are isolating your code to purely examine the test at hand. When I was first performing the benchmark above, I explicitly referenced sysdate for each of the 100,000 rows being processed, namely:

for i in 1 .. 100000 loop

  insert into DEMO values (i, sysdate, to_char(i));

end loop;



for i in 1 .. 100000 loop

   l_rows(i).x := i;

   l_rows(i).y := sysdate;

   l_rows(i).z := to_char(i);

Note Both tests ran very slowly, and the bulk binding results did not outperform the single row test by the order magnitude I had expected. Some more careful experimenting showed that it was in fact the 100,000 calls to sysdate that was the dominant factor in the test. In fact, had I performed the test on earlier versions of Oracle, where references to the ‘sysdate’ would silently issue a ‘select sysdate from dual’, then the tests would suggest bulk binding was of no benefit at all. Always examine your tests carefully before concluding what the results might mean.

Monitoring Memory Usage

Just like bulk collect, if you have a large number of rows that you need to bulk bind, this does not necessarily mean that you should be pre-storing all of them in array before passing them to the database. There are diminishing returns on performance as the number of rows you bind goes up, at the ever increasing cost of PGA memory. You can bulk bind in batches to ensure that you do not exhaust session memory. Here is an example (bulk_bind_pga.sql) similar to that of the bulk collect memory demo demonstrating the PGA consumption with increasing bulk bind sizes:

SQL> set serverout on

SQL> declare

  2    type t_row_list is table of hardware.descr%type;

  3    l_rows t_row_list;

  4

  5    l_start_of_run timestamp;

  6

  7    l_pga_ceiling  number(10);

  8

  9    type t_bulk_sizes is table of pls_integer;

 10    l_bulk_sizes t_bulk_sizes := t_bulk_sizes(10,100,1000,10000,100000,1000000);

 11

 12    tot_rows pls_integer := 10000000;

 13

 14  begin

 15    select value

 16    into   l_pga_ceiling

 17    from   v$mystat m, v$statname s

 18    where  s.statistic# = m.statistic#

 19    and    s.name = 'session pga memory max';

 20

 21    dbms_output.put_line('Initial PGA: '||l_pga_ceiling);

 22

 23    for i in 1 .. l_bulk_sizes.count

 24    loop

 25

 26      execute immediate 'truncate table hardware';

 27

 28      l_start_of_run := systimestamp;

 29

 30      l_rows := t_row_list();

 31      l_rows.extend(l_bulk_sizes(i));

 32      for j in 1 .. l_bulk_sizes(i)

 33      loop

 34         l_rows(j) := rpad('x',50);

 35      end loop;

 36

 37      for iter in 1 .. tot_rows / l_bulk_sizes(i) loop

 38        forall j in 1 .. l_bulk_sizes(i)

 39          insert into hardware ( descr )  values (l_rows(j));

 40      end loop;

 41

 42      select value

 43      into   l_pga_ceiling

 44      from   v$mystat m, v$statname s

 45      where  s.statistic# = m.statistic#

 46      and    s.name = 'session pga memory max';

 47

 48      dbms_output.put_line('Bulk size: '||l_bulk_sizes(i));

 49      dbms_output.put_line('- Elapsed: '||( systimestamp - l_start_of_run));

 50      dbms_output.put_line('- PGA Max: '||l_pga_ceiling);

 51

 52    end loop;

 53

 54  end;

 55  /

Initial PGA: 3470120

Bulk size: 10

- Elapsed: +000000000 00:00:53.478000000

- PGA Max: 4042888

Bulk size: 100

- Elapsed: +000000000 00:00:14.760000000

- PGA Max: 4042888

Bulk size: 1000

- Elapsed: +000000000 00:00:10.588000000

- PGA Max: 4042888

Bulk size: 10000

- Elapsed: +000000000 00:00:11.872000000

- PGA Max: 4108424

Bulk size: 100000

- Elapsed: +000000000 00:00:16.431000000

- PGA Max: 12890248

Bulk size: 1000000

- Elapsed: +000000000 00:00:17.507000000

- PGA Max: 99266696



PL/SQL procedure successfully completed.

Similar to bulk collect, once you are binding above 1,000 rows per array, the benefits are negligible and the increase in memory consumption will certainly become a scalability threat if your application consists of hundreds or thousands of sessions all consuming large amounts of memory on the server. In this example, performance actually got worse as the bulk bind sizes got larger than 1,000.

Improvements in 11g

The sample code in this chapter has been written assuming version 11 of the database. If you are running the bulk bind samples on an earlier version, then you may see an error like this:

SQL> create table DEMO ( x int, y int);



Table created.



SQL> declare

  2    type t_rows is

  3      table of demo%rowtype

  4      index by pls_integer;

  5    l_rows t_rows;

  6  begin

  7    l_rows(1).x := 1;

  8    l_rows(1).y := 1;

  9

 10    l_rows(2).x := 2;

 11    l_rows(2).y := 2;

 12

 13    forall i in 1 .. l_rows.count

 14       insert into DEMO

 15       values ( l_rows(i).x, l_rows(i).y );

 16  end;

 17  /

PL/SQL procedure successfully completed.



 values ( l_rows(i).x, l_rows(i).y );

              *

ERROR at line 15:

PLS-00436: implementation restriction: cannot reference

              fields of BULK In-BIND table of records

In versions of Oracle prior to 11g, bulk bind operations were not able to access the individual elements of a record or object type within an associative array. However, all is not lost—it just takes a little more coding. You can still bulk bind arrays of simple datatypes, thus you can use an associative array for each attribute that you need to bulk bind. The previous example can be recast into a version that will work on earlier versions of Oracle.

SQL>  declare

  2    type t_x_list is table of demo.x%type

  3      index by pls_integer;

  4

  5    type t_y_list is table of demo.y%type

  6      index by pls_integer;

  7

  8    l_x_rows t_x_list;

  9    l_y_rows t_y_list;

 10  begin

 11    l_x_rows(1) := 1;

 12    l_x_rows(2) := 2;

 13

 14    l_y_rows(1) := 1;

 15    l_y_rows(2) := 2;

 16

 17    forall i in 1 .. l_x_rows.count

 18       insert into DEMO

 19       values ( l_x_rows(i), l_y_rows(i) );

 20  end;

 21  /



PL/SQL procedure successfully completed.

Tip For other techniques to avoid this restriction on earlier versions of Oracle, see www.oracle-developer.net/display.php?id=410

Error Handling with Bulk Bind

There are large benefits to had from bulk binding. But one area where extra care is required is in error handling. In a development model where rows are modified on a row-at-a-time basis, when a SQL statement fails with an error, the erroneous row in question is implicit—it’s the row you are working with. For example, in the following simple PL/SQL block (bulk_error_1.sql) with two insert statements, it is obvious which insert statement is the problem one:

SQL> alter table hardware

  2    add constraint

  3    hardware_chk check ( item > 0 );



Table altered.



SQL> begin

  2    insert into hardware ( item )  values (1);

  3    insert into hardware ( item ) values (-1);

  4    insert into hardware ( item ) values (2);

  5    insert into hardware ( item ) values (3);

  6    insert into hardware ( item ) values (4);

  7    insert into hardware ( item ) values (-2);

  8  end;

  9  /

begin

*

ERROR at line 1:

ORA-02290: check constraint (MCDONAC.HARDWARE_CHK) violated

ORA-06512: at line 3

When you make the transition to modifying rows in bulk, things get a little more complicated, so more care is needed. A single FORALL statement might be canvassing hundreds rows. When the previous example is repeated in bulk mode (bulk_error_2.sql), it’s not immediately obvious where row caused the error.

SQL> declare

  2    type t_list is table of hardware.item%type;

  3    l_rows t_list := t_list(1,-1,2,3,4,-2);

  4  begin

  5    forall i in 1 .. l_rows.count

  6      insert into hardware ( item ) values (l_rows(i));

  7  end;

  8  /

declare

*

ERROR at line 1:

ORA-02290: check constraint (MCDONAC.HARDWARE_CHK) violated

ORA-06512: at line 5

As per normal statement level operations, the default is for the entire bulk bind operation, so even though one of the rows in the array was valid, there will be no rows present in the target table.

SQL> select count(*) from HARDWARE;



no rows selected.

Note that rollback of changes is not a property of the bulk bind per se; it’s a standard part of the PL/SQL transactional management. A common misconception with PL/SQL is that it is just a wrapper around a series of independent SQL statements. Thus, in the first example where the second insert statement failed

begin

  insert into DEMO values (1);

  insert into DEMO values (-1);

end;

developers then feel the need to take reparative action to undo the first insert. It is common to see exception handling code like the following in PL/SQL modules:

begin

  insert into DEMO values (1);

  insert into DEMO values (-1);

exception

  when others then

    rollback;

    raise;

end;

There is no requirement to perform such a rollback. Moreover, such a rollback will typically result in data corruption within your application. PL/SQL blocks implicitly create a savepoint into the code. Thus, independent of where an error occurs in a PL/SQL block, all changes in the block are automatically rolled back to a point as if the PL/SQL routine was never called. This behavior is one of the truly great features in PL/SQL. Very few other languages make transaction management so easy.

Now returning to the bulk bind example, simple code inspection shows that the second array entry with a value of -1 is the problem row. However, this is because the example is so trivial. There is no information from the actual error message that reveals which array entry was the cause—just that one or more of the entries have violated the constraint. In Oracle 9, this was addressed by extending the bulk bind syntax to add the SAVE EXCEPTIONS clause. The error still occurs, but additional information to allow diagnosis of which array entries are in error. Let’s amend the example as follows (bulk_error_3.sql) to demonstrate how to use SAVE EXCEPTIONS:

SQL> declare

  2    type t_list is table of hardware.item%type;

  3    l_rows t_list := t_list(1,-1,2,3,4,-2);

  4  begin

  5    forall i in 1 .. l_rows.count save exceptions

  6      insert into hardware ( item ) values (l_rows(i));

  7  end;

  8  /

ERROR:

ORA-24381: error(s) in array DML

ORA-06512: at line 5

At first glance, it appears not much has been achieved, but this example demonstrates that a new exception is raised (ORA-24381) rather than the previous constraint violation (ORA-02290). By handling this particular bulk bind exception, I gain access to a number of special attributes that allow drilling down into the errors. For example, the following code (bulk_error_4.sql) introduces more code into the exception handler to reveal the true cause of the error:

SQL> set serverout on

SQL> declare

  2    type t_list is table of hardware.item%type;

  3    l_rows t_list := t_list(1,-1,2,3,4,-2);

  4

  5    bulk_bind_error exception;

  6    pragma exception_init(bulk_bind_error,-24381);

  7

  8  begin

  9    forall i in 1 .. l_rows.count save exceptions

 10      insert into hardware ( item ) values (l_rows(i));

 11

 12  exception

 13    when bulk_bind_error then

 14      dbms_output.put_line(

 15         'There were '||sql%bulk_exceptions.count||' errors in total'),

 16      for i in 1 .. sql%bulk_exceptions.count loop

 17         dbms_output.put_line(

 18            'Error '||i||' occurred at array index:'||

 19             sql%bulk_exceptions(i).error_index);

 20         dbms_output.put_line('- error code:'||

 21             sql%bulk_exceptions(i).error_code);

 22         dbms_output.put_line('- error text:'||

 23             sqlerrm(-sql%bulk_exceptions(i).error_code));

 24      end loop;

 25

 26  end;

 27  /

There were 2 errors in total

Error 1 occurred at array index:2

- error code:2290

- error text:ORA-02290: check constraint (.) violated

Error 2 occurred at array index:6

- error code:2290

- error text:ORA-02290: check constraint (.) violated



PL/SQL procedure successfully completed.

When the SAVE EXCEPTIONS syntax is used within a FORALL, any errors become available within a new collection named SQL%BULK_EXCEPTIONS that contains a row for each error. Each row in that new collection contains the following:

error_index: The index from the collection used in the FORALL
error_code: The oracle error code; note that the error code is positive (unlike the SQLCODE built-in function in PL/SQL).

Tip Take care with older versions of the Oracle documentation. Some code examples position the SQL%BULK_EXCEPTIONS inline within the code, directly under the FORALL statement, thus suggesting that an exception will not be raised. As seen from the previous demo, you must code the references to SQL%BULK_EXCEPTIONS collection within your exception handler code section.

Also, since the SQL%BULK_EXCEPTIONS attribute is a collection, multiple errors can be caught and handled. In the previous example, because the exception handler did not re-raise the error back to the calling environment, the successfully inserted rows are still retained within the table.

SQL> select item from HARDWARE;



         X

----------

         1

         2

         3

         4

SAVE EXCEPTIONS with Batches

As described earlier, when bulk binding a large number of rows, you will be processing the rows in smaller size chunks to avoid consuming excessive session PGA memory. But if you are bulk binding in batches, then each FORALL call will re-initialize the SQL%BULK_EXCEPTIONS structure. So, in this circumstance, the structure can’t be used to house the entire set of rejected rows during a series of bulk bind calls. One possible workaround is to catch any errors out of each bulk bind call, save them to a table, then process the next batch of 1,000 rows. An example (bulk_error_5.sql) of this approach is presented below:

SQL> create table ERRS

  2    ( error_index   number(6),

  3      error_code    number(6),

  4      item          number );



Table created.



SQL> set serverout on

SQL> declare

  2    type t_list is table of hardware.item%type;

  3    l_rows t_list := t_list(1,-1,2,3,4,-2);

  4

  5    bulk_bind_error exception;

  6    pragma exception_init(bulk_bind_error,-24381);

  7

  8    type t_err_list is table of ERRS%rowtype index by pls_integer;

  9    l_err_list t_err_list;

 10

 11  begin

 12    forall i in 1 .. l_rows.count save exceptions

 13      insert into hardware ( item ) values (l_rows(i));

 14

 15  exception

 16    when bulk_bind_error then

If there are rows in error, then a new structure (l_err_list) is populated with all of the information from the sql%bulk_exceptions collection.

 17      for i in 1 .. sql%bulk_exceptions.count loop

 18         l_err_list(i).error_index := sql%bulk_exceptions(i).error_index;

 19         l_err_list(i).error_code := sql%bulk_exceptions(i).error_code;

 20         l_err_list(i).item := l_rows(sql%bulk_exceptions(i).error_index);

 21      end loop;

Then these will saved into the ERRS table, using bulk bind, of course!

 22      forall i in 1 .. l_err_list.count

 23         insert into ERRS values l_err_list(i);

 24

 25  end;

 26  /



PL/SQL procedure successfully completed.



SQL> select * from errs;



ERROR_INDEX ERROR_CODE       ITEM

----------- ---------- ----------

          2       2290         -1

          6       2290         -2

LOG ERRORS Clause

Alternatively, you can achieve something similar by converting your code to a pure SQL approach and take advantage of the LOG ERRORS clause. From 10g onwards, if the DML operation you were planning to use bulk bind for can be expressed natively in SQL, you can catch errors in a similar fashion to the SAVE EXCEPTIONS clause. The DDL and code below are contained in bulk_log_errors.sql.

First, create a table to catch any errors using the supplied DBMS_ERRLOG package.

SQL> exec DBMS_ERRLOG.CREATE_ERROR_LOG('HARDWARE'),



PL/SQL procedure successfully completed.

This execution creates a table with the columns from the HARDWARE table, plus additional columns to indicate what kind of error has occurred. Here’s what the resulting table will look like:

SQL> desc err$_HARDWARE

 Name                       Null?    Type

 -------------------------- -------- ----------------

 ORA_ERR_NUMBER$                     NUMBER

 ORA_ERR_MESG$                       VARCHAR2(2000)

 ORA_ERR_ROWID$                      ROWID

 ORA_ERR_OPTYP$                      VARCHAR2(2)

 ORA_ERR_TAG$                        VARCHAR2(2000)

 AISLE                               VARCHAR2(4000)

 ITEM                                VARCHAR2(4000)

 DESCR                               VARCHAR2(4000)

 STOCKED                             VARCHAR2(4000)

Then execute the SQL statement with the additional log error clause to capture errors, like so:

SQL> insert

  2  into HARDWARE ( item )

  3  with SRC_ROWS as

  4    ( select -3 + rownum     x from dual

  5      connect by level <= 6 )

  6  select x

  7  from SRC_ROWS

  8  log errors reject limit unlimited;



3 rows created.

You’ll find that, similar to the bulk bind example (bulk_bind_error_5.sql), the erroneous rows have been captured along with the reason for the error.

SQL> select

  2   item

  3  ,ora_err_number$

  4  ,ora_err_mesg$

  5  from err$_HARDWARE;



ITEM  ORA_ERR_NUMBER$ ORA_ERR_MESG$

----- --------------- ------------------------------------------------------------

-2               2290 ORA-02290: check constraint (MCDONAC.HARDWARE_CHK) violated

-1               2290 ORA-02290: check constraint (MCDONAC.HARDWARE_CHK) violated

0                2290 ORA-02290: check constraint (MCDONAC.HARDWARE_CHK) violated

A complete coverage of the LOG ERRORS extension is beyond the scope of this book, but the previous example demonstrates how it can be used in a very similar fashion to the SAVE EXCEPTIONS feature in PL/SQL, with the added benefit of having the erroneous data readily available within a table for further processing.

Robust Bulk Bind

When you bulk collect into a collection, the array is populated from index 1. Almost all of the examples in the standard documentation make this assumption, the code typically being along the lines of:

begin

  select ...

  bulk collect into <array>

  from ...



  for i in 1 .. <array>.count loop

    <processing>

  end loop;

end;

It is very unlikely that Oracle will ever change this behavior, so its reasonably safe to assume that any collection initialized with bulk collect will always start at an array index of 1. However, collections don’t fall solely under the domain of the bulk collect/bulk bind feature in PL/SQL. In fact, collections pre-date bulk operations, going all the way back to Oracle 7. Once data has been fetched into a collection, the developer is free to do whatever they like to the contents of a collection. So what happens to a bulk bind operation if the collection no longer contains a set of elements starting from index=1. Let’s explore some scenarios.

Scenario 1: Elements Do Not Start at 1

As long as an array’s indices are contiguous, you can use the attributes of the array itself to continue to use bulk bind. Consider the following example (bulk_bind_scenario1.sql) where the array starts from 10. The array attributes FIRST and LAST can be used to define the extrema of the array for bulk binding.

SQL> declare

  2    type t_num_list is table of hardware.item%type index by pls_integer;

  3

  4    val t_num_list;

  5  begin

  6

  7    val(10) := 10;

  8    val(11) := 20;

  9    val(12) := 20;

 10

 11    FORALL i IN val.first .. val.last

 12       insert into hardware ( item ) values (val(i));

 13

 14  end;

 15  /



PL/SQL procedure successfully completed.

In fact, it’s probably reasonable to adopt a standard that .FIRST and .LAST attributes should be used in preference to 1 and .COUNT. However, the next scenario demonstrates that doing so does not provide a total safeguard.

Scenario 2: Elements Are Not Contiguous

Let’s repeat the preceding example with one slight alteration: a missing entry in the array. The following is the example and the resulting error caused by the missing entry (bulk_bind_scenario2a.sql):

SQL> declare

  2    type t_num_list is table of hardware.item%type index by pls_integer;

  3

  4    val t_num_list;

  5  begin

  6

  7    val(10) := 10;

  8  --  val(11) := 20;

  9    val(12) := 20;

 10

 11    FORALL i IN val.first .. val.last

 12       insert into hardware ( item ) values (val(i));

 13

 14  end;

 15  /

declare

*

ERROR at line 1:

ORA-22160: element at index [11] does not exist

ORA-06512: at line 11

Once collections become sparse, bulk bind will not automatically work using low and high boundary index values. However, from version 10.2 onwards, the FORALL syntax has been extended to include the INDICES OF and VALUES OF specification. The use of INDICES OF fixes the ORA-22160 issue encountered above as demonstrated here (bulk_bind_scenario2b.sql):

SQL> declare

  2    type t_num_list is table of hardware.item%type index by pls_integer;

  3

  4    val t_num_list;

  5  begin

  6

  7    val(10) := 10;

  8  --  val(11) := 20;

  9    val(12) := 20;

 10

 11    FORALL i IN INDICES OF val

 12       insert into hardware ( item ) values (val(i));

 13

 14  end;

 15  /



PL/SQL procedure successfully completed.

I really like this syntax. There is no dependency on the array attributes, and the code is independent of how the data is spread throughout the array. I recommend adopting a standard of using the INDICES OF clause whenever you want to process an entire collection, and that use of .FIRST, .LAST, and .COUNT should be deprecated in your PL/SQL code. Sadly, the INDICES OF extension can only be used in a FORALL statement, not in a standard FOR loop.

If, however, the bulk binding you need to perform is more along the lines of cutting and slicing an existing collection, then this is where the VALUES OF syntax can be useful. The VALUES OF clause allows a level of indirection, somewhat similar to pointers, to allow you to bulk bind selected subsets of a larger collection.

In this next example (bulk_bind_values_of.sql), a collection will be populated that contains a STATUS attribute in that rows flagged with a status of NEW will be inserted into the HARDWARE table and rows flagged with status of UPD will update their matching rows in the HARDWARE table. The example thus emulates a MERGE statement. The collection will be examined to determine which indices should be used for update and which should be used for insert. The VALUES OF clause will then be used to process the rows.

First, I’ll create some structures representing the input data. The variable src will be the input data, containing 100 rows each with a status of NEW or UPD.

SQL> set serverout on

SQL> declare

  2    type t_input_row is record (

  3       item   hardware.item%type,

  4       descr  hardware.descr%type,

  5       status varchar2(3)

  6       );

  7

  8    type t_input_list is

  9       table of t_input_row

 10       index by pls_integer;

 11

 12    src t_input_list;

Now two variables (IND_N and IND_Y) are defined and will hold the index values of the appropriate rows in SRC. In this way, rather than copying the source data into separate collections (one for status = NEW and one for status = UPD), a record of just the index entries is retained. Avoiding the need to copy is of particular importance if the source data is large.

 13

 14    type t_target_indices is

 15       table of pls_integer

 16       index by pls_integer;

 17

 18    ind_new  t_target_indices;

 19    ind_upd  t_target_indices;

 20

 21  begin

Now the source data is seeded with some fictitious values (80 rows of status = UPD, and 20 rows of status = NEW). In a real world example, the data would probably be initialized elsewhere and passed into the application for processing.

 22    for i in 1 .. 100 loop

 23       src(i).item   := i;

 24       src(i).descr  := 'Item '||i;

 25       src(i).status :=  case when mod(i,5) = 0 then 'NEW' else 'UPD' end;

 26    end loop;

 27

Now the data is scanned and the indices arrays are populated with the indices that correspond to the two different status values.

 28

 29    for i in 1 .. 100 loop

 30       if src(i).status = 'NEW' then

 31          ind_new(ind_new.count) := i;

 32       else

 33          ind_upd(ind_upd.count) := i;

 34       end if;

 35    end loop;

And finally, the VALUES OF syntax is used to transfer the changes into the HARDWARE table. Even though the rows relevant to each table are sparsely distributed throughout the ‘src’ collection, the VALUES OF syntax gives direct access to them for bulk binding.

 36

 37    forall i in values of ind_new

 38       insert into hardware ( aisle, item)

 39       values (1, src(i).item);

 40    dbms_output.put_line(sql%rowcount||' rows inserted'),

 41

 42    forall i in values of ind_upd

 43       update hardware

 44       set descr = src(i).descr

 45       where aisle = 1

 46       and item = src(i).item;

 47    dbms_output.put_line(sql%rowcount||' rows updated'),

 48

 49  end;

 50  /

20 rows inserted

80 rows updated

The VALUES OF and INDICES OF syntax completes the bulk bind implementation. Any permutation of entries within an array can be manipulated and bound into a database table.

Earlier Versions of Oracle

If you own Oracle 10.1 or below, then your version of Oracle does not yet support the VALUES OF or INDICES OF extensions, but all is not lost. If a collection is possibly sparse, then transferring that collection to one that is dense will solve the problem. This is not something to done lightly, because if the collection is large in size, then you will be holding two copies of the collection in memory while you densify the data. I’ll return to the first example in this section where the input array did not contain contiguous entries and solve the problem without using VALUES OF (bulk_bind_scenario_oldver.sql).

First, here is the code to declare the first collection val, which will be the sparse collection:

SQL> declare

  2    type t_num_list is table of hardware.item%type

  3      index by pls_integer;

  4

  5    val t_num_list;

Now a second array is defined, which will contain the entries from the sparse collection val:

  6

  7    dense_val t_num_list;

  8    idx       pls_integer;

  9

 10  begin

 11

 12    val(10) := 10;

 13    val(12) := 20;

 14

Population of the dense_val array is performed by walking along the entries of the val collection, using the collection attributes .FIRST and .NEXT. The collection dense_val will start at 0 (because dense_val.count is initially zero) and grows up to the number of elements in val.

 15    idx := val.first;

 16    while idx is not null loop

 17       dense_val(dense_val.count) := val(idx);

 18       idx := val.next(idx);

 19    end loop;

The dense_val collection is then used rather than sparse val to perform the bulk bind:

 20

 21    FORALL i IN dense_val.first .. dense_val.last

 22       insert into hardware ( item )

 23       values (dense_val(i));

 24

 25  end;

 26  /



PL/SQL procedure successfully completed.

A Justification for Massive Collections

As seen throughout this chapter, once you start working with collections, you need to be aware of the implications with regard to memory consumption. However, sometimes your performance requirements have their motivation in other parts of the database infrastructure.

Many of the examples you have seen involve inserting large amounts of rows into a table. While bulk binding is dramatically faster than single row inserts, it will still consume large amounts of redo because the inserts are still DML issued via the conventional path, so performance may still compromised to a slight degree. For example, performance may be compromised due to freelist or segment space bitmap management, or due to advancing the high watermark of the table.

Oracle Database 11.2 introduces the APPEND_VALUES hint, which will convert a conventional path insert statement into a direct path load. Here’s an example of the hint in use (bulk_bind_append_values_1.sql):

SQL> insert /*+ APPEND_VALUES */

  2  into HARDWARE ( item ) values (1);



1 row created.



SQL> select item from HARDWARE;

select item from HARDWARE

                 *

ERROR at line 1:

ORA-12838: cannot read/modify an object after modifying it in parallel



SQL> commit;



Commit complete.



SQL> select item from HARDWARE;



      ITEM

----------

         1

When the APPEND_VALUES hint was announced, its usefulness was questioned. After all, who would want to lock a table, advance its high watermark, and be bound to end the transaction immediately all to insert just a single row? However, when combined with bulk bind, the usefulness of the feature becomes more apparent. In the next example (bulk_bind_append_values_2.sql), the session level statistics are measured to compare the difference in redo consumption between a conventional insert and a direct load insert via a bulk bind:

SQL> declare

  2    type t_list is table of hardware.descr%type;

  3    l_rows t_list := t_list();

  4    l_now timestamp;

  5    l_redo1 int;

  6    l_redo2 int;

  7    l_redo3 int;

  8  begin

  9

 10    select value

 11    into   l_redo1

 12    from   v$mystat m, v$statname s

 13    where  s.statistic# = m.statistic#

 14    and    s.name = 'redo size';

 15

An array of 1,000,000 rows to be inserted via a standard bulk bind is prepared, and the before and after redo consumption by this session is captured.

 16    for i in 1 .. 1000000 loop

 17       l_rows.extend;

 18       l_rows(i) := i;

 19    end loop;

 20

 21    l_now := systimestamp;

 22    forall i in 1 .. l_rows.count

 23      insert into hardware ( descr )  values (l_rows(i));

 24    dbms_output.put_line('Elapsed = '||(systimestamp-l_now));

 25

 26    select value

 27    into   l_redo2

 28    from   v$mystat m, v$statname s

 29    where  s.statistic# = m.statistic#

 30    and    s.name = 'redo size';

 31

Then the table is truncated, and the load is re-performed using APPEND_VALUES for direct path load.

 32    execute immediate 'truncate table hardware';

 33

 34    dbms_output.put_line('Redo conventional = '||(l_redo2-l_redo1));

 35

 36    l_now := systimestamp;

 37    forall i in 1 .. l_rows.count

 38      insert /*+ APPEND_VALUES */ into hardware ( descr ) values (l_rows(i));

 39    dbms_output.put_line('Elapsed = '||(systimestamp-l_now));

 40

 41    select value

 42    into   l_redo3

 43    from   v$mystat m, v$statname s

 44    where  s.statistic# = m.statistic#

 45    and    s.name = 'redo size';

 46

 47    dbms_output.put_line('Redo direct load = '||(l_redo3-l_redo2));

 48

 49  end;

 50  /

Elapsed = +000000000 00:00:03.057000000

Redo conventional = 67599300

Elapsed = +000000000 00:00:02.668000000

Redo direct load = 146912



PL/SQL procedure successfully completed.

As expected, the direct path load operation was faster and used much less redo. So while it’s always important to be aware of the memory consumption with collections, when it comes to using bulk bind for direct load inserts, you may find yourself using larger-than-normal collection sizes if the memory on your database server permits.

The Real Benefit: Client Bulk Processing

As I mentioned at the start of the chapter, my repeated meanderings within the hardware store are hurting my efficiency. However, sometimes things are much, much worse: I pay for an item, drive home, and then realize I need to get back in the car, drive back to the hardware store, and purchase something else. (For the sake of sensitive readers, I won’t include the term of “endearment” my wife uses when this happens!)

This chapter has described the efficiency of using bulk operations in PL/SQL to access data, but that is equivalent to already being at the hardware store. The cost of coming from an external client application to the database (network trip) and processing data in a row-at-a-time basis is equivalent to driving back and forth to the store in a car. The cost of this can be quantified with some simple demos. First, I’ll build (using PL/SQL, of course!) a replica of what many client applications implement in 3GL code, namely a routine to open a cursor on the table and a routine a fetch a single row. Here is the code, which you’ll find in bulk_network_1.sql:

SQL> create or replace

  2  package PKG1 is

  3

  4   procedure open_cur(rc in out sys_refcursor);

  5

  6   procedure fetch_cur(rc in out sys_refcursor, p_row out hardware.item%type);

  7

  8  end;

  9  /



Package created.

SQL> create or replace

  2  package body PKG1 is

  3

  4   procedure open_cur(rc in out sys_refcursor) is

  5   begin

  6     open rc for select item from hardware;

  7   end;

  8

  9   procedure fetch_cur(rc in out sys_refcursor, p_row out hardware.item%type) is

 10   begin

 11     fetch rc into p_row;

 12   end;

 13

 14  end;

 15  /



Package body created.

SQL Plus will suffice as the client application calling this package. SQL Plus is, of course, a genuine SQL client in its own right, and the number of rows it will fetch per fetch call is a preference you have explicit control over, so the demo above could be written just as

set arraysize n  (n=1 for single row fetch, n=1000 for multirow fetch)

select item from HARDWARE;

but I want to mimic what a 3GL application will do, namely, contain explicit calls to open the cursor, fetch from it repeatedly, and then close it. To open the cursor, and then repeatedly fetch rows until the resultset is exhausted, I can use the script bulk_single_fetch_100000.sql, which performs the following:

variable rc refcursor

exec pkg1.open_cur(:rc)

variable n number

exec     pkg1.fetch_cur(:rc,:n);

exec     pkg1.fetch_cur(:rc,:n);

[repeat 100000 times]

In order to view the elapsed time for the demo, without scrolling through 100,000 lines of output, I switch off terminal output and record before and after timestamps.

SQL> variable rc refcursor

SQL> exec pkg1.open_cur(:rc)

SQL> select to_char(systimestamp,'HH24:MI:SS.FF') started from dual;



STARTED

------------------

12:11:18.779000



SQL> set termout off

SQL> @bulk_single_fetch_1000.sql   -- contains 1000 fetch calls

[repeated 100 times]



SQL> set termout on

SQL> select to_char(systimestamp,'HH24:MI:SS.FF') ended from dual;

ENDED

------------------

12:12:47.270000

You can see that 100,000 trips to the database took approximately 90 seconds. I repeated that demo with a session trace enabled and examined the resultant tkprof-formatted file. Here are the results:

SELECT ITEM

FROM   HARDWARE



call     count       cpu    elapsed       disk      query    current        rows

------- ------  -------- ---------- ---------- ---------- ----------  ----------

Parse        1      0.00       0.05          0          1          0           0

Execute      1      0.00       0.00          0          0          0           0

Fetch   100000      1.17       0.94          0     100003          0      100000

------- ------  -------- ---------- ---------- ---------- ----------  ----------

total   100002      1.17       1.00          0     100004          0      100000

You can see that out of this 90 seconds, only one second was spent actually doing work in the database. The other 89 seconds are spent jumping back and forth across the network. The database is doing nothing, but from the perspective of the client application, it is waiting on the database. In motoring parlance, I’m spinning my wheels but going nowhere. At my current workplace, whenever we encounter 3GL programs on the middle tier server exhibiting this behavior, we call it “middleware fail.” It is amazing how labeling poor quality code in this way sharpens the focus of the development team!

Now, let’s apply this newfound knowledge of bulk processing to fetch the data from the database in bulk and pass that data back to the client in bulk. The following code (bulk_network_2.sql) is the new implementation and the performance will be much better:

SQL> create or replace

  2  package PKG2 is

  3

  4   type t_num_list is table of hardware.item%type index by pls_integer;

  5

  6   procedure open_cur(rc in out sys_refcursor);

  7

  8   procedure fetch_cur(rc in out sys_refcursor, p_rows out t_num_list);

  9

 10  end;

 11  /



Package created.



SQL> create or replace

  2  package body PKG2 is

  3

  4   procedure open_cur(rc in out sys_refcursor) is

  5   begin

  6     open rc for select item from hardware;

  7   end;

  8

  9   procedure fetch_cur(rc in out sys_refcursor, p_rows out t_num_list) is

 10   begin

 11     fetch rc bulk collect into p_rows limit 1000;

 12   end;

 13

 14  end;

 15  /



Package body created.

Rerun the demo. Each time, 1,000 rows will be bulk collected from the database and passed back to the client. Because SQL Plus does not natively understand the array that comes back, the data will simply be appended to a large VARCHAR2 variable (the_data) to simulate the client receiving the array data. Here is an example (bulk_multi_fetch_in_bulk.sql):

SQL> variable rc refcursor

SQL> exec pkg2.open_cur(:rc)



PL/SQL procedure successfully completed.



SQL> select to_char(systimestamp,'HH24:MI:SS.FF') started from dual;



STARTED

------------------

12:39:09.704000



SQL> variable the_data varchar2(4000);

SQL> set termout off



SQL> declare

  2    n pkg2.t_num_list;

  3  begin

  4    :the_data := null;

  5    pkg2.fetch_cur(:rc,n);

  6    for i in 1 .. n.count loop

  7      :the_data := :the_data||n(i);

  8    end loop;

  9  end;

[repeated 100 times]



SQL> select to_char(systimestamp,'HH24:MI:SS.FF') ended from dual;



ENDED

------------------

12:39:09.837000

The difference is astounding. Immediate turnaround time is down from 90 seconds to 0.13 seconds. Re-running the demo with trace enabled reveals the reduction in database trips.

SELECT ITEM

FROM HARDWARE



call     count       cpu    elapsed       disk      query    current        rows

------- ------  -------- ---------- ---------- ---------- ----------  ----------

Parse        1      0.00       0.00          0          0          0           0

Execute      1      0.00       0.00          0          0          0           0

Fetch      100      0.04       0.04          0        256          0      100000

------- ------  -------- ---------- ---------- ---------- ----------  ----------

total      102      0.04       0.04          0        256          0      100000

100,000 rows were still fetched (far right column), but 100 fetch calls were made. I cannot overstate the benefit that has been realized here, even with just this simple demo. It’s not uncommon for customers of Oracle to spend large sums of money performing optimization exercises in order to glean perhaps an extra 10-15 percent performance out of their system through such methods as caching, index creation, SQL tuning, etc. Sometimes, customers even purchase more hardware along with associated license cost increases. But consider the performance benefits gained here in this demo:

From 90 seconds down to 0.13 seconds, an almost 700-fold improvement.

As an application developer, if you’re making modifications to code that makes it hundreds of times faster, that’s going to make your applications that much more successful and you very popular! Reducing network trips in a client application’s interaction with the database comes down largely the tools at the disposal of the developer in the language of their client application, the design of the application to take advantage of passing data to the database in an intelligent fashion, and the diligence of the developer. But as a database developer, if you can ensure that your PL/SQL interfaces to the database are equipped to allow client applications to pass and receive data in bulk, then you are that much closer to successful application performance.

Summary

One of the reasons I enjoy writing, presenting, or talking about the bulk operations in PL/SQL is that using them is virtually a guaranteed success story. Many Oracle features, new or old, apply only to a particular niche of customers or address a specific technical issue, and moreover, require a large amount of careful testing to ensure that using the features does not result in negative impacts elsewhere within the database or its applications. On the other hand, adopting bulk collection and bulk binding in PL/SQL benefits applications in the overwhelming majority of occasions. The effort required to move from row-centric fetching of data to the set-centric bulk fetching of data is small or perhaps even zero. Your choices are pretty simple.

If you are already using FOR cursor_variable in (QUERY or CURSOR) syntax, then getting bulk collection is simply a matter of being on a recent version of Oracle and ensuring the PL/SQL compilation settings are at their default.
If you are not that fortunate, then it’s still just a matter of some simple re-coding to change FETCH CURSOR INTO style to the FETCH CURSOR BULK COLLECT INTO. Just a couple of keywords and some PL/SQL type definitions and you’re done!

The effort required to move from row-centric modification of data to set-centric bulk bind modification of data is equally small.

If you are already issuing DML (insert, update, delete) in a loop, simply adding some appropriate type definitions and recasting the FOR loop into FORALL and the job is done.

It is that easy, and whereas performance improvement in applications is often described in terms of a few-percentage-point gain here and there, you’ve seen in this chapter that moving to bulk operations can give orders of magnitude improvements in performance. In past releases of Oracle, some elements of PL/SQL did not support bulk operations (for example, native dynamic SQL and dynamic ref cursors). However, all of these restrictions have been lifted in the recent releases of the database, thus there’s never a reason why bulk operations can’t be considered.

So the effort is small, the risks are low, and the payback is huge. The moment your approach within PL/SQL becomes set-centric, you’ll be amazed at how quickly your newfound, set-centric thinking will spread to other areas of your Oracle skillset. You will find yourself achieving more in SQL rather than by procedural logic and you will also be taking of advantage of facilities that process data en-masse such as pipeline functions. The same skills you learn with bulk operations will become the motivations for a set-centric mindset throughout all of your database development.

I hope you share my excitement about bulk operations, and that you will see the rewards at your own workplace. As for me, I’m off to the hardware store—this time with a trolley.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 6: Bulk SQL Operations

Create new playlist

Sign In

Sign Up

C H A P T E R 6

Bulk SQL Operations

The Hardware Store

Setting for the Examples in this Chapter

Bulk Operations in PL/SQL

Getting Started with BULK Fetch

1. Implicit Cursor

2. Explicit Fetch Calls

3. Implicit Fetch Calls

1. Implicit Cursor BULK Mode

2. Explicit Fetch Calls BULK Mode

3. Implicit Fetch Calls BULK mode

Three Collection-Style Datatypes

Why should I bother?

Monitoring Bulk Collect Overheads

Refactoring Code to Use Bulk Collect

No Exception Raised

Exiting a Loop

Over Bulking

Bulk Binding

Getting Started with Bulk Bind

Measuring Bulk Binding Performance

Monitoring Memory Usage

Improvements in 11g

Error Handling with Bulk Bind

SAVE EXCEPTIONS with Batches

LOG ERRORS Clause

Robust Bulk Bind

Scenario 1: Elements Do Not Start at 1

Scenario 2: Elements Are Not Contiguous

Earlier Versions of Oracle

A Justification for Massive Collections

The Real Benefit: Client Bulk Processing

Summary

Table of Contents for
Chapter 6: Bulk SQL Operations