18. Designing and Monitoring Indexes for Optimal Performance

There is an old saying among carpenters: “If the only tool you have is a hammer, everything starts to look like a nail.” Likewise, a properly chosen and deployed index can be one of the most flexible and sharpest tools in an Oracle DBA’s toolbox, as long as the DBA realizes that an index is not the only tool that may be appropriate for solving query performance issues.

This chapter discusses the different types of indexes that are available for Oracle databases, as well as each index type’s strengths and weaknesses. This chapter also looks at issues that can typically arise when indexes are misused or overused in an Oracle database and how to detect when this is the case. New features in Oracle Database Releases 11g and 12c that may alleviate the dependency on indexes for improved query performance are also covered in this chapter.

Types of Indexes

Indexes are very much like a double-edged razor blade: they are often excellent at cutting through the Gordian knot of query performance tuning when the only other alternative is to retrieve all of the data in a table via a full table scan. However, like any balanced, dual-edged blade, an index tends to cut both ways: it can become quite expensive to maintain, especially when implemented against a table that is updated frequently via an OLTP application workload. And it can become especially deleterious to application performance when deletions are permitted against the index’s underlying data segments.

The Oracle database offers two types of indexes—B-tree indexes and bitmap indexes—and each index type has distinct advantages and drawbacks.

B-Tree Indexes

A B-tree index (balanced tree index) is typically the most common type of index found in Oracle databases. A B-tree index is extremely efficient at organizing information about the values stored in one or more columns of a table. Moreover, it generally remains in a balanced state even after its underlying data has been modified via data manipulation language (DML) statements—hence its name—and does not require any maintenance to retain that balance except in the most extreme of circumstances.

Figure 18.1 illustrates the basic concepts behind a B-tree index storage structure. As data values are added to the first block (also sometimes called a page) of a B-tree index, it continues to fill up until it’s about 50 percent full, at which point it splits into a root page and two leaf pages. Each leaf contains roughly one-half of the total entries in the index, with the split based on the median values of those present in the index. The root page retains just two entries about where to find the data in the two leaf pages. As the number of entries in the index continues to grow, this page splitting will continue, and the appropriate entries will be stored initially in the root page until it fills up and subsequently in a new branch page that’s created when the initial root page’s space has been exhausted.

Image

Figure 18.1 Concepts behind the B-tree index

This storage structure usually enables the index to be traversed extremely quickly to locate the particular values being sought after, as is generally the purpose of indexing columns in a table. Subsequently, the ROWIDs of the desired values are captured to access the required data from the table.

B-tree indexes are, of course, useful for other purposes as well:

Image They are excellent structures for guaranteeing uniqueness of a primary key or unique key constraint.

Image If the columns of two row sets being joined are indexed, then it may be much faster to use the indexes themselves as the row sources to look for common members in each set.

Image If indexed values are sorted in the same order as the ORDER BY or GROUP BY clause in a query, the index can return the data in sorted order, thereby eliminating the expensive sorting process.

Image If all the columns needed to answer a query can be found in the index itself, retrieval from the underlying table may be completely bypassed.

Consider a simple table, AP.RANDOMIZED_SORTED, with just four columns that have been populated with 10 million rows using DBMS_RANDOM to generate anonymous data, as shown in Listing 18.1.

Listing 18.1 Constructing a Sample Table for Index Demonstrations


DROP TABLE ap.randomized_sorted PURGE;
CREATE TABLE ap.randomized_sorted (
    key_id      NUMBER(8)
   ,key_date    DATE
   ,key_desc    VARCHAR2(32)
   ,key_sts     NUMBER(2) NOT NULL
)
    TABLESPACE ado_cold_data
    NOLOGGING;

TRUNCATE TABLE ap.randomized_sorted;

SET TIMING ON
DECLARE
    TYPE tnb_KeyID IS
        TABLE OF ap.randomized_sorted.key_id%TYPE
        INDEX BY PLS_INTEGER;
    TYPE tdt_KeyDate IS
        TABLE OF ap.randomized_sorted.key_date%TYPE
        INDEX BY PLS_INTEGER;
    TYPE tvc_KeyDesc IS
        TABLE OF ap.randomized_sorted.key_desc%TYPE
        INDEX BY PLS_INTEGER;
    TYPE tnb_KeySts IS
        TABLE OF ap.randomized_sorted.key_sts%TYPE
        INDEX BY PLS_INTEGER;

    tb_KeyId    tnb_KeyID;
    tb_KeyDate  tdt_KeyDate;
    tb_KeyDesc  tvc_KeyDesc;
    tb_KeySts   tnb_KeySts;
    ctr         PLS_INTEGER;
    nMaxIterations  CONSTANT PLS_INTEGER := 10000000;

BEGIN
    -- Populate collections
    FOR ctr IN 1..nMaxIterations
        LOOP
            tb_KeyID(ctr)   := ctr;
            tb_KeyDate(ctr) :=
            (TO_DATE('12/31/2014','mm/dd/yyyy') - DBMS_RANDOM.VALUE(1,3650));
            tb_KeyDesc(ctr) := LPAD(' ',DBMS_RANDOM.VALUE(1,32),
SUBSTR('abcdefghijklmnopqrstuvwxyz',DBMS_RANDOM.VALUE(1,26), 1));
            CASE MOD(ROUND(DBMS_RANDOM.VALUE(1,nMaxIterations),0),50)
                                    WHEN  0 THEN tb_KeySts(ctr) := 30;
                                    WHEN  1 THEN tb_KeySts(ctr) :=40;
                                    WHEN  2 THEN tb_KeySts(ctr) :=40;
                                    WHEN  3 THEN tb_KeySts(ctr) :=40;
                                    WHEN  4 THEN tb_KeySts(ctr) :=40;
                                    WHEN  5 THEN tb_KeySts(ctr) :=20;
                                    WHEN  6 THEN tb_KeySts(ctr) :=40;
                                    WHEN  7 THEN tb_KeySts(ctr) :=40;
                                    WHEN  8 THEN tb_KeySts(ctr) :=40;
                                    WHEN  9 THEN tb_KeySts(ctr) :=30;
                                    WHEN 10 THEN tb_KeySts(ctr) :=40;
                                    WHEN 11 THEN tb_KeySts(ctr) :=20;
                                    WHEN 12 THEN tb_KeySts(ctr) :=10;
                                    WHEN 13 THEN tb_KeySts(ctr) :=15;
                                    WHEN 14 THEN tb_KeySts(ctr) :=30;
                                    ELSE tb_KeySts(ctr) :=50;
                                END CASE;
        END LOOP;

    -- Load table from collection items
    FORALL ctr IN 1..nMaxIterations
        INSERT INTO ap.randomized_sorted(key_id, key_date, key_desc, key_sts)
        VALUES (tb_KeyID(ctr), tb_KeyDate(ctr), tb_KeyDesc(ctr), tb_KeySts(ctr));

    COMMIT;

EXCEPTION
    WHEN OTHERS THEN DBMS_OUTPUT.PUT_LINE('Fatal error detected!')
END;
/


To enforce a primary key constraint for this table’s primary key column (KEY_ID), a B-tree index, as shown in Listing 18.2, is an excellent choice because each row has a unique value in the column KEY_ID.

Listing 18.2 Building a Unique Index for Demonstration Purposes


ALTER TABLE ap.randomized_sorted
  ADD CONSTRAINT randomized_sorted_pk
  PRIMARY KEY (key_id)
  USING INDEX (
    CREATE UNIQUE INDEX ap.randomized_sorted_pk
       ON ap.randomized_sorted(key_id)
       TABLESPACE ado_cold_idx
       NOLOGGING
       PARALLEL 4);

ALTER INDEX ap.randomized_sorted_pk LOGGING;


Note the use of parallelism as well as the NOLOGGING clause to avoid writing out any online redo entries during its creation so that this index can be built as quickly as possible. Don’t forget to re-enable online redo log generation once the index has been created successfully.

Bitmap Indexes

Though they are typically used less frequently than B-tree indexes, the unique features of bitmap indexes are especially valuable for data warehousing application workloads. The easiest way to understand how a bitmap index works is to imagine all the ROWIDs of the corresponding table, laid on their side, along with a series of bitmaps—one for each unique value in the indexed column—arrayed below those ROWIDs. Each bit in a bitmap maps to a ROWID. In each bitmap, the bits mapping to those ROWIDs will be set, which have the value corresponding to the bitmap in the indexed column, as shown in Figure 18.2.

Image

Figure 18.2 Concepts behind the bitmap index

Listing 18.3 shows an example of a bitmap index created on the KEY_STS column in table AP.RANDOMIZED_SORTED.

Listing 18.3 Constructing a Bitmap Index for Demonstration Purposes


CREATE BITMAP INDEX ap.randomized_sorted_sts_bix
   ON ap.randomized_sorted (key_sts)
   TABLESPACE ado_cold_idx
   NOLOGGING PARALLEL(4);

ALTER INDEX ap.randomized_sorted_sts_bix LOGGING;


Because they essentially reference multiple rows simultaneously, bitmap indexes tend to be especially useful for a database that’s servicing “read-mostly” data primarily for data warehouse application workloads:

Image A bitmap index works best for a column that comprises a small set of relatively static column values because filtering the rows that match selection criteria against that column requires searching the bitmaps only for the corresponding value.

Image A table with a very large number of rows—say, 500,000 or more—tends to benefit from a bitmap index because multiple rows can be searched for matching column values with significantly fewer reads of index blocks than with a B-tree index.

Image When individual bitmap indexes are present on two or more columns, selection criteria against those columns can leverage bitmap XAND and XOR operations to quickly filter matching column values.

Image Unlike B-tree indexes, a bitmap index can record the presence of a NULL value in its corresponding column; therefore, the IS NOT NULL operator can leverage a bitmap index without having to read all values in the underlying table.

Image Bitmap indexes can take advantage of parallel DML during data loading, so they are perfect for data warehouses that are typically loaded during longer-running batch operations.

However, the same features that make the bitmap index so powerful for data warehouses are also its Achilles heel, especially for heavily updated data as well as online transaction processing (OLTP) application workloads:

Image First and foremost, because they reference multiple rows simultaneously, bitmap index segments must be updated simultaneously too. Unfortunately, this means that the bitmap segments as well as the table rows they index must be locked while the index is reconstructed.

Image Bitmap indexes are not optimal for enforcing a PRIMARY KEY or UNIQUE KEY referential integrity constraint.

Image Likewise, a column that contains constantly changing values or whose domain grows constantly would be a poor choice for a bitmap index.

Image A table with a relatively small number of rows—say, less than 500,000—will not likely benefit enough from a bitmap index to justify the maintenance costs, which are much higher compared to that of a B-tree index.

Image Eventually, every bitmap entry must be reconverted into its corresponding ROWID. Although this takes virtually no time to complete, it obviously impacts performance when millions of entries need to undergo conversion.

Partitioned Indexes

When thinking about partitioned tables in an Oracle database, it’s also important to understand the difference between local partitioned indexes and global partitioned indexes. Although these aren’t really different index types—they can only be either B-tree or bitmap indexes—each flavor of partitioned index offers distinct advantages and drawbacks.

Local Partitioned Indexes

As its name suggests, a local partitioned index exists only within the scope of a single table partition and indexes only the column values within that partition. Since the data values to be retrieved exist only within a few partitions, local indexes can be used to filter, sort, and retrieve data only within those partitions whenever the Oracle Optimizer chooses to leverage partition pruning. Using a local partitioned index usually results in much fewer physical reads to locate data.

Global Partitioned Indexes

Unlike a local partitioned index, a global partitioned index spans all existing partitions of a partitioned table. It is therefore useful in situations when the Oracle Optimizer determines that a majority of the table partitions must be scanned to retrieve pertinent data values, and thus the global index is a better choice than searching all the local partitioned indexes to locate data.

Listing 18.4 shows the creation of table AP.RANDOMIZED_PARTED, which comprises four partitions.

Listing 18.4 Creating an Example Partitioned Table


DROP TABLE ap.randomized_parted PURGE;
CREATE TABLE ap.randomized_parted (
    key_id      NUMBER(8)
   ,key_date    DATE
   ,key_desc    VARCHAR2(32)
   ,key_sts     NUMBER(2) NOT NULL
)
    PARTITION BY RANGE(key_date) (
         PARTITION p1_frigid
             VALUES LESS THAN (TO_DATE('2010-01-01','yyyy-mm-dd'))
             TABLESPACE ado_cold_data
        ,PARTITION p2_cool
             VALUES LESS THAN (TO_DATE('2013-01-01','yyyy-mm-dd'))
             TABLESPACE ado_cool_data
        ,PARTITION p3_warm
             VALUES LESS THAN (TO_DATE('2014-01-01','yyyy-mm-dd'))
             TABLESPACE ado_warm_data
        ,PARTITION p4_hot
             VALUES LESS THAN (TO_DATE('2014-07-01','yyyy-mm-dd'))
             TABLESPACE ado_hot_data
        ,PARTITION p5_radiant
             VALUES LESS THAN (MAXVALUE)
             TABLESPACE ap_data
    )
    NOLOGGING
    PARALLEL 4
;

INSERT /*+ APPEND NOLOGGING PARALLEL(4) */ INTO ap.randomized_parted
SELECT * FROM ap.randomized_sorted;

COMMIT;

ALTER TABLE ap.randomized_parted
    DROP CONSTRAINT randomized_parted_pk;

DROP INDEX ap.randomized_parted_pk;

ALTER TABLE ap.randomized_parted
    ADD CONSTRAINT randomized_parted_pk
    PRIMARY KEY (key_id)
    USING INDEX (
      CREATE UNIQUE INDEX ap.randomized_parted_pk
         ON ap.randomized_parted(key_id)
         TABLESPACE ap_idx
         NOLOGGING
         PARALLEL 4
         );

Table created.


Listing 18.5 illustrates how to create a series of local indexes on column KEY_STS as well as a single global partitioned index on column KEY_DATE.

Listing 18.5 Example of Local and Global Partitioned Indexes


-----
-- Create local partitioned indexes on KEY_STS
-----
DROP INDEX ap.randomized_parted_loc_sts;
CREATE INDEX ap.randomized_parted_loc_sts
    ON ap.randomized_parted (key_sts)
    STORAGE (INITIAL 10M)
    LOCAL (
         PARTITION lx1_frigid    TABLESPACE ado_cold_idx
        ,PARTITION lx2_cool      TABLESPACE ado_cool_idx
        ,PARTITION lx3_warm      TABLESPACE ado_warm_idx
        ,PARTITION lx4_hot       TABLESPACE ado_hot_idx
        ,PARTITION lx5_radiant   TABLESPACE ap_idx
        );

-----
-- Create a global partitioned index on KEY_DESC
-----
CREATE INDEX ap.randomized_parted_glb_desc
    ON ap.randomized_parted (key_desc)
    STORAGE (INITIAL 10M)
    GLOBAL;


Partial Indexes

Starting with Oracle Database Release 12.1.0.1, it’s now possible to create indexes for only some of the partitions of a partitioned table. For example, consider table AP.RANDOMIZED_PARTED, shown earlier in Listing 18.5. There may be very little advantage to creating local indexes on column KEY_STS in the oldest partitions because queries hardly ever reference these partitions; additionally, even when the oldest data is referenced, it’s usually going to be accessed via a full table partition scan anyway because the majority of the data is likely to be in an active state. The new INDEXING {ON|OFF} clause of the CREATE TABLE and ALTER TABLE statements makes this possible, as shown in the reconstructed data definition language (DDL) for this table in Listing 18.6.

Listing 18.6 Example of Partial Indexing Schemes in Oracle 12c Release 1 (12.1.0.1)


CREATE TABLE ap.randomized_parted (
    key_id      NUMBER(8)
   ,key_date    DATE
   ,key_desc    VARCHAR2(32)
   ,key_sts     NUMBER(2) NOT NULL
)
    INDEXING OFF
    PARTITION BY RANGE(key_date) (
         PARTITION p1_frigid
             VALUES LESS THAN (TO_DATE('2010-01-01','yyyy-mm-dd'))
             TABLESPACE ado_cold_data
        ,PARTITION p2_cool
             VALUES LESS THAN (TO_DATE('2013-01-01','yyyy-mm-dd'))
             TABLESPACE ado_cool_data
        ,PARTITION p3_warm
             VALUES LESS THAN (TO_DATE('2014-01-01','yyyy-mm-dd'))
             TABLESPACE ado_warm_data
             INDEXING ON
        ,PARTITION p4_hot
             VALUES LESS THAN (TO_DATE('2014-07-01','yyyy-mm-dd'))
             TABLESPACE ado_hot_data
             INDEXING ON
        ,PARTITION p5_radiant
             VALUES LESS THAN (MAXVALUE)
             TABLESPACE ap_data
             INDEXING ON
    )
    NOLOGGING
    PARALLEL 4;


In this case, local indexes would not be created by default for any table partitions at all unless the INDEXING attribute is set to ON for each individual partition. In addition, if a global index were to be created on the KEY_STS column for this table, by default that index would not include any values from the first two partitions.

Other Index Types

There are a few other index types and subvariants that are beyond the bounds of this discussion, but they definitely deserve mention, as you will almost certainly encounter them during your Oracle DBA career.

Index Organized Tables

Index-organized tables (IOTs) are truly indexes, but they have many of the same attributes of a standard heap-organized table. When a majority of a table’s columns need to be indexed to enforce a PRIMARY KEY constraint, an IOT fulfills this need quite nicely.

Another advantage to an IOT is that it eliminates the required retrieval of rows from the table it would normally index, so it’s not uncommon to see the physical I/O required to retrieve data dropping by a significant margin; IOTs are therefore not uncommon in OLTP applications. Data within IOTs can be indexed with secondary indexes as well.

Reverse-Key Indexes

When a table’s PRIMARY KEY constraint is based on a single column that contains a monotonically increasing value—for example, ORDER_ID in the OE.ORDERS table (part of the standard Oracle sample schemas)—index entries for similar values tend to be located within just a few index blocks. While this arrangement does help the clustering factor of the index, it can have an extremely deleterious effect on OLTP application performance, especially in an Oracle Real Application Cluster (RAC) database.

A reverse-key index simply stores the values in reverse byte order—for example, 1001 is stored as 1001, but 1002 is stored as 2001, 1003 is stored as 3001, and so forth—so this tends to spread out the values among multiple index blocks. The end result is that two different user sessions will not contend with each other when attempting to add a new entry into the same index simultaneously during heavy OLTP application activity, thus alleviating a TX wait enqueue for that index.

Composite Indexes

If an index contains more than one column, it’s termed a composite or concatenated index. If a query’s selection criteria includes all or leading columns of the columns in a composite index, it’s likely to speed the execution of that query. However, it’s also possible to construct a composite index so that the columns with the lowest number of unique values are listed first in the index—for example, COUNTRY, STATE, and CITY—so that a query that specifies only values for STATE and CITY in its selection criteria could still take advantage of that index via a skip-scan operation, which essentially bypasses the leading column in the index (COUNTRY) and accesses the secondary columns.

Compressed Indexes

The child table in a parent–child relationship typically requires a multicolumn PRIMARY KEY constraint to enforce referential integrity. For example, the OE.ORDER_ITEMS table that’s part of the standard Oracle sample schemas has a two-column primary key for columns ORDER_ID and LINE_ITEM_ID. It’s not uncommon for the child table to have a multiplicity of LINE_ITEM_ID values per ORDER_ID; therefore, to save space in an index for this constraint, the index can be compressed so that the value for ORDER_ID is stored just once as a prefix within the index page for all LINE_ITEM_ID values, as shown in Listing 18.7.

Listing 18.7 Compressing an Index


ALTER TABLE oe.order_items DROP CONSTRAINT order_items_pk;
ALTER TABLE oe.order_items
  ADD CONSTRAINT order_items_pk
  PRIMARY KEY (order_id, line_item_id)
  USING INDEX (
    CREATE UNIQUE INDEX oe.order_items_pk_idx
       ON oe.order_items (order_id, line_item_id)
       TABLESPACE example
       COMPRESS);


Bitmap Join Indexes

The bitmap join index is a special subtype of bitmap index that is useful when the columns that are going to be used for joins within a subset of tables—for example, a fact table and several of its corresponding dimension tables—are already known. A bitmap join index essentially stores the ROWIDs of all the values that already matched between the fact and dimension tables (see Listing 18.8). The advantage of this flavor of bitmap index is that it’s extremely quick to search for matched sets of column values, as it essentially offloads the processing of the joined values to the index itself.

Listing 18.8 Storing ROWIDs in a Bitmap Join Index


CREATE BITMAP INDEX oe.order_item_qty_bjx
   ON oe.order_items (quantity)
     FROM oe.order_items OI, oe.product_information P
    WHERE OI.product_id = P.product_id
   TABLESPACE example;


Multiple Indexes on Identical Columns

Prior to Oracle Database 12c Release 1, only one index could be created on the same column or combination of columns. But starting with Release 12.1.0.1, it’s now possible to create multiple indexes on the same column or set of columns, so that the statements that create the new bitmap index shown in Listing 18.3 would actually execute correctly without having to drop the existing B-tree index first. Using multiple indexes on identical columns offers the following possibilities:

Image A unique index as well as a nonunique index can be created for the same set of columns.

Image A B-tree and a bitmap index can be created on the same columns.

Image Indexes with different partition schemes (for example, a local partitioned index and a global partitioned index) can co-exist for the same columns.

Image Partitioned indexes are also permitted on the same columns, but with different partitioning methods (for example, one that is hash-partitioned and another that is range-partitioned).

Note that there are some restrictions on this new feature, however:

Image A B-tree non-clustered index and a B-tree clustered index cannot be created on the same column.

Image A B-tree index and an IOT cannot co-exist on the same column.

Image When multiple indexes are created on the same column, only one of the indexes can be visible at a time, and the other indexes must have a status of INVISIBLE. However, it’s possible to inform the optimizer that it is permitted to leverage the non-default index by setting the OPTIMIZER_USE_INVISIBLE_INDEX initialization parameter to TRUE at either the session or system level.

Index Performance Issues

Even when an Oracle DBA has done her homework and has chosen the proper type of index for the job, indexes can present some unique challenges for tuning application performance.

Index Statistics

The Oracle query optimizer leverages index statistics when it makes its decision about whether to use an index instead of a full table scan. If there are multiple indexes to choose from, the optimizer also uses these statistics to determine which index is most efficient. A simple query against the DBA_INDEXES data dictionary view reveals the most crucial statistics, as Listing 18.9 shows.

Listing 18.9 Crucial Index Statistics


SET LINESIZE 170
SET PAGESIZE 20000
COL owner           FORMAT A08          HEADING "Owner"
COL index_owner     FORMAT A08          HEADING "Owner"
COL table_name      FORMAT A24          HEADING "Table Name"
COL index_name      FORMAT A30          HEADING "Index Name"
COL partition_name  FORMAT A12          HEADING "Partition Name"
COL tsp_name        FORMAT A12          HEADING "Tablespace|Name"
COL status          FORMAT A08          HEADING "Status"
COL num_rows        FORMAT 999,999,999  HEADING "Row|Count"
COL distinct_keys   FORMAT 999,999,999  HEADING "Distinct|Keys"
COL clusfctr        FORMAT 99,999,999   HEADING "Clust|Factor"
COL blevel          FORMAT 99999        HEADING "Index|Depth"
COL leaf_blocks     FORMAT 9,999,999    HEADING "Leaf|Blocks"
COL avg_lbpk        FORMAT 999,999      HEADING "Avg Leaf|Blks/Key"
COL avg_dbpk        FORMAT 999,999      HEADING "Avg Data|Blks/Key"
TTITLE "Index Performance Metadata|(from DBA_INDEXES)"
SELECT
     owner
    ,table_name
    ,index_name
    ,tablespace_name tsp_name
    ,status
    ,num_rows
    ,distinct_keys
    ,clustering_factor clusfctr
    ,blevel
    ,leaf_blocks
    ,avg_leaf_blocks_per_key avg_lbpk
    ,avg_data_blocks_per_key avg_dbpk
  FROM dba_indexes
 WHERE owner = 'AP'
 ORDER BY 1,2,3
;
TTITLE OFF
Index Performance Metadata
                                                                            (from DBA_INDEXES)

                                               Tablespace                     Row     Distinct       Clust  Index       Leaf Avg Leaf Avg Data
Owner Table Name          Index Name              Name         Status        Count         Keys      Factor  Depth     Blocks Blks/Key Blks/Key
----- ------------------- ----------------------- ------------ ------ ------------ ------------ ----------- ------ ---------- -------- --------
AP    EST_SALES           EST_SALES_CITYST_IDX    ADO_COLD_IDX VALID       680,352       29,768     526,867      2      1,124        1       17
AP    EST_SALES           EST_SALES_CITY_IDX      ADO_COLD_IDX VALID       680,352       18,792     526,845      2      1,948        1       28
AP    EST_SALES           EST_SALES_PK_IDX        ADO_COLD_IDX VALID       680,352      680,352      74,988      2      1,519        1        1
AP    EST_SALES           EST_SALES_STATE_IDX     ADO_COLD_IDX VALID       680,352           62      89,808      2      1,047       16    1,448
AP    EST_SALES           EST_SALES_TIMEZONE_IDX  ADO_COLD_IDX VALID       649,680           26      82,138      2      1,001       38    3,159
AP    EST_SALES           EST_SALES_ZIPCITYST_IDX ADO_COLD_IDX VALID       680,352       42,522     680,352      2      1,193        1       16
AP    EST_SALES           EST_SALES_ZIPCODE_IDX   ADO_COLD_IDX VALID       680,352       42,264     680,352      2      1,118        1       16
AP    INVOICES            INVOICES_CUST_IDX       AP_IDX       VALID           500            1           3      0          1        1        3
AP    INVOICES            INVOICES_PK_IDX         AP_IDX       VALID           500          500           3      0          1        1        1
AP    INVOICE_ITEMS       INVOICE_ITEMS_PK_IDX    AP_IDX       VALID        47,500       47,500         191      1        118        1        1
AP    INVOICE_ITEMS       INVOICE_ITEMS_PROD_IDX  AP_IDX       VALID        47,500            5         955      1        140       28      191
AP    RANDOMIZED_PARTED   RANDOMIZED_PARTED_PK    ADO_COLD_IDX VALID    10,365,139   10,365,139   6,722,752      2     28,733        1        1
AP    RANDOMIZED_SORTED   RANDOMIZED_SORTED_PK    ADO_COLD_IDX VALID    10,194,228   10,194,228      76,739      2     22,556        1        1
AP    RANDOMIZED_UNSORTED RANDOMIZED_UNSORTED_PK  ADO_COLD_IDX VALID    10,395,465   10,395,465  10,379,934      2     23,004        1        1
AP    VENDORS             VENDORS_PK_IDX          AP_IDX       VALID           164          164           3      0          1        1        1


Here’s a breakout of what five of the most crucial statistics reveal:

Image Depth: An index’s depth refers to the number of root versus node versus leaf pages it comprises. A value of zero (0) means there is only a root page, while any number greater than zero indicates how many levels have been constructed to accommodate all the values.

Image Leaf blocks: Since this count reflects the total number of leaf blocks that make up the index, the optimizer uses this number to determine if a full table scan is cheaper than a full index scan.

Image Distinct keys: This column reflects the number of unique values stored in the index; for an index that’s enforcing either a PRIMARY KEY or UNIQUE constraint, the number of distinct keys will equal the number of rows in the table.

Image Average leaf blocks per key: The average number of leaf blocks per distinct key value indicates the average number of leaf blocks in which each distinct value in the index appears. For an index that supports either a PRIMARY KEY or UNIQUE constraint, this value will be equal to one (1); for a nonunique index, a larger value could be there in case the number of entries per distinct key value is more than what can fit into one leaf block.

Image Average data blocks per key: Likewise, this column records the average number of data blocks per distinct value in the index; it gives an approximate measure of how many rows can be retrieved from the corresponding table for one key value.

Image Clustering factor: This is probably the most crucial index statistic to understand because the optimizer uses it to determine how efficient the index is for retrieving all the necessary ROWIDs from a B-tree index in the fewest number of index blocks.

The Impact of a Low Clustering Factor

When data has been loaded into a table and the rows in the table are not sorted by the indexed column, then the index on the column will have a low clustering factor. For example, consider the table AP.RANDOMIZED_UNSORTED, which has been loaded using data from table AP.RANDOMIZED_SORTED, as shown in Listing 18.10. Note that the ORDER BY clause in the statement that loads data into AP.RANDOMIZED_SORTED forces the data to be sorted in descending order based on the values for the KEY_DESC column, whereas the index has been created on the KEY_ID column.

Listing 18.10 Example of Low Index Clustering Factor


DROP TABLE ap.randomized_unsorted PURGE;
CREATE TABLE ap.randomized_unsorted (
    key_id      NUMBER(8)
   ,key_date    DATE
   ,key_desc    VARCHAR2(32)
   ,key_sts     NUMBER(2) NOT NULL
)
    TABLESPACE ado_cold_data
    NOLOGGING
    PCTFREE 0;

TRUNCATE TABLE ap.randomized_unsorted;

INSERT /*+ APPEND NOLOGGING PARALLEL(4) */ INTO ap.randomized_unsorted
SELECT /*+ PARALLEL(4) */ * FROM ap.randomized_sorted
 ORDER BY key_desc DESC;

COMMIT;

ALTER TABLE ap.randomized_unsorted
    DROP CONSTRAINT randomized_unsorted_pk;

DROP INDEX ap.randomized_unsorted_pk;

ALTER TABLE ap.randomized_unsorted
  ADD CONSTRAINT randomized_unsorted_pk
  PRIMARY KEY (key_id)
  USING INDEX (
    CREATE UNIQUE INDEX ap.randomized_unsorted_pk
       ON ap.randomized_unsorted(key_id)
       TABLESPACE ado_cold_idx
       NOLOGGING
       PARALLEL 4
       );
ALTER INDEX ap.randomized_unsorted_pk LOGGING;


When index AP.RANDOMIZED_UNSORTED_PK_IDX is created on column KEY_ID, the index’s clustering factor is dramatically different than that of AP.RANDOMIZED_SORTED_PK_IDX, as the query and output in Listing 18.9 shows. As a result, whenever a query filters data using a range scan against KEY_ID for this table, it will use dramatically larger amounts of physical I/O because the data values are not tightly clustered within a few index pages.


Note

An index’s clustering factor is only meaningful to the query optimizer for B-tree indexes; it is meaningless for bitmap indexes because of their structure and how they are used.


Operational Considerations for Indexes

Sometimes, an unexpectedly poor performance from an index might have nothing to do with the index at all. Its ill effect on a decision support system (DSS) query’s execution speed or an OLTP application workload’s execution rate per transaction could occur because the Oracle DBA employed an inappropriate index type; he might have ignored best practices, or he could simply be encountering a performance “headwind” because of the application’s inability to scale up as expected.

Impact of Choosing the Wrong Index Type

As we’ve already discussed, when implementing a new index, it’s important to remember which index type is the most appropriate:

Image Generally speaking, remember that bitmap indexes are most effective when they are used to index an extremely large number of rows for a column that has relatively few distinct values that very rarely change.

Image Conversely, B-tree indexes will generally outperform a bitmap index when the column being indexed has an extremely large number of frequently updated distinct values, and especially when the domain of unique values is constantly expanding over time.

Impact of Forcing the Use of an Index via a Hint

Unfortunately, forcing an index via a hint happens more frequently than it should. For example, suppose an application’s user complains about the performance of a query. An overzealous developer or DBA reviews the query’s execution plan to determine the cause of the perceived poor performance. When she discovers that the query is not using the supposedly “appropriate” index, she might choose to force the use of the appropriate index by modifying the application code to incorporate a +INDEX hint.

That strategy may solve the query’s performance issue (for now), but in due course of time when that selected index ceases to be the most appropriate choice—say, after the table has grown to such a size that a table scan with several degrees of parallelism is actually a more efficient access path to data—the optimizer will have no choice but to continue to use the now less-than-desirable index.

Impact of Outmoded Initialization Parameter Settings

It was not uncommon in earlier releases of Oracle to override the default values for the two initialization parameters discussed in this section in an attempt to influence the cost-based optimizer to favor indexes over table scans. We’ve often seen that these settings continue to be propagated forward into future releases of the Oracle database long after they might have outlived their usefulness.

The OPTIMIZER_INDEX_CACHING initialization parameter controls the costing of an index probe in conjunction with a nested loop or an INLIST iterator. The range of values 0 to 100 for this parameter indicates percentage of index blocks available in the buffer cache, which modifies the optimizer’s estimate of the cost of an index probe for nested loops and INLIST iterators. A value of 100 infers that 100 percent of the index blocks are likely to be found in the buffer cache, and the optimizer adjusts the cost of an index probe or nested loop accordingly.

For good reason, the default value of this parameter is zero (0), or in other words, index blocks are no more likely than table blocks to be found in the database buffer cache, which results in the default optimizer behavior. Unless you know precisely how often this is likely to occur, Oracle suggests using caution when setting this parameter to any value other than zero.

The OPTIMIZER_INDEX_COST_ADJ initialization parameter tweaks optimizer behavior to be either less or more inclined to choose an index versus a table scan to retrieve data. The minimum value of one (1) is the lowest setting, indicating the optimizer should consider indexes about 100 times more costly than table scans; conversely, the maximum setting of 10000 forces the optimizer to consider that index scans are 100 times less costly than table scans.

The default for this parameter is 100, at which the optimizer evaluates index access paths at the regular cost. Any other value makes the optimizer evaluate the access path at that percentage of the regular cost. Again, unless you know your application workload intimately and can accurately predict the implications of this setting after extensive testing, Oracle suggests leaving this parameter at its default value.

Impact of Deleting (Almost) All Index Entries in a Block

If an application permits deletion of rows from its tables, after a mass deletion has occurred, it is possible that many index pages may have only a few entries that correspond to active, non-deleted rows. Unfortunately, Oracle cannot reclaim space in an index page until all entries have been deleted; as a result, it is possible that many more index pages are being scanned than necessary to retrieve index row pieces during indexed lookups. It’s therefore recommended in this state to either rebuild an index or, if the database release is at least Oracle 10g Release 2, to compact the index segment with the ALTER INDEX <index name> COMPACT; command.

Impacts of Index Overuse

Finally, when it comes to tuning application performance, a prudent Oracle DBA should use indexes like a gourmet chef uses saffron or some other rare and costly spice: with great care, and not in every dish! Put simply, not every poorly performing query will suddenly start performing faster just because a new index has been added, and every index adds the specter of additional physical I/O whenever DML affecting the indexed column is performed on the underlying table.

Hiding Unselective Indexes

If your current Oracle Database release is at least 11g Release 1, then you are fortunate to have a powerful tool to battle unnecessary indexes on your tool belt: the ability to hide an index from the Oracle Optimizer for consideration as a possible access path during statement parsing.

An existing index can be made invisible by simply altering its state to INVISIBLE, and a new index can even be created in an initially INVISIBLE state, as shown in Listing 18.11. Even though the optimizer will not be allowed to use this index as an access path, all DML operations against this index will continue to be processed.

Listing 18.11 Making an Existing Index Invisible to the Optimizer


ALTER INDEX sh.zm_customers_marital_bix INVISIBLE;

DROP INDEX sh.zm_customers_marital_bix;
CREATE BITMAP INDEX sh.zm_customers_marital_bix
    ON sh.zm_customers (cust_marital_status)
    TABLESPACE example
    INVISIBLE;


Note that even if the index is specified directly in a +INDEX hint, the optimizer will ignore it as long as it has been marked INVISIBLE ... unless, that is, the OPTIMIZER_USE_INVISIBLE_INDEX initialization parameter is set to TRUE at the session level, as Listing 18.12 shows.

Listing 18.12 Making an Invisible Index Temporarily Visible to the Optimizer


SELECT /*+ INDEX(sh.zm_customers_marital_bix) */
  cust_state_province, COUNT(*)
  FROM sh.zm_customers
WHERE cust_marital_status = 'widow'
GROUP BY cust_state_province
ORDER BY cust_state_province;

Plan hash value: 2056234527
-----------------------------------------------------------------------------------
| Id  | Operation          | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |              |   145 |  2465 |   424   (1)| 00:00:01 |
|   1 |  SORT GROUP BY     |              |   145 |  2465 |   424   (1)| 00:00:01 |
|*  2 |   TABLE ACCESS FULL| ZM_CUSTOMERS |  3461 | 58837 |   423   (1)| 00:00:01 |
-----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
   2 - filter("CUST_MARITAL_STATUS"='widow')

ALTER SESSION SET optimizer_use_invisible_index = TRUE;

SELECT /* +INDEX(sh.zm_customers_marital_idx) */
    cust_city, COUNT(*)
  FROM sh.zm_customers
 WHERE marital_status = 'S';

Plan hash value: 2120557835

-----------------------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name                     | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |                          |    58 |   986 |    19   (6)| 00:00:01 |
|   1 |  SORT GROUP BY                       |                          |    58 |   986 |    19   (6)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID BATCHED| ZM_CUSTOMERS             |    75 |  1275 |    18   (0)| 00:00:01 |
|   3 |    BITMAP CONVERSION TO ROWIDS       |                          |       |       |            |          |
|*  4 |     BITMAP INDEX SINGLE VALUE        | ZM_CUSTOMERS_MARITAL_BIX |       |       |            |          |
-----------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   4 - access("CUST_MARITAL_STATUS"='widow')


If the Oracle DBA determines that the index should indeed remain visible, it can simply be rendered visible to the optimizer again by issuing the ALTER INDEX <index name> VISIBLE; command.

Index Performance Issues in RAC Databases

RAC databases present their own unique challenges in regard to index performance, as discussed in the following sections.

Impact of Nonselective Indexes

Indexes that are nonselective—that is, that don’t dramatically improve query performance—can have an especially deleterious effect on DML in RAC because they may be subject to interinstance contention. For example, consider a table with five indexes: one primary key index and four other indexes that were created in anticipation of particular query workloads. If those four additional indexes provide very little benefit for speeding query execution times, they may actually be hindering the RAC database’s ability to quickly update these nonselective indexes. Even worse, if these indexes need to be accessed by queries while OLTP activity is occurring, it is likely that this concurrent query and OLTP activity will cause severe cache transfer contention between RAC database instances because both a current image and at least one CR image must be maintained for each of the index pages.

Impact of Monotonically Increasing Indexes

Indexes that grow monotonically because of ever-increasing values (for example, OE.ORDERS.ORDER_ID) often become hot spots in a RAC application workload. Consider a typical primary key strategy that simply increments the next key value by +1 each time a new entry is requested. In this situation, the corresponding primary key index will tend to cluster all of the like values in a very few index blocks; this clustering will result in frequent leaf block splits, the index will end up with an index tree structure that is relatively low in depth, and the index’s root block will tend to be accessed much more frequently than other leaf pages as compared to the case when the index was not monotonic. In a RAC database, this strategy is likely to become a serious performance bottleneck for OLTP applications, especially if multiple instances are hosting the OLTP application: the instances will essentially play a constant game of tug-of-war as they exchange holder status of the current primary index block.

One approach to alleviating this issue is to implement global index hash partitioning, a new feature in Oracle 10g. Other approaches include using a natural key based on existing database column values (that is, instead of a “dumb-numbering” scheme), or even using a reverse-key index as described previously. (Remember, reverse-key indexes cannot be used for a range scan.)

Oracle Sequences and Index Contention

If sequences are used to obtain a table’s next value for its primary key column, it’s important to verify that the sequence’s CACHE value is set much higher than the default of 20—high enough to ensure that two or more instances will never access the same index page at the same time—and also to specify the NOORDER directive. (Recall that the NOORDER clause means that if Session 1 requests a NEXTVAL from a sequence before Session 2 does, it doesn’t matter that Session 2’s value is higher than Session 1’s.)

This means that each instance in the cluster will get a much higher range of sequence values to choose from, eliminating index page contention as a factor; however, this approach might result in large gaps between sequenced values because each instance has reserved a widely disjoint set of values.

OLTP and Read-Mostly Application Workload Collisions

Finally, sometimes problems related to index difficulties are simply the symptom of a set of application operations colliding unexpectedly at the same time. Consider this situation: an OLTP application has recently added new rows to an indexed table on one RAC node, but the changes haven’t been committed yet. There is also a read-mostly workload (for example, batch reporting) executing on another RAC node that is actively attempting to read from the same table and indexes. Because the changes haven’t been committed on the first node, the second node has no choice but to request consistent read (CR) images be generated from undo data on the first node for both the tables and the indexes. If we now imagine that both nodes are servicing both the OLTP and the batch reporting workload, it’s easy to understand how quickly the RAC database’s interconnect will tend to spend most of its time processing requests for CR images.

If the Oracle DBA cannot get the application developers to commit changes more frequently, then it’s very likely that the CR images of the data and index blocks will take up most of the RAC database’s interconnect bandwidth. Likewise, if the indexes are significantly unselective, all of this extra work to maintain CR images for the index blocks will turn out to be essentially counterproductive.

Summary

An index may not be the panacea to cure a query’s performance ills, so don’t be afraid to strenuously question the need for a new index if the intent is merely to speed a query’s execution. Remember that an index may actually contribute significantly to the execution time of DML against the index’s base table or partition, so adding indexes, especially to often-updated data segments, may actually cause more trouble than they are worth.

Invest time to choose the right index type. A bitmap index may be excellent for a column with a small domain of indexed values—until a single brand-new value needs to be added to that column’s domain, forcing a complete rebuild of all bitmap index segments to incorporate that value.

Remember that the B in B-tree stands for “balanced”—and that means that except for the most extreme circumstances, there is little need to rebuild a B-tree index because it remains balanced in almost all situations.

If deletions are performed against a table with large numbers of indexed columns, remember that space allocated to an index block cannot be reclaimed until all values within that block are marked as deleted, and it’s possible that poor application performance may result because many more “almost empty” index blocks may have to be accessed.

The onset of engineered systems such as Exadata and the innovation of In-Memory Column Store (IMCS) in Oracle Database 12.1.0.2 often completely negate the need for indexes for analytical queries. So when in doubt of an index’s benefit, simply consider making that index invisible—and telling no one. If there are no significant complaints about application performance with an invisible index, you have probably found an opportunity to reduce the demands of keeping that index updated by removing it entirely.

If you’re fortunate enough to have made the transition to Oracle Database 12c Release 1 for your data warehouse, consider leveraging the new PARTIAL index feature to limit the creation of local and/or global index partitions against the table partitions that can least benefit from them while retaining index partitions for the table partitions that could most benefit from speedy data access.

Real Application Cluster (RAC) databases are often persnickety about indexes. Remember that applications with too many unselective indexes, improperly sequenced monotonically increasing indexes, or colliding OLTP versus read-mostly workloads can often trace their unexpectedly poor performance or lack of scalability directly to indexes.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset