The SELECT statement is the key to getting data out of an Oracle database. It’s also very likely the most commonly executed SQL statement from SQL*Plus.
The basic form of the SELECT statement looks like this:
SELECT column_list FROM table_list WHERE conditions GROUP BY column_list HAVING conditions ORDER BY column_list;
The lists in this syntax are comma-delimited. The column list, for example, is a comma-delimited list of column names or expressions identifying the data that you want the query to return.
To retrieve columns from a table, list the columns you want following the SELECT keyword, place the table name after the FROM keyword, and execute your statement. The following query returns a list of tables that you own together with the names of their assigned tablespaces:
SELECT table_name, tablespace_name FROM user_tables;
You can use the ORDER BY clause to sort the results of a query. The following example sorts the results by table name:
SELECT table_name, tablespace_name FROM user_tables ORDER BY table_name;
The default is to sort in ascending order. You can specify descending order using the DESC keyword. For example:
ORDER BY table_name DESC;
While it’s redundant, ASC may be used to specify ascending order. The following example sorts the table list first by tablespace name in descending order and then within that by table name in ascending order:
SELECT table_name, tablespace_name FROM user_tables ORDER BY tablespace_name DESC, table_name ASC;
If you want the sort to be case-insensitive, you can use Oracle’s built-in UPPER function. For example:
SELECT table_name, tablespace_name FROM user_tables ORDER BY UPPER(table_name);
For symmetry, Oracle also has a built-in LOWER function. LOWER converts a string to lowercase; UPPER converts to uppercase.
Use the WHERE clause to restrict the rows returned by a query to those that you need to see. The following example returns a list of any invalid objects that you own:
SELECT object_name, object_type FROM user_objects WHERE status = 'INVALID' ORDER BY object_type, object_name;
The expression following the WHERE clause may be any valid Boolean expression. Oracle supports all the typical operators that you would expect: +, -, /, *, <, >, <>, <=, >=, AND, OR, NOT, ||, IS NULL, LIKE, BETWEEN, and IN. Parentheses are also supported and may be used to clarify the order of evaluation.
Null values are pernicious, especially in the WHERE clause of a query. With only a few exceptions, any expression containing a null value will return a null as the result. Since nulls are considered neither true nor false, this can have unexpected ramifications in how a WHERE clause is evaluated. Consider the following query that attempts to retrieve a list of NUMBER columns with a scale other than 2:
SELECT table_name, column_name FROM user_tab_columns WHERE data_type = 'NUMBER' AND data_scale <> 2;
This query is an utter failure because it misses all the floating-point NUMBER columns that have no scale defined at all. Avoid this problem by explicitly considering nulls when you write your WHERE clause. Use either the IS NULL or the IS NOT NULL operator. For example:
SELECT table_name, column_name FROM user_tab_columns WHERE data_type = 'NUMBER' AND (data_scale <> 2 OR data_scale IS NULL);
When sorting data, null values are treated as greater than all other values. When a standard ascending sort is being done, null values will sort to the bottom of the list. A descending sort will cause null values to rise to the top. You can use the built-in NVL function to modify this behavior.
If you are returning results from a query that might be null, or you are sorting on results that might be null, you can use Oracle’s built-in NVL function to replace null values with a selected non-null value. For example, the NUM_ROWS column in the USER_TABLES view will have a value only for tables that have been analyzed. Here, the NVL function is used to convert null values to zeros:
SELECT table_name, NVL(num_rows,0) FROM user_tables ORDER BY NVL(num_rows,0);
Be cautious about using NVL in a WHERE clause. Using NVL, or any other function, on an indexed column in a WHERE clause may prevent Oracle from using any index on that column.
It’s very common to combine data from two or more tables in order to return related information. Such a combination of two tables is referred to as a join.
You join two tables by listing them in the FROM clause, separated by commas. For example:
SELECT user_constraints.constraint_name, user_constraints.constraint_type, user_cons_columns.column_name FROM user_constraints, user_cons_columns;
This query returns the Cartesian product—all possible combinations of all rows from both tables. Conceptually, this is where all joins start. In practice, you almost always put some conditions in the WHERE clause so that only related rows are combined. The following, more useful, query returns a list of constraint names together with the columns involved in each constraint:
SELECT user_constraints.constraint_name, user_constraints.constraint_type, user_cons_columns.column_name FROM user_constraints, user_cons_columns WHERE user_constraints.constraint_name = user_cons_columns.constraint_name;
Because both tables contain columns with matching names, the column references must be qualified with the table name. You can see that this quickly gets cumbersome. The solution is to provide a shorter alias for each table and use that alias to qualify the column names. For example:
SELECT uc.constraint_name, uc.constraint_type, ucc.column_name FROM user_constraints uc, user_cons_columns ucc WHERE uc.constraint_name = ucc.constraint_name;
Here, the alias uc is used for the USER_CONSTRAINTS table, while UCC is used for USER_CONS_COLUMNS. The resulting query is much easier to read because you aren’t overwhelmed with long table names.
The joins that you’ve seen so far are inner joins. An inner join is one that returns data only when both tables have a row that matches the join conditions. For example, the following query returns only tables that have constraints defined on them:
SELECT ut.table_name, uc.constraint_name FROM user_tables ut, user_constraints uc WHERE ut.table_name = uc.table_name;
An outer join returns rows for one table, even
when there are no matching rows in the other. You specify an outer
join in Oracle by placing a plus sign (+
) in
parentheses following the column names from the optional table in
your WHERE clause. For example:
SELECT ut.table_name, uc.constraint_name FROM user_tables ut, user_constraints uc WHERE ut.table_name = uc.table_name(+);
The (+
) following uc.table_name makes the
user_constraint table optional. The query will return all tables, and
where there are no corresponding constraint records, Oracle will
supply a null in the constraint name column.
The GROUP BY and HAVING clauses, together with Oracle’s built-in aggregate functions, allow you to summarize the data returned by a query.
Aggregate functions take data from multiple rows as input and return one summarized value. For example, the following query uses the COUNT function to return the number of tables that you own:
SELECT COUNT(*) FROM user_tables;
Oracle supports several different aggregate functions, all of which are listed in Table 1.1.
Table 1-1. Aggregate Functions
Function |
Description |
---|---|
AVG |
Averages the values in each group. |
COUNT |
Counts the non-null values in each group. COUNT(*) is a special case and counts all rows. |
MAX |
Returns the maximum value in a group. |
MIN |
Returns the minimum value in a group. |
STDDEV |
Returns the standard deviation of all values in a group. |
SUM |
Returns the sum of all values in a group. |
VARIANCE |
Returns the variance (related to standard deviation) of all values in a group. |
In addition to summarizing the entire results of a query, you can summarize the data for each distinct value in a column. For example, the following query returns the number of columns in each table you own:
SELECT ut.table_name, COUNT(utc.column_name) FROM user_tables ut, user_tab_columns utc WHERE ut.table_name = utc.table_name GROUP BY ut.table_name ORDER BY ut.table_name;
The following query extends the previous query and displays the number of columns in each table to which you have access. This time the grouping results in one row for each distinct owner and table name combination:
SELECT at.owner, at.table_name, COUNT(atc.column_name) FROM all_tables at, all_tab_columns atc WHERE at.table_name = atc.table_name GROUP BY at.owner, at.table_name ORDER BY at.owner, at.table_name;
If you want the results of a GROUP BY query returned in any particular order, you must include an ORDER BY clause. However, an ORDER BY clause is not required. At times it may appear that Oracle automatically sorts GROUP BY queries. It does, but only to a point. If you want the results sorted, you must include an ORDER BY clause.
Columns in the select list of a GROUP BY query must be either listed in the GROUP BY clause or enclosed by one of the aggregate functions listed earlier in Table 1.1.
You can use the HAVING clause to restrict the rows returned by a summary query to only the rows of interest. The HAVING clause functions just like the WHERE clause, except that the HAVING conditions are applied to the summarized results. For example, the following query returns a list of all tables for which you have not defined any indexes:
SELECT ut.table_name, COUNT(ui.index_name) FROM user_tables ut, user_indexes ui WHERE ut.table_name = ui.table_name(+) GROUP BY ut.table_name HAVING COUNT(ui.index_name) = 0;
This query works by first counting up the number of indexes on each table and then eliminating those tables with nonzero counts.
Avoid placing conditions in the HAVING clause that do not test summarized values. Consider, for example, these two queries:
SELECT at.owner, at.table_name, COUNT(atc.column_name) FROM all_tables at, all_tab_columns atc WHERE at.table_name = atc.table_name GROUP BY at.owner, at.table_name HAVING at.owner <> 'SYS' AND at.owner <> 'SYSTEM' ORDER BY at.owner, at.table_name; SELECT at.owner, at.table_name, COUNT(atc.column_name) FROM all_tables at, all_tab_columns atc WHERE at.table_name = atc.table_name AND at.owner <> 'SYS' AND at.owner <> 'SYSTEM' GROUP BY at.owner, at.table_name ORDER BY at.owner, at.table_name;
Both queries return the same result—a count of rows in each table except for those tables owned by SYS or SYSTEM. The second query, however, will execute more efficiently because tables owned by SYS and SYSTEM are eliminated by the WHERE clause before the data is summarized.
The aggregate functions listed in Table 1.1 ignore null values. By default, they also exclude duplicate values. You can use the ALL and DISTINCT keywords to modify this behavior. For example:
SELECT COUNT (DISTINCT table_name) FROM user_tab_columns; SELECT COUNT (ALL table_name) FROM user_tab_columns;
The first query uses the DISTINCT keyword to count up the number of tables. The second query uses the ALL keyword to count up the total number of columns defined for all those tables.
SQL supports four union operators that allow you to take the results of two queries and combine them into one. These are listed in Table 1.2.
Table 1-2. SQL’s Union Operators
Function |
Description |
---|---|
UNION |
Combines the results of two queries and then eliminates duplicate rows. |
UNION ALL |
Combines the results of two queries without eliminating duplicate rows. |
MINUS |
Takes the rows returned by one query and eliminates those that are also returned by another. |
INTERSECT |
Takes the results from two queries and returns only rows that appear in both. |
The following example of a union query uses the MINUS operator to return a list of all tables for which you have not yet defined any indexes:
SELECT table_name FROM user_tables MINUS SELECT DISTINCT table_name FROM user_indexes WHERE table_owner = USER ORDER BY table_name;
The first query returns a list of all tables you own. The second query returns a list of all tables that are indexed. The MINUS union operation removes those indexed tables from the first list, leaving only the unindexed tables.