6.1. Subqueries

A subquery places one query inside another one. The second query resides somewhere within the WHERE clause of a SELECT statement. One or more values returned by the subquery are used by the main query to return the results to the user.

subquery

A query that is embedded in a main, or parent, query and used to assist in filtering the result set from a query.

The types of operators allowed in the WHERE clause depend on whether the subquery returns one row or more than one row. If only a single row is returned from a query, the comparison operators =, !=, <, >, >=, <=, and so forth are valid. If more than one row is returned from a subquery, operators such as IN, NOT IN, ANY, and ALL are valid.

6.1.1. Single-Row Subqueries

The boss, King, wants to do his quarterly salary analysis. He would like to see which employees in the IT department are earning more than the average salary across all employees. Janice, the database analyst and DBA, realizes that this could be written as two queries and decides to take that approach first before using a subquery. The average salary for an employee in the company is retrieved by a query you've seen in previous chapters:

select avg(salary) from employees;

AVG(SALARY)
-----------
 6461.68224

1 row selected.

Using this information as a starting point, Janice writes a second query to see which employees in the IT department (department 60) have a higher salary than the average. She must cut and paste the number returned from the previous query into this new query:

select employee_id, last_name, first_name, salary
from employees
where salary > 6461.68224
and department_id = 60;

EMPLOYEE_ID LAST_NAME     FIRST_NAME            SALARY
----------- ------------- ----------------- ----------
        103 Hunold        Alexander               9000

1 row selected.

The only employee in the IT department making more than the company average salary is Alexander Hunold, who happens to be the manager of that department.

Janice wants to streamline this reporting function for King. She realizes that this can easily be written as a single-row subquery. She will embed the query she used to calculate the average into the second query, replacing the constant value as follows:

single-row subquery

A subquery that returns a single row and is compared to a single value in the parent query.

select employee_id, last_name, first_name, salary
from employees
where salary > (select avg(salary) from employees)
and department_id = 60;

EMPLOYEE_ID LAST_NAME     FIRST_NAME                 SALARY
----------- ------------- ---------------------- ----------
        103 Hunold        Alexander                    9000

1 row selected.

Not only is the query more readable and easier to maintain than the version with two queries, but the Oracle server also will process it much more efficiently.

As a general rule, a query, enclosed in parentheses, can take the place of a table name in the FROM clause or a column name in the SELECT or WHERE clause of a query.


King is starting to realize that the IT department may need some pay increases in the next fiscal year.

6.1.2. Multiple-Row Subqueries

Sometimes, you want to compare a column in a table to a list of results from a subquery, not just a single result. This is where a multiple-row subquery comes in handy. For example, King is following up on his analysis of employee salaries in the IT department, and he wants to see who else in the company is making the same salary as anyone in the IT department.

multiple-row subquery

A subquery that can return more than one row for comparison to the main, or parent, query using operators such as IN.

Janice starts out with the subquery to make sure that she starts with the right set of results to use for the main query. She wants to get the salaries for the employees in the IT department (department 60):

select salary
from employees
where department_id = 60;

SALARY
----------
      9000
      6000
      4800
      4800
      4200

5 rows selected.

So far, so good. She takes this query and makes it a subquery in the query that compares the salaries of all employees to this list by using the IN clause:

select employee_id, last_name, first_name, salary
from employees
where salary in (select salary from employees
                 where department_id = 60);

EMPLOYEE_ID LAST_NAME     FIRST_NAME             SALARY
----------- ------------- ------------------ ----------
        158 McEwen        Allan                    9000
        152 Hall          Peter                    9000
        109 Faviet        Daniel                   9000
        103 Hunold        Alexander                9000
        202 Fay           Pat                      6000
        104 Ernst         Janice                   6000
        106 Pataballa     Valli                    4800
        105 Austin        David                    4800
        184 Sarchand      Nandita                  4200
        107 Lorentz       Diana                    4200

10 rows selected.

But wait, something is not quite right here. King did not want to see the IT employees in this list; he wanted to include everyone but the IT employees. So Janice makes a slight change as follows, removing employees whose job title is not an IT job title:

select employee_id, last_name, first_name, salary
from employees
where salary in (select salary from employees
                 where department_id = 60)
      and job_id not like 'IT_%';

EMPLOYEE_ID LAST_NAME     FIRST_NAME             SALARY
----------- ------------- ------------------ ----------
        158 McEwen        Allan                    9000
        152 Hall          Peter                    9000
        109 Faviet        Daniel                   9000
        202 Fay           Pat                      6000
        184 Sarchand      Nandita                  4200

5 rows selected.

Note that Janice also could have checked for a department ID other than 60, as you have seen in previous queries.

6.1.3. Correlated Subqueries

A correlated subquery looks very much like a garden-variety subquery, with one important difference: The correlated subquery references a column in the main query as part of the qualification process to see if a given row will be returned by the query. For each row in the parent query, the subquery is evaluated to see if the row will be returned. In Janice's situation, the salary of each individual employee is compared to the average salary for that employee's department. The check-marked rows in the parent query are returned.

correlated subquery

A subquery that contains a reference to a column in the main, or parent, query.



Janice knows that King will be asking for more queries regarding salaries, so she comes up with a fairly generic query that will identify employees who are making more than the average salary for their department. As a first step, she builds the subquery that retrieves the average salary for a department:

select avg(salary) from employees
    where department_id = 60;

AVG(SALARY)
-----------
       5760

1 row selected.

That query returns the average salary for department 60. In the correlated subquery, she will need to generalize it so that it will correlate with any department in the parent query. Next, she builds the parent query that compares a given employee's salary to the average she just calculated:

select employee_id, last_name, salary
  from employees
  where department_id = 60 and
    salary > 5760;

EMPLOYEE_ID LAST_NAME              SALARY
----------- ------------------ ----------
        103 Hunold                   9000
        104 Ernst                    6000

2 rows selected.

Notice that there are two queries that can now be linked together into a correlated subquery to return all employees who earn more than the average for their department across all departments. If you're not sure how to link these two queries, the hint is in the column names. Janice joins the two queries using the DEPARTMENT_ID column:

select employee_id, last_name, department_id, salary
  from employees emp
  where
    salary > (select avg(salary) from employees
              where department_id = emp.department_id);

EMPLOYEE_ID LAST_NAME          DEPARTMENT_ID     SALARY
----------- ------------------ ------------- ----------
        100 King                          90      24000

103 Hunold                        60       9000
        104 Ernst                         60       6000
        108 Greenberg                    100      12000
        109 Faviet                       100       9000
...
        193 Everett                       50       3900
        201 Hartstein                     20      13000
        205 Higgins                      110      12000

38 rows selected.

As Janice expected, this query still shows that Hunold and Ernst make more than the average salary for department 60.

6.1.4. Multiple-Column Subqueries

There are times when you need to use a subquery that compares more than just one column between the parent query and the subquery. This is known as a multiple-column subquery. Typically, the IN clause is used to compare the outer query's columns to the columns of the subquery.

NOTE

Multiple-column subqueries can be rewritten as a compound WHERE clause with multiple logical operators. However, this approach is not as readable or maintainable as a multiple-column subquery.

The boss, King, wants to be able to identify employees who make the same salaries as other employees with the same job. He wants to specify an employee number and have the query return the other employees who have the same job title and make the same salary. Janice immediately realizes that this could be written as a multiple-column subquery. She decides to try out the query on one of the stock clerks, Hazel Philtanker, who has an employee number of 136:

multiple-column subquery

A subquery in which more than one column is selected for comparison to the main query using the same number of columns.

select employee_id, last_name, job_id, salary
  from employees
  where (job_id, salary) in
        (select job_id, salary from employees
         where employee_id = 136);

EMPLOYEE_ID LAST_NAME       JOB_ID         SALARY
----------- --------------- ---------- ----------
        128 Markle          ST_CLERK         2200
        136 Philtanker      ST_CLERK         2200

2 rows selected.

The query looks good, except that Hazel is included in the results. If King decides he doesn't want to see the selected employee in the results, Janice can modify the query slightly and change it into a correlated multiple-column subquery:

select employee_id, last_name, job_id, salary
  from employees emp
  where (job_id, salary) in
        (select job_id, salary from employees
         where employee_id = 136
         and employee_id != emp.employee_id);

EMPLOYEE_ID LAST_NAME       JOB_ID         SALARY
----------- --------------- ---------- ----------
        128 Markle          ST_CLERK         2200

1 row selected.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset