A lot happens in the WHERE clause. This is the place where the rows (with columns both actual and derived) from the list specified in the SELECT clause are trimmed down to only the results you need to see. Starting with the syntax described in Chapter 2, "SQL*Plus and iSQL*Plus Basics," we can expand the SELECT statement syntax as follows:
SELECT * | {[DISTINCT] column | expression [alias], ...} FROM tablename [WHERE condition ... ];
The WHERE clause may have one or more conditions, separated by AND and OR and optionally grouped in parentheses to override the default precedence.
From the perspective of the table, the SELECT clause slices a table vertically, and the WHERE clause slices it horizontally.
A WHERE clause will often compare one column's value to a constant or compare two of the columns to each other in some way. Table 4.1 lists the comparison operators that are valid within a WHERE clause.
Comparison Operator | Definition |
---|---|
= | Equal to |
> | Greater than |
>= | Greater than or equal to |
< | Less than |
<= | Less than or equal to |
<>, !=, ^= | Not equal to |
In Chapter 3, "Oracle Database Functions," you learned about operator precedence. The comparison operators are lower in precedence only to the arithmetic operators *, /, +, and − and the concatenation operator ||. This makes a lot of sense when you consider how expressions are typically used in WHERE clauses: Some kind of arithmetic operation is performed on one or more columns or constants, and that result is compared to another constant, column, or arithmetic operation on one or more columns or constants. For instance, consider this WHERE clause:
where salary * 1.10 > 24000
This example will evaluate SALARY * 1.10 first and then do the comparison to 24000.
In Scott's widget company, another corporate shakeup has occurred, and King is once again the president of the company. Janice, in her analyst role, is running some reports against the EMPLOYEES table for King, whose first task is to do a thorough salary review for all employees who have salaries that are within $10,000 of his salary. Janice knows that King's salary is $24,000, so she will specify this numeric literal in the query, along with the $10,000 for the difference in salary:
select employee_id "Emp ID", last_name "Last Name", salary "Salary" from employees where salary + 10000 > 24000;
Emp ID Last Name Salary ---------- ------------------------- ---------- 100 King 24000 101 Kochhar 17000 102 De Haan 17000 3 rows selected.
A few things come to mind right away. First, King himself is in the list. You will learn how to remove his name in the next section. Janice could have also written the WHERE clause the other way around:
where 24000 < salary + 10000;
and the results of the query would be the same. Janice could have also saved a bit of processing time by calculating the salary cutoff number before writing the query:
where salary > 14000;
How you write your WHERE clause may be about style, readability, and documentation more than it is about processing speed, which is why the first version of the WHERE clause might be the best choice.
NOTE
Column aliases are not allowed in the WHERE clause. The actual column names must be used.
The WHERE clause using comparison operators is really powerful, but in reality, you usually have more than one condition for selecting rows. Sometimes you need all of the conditions to be true, sometimes you need only one of the conditions to be true, and sometimes you want to specify what you don't need. You can accomplish this by using AND, OR, and NOT in your WHERE clauses.
Using an AND between two comparison conditions will give you rows from the table that satisfy both conditions. In one of the queries above, Janice noticed that King's name was returned in the query that was looking for other employees who had salaries close to King's. There is no need to include King in this query. Since Janice knows King's employee ID, she can remove him from the results of those queries by adding an AND condition, as follows:
select employee_id "Emp ID", last_name "Last Name", salary "Salary" from employees where salary + 10000 > 24000 and employee_id != 100;
Emp ID Last Name Salary ---------- ------------------------- ---------- 101 Kochhar 17000 102 De Haan 17000 2 rows selected.
The rules of precedence tell us that AND is very low on the list, and therefore the AND operation is performed last in the WHERE clause. However, for clarity, it doesn't hurt to add parentheses to make the conditional expressions more obvious:
where (salary + 10000 > 24000) and (employee_id != 100);
There are other ways to remove King from the query. We'll discuss some of these methods in Chapter 6, "Advanced SQL Queries."
Now King decides that he wants to include anyone who works in the IT department, in addition to those whose salaries are close to his. Janice recognizes that this is a job for the OR operator. She modifies the query to include those employees who are in the IT department, using the JOB_ID column:
select employee_id "Emp ID", last_name "Last Name", salary "Salary" from employees where (salary + 10000 > 24000) and (employee_id != 100) or job_id = 'IT_PROG'; Emp ID Last Name Salary ---------- ------------------------- ---------- 101 Kochhar 17000 102 De Haan 17000 103 Hunold 9000 104 Ernst 6000 105 Austin 4800 106 Pataballa 4800 107 Lorentz 4200 7 rows selected.
Since the AND has a higher priority than the OR, the salary and employee ID comparisons are evaluated to see if they are both true; if so, the row is returned. If either one or the other is not true, the row might still be returned if the employee is in the IT department. Janice can make this WHERE clause more readable by putting in the parentheses, even if they're not needed:
where ((salary + 10000 > 24000) and (employee_id != 100)) or (job_id = 'IT_PROG'),
|
Janice expects that the other shoe will drop in a month or two, when King will ask for a report that has everyone else in it. This is a good place to use NOT. Janice can use this operator to negate the entire set of conditions that gave the first set of rows, thus returning the rest of the rows:
select employee_id "Emp ID", last_name "Last Name", salary "Salary" from employees where not ( (salary + 10000 > 24000) and (employee_id != 100) or job_id = 'IT_PROG' ) ; Emp ID Last Name Salary ---------- ------------------------- ---------- 100 King 24000 108 Greenberg 12000 109 Faviet 9000 110 Chen 8200 ... 203 Mavris 6500 204 Baer 10000 205 Higgins 12000 206 Gietz 8300 100 rows selected.
Note how Janice merely put the entire previous WHERE clause into parentheses and added a NOT in the front. One query returns a given set of rows, and a second query returns everything but the given set of rows. So, between the two queries, she has covered the entire table. Janice will have this report ready for King when he asks for it.
The BETWEEN, IN, and LIKE operators provide more ways to trim down the number of rows returned from a query. BETWEEN gives you an easy way to check for a value that falls within a certain range. The IN operator can help you find values in a list. LIKE can help you find character strings that match a certain pattern. Adding NOT to these will give you just the opposite set of rows.
BETWEEN a Rock and a Hard Place
The BETWEEN operator in a WHERE clause will limit the rows to a range that is specified by a beginning value and an ending value; the range is inclusive. The values can be dates, numbers, or character strings. The column values to be compared will be converted to the datatypes of the values in the BETWEEN operator as needed.
Each quarter at Scott's widget company, employees are recognized for years of service to the company. Janice is in charge of generating the report that lists the employees who have their anniversary within the next three months. Her query will use one of the functions mentioned in the previous chapter, EXTRACT, which returns one of the individual components of a DATE datatype.
select employee_id "Emp ID", department_id "Dept ID", hire_date "Hire Date", last_name || ', ' || first_name "Name" from employees /* Oct to Dec */ where extract(month from hire_date) between 10 and 12; Emp ID Dept ID Hire Date Name ---------- ---------- --------- ---------------------- 113 100 07-DEC-99 Popp, Luis 114 30 07-DEC-94 Raphaely, Den 116 30 24-DEC-97 Baida, Shelli 118 30 15-NOV-98 Himuro, Guy 123 50 10-OCT-97 Jasper, Susan Abigail 124 50 16-NOV-99 Mourgos, Kevin 130 50 30-OCT-97 Atkinson, Mozhe 135 50 12-DEC-99 Gee, Ki 138 50 26-OCT-97 Stiles, Stephen 141 50 17-OCT-95 Rajs, Trenna 145 80 01-OCT-96 Russell, John 148 80 15-OCT-99 Cambrault, Gerald 154 80 09-DEC-98 Cambrault, Nanette 155 80 23-NOV-99 Tuvault, Oliver 160 80 15-DEC-97 Doran, Louise
161 80 03-NOV-98 Sewall, Sarath 162 80 11-NOV-97 Vishney, Clara 191 50 19-DEC-99 Perkins, Randall 18 rows selected.
There is a lot going on in this query. First, notice that the columns are all aliased to make the output much more readable.
Janice also used the concatenation operator || to make the output more readable. She could have used the CONCAT function here, although she would need to use it twice to get the same results.
There is also something else new in this example: the /* and */. These characters denote a comment in Oracle SQL. A comment is used to help document the SQL code that you're writing. Documenting your SQL code is good not only for other developers who may need to modify your code in the future, but also for you when, months from now, you can't quite remember why you used a particular table or function!
comment
Documentation for SQL statements. Comments are specified by using the pair /* and */ or by using --.
Alternatively, you can use -- to specify a comment, like this:
select * from employees -- All columns needed for finance
The main difference between using /* */ and -- is that the latter form treats everything to the end of the line as a comment, whereas the former treats everything as a comment until the closing */ is reached, which may be on the same line or several lines later.
NOTE
Although both /* */ and -- can be used almost interchangeably, the /* */ form must be used after the SELECT keyword when specifying optimizer hints. See Chapter 12, "Making Things Run Fast (Enough)," for details on how to specify hints to the optimizer.
Finally, the query has the BETWEEN operator. The EXTRACT function will return a value from 1 to 12 because the function is called with MONTH, and if this value falls in the range of 10 to 12, then the row is returned from the query.
What happens if you change the BETWEEN operator slightly and reverse the order of the months?
where extract(MONTH from HIRE_DATE) between 12 and 10;
Your intuition might tell you that this form of the WHERE clause would work, since 11 would still be between 12 and 10, just as 11 is between 10 and 12. But it doesn't work. This is because of how Oracle's SQL engine translates the arguments of the BETWEEN operator. When processing the query, Oracle changes BETWEEN to a pair of comparisons joined with an AND, as follows:
where extract(MONTH from HIRE_DATE) >= 12 and extract(MONTH from HIRE_DATE) <= 10;
Since no number can be at the same time greater than or equal to 12 and less than or equal to 10, no rows will be returned from a query with this WHERE clause.
IN the Thick of Things
The IN operator makes it easy to specify a list of values to search for in a WHERE clause. The IN clause contains a list of one or more values, separated by commas and enclosed in parentheses:
IN (value1, value2, ...)
It is ideal for situations where the values to be selected aren't in a range that the BETWEEN operator (or a pair of comparisons with an AND) can easily handle.
At Scott's widget company, one of the vice presidents, one of the store managers, and one of the purchasing managers will be temporarily moving to Chicago to open a new branch office. The employees who report to them will also move. The manager IDs for these positions are 102, 114, and 121. Janice writes a query to identify the people who are moving along with their managers:
select employee_id "Emp ID", manager_id "Mgr ID", last_name || ', ' || first_name "Name" from employees where manager_id in (102, 114, 121); Emp ID Mgr ID Name ---------- ---------- ---------------------------- 103 102 Hunold, Alexander 115 114 Khoo, Alexander 116 114 Baida, Shelli 117 114 Tobias, Sigal 118 114 Himuro, Guy 119 114 Colmenares, Karen 129 121 Bissot, Laura 130 121 Atkinson, Mozhe 131 121 Marlow, James 132 121 Olson, TJ 184 121 Sarchand, Nandita 185 121 Bull, Alexis 186 121 Dellinger, Julia 187 121 Cabrio, Anthony 14 rows selected.
The IN operator could be rewritten with a series of OR conditions, but once you need to use more than two or three values, the advantages of using IN become apparent.
As you might expect, NOT IN is also valid. If the query you want to write sounds something like, "I want all the values except for these two or three...", then NOT IN is probably a good choice.
What's Not to Like about LIKE?
The LIKE operator lets you do pattern matching in a query. You know how to search for exact strings and numbers, but in some cases, you know only a few digits of the number or a portion of the string you need to find.
pattern matching
Comparing a string in a database column to a string containing wildcard characters. These wildcard characters can represent zero, one, or more characters in the database column string.
The LIKE operator can be used interchangeably with an equal sign, except that the string specified with LIKE can contain wildcard characters. The wildcard characters allowed in LIKE are %, which represents zero or more characters, and _, which represents exactly one character.
For example, the pattern 'Sm_th%' will match 'Smith' and 'Smythe' but not 'Smooth'. The pattern '%o%o%' will match any string that contains at least two lowercase o characters.
Janice is writing an ad-hoc query for Employee Services that will retrieve the job titles that have the word "Manager" somewhere in the title. She uses the LIKE operator:
select job_id, job_title from jobs where job_title like '%Manager%'; JOB_ID JOB_TITLE ---------- ----------------------------------- FI_MGR Finance Manager AC_MGR Accounting Manager SA_MAN Sales Manager PU_MAN Purchasing Manager ST_MAN Stock Manager MK_MAN Marketing Manager 6 rows selected.
NOTE
When numbers or dates are used with the LIKE operator, they are converted to character strings using the default conversion rules before comparing to the LIKE string.
What happens when you want to search for the _ or % characters themselves? The job IDs in Scott's corporate database use underscores, so Janice would get erroneous results if she specified 'sT_' in the LIKE string to find store-related jobs. This would also return jobs that had 'ASSISTANT' or 'COSTMGR' in the job ID. To solve this problem, she uses the ESCAPE option of the LIKE clause. The ESCAPE option lets you define a special character—one that you don't expect to find in your strings—to use before _ or % to indicate that you're actually looking for a _ or % character. To find all the job descriptions for jobs that are store-related, and therefore begin with 'sT_', Janice uses the following query:
select job_id, job_title from jobs where job_id like 'sT\_%' escape ''; JOB_ID JOB_TITLE ---------- ----------------------------------- ST_MAN Stock Manager ST_CLERK Stock Clerk 2 rows selected.
The ESCAPE option is used only with LIKE, and it tells the SQL engine to treat the character that follows literally instead of as a wildcard character. Notice in the above example that the underscore is "escaped," but the % acts as it normally does and specifies that zero or more characters follow.
|
As mentioned in previous chapters, NULLs can be very useful in the database for saving disk space and for identifying values that are unknown, as opposed to being blank or zero. The key to understanding NULLs is to know that they are not equal to anything. Therefore, NULLs won't work with the standard comparison operators, such as +, /, >, =, and so forth. Janice learned this the hard way when she wanted to identify employees who made a commission of less than 15 percent or no commission at all. Here is the query she used:
select employee_id "Emp ID", last_name "Name", commission_pct "Comm%" from employees where commission_pct < 0.15;
Emp ID Name Comm% ---------- ------------------------- ---------- 164 Marvins .1 165 Lee .1 166 Ande .1 167 Banda .1 173 Kumar .1 179 Johnson .1 6 rows selected.
This list appears to be way too short. That is because the rows in the EMPLOYEES table with NULL values for the commission do not pass the criteria of being less than 0.15; they don't compare to any value because they are unknown.
This is where the IS NULL and IS NOT NULL operators come to the rescue. These two operators are the only ones that can do a direct comparison to values that are NULL in a database row. For Janice to fix her query, she needs to add an IS NULL condition to her WHERE clause:
select employee_id "Emp ID", last_name "Name", commission_pct "Comm%" from employees where commission_pct < 0.15 or commission_pct is null; Emp ID Name Comm% ---------- ------------------------- ---------- 100 King 101 Kochhar 102 De Haan ... 164 Marvins .1 165 Lee .1 166 Ande .1 167 Banda .1 173 Kumar .1 179 Johnson .1 180 Taylor ... 205 Higgins 206 Gietz 78 rows selected.
|
You'll see in the section on GROUP BY how multirow functions handle NULL values in a reasonable and expected way.