Chapter 24
Sub-query Functions
“ A little manu often casts a long shadow.”
Table of Contents Chapter 24 – Sub-query Functions
– An IN List is much like a Subquery
– An IN List Never has Duplicates – Just like a Subquery
– An IN List Ignores Duplicates
– The Final Answer Set from the Subquery
– Quiz- Answer the Difficult Question
– Answer to Quiz- Answer the Difficult Question
– Should you use a Subquery of a Join?
– Answer to Quiz- Write the Subquery
– Quiz- Write the More Difficult Subquery
– Answer to Quiz- Write the More Difficult Subquery
– Quiz- Write the Subquery with an Aggregate
– Answer to Quiz- Write the Subquery with an Aggregate
– Quiz- Write the Correlated Subquery
– Answer to Quiz- Write the Correlated Subquery
– The Basics of a Correlated Subquery
– The Top Query always runs first in a Correlated Subquery
– The Bottom Query runs Last in a Correlated Subquery
– Quiz- Who is coming back in the Final Answer Set?
– Correlated Subquery Example vs. a Join with a Derived Table
– Correlated Subquery that Finds Duplicates
– Quiz- Write the NOT Subquery
– Answer to Quiz- Write the NOT Subquery
– Quiz- Write the Subquery using a WHERE Clause
– Answer to Quiz- Write the Subquery using a WHERE Clause
– Quiz- Write the Subquery with Two Parameters
– Answer to Quiz- Write the Subquery with Two Parameters
– How the Double Parameter Subquery Works
– More on how the Double Parameter Subquery Works
– Quiz – Write the Triple Subquery
– Answer to Quiz – Write the Triple Subquery
– Quiz – How many rows return on a NOT IN with a NULL?
– Answer – How many rows return on a NOT IN with a NULL?
– How to handle a NOT IN with Potential NULL Values
– How a Correlated Exists matches up
– The Correlated NOT Exists Answer Set
– Quiz – How many rows come back from this NOT Exists?
– Answer – How many rows come back from this NOT Exists?
An IN List is much like a Subquery
SELECT *
FROM Employee_Table
WHERE Dept_No IN (100, 200)
This query is very simple and easy to understand. It uses an IN List to find all Employees who are in Dept_No 100 or Dept_No 200.
An IN List Never has Duplicates – Just like a Subquery
What is going on with this IN List? Why in the world are their duplicates in there? Will this query even work? What will the result set look like? Turn the page!
An IN List Ignores Duplicates
Duplicate values are ignored here. We got the same rows back as before and it is as if the system ignored the duplicate values in the IN List. That is exactly what happened.
The Subquery
There is a Top Query and a Bottom Query!
SELECT *
FROM Employee_Table
WHERE Dept_No IN (
SELECT Dept_No
FROM Department_Table) ;
The query above is a Subquery, which means there are multiple queries in the same SQL. The bottom query runs first and its purpose in life is to build a distinct list of values that it passes to the top query. The top query then returns the result set. This query solves the problem: Show all Employees in Valid Departments!
How a Basic Subquery Works
The Final Answer Set from the Subquery
Quiz- Answer the Difficult Question
How are Subqueries similar to Joins between two tables?
A great question was asked above. Do you know the key to answering? Turn the page!
Answer to Quiz- Answer the Difficult Question
How are Subqueries similar to Joins between two tables?
A Subquery between two tables or a Join between two tables will each need a common key that represents the relationship.
Just like Dept_No and Dept_No!
A Subquery will use a common key linking the two tables together very similar to a join! When subquerying between two tables look for the common link between the two tables. Most of the time they both have a column with the same name, but not always.
Should you use a Subquery of a Join?
When do I Subquery?
SELECT *
FROM Employee_Table
WHERE Dept_No IN (
SELECT Dept_No
FROM Department_Table) ;
When do I perform a Join?
SELECT E.*
FROM Employee_Table as E
Inner Join
Department_Table as D
ON E.Dept_No = D.Dept_No;
Both queries above return the same data. If you only want to see a report where the final result set has only columns from one table try a Subquery. Obviously, if you need columns on the report where the final result set has columns from both tables you have to do a Join.
Quiz- Write the Subquery
Write the Subquery
Select all columns in the Customer_Table if the customer has placed an order!
Here is your opportunity to show how smart you are. Write a Subquery that will bring back everything from the Customer_Table if the customer has placed an order in the Order_Table. Good luck! Advice: Look for the common key among both tables!
Answer to Quiz- Write the Subquery
Write the Subquery
Select all columns in the Customer_Table if the customer has placed an order!
SELECT *
FROM Customer_Table
WHERE Customer_Number IN (
SELECT Customer Number
FROM Order_Table) ;
Customer_Number | Customer_Name |
31323134 | ACE Consulting |
57896883 | XYZ Plumbing |
11111111 | Billy's Best Choice |
87323456 | Databases N-U |
The common key among both tables is Customer_Number. The bottom query runs first and delivers a distinct list of Customer_Numbers, which the top query uses in the IN List!
Quiz- Write the More Difficult Subquery
Write the Subquery
Select all columns in the Customer_Table if the customer has placed an order over $10,000.00 Dollars!
Here is your opportunity to show how smart you are. Write a Subquery that will bring back everything from the Customer_Table if the customer has placed an order in the Order_Table that is greater than $10,000.00.
Answer to Quiz- Write the More Difficult Subquery
Write the Subquery
Select all columns in the Customer_Table if the customer has placed an order over $10,000.00 Dollars!
SELECT *
FROM Customer_Table
WHERE Customer_Number IN (
SELECT Customer Number
FROM Order_Table
WHERE Order_Total > 10000.00) ;
Customer_Number | Customer_Name |
11111111 | Billy's Best Choice |
57896883 | XYZ Plumbing |
87323456 | Databases N-U |
Here is your answer!
Quiz- Write the Subquery with an Aggregate
Write the Subquery
Select all columns in the Employee_Table if the employee makes a greater Salary than the AVERAGE Salary.
Another opportunity knocking! Would someone please answer the query door!
Answer to Quiz- Write the Subquery with an Aggregate
Write the Subquery
Select all columns in the Employee_Table if the employee makes a greater Salary than the AVERAGE Salary.
SELECT * FROM Employee Table
WHERE Salary > (
SELECT AVG(Salary)
FROM Employee_Table) ;
Quiz- Write the Correlated Subquery
Write the Correlated Subquery
Select all columns in the Employee_Table if the employee makes a greater Salary than the AVERAGE Salary (Within their own Department).
Another opportunity knocking! This is a tough one and only the best get this written right.
Answer to Quiz- Write the Correlated Subquery
Write the Correlated Subquery
Select all columns in the Employee_Table if the employee makes a greater Salary than the AVERAGE Salary (Within their own Department).
SELECT * FROM Employee_Table as EE
WHERE Salary > (
SELECT AVG(Salary)
FROM Employee_Table as EEEE
WHERE EE.Dept_No = EEEE.Dept_No) ;
The Basics of a Correlated Subquery
The Top Query is Co-Related (Correlated) with the Bottom Query. The same table is used twice, but given a different alias both times. The bottom WHERE clause co-relates Dept_No from Top and Bottom.
SELECT *
FROM Employee_Table as EE
WHERE Salary > (
SELECT AVG(Salary)
FROM Employee_Table as EEEE
WHERE EE.Dept_No = EEEE.Dept_No) ;
Does the Top or Bottom Query run first?
The Top Query always runs first in a Correlated Subquery
SELECT *
FROM Employee_Table as EE
WHERE Salary > (
SELECT AVG(Salary)
FROM Employee_Table as EEEE
WHERE EE.Dept_No = EEEE.Dept_No) ;
The Top Query always runs first in a Correlated Subquery?
The Bottom Query runs Last in a Correlated Subquery
SELECT * FROM Employee_Table as EE
WHERE Salary > (
SELECT AVG(Salary)
FROM Employee_Table as EEEE
WHERE EE.Dept_No = EEEE.Dept_No) ;
The Bottom Query run 2nd to get the Average Salary once for each distinct Dept_No!
Quiz- Who is coming back in the Final Answer Set?
SELECT * FROM Employee_Table as EE
WHERE Salary > (
SELECT AVG(Salary)
FROM Employee_Table as EEEE
WHERE EE.Dept_No = EEEE.Dept_No) ;
Which Employees will be in the Final Answer Set?
Look at the results from the TOP Query on Left and then look at how the Bottom Query is run once per Dept_No and figure out in your head who is coming back.
SELECT * FROM Employee_Table as EE
WHERE Salary > (
SELECT AVG(Salary)
FROM Employee_Table as EEEE
WHERE EE.Dept_No = EEEE.Dept_No) ;
Which Employees will be in the Final Answer Set?
Correlated Subquery Example vs. a Join with a Derived Table
SELECT Last_Name, Dept_No, Salary
FROM Employee_Table as EE
WHERE Salary > (
SELECT AVG(Salary)
FROM Employee_Table as EEEE
WHERE EE.Dept_No = EEEE.Dept_No) ;
SELECT E.*, AVGSAL
FROM Employee_Table as E
INNER JOIN
(SELECT Dept_No, AVG(Salary)
FROM Employee_Table
GROUP BY Dept_No)
as TeraTom (Depty, AVGSAL)
ON Dept_No = Depty
AND Salary > AVGSAL ;
Correlated Subquery
Last_Name | Dept_No | Salary |
Smith | 200 | 48000.00 |
Harrison | 400 | 54500.00 |
Strickling | 400 | 54500.00 |
Join with a Derived Table
Both queries above will bring back all employees making a salary that is greater than the average salary in their department. The biggest difference is that the Join with the Derived Table also shows the Average Salary in the result set.
Correlated Subquery that Finds Duplicates
There is another challenge faced by many people today and it relates to improper controls being used in the front-end systems and human error. Specifically, duplicate records can be a problem in the data as well as in the functioning of the organization. For instance, a physician may bill the healthcare insurance company twice and they might even pay it twice. Either way, this activity should exist in the data warehouse. The following query can be used to find this erroneous type of data or occurrence:
Quiz- Write the NOT Subquery
Write the Subquery
Select all columns in the Customer_Table if the Customer has NOT placed an order.
Another opportunity knocking! Write the above query!
Answer to Quiz- Write the NOT Subquery
Write the Subquery
Select all columns in the Customer_Table if the Customer has NOT placed an order.
SELECT *
FROM Customer_Table
WHERE Customer_Number
NOT IN
(SELECT Customer_Number
FROM Order_Table
WHERE Customer_Number
IS NOT NULL) ;
SELECT *
FROM Customer_Table
WHERE Customer_Number
NOT = ALL
(SELECT Customer_Number
FROM Order_Table
WHERE Customer_Number
IS NOT NULL) ;
Wow! You can see that both queries are the same with just a few different techniques.
Quiz- Write the Subquery using a WHERE Clause
Write the Subquery
Select all columns in the Order_Table that were placed by a customer with ‘Bill’ anywhere in their name.
Another opportunity to show your brilliance is ready for you to make it happen.
Answer to Quiz- Write the Subquery using a WHERE Clause
Write the Subquery
Select all columns in the Order_Table that were placed by a customer with ‘Bill’ anywhere in their name.
SELECT * FROM Order_Table
WHERE Customer_Number IN
(SELECT Customer_Number FROM Customer_Table
WHERE Customer_Name LIKE ‘%Bill%’) ;
Great job on writing your query just like the above.
Quiz- Write the Subquery with Two Parameters
Write the Subquery
What is the highest dollar order for each Customer? This Subquery will involve two parameters!
Get ready to be amazed at either yourself or the Answer on the next page!
Answer to Quiz- Write the Subquery with Two Parameters
Write the Subquery
What is the highest dollar order for each Customer? This Subquery will involve two parameters!
SELECT Customer_Number, Order_Number, Order_Total
FROM Order_Table
WHERE (Customer_Number, Order_Total) IN
(SELECT Customer_Number, MAX(Order_Total)
FROM Order_Table GROUP BY 1) ;
This is how you utilize multiple parameters in a Subquery! Turn the page for more.
How the Double Parameter Subquery Works
SELECT Customer_Number, Order_Number, Order_Total
FROM Order_Table
WHERE (Customer_Number, Order_Total) IN
(SELECT Customer_Number, MAX(Order_Total)
FROM Order_Table GROUP BY 1) ;
The bottom query runs first returning two columns. Next page for more info!
More on how the Double Parameter Subquery Works
The IN list is built and the top query can now process for the final Answer Set.
Quiz – Write the Triple Subquery
Write the Subquery
What is the Customer_Name who has the highest dollar order among all customers? This query will have multiple Subqueries!
Good luck in writing this. Remember that this will involve multiple Subqueries.
Answer to Quiz – Write the Triple Subquery
Write the Subquery
What is the Customer_Name who has the highest dollar order among all customers? This query will have multiple Subqueries!
SELECT Customer_Name
FROM Customer_Table
WHERE Customer_Number IN
(SELECT Customer_Number FROM Order_Table
WHERE Order_Total IN
(SELECT Max(Order_Total) FROM Order_Table)) ;
The query is above and of course the answer is XYZ Plumbing.
Quiz – How many rows return on a NOT IN with a NULL?
How many rows return from the query conceptually?
We really didn't place a new row inside the Order_Table with a NULL Customer_Number, but in theory if we had how many rows would return?
Answer – How many rows return on a NOT IN with a NULL?
How many rows return from the query conceptually? ZERO
The answer is no rows. This is because when you have a NULL value in a NOT IN list the system doesn't know the value of NULL so it returns nothing.
How to handle a NOT IN with Potential NULL Values
How many rows return NOW from the query? 1 Acme Products
You can utilize a WHERE clause that tests to make sure Customer_Number IS NOT NULL. This should be used when a NOT IN could encounter a NULL.
IN is equivalent to =ANY
Instead of an IN you can use the = ANY
SELECT Customer_Number
,Customer_Name
FROM Customer_Table
WHERE Customer_Number IN
(SELECT Customer_Number
FROM Order_Table ) ;
SELECT Customer_Number
,Customer_Name
FROM Customer_Table
WHERE Customer_Number = ANY
(SELECT Customer_Number
FROM Order_Table ) ;
Instead of using the IN, you can use the = ANY command. These queries work the SAME. The above queries will produce the same result set.
Using a Correlated Exists
Use EXISTS to find which Customers have placed an Order?
SELECT Customer_Number, Customer_Name
FROM Customer_Table as Top1
WHERE EXISTS
(SELECT * FROM Order_Table as Bot1
Where Top1.Customer_Number = Bot1.Customer_Number ) ;
The EXISTS command will determine via a Boolean if something is True or False. If a customer placed and order it EXISTS and using the Correlated Exists statement only customers who have placed an order will return in the answer set. EXISTS is different than IN as it is less restrictive as you will soon understand.
How a Correlated Exists matches up
SELECT Customer_Number, Customer_Name
FROM Customer_Table as Top1
WHERE EXISTS
(SELECT * FROM Order_Table as Bot1
Where Top1.Customer_Number = Bot1.Customer_Number ) ;
Only customers who placed an order return with the above Correlated EXISTS.
The Correlated NOT Exists
Use NOT EXISTS to find which Customers have NOT placed an Order?
SELECT Customer_Number, Customer_Name
FROM Customer_Table as Top1
WHERE NOT EXISTS
(SELECT * FROM Order_Table as Bot1
Where Top1.Customer_Number = Bot1.Customer_Number ) ;
The EXISTS command will determine via a Boolean if something is True or False. If a customer placed and order it EXISTS and using the Correlated Exists statement only customers who have placed an order will return in the answer set. EXISTS is different than IN as it is less restrictive as you will soon understand.
The Correlated NOT Exists Answer Set
Use NOT EXISTS to find which Customers have NOT placed an Order?
SELECT Customer_Number, Customer_Name
FROM Customer_Table as Top1
WHERE NOT EXISTS
(SELECT * FROM Order_Table as Bot1
Where Top1.Customer_Number = Bot1.Customer_Number ) ;
The only customer who did NOT place an order was Acme Products.
Quiz – How many rows come back from this NOT Exists?
We added a Null Value to the Order_Table
SELECT Customer_Number, Customer_Name
FROM Customer_Table as Top1
WHERE NOT EXISTS
(SELECT * FROM Order_Table as Bot1
Where Top1.Customer_Number = Bot1.Customer_Number ) ;
How many rows return from the query conceptually?
A NULL value in a list for queries with NOT IN returned nothing, but you must now decide if that is also true for the NOT EXISTS. How many rows will return?
Answer – How many rows come back from this NOT Exists?
We added a Null Value to the Order_Table
SELECT Customer_Number, Customer_Name
FROM Customer_Table as Top1
WHERE NOT EXISTS
(SELECT * FROM Order_Table as Bot1
Where Top1.Customer_Number = Bot1.Customer_Number ) ;
How many rows return from the query conceptually? 1
NOT EXISTS is unaffected by a NULL in the list, that's why it is more flexible!