“An invasion of Armies can be resisted, but not an idea whose time has come.”
- Victor Hugo
This query is easy to understand. It uses an IN List to find all Employees who are in Dept_No 100 or Dept_No 200.
What is going on with this IN List? Why in the world are their duplicates in there? Will this query even work? What will the result set look like? Duplicate values are ignored here. We got the same rows back as before, and it is as if the system ignored the duplicate values in the IN List. That is exactly what happened.
The query above is a Subquery which means there are multiple queries in the same SQL. The bottom query runs first, and its purpose in life is to build a distinct list of values that it passes to the top query. The top query then returns the result set. This query solves the problem: Show all Employees in Valid Departments!
The bottom query runs first and builds a distinct IN list. Then the top query runs using the list.
Both queries above are the same. Query 2 has values in an IN list. Query 1 runs a subquery to build the values in the IN list.
How are Subqueries similar to Joins between two tables?
A great question was asked above. Do you know the key to answering? Turn the page!
How are Subqueries similar to Joins between two tables?
A Subquery between two tables or a Join between two tables will each need a common key
that represents the relationship. This is called a Primary Key/Foreign Key relationship.
A Subquery will use a common key linking the two tables together very similar to a join! When subquerying between two tables, look for the common link between the two tables. Most of the time they both have a column with the same name but not always.
If you only want to see a report where the final result set has only columns from one table, use a Subquery. Obviously, if you need columns on the report where the final result set has columns from both tables, you have to do a Join.
Write the Subquery
Select all columns in the Customer_Table if the customer has placed an order!
Here is your opportunity to show how smart you are. Write a Subquery that will bring back everything from the Customer_Table if the customer has placed an order in the Order_Table. Good luck! Advice: Look for the common key among both tables!
The common key among both tables is Customer_Number. The bottom query runs first and delivers a distinct list of Customer_Number values which the top query uses in the IN List!
Write the Subquery
Select all columns in the Customer_Table if the customer
has placed an order over $10,000.00 Dollars!
Here is your opportunity to show how smart you are. Write a Subquery that will bring back everything from the Customer_Table if the customer has placed an order in the Order_Table that is greater than $10,000.00.
Here is your answer!
Write SQL that will bring back an answer set that selects all columns from
the Student_Table if that student is taking a course that has four (4) credits.
Use a subquery to get the answer set requested above. The answer is on the next page.
Above is something to enjoy and learn from.
Write the Subquery
Select all columns in the Employee_Table if the employee makes
a greater Salary than the AVERAGE Salary.
Another opportunity knocking! Would someone please answer the query door?
Notice that we are no longer using an IN clause, but instead a greater than sign.
Write the Correlated Subquery
Select all columns in the Employee_Table if the employee makes a
greater Salary than the AVERAGE Salary (within their own Department).
Another opportunity knocking! This is a tough one, and only the best get this written correctly.
A Correlated subquery uses a column from the top query in the WHERE clause on the bottom query. This co-relates the top and bottom queries, thus the name correlated subquery. Since we wanted to see all salaries greater than the average salary within their own Dept_No the correlating column is Dept_No. Both tables are aliased so the WHERE clause is correlated.
The Top Query is Co-Related (Correlated) with the Bottom Query.
The table name from the top query and the table name from the bottom query are given a different alias.
The bottom query WHERE clause co-relates Dept_No from Top and Bottom.
The top query is run first.
The bottom query is run one time for each distinct value delivered from the top query.
SELECT *
FROMEmployee_Table as EE
WHERESalary > (
SELECTAVG(Salary)
FROMEmployee_Table as EEEE
WHEREEE.Dept_No = EEEE.Dept_No) ;
A correlated subquery breaks all the rules. It is the top query that runs first. Then, the bottom query is run one time for each distinct column in the bottom WHERE clause. In our example, this is the column Dept_No. This is because in our example, the WHERE clause is comparing the column Dept_No. After the top query runs and brings back its rows, the bottom query will run one time for each distinct Dept_No. If this is confusing, it is not you. These take a little time to understand, but I have a plan to make you an expert. Keep reading!
The top query runs first and then the bottom query is only run once per distinct Dept_No.
Both queries above will bring back all employees making a salary that is greater than the average salary in their department. The biggest difference is that the Join with the Derived Table also shows the Average Salary in the result set.
Select all columns in the Sales_Table if the Daily_Sales column is
greater than the Average Daily_Sales within its own Product_ID.
Another opportunity knocking! This is your second chance. I will even give you a third chance.
Select all columns in the Sales_Table if the Daily_Sales column is
greater than the Average Daily_Sales within its own Product_ID.
SELECT * FROM Sales_Table as TopS
WHEREDaily_Sales > (
SELECTAVG(Daily_Sales)
FROMSales_Table as BotS
WHERETopS.Product_ID = BotS.Product_ID)
ORDER BY Product_ID, Sale_Date ;
Notice that it is the Product_Id in the bottom WHERE clause.
Select all columns in the Sales_Table if the Daily_Sales column is
greater than the Average Daily_Sales within its own Sale_Date.
Another opportunity knocking! There is just one minor adjustment and you are home free.
Select all columns in the Sales_Table if the Daily_Sales column is
greater than the Average Daily_Sales within its own Sale_Date.
SELECT * FROM Sales_Table as TopS
WHEREDaily_Sales > (
SELECTAVG(Daily_Sales)
FROMSales_Table as BotS
WHERETopS.Sale_Date = BotS.Sale_Date)
ORDER BY Sale_Date ;
Notice that it is the Sale_Date in the bottom WHERE clause. Plus, we threw in an ORDER BY that is outside of the subquery.
Select all columns in the Student_Table if the Grade_Pt column
is greater than the Average Grade_Pt within its own Class_Code.
Another opportunity knocking! There is just one minor adjustment and you are home free.
Select all columns in the Student_Table if the Grade_Pt column
is greater than the Average Grade_Pt within its own Class_Code.
SELECT * FROM Student_Table as TopS
WHEREGrade_Pt > (
SELECTAVG(Grade_Pt)
FROMStudent_Table as BotS
WHERETopS. Class_Code = BotS.Class_Code )
ORDER BY Class_Code ;
Write a correlated subquery that will bring back an answer set that returns all columns from the Course_Table if that course is being taken by a student who has a greater than average grade point within their own class code.
Use a subquery to get the answer set requested above. The answer is on the next page.
SELECT *
FROM Course_Table
WHERE Course_ID IN
(SELECT Course_ID
FROM Student_Course_Table
WHERE Student_ID IN
(SELECT Student_ID
FROM Student_Table AS s1
WHERE Grade_Pt >
(SELECT AVG(Grade_Pt)
FROM Student_Table AS s2
WHERE s1.Class_Code=s2.Class_Code)
)
) ;
Above is something to enjoy and learn from.
Write the Subquery
Select all columns in the Customer_Table if the
Customer has NOT placed an order.
Another opportunity knocking! Write the above query!
When a NOT IN subquery encounters a NULL value it returns nothing. Since the bottom query is passing up the Customer_Number to the top query, if there are NULL values in any Customer_Number, the top query returns nothing. That is why we used the IS NOT NULL statement in the bottom WHERE clause.
Write the Subquery
Select all columns in the Order_Table that were placed
by a customer with ‘Bill’ anywhere in their name.
Write the above query and then check out the results on the next page.
Great job on writing your query just like the above.
Write the Subquery
What is the highest dollar order for each Customer?
This Subquery will involve two parameters!
Get ready to be amazed at either yourself or the Answer on the next page!
This is how you utilize multiple parameters in a Subquery! Turn the page for more.
SELECT Customer_Number, Order_Number, Order_Total
FROM Order_Table
WHERE (Customer_Number, Order_Total) IN
(SELECT Customer_Number, MAX(Order_Total)
FROM Order_Table GROUP BY 1) ;
The bottom query runs first returning two columns. Next page for more info!
The IN list is built and the top query can now process for the final Answer Set.
Write the Subquery
What is the Customer_Name who has the highest dollar order
among all customers? This query will have multiple Subqueries!
Good luck in writing this. Remember that this will involve multiple Subqueries.
The answer is XYZ Plumbing.
How many rows return from the query now that a
NULL value is in a Customer_Number?
We really didn’t place a new row inside the Order_Table with a NULL value for the Customer_Number column, but in theory, if we had, how many rows would return?
How many rows return from the query now that a
NULL value is in a Customer_Number?
ZERO rows will return
The answer is no rows come back. This is because when you have a NULL value in a NOT IN list, the system doesn’t know the value of NULL, so it returns nothing.
How many rows return NOW from the query? 1 Acme Products
You can utilize a WHERE clause that tests to make sure Customer_Number IS NOT NULL. This should be used when a NOT IN could encounter a NULL.
Instead of using the IN, you can use the = ANY command. These queries work the SAME. The above queries will produce the same result set.
The EXISTS command will determine via a Boolean if something is True or False. If a customer placed an order, it EXISTS, and using the Correlated Exists statement, only customers who have placed an order will return in the answer set.
SELECT Customer_Number, Customer_Name
FROM Customer_Table as Top1
WHERE EXISTS
(SELECT * FROM Order_Table as Bot1
Where Top1.Customer_Number = Bot1.Customer_Number ) ;
Customer_Number Customer_Name
_________________ _________________
11111111 Billy’s Best Choice
31323134 ACE Consulting
57896883 XYZ Plumbing
87323456 Databases N-U
Only customers who placed an order return with the above Correlated EXISTS.
The EXISTS command will determine via a Boolean if something is True or False. If a customer has not placed an order, it does not EXIST, and using the Correlated Exists statement, only customers who have not placed an order will return in the answer set. Null values do not affect a NOT EXIST statement like they do a NOT IN statement.
Use NOT EXISTS to find which Customers have NOT placed an Order?
SELECT Customer_Number, Customer_Name
FROM Customer_Table as Top1
WHERE NOT EXISTS
(SELECT * FROM Order_Table as Bot1
Where Top1.Customer_Number = Bot1.Customer_Number ) ;
Customer_Number Customer_Name
________________ ______________
31313131 Acme Products
The only customer who did NOT place an order was Acme Products.
SELECT Customer_Number, Customer_Name
FROM Customer_Table as Top1
WHERE NOT EXISTS
(SELECT * FROM Order_Table as Bot1
Where Top1.Customer_Number = Bot1.Customer_Number ) ;
How many rows return from the query?
A NULL value in a list for queries with NOT IN returned nothing, but you must now decide if that is also true for the NOT EXISTS. How many rows will return?
SELECT Customer_Number, Customer_Name
FROM Customer_Table as Top1
WHERE NOT EXISTS
(SELECT * FROM Order_Table as Bot1
Where Top1.Customer_Number = Bot1.Customer_Number ) ;
How many rows return from the query?
One row
Acme Products
NOT EXISTS is unaffected by a NULL in the list. That’s why it is more flexible!