Overview 124
Introduction 124
Objectives 124
Prerequisites 125
Summary 149
Quiz 151
Suppose you are generating a report based on data from a health clinic. You want to display the results of individual patient stress tests taken in 1998 (which are stored in table A), followed by the results from stress tests taken in 1999 (which are stored in table B). Instead of combining the table rows horizontally, as you would in a PROC SQL join, you want to combine the table rows vertically (one on top of the other).
When you need to select data from multiple tables and combine the tables vertically, PROC SQL can be an efficient alternative to using other SAS procedures or the DATA step. In a PROC SQL set operation, you use one of four set operators (EXCEPT, INTERSECT, UNION, and OUTER UNION) to combine tables (and views) vertically by combining the results of two queries:
proc sql;
select *
from a
set-operator
select *
from b;
Each set operator combines the query results in a different way.
In this chapter, you will learn how to use the various set operators, with or without the optional keywords ALL and CORR (CORRESPONDING), to combine the results of multiple queries.
In this chapter, the references to tables are also applicable to views, unless otherwise noted. Δ
In this chapter, you learn to
combine the results of multiple PROC SQL queries in different ways by using the set operators EXCEPT, INTERSECT, UNION, and OUTER UNION
modify the results of a PROC SQL set operation by using the keywords ALL and CORR (CORRESPONDING)
compare PROC SQL outer unions with other SAS techniques.
A set operation is a SELECT statement that contains
two groups of query clauses (each group beginning with a SELECT clause)
a set operator
one or both of the keywords ALL and CORR (CORRESPONDING).
In the following PROC SQL step, the SELECT statement contains one set operation. The set operation uses the set operator UNION to combine the result of a query on the table Sasuser.Stress98 with the result of a query on the table Sasuser.Stress99.
proc sql; select * from sasuser.stress98 union select * from sasuser.stress99;
You will learn the details about using each set operator later in this chapter.
PROC SQL evaluates a SELECT statement with one set operation as follows:
Each query is evaluated to produce an intermediate (internal) result table.
Each intermediate result table then becomes an operand linked with a set operator to form an expression (for example, Table1 UNION Table2
).
PROC SQL evaluates the entire expression to produce a single output result set.
A single SELECT statement can contain more than one set operation. Each additional set operation includes a set operator and a group of query clauses, as shown in the following example:
proc sql; select * from table1 set-operator select * from table2 set-operator select * from table3;
This SELECT statement uses two set operators to link together three queries. Regardless of the number of set operations in a SELECT statement, the statement contains only one semicolon, which is placed after the last group of query clauses.
The following PROC SQL step contains two set operators (both are OUTER UNION) that combine three queries:
proc sql; select * from sasuser.mechanicslevel1 outer union select *
from sasuser.mechanicslevel2 outer union select * from sasuser.mechanicslevel3;
When PROC SQL evaluates a SELECT statement that contains multiple set operations, an additional processing step (step 3 below) is required:
Each query is evaluated to produce an intermediate (internal) result table.
Each intermediate result table then becomes an operand linked with a set operator to form an expression (for example, Table1 UNION Table2
).
If the set operation contains more than two queries, then the result from the first two queries (enclosed in parentheses in the following examples) becomes an operand for the next set operator and operand. For example:
with two set operators: (Table1 UNION Table2) EXCEPT Table3
with three set operators: ((Table1 UNION Table2) EXCEPT Table3) INTERSECT Table4
.
PROC SQL evaluates the entire expression to produce a single output result set.
When processing set operators, PROC SQL follows a default order of precedence, unless this order is overridden by parentheses in the expression(s). By default, INTERSECT is evaluated first. OUTER UNION, UNION, and EXCEPT all have the same level of precedence. Δ
Each of the four set operators EXCEPT, INTERSECT, UNION, and OUTER UNION selects rows and handles columns in a different way, as described below.
In the following chart, Table 1 is the table that is referenced in the first query and Table 2 is the table that is referenced in the second query. Δ
Set Operator | Treatment of Rows | Treatment of Columns | Example |
---|---|---|---|
EXCEPT | Selects unique rows from the first table that are not found in the second table. | Overlays columns based on their position in the SELECT clause without regard to the individual column names. | proc sql; select * from table1 except select * from table2; |
INTERSECT | Selects unique rows that are common to both tables. | Overlays columns based on their position in the SELECT clause without regard to the individual column names. | proc sql; select * from table1 intersect select * from table2; |
UNION | Selects unique rows from one or both tables. | Overlays columns based on their position in the SELECT clause without regard to the individual column names. | proc sql; select * from table1 union select * from table2; |
OUTER UNION | Selects all rows from both tables. The OUTER UNION operator concatenates the results of the queries. | Does not overlay columns. | proc sql; select * from table1 outer union select * from table2; |
A set operator that selects only unique rows will display one occurrence of a given row in output. Δ
When processing a set operation that displays only unique rows (a set operation that contains the set operator EXCEPT, INTERSECT, or UNION), PROC SQL makes two passes through the data, by default:
PROC SQL eliminates duplicate (nonunique) rows in the tables.
PROC SQL selects the rows that meet the criteria and, where requested, overlays columns.
For set operations that display both unique and duplicate rows, only one pass through the data (step 2 above) is required.
You can use a set operation to combine tables that have different numbers of columns and rows or that have columns in a different order.
Three of the four set operators (EXCEPT, INTERSECT, and UNION) combine columns by overlaying them. (The set operator OUTER UNION does not overlay columns.)
By default, the set operators EXCEPT, INTERSECT, and UNION overlay columns based on the relative position of the columns in the SELECT clause. Column names are ignored. You control how PROC SQL maps columns in one table to columns in another table by specifying the columns in the appropriate order in the SELECT clause. The first column specified in the first query's SELECT clause and the first column specified in the second query's SELECT clause are overlaid, and so on.
When columns are overlaid, PROC SQL uses the column name from the first table (the table referenced in the first query). If there is no column name in the first table, the column name from the second table is used. When the SELECT clause contains an asterisk (*) instead of a list of column names, the set operation combines the tables (and, if applicable, overlays columns) based on the positions of the columns in the tables.
For example, the following set operation uses the set operator EXCEPT, so columns are overlaid. The SELECT clause in each query uses an asterisk (*), so the columns are overlaid based on their positions in the tables. The first column in table One (X)is overlaid on the first column in table Two (X), and so on.
In order to be overlaid, columns in the same relative position in the two SELECT clauses must have the same data type. If they do not, PROC SQL generates a warning message in the SAS log and stops executing. For example, in the tables shown above, if the column One.X had a different data type than column Two.X, the SAS log would display the following error message.
Table 4.1. SAS Log
|
Next, we will use the keywords ALL and CORR to modify the default action of the set operators.
To modify the behavior of set operators, you can use either or both of the keywords ALL and CORR immediately following the set operator:
proc sql;
select *
from table
set-operator <all> <corr>
select *
from table2;
The use of each keyword is described below.
Keyword | Action | Used When… |
---|---|---|
ALL | Makes only one pass through the data and does not remove duplicate rows. | You do not care if there are duplicates. Duplicates are not possible. ALL cannot be used with OUTER UNION. |
CORR (or CORRESPONDING) | Compares and overlays columns by name instead of by position:
If an alias is assigned to a column in the SELECT clause, CORR will use the alias instead of the permanent column name. | Two tables have some or all columns in common, but the columns are not in the same order. |
In the remainder of this chapter, you will learn more about the use of each set operator, with and without the keywords ALL and CORR.
The set operator EXCEPT does both of the following:
selects unique rows from the first table (the table specified in the first query) that are not found in the second table (the table specified in the second query)
overlays columns.
Consider how EXCEPT works when used alone and with the keywords ALL and CORR.
Suppose you want to display the unique rows in table One that are not found in table Two. The PROC SQL set operation that includes the EXCEPT operator, the tables One and Two, and the output of the set operation are shown below:
The set operator EXCEPT overlays columns by their position. In this output, the following columns are overlaid:
the first columns, One.X and Two.X, both of which are numeric
the second columns, One.A and Two.B, both of which are character.
The column names from table One are used, so the second column of output is named A rather than B.
Consider how PROC SQL selects rows from table One to display in output.
In the first pass, PROC SQL eliminates any duplicate rows from the tables. As shown below, there is one duplicate row: in table One, the second row is a duplicate of the first row. All remaining rows in table One are still candidates in PROC SQL's selection process.
In the second pass, PROC SQL identifies any rows in table One for which there is a matching row in table Two and eliminates them. There is one matching row in the two tables, as shown below, which is eliminated.
The five remaining rows in table One, the unique rows, are displayed in the output.
To select all rows in the first table (both unique and duplicate) that do not have a matching row in the second table, add the keyword ALL after the EXCEPT set operator. The modified PROC SQL set operation, the tables One and Two, and the output are shown below:
The output now contains six rows. PROC SQL has again eliminated the one row in table One (the fifth row) that has a matching row in table Two (the fourth row). Remember that when the keyword ALL is used with the EXCEPT operator, PROC SQL does not make an extra pass through the data to remove duplicate rows within table One. Therefore, the second row in table One, which is a duplicate of the first row, is now included in the output.
To display both of the following, add the keyword CORR after the set operator.
only columns that have the same name
all unique rows in the first table that do not appear in the second table.
The modified PROC SQL set operation, the tables One and Two, and the output are shown below:
X is the only column that has the same name in both tables, so X is the only column that PROC SQL examines and displays in the output.
In the first pass, PROC SQL eliminates the second and third rows of table One from the output because they are not unique within the table; they contain values of X that duplicate the value of X in the first row of table One. In the second pass, PROC SQL eliminates the first, fourth, and fifth rows of table One because each contains a value of X that matches a value of X in a row of table Two. The output displays the two remaining rows in table One, the rows that are unique in table One and that do not have a row in table Two that has a matching value of X.
If the keywords ALL and CORR are used together, the EXCEPT operator will display all unique and duplicate rows in the first table that do not appear in the second table, and will overlay and display only columns that have the same name. The modified PROC SQL set operation, the tables One and Two, and the output are shown below:
Once again, PROC SQL looks at and displays only the column that has the same name in the two tables: X. Because the ALL keyword is used, PROC SQL does not eliminate any duplicate rows in table One. Therefore, the second and third rows in table One, which are duplicates of the first row in table One, appear in the output. PROC SQL does eliminate the first, fourth, and fifth rows in table One from the output because for each one of these three rows there is a corresponding row in table Two that has a matching value of X.
As this example shows, when the ALL keyword is used with the EXCEPT operator, a row in table One cannot be eliminated from the output unless it has a separate matching row in table Two. Table One contains three rows in which the value of X is 1, but table Two contains only one row in which the value of X is 1. That one row in table Two causes the first of the three rows in table One that have a matching value of X to be eliminated from the output. However, table Two does not have two additional rows in which the value of X is 1, so the other two rows in table One are not eliminated, and do appear in the output.
The EXCEPT operator can be used to solve a realistic business problem. Suppose you want to display the names of all new employees of a company. There is no table that contains information for only the new employees, so you will have to use data from the following two tables.
Table | Relevant Columns |
---|---|
Sasuser.Staffchanges lists information for all new employees and existing employees who have had a change in salary or job code |
|
Sasuser.Staffmaster lists information for all existing employees |
|
The relationship between these two tables is shown in the diagram below:
The intersection of these two tables includes information for all existing employees who have had changes in job code or salary. The shaded portion, the portion of Sasuser.Staffchanges that does not overlap with Sasuser.Staffmaster, includes information for the people that you want: new employees.
To separate the new employees from the existing employees in Sasuser.Staffchanges, you create a set operation that displays all rows from the first table (Sasuser.Staffchanges) that do not exist in the second table (Sasuser.Staffmaster). The following PROC SQL step solves the problem:
proc sql;
select firstname, lastname
from sasuser.staffchanges
except all
select firstname, lastname
from sasuser.staffmaster;
This PROC SQL set operation includes the operator EXCEPT and the keyword ALL. Although you do not want the output to contain duplicate rows, you already know that there are no duplicates in these two tables. Therefore, ALL is specified to prevent PROC SQL from making an extra pass through the data, which speeds up the processing of this query.
PROC SQL compares only the columns that are specified in the SELECT clauses, and these columns are compared in the order in which they are specified. The output displays the first and last names of the two new employees.
In a set operation that uses the EXCEPT operator, the order in which the tables are listed in the SELECT statement makes a difference. If the tables in this example were listed in the opposite order, the output would display all existing employees who have had no changes in salary or job code. Δ
This example is a variation of the preceding set operation. Suppose you want to display the number of existing employees who have had no changes in salary or job code. Once again, the query uses the following tables and columns.
Table | Relevant Columns |
---|---|
Sasuser.Staffchanges lists information for all new employees and existing employees who have had a change in salary or job code |
|
Sasuser.Staffmaster lists information for all existing employees |
|
The following PROC SQL query solves this problem:
proc sql; select count(*) label='No. of Persons' from (select EmpID from sasuser.staffmaster except all select EmpID from sasuser.staffchanges);
This PROC SQL query uses
the COUNT function with an asterisk (*) as an argument to count the number of employee IDs returned from the set operation
the set operator EXCEPT within an in-line view.
The in-line view returns a virtual table that contains employees who have had no changes in salary or job code. This virtual table is then passed to the COUNT(*) summary function, which counts the number of rows in the virtual table. The output shows that there are 144 existing employees who have had no changes in salary or job code.
The set operator INTERSECT does both of the following:
selects unique rows that are common to both tables
overlays columns.
The following example demonstrates how INTERSECT works when used alone and with the keywords ALL and CORR.
The INTERSECT operator compares and overlays columns in the sameway as the EXCEPT operator, by column position instead of column name. However, INTERSECT selects rows differently and displays in output the unique rows that are common to both tables. The following PROC SQL set operation uses the INTERSECT operator to combine the tables One and Two, which were introduced previously:
Tables One and Two have only one unique row in common and this row is displayed in the output. (This is the same row that was eliminated in the earlier example that contained the EXCEPT operator.)
Adding the keyword ALL to the preceding PROC SQL query prevents PROC SQL from making an extra pass through the data. If there were any rows common to tables One and Two that were duplicates of other common rows, they would also be included in output. However, as you have seen, there is only one common row in these tables. The modified PROC SQL query, the tables One and Two, and the output are shown below:
As before, there is just one row of output.
To display the unique rows that are common to the two tables based on the column name instead of the column position, add the CORR keyword to the PROC SQL set operation. The modified query, the tables One and Two, and the output are shown below:
X is the only column name that is common to both tables, so X is the only column that PROC SQL examines and displays in the output. In the first pass, PROC SQL eliminates the rows that are duplicated within each table: the second and third rows in table One contain the same value for X as the first row, and the fourth row in table Two contains the same value for X as the third row. In the second pass, PROC SQL eliminates any rows that are not common across tables: the fourth and fifth rows in table One and the fifth row in table Two do not have a matching value of X in the other table. The output displays the three rows with unique values of X that are also common to both tables.
If the keywords ALL and CORR are used together, the INTERSECT operator will display all unique and nonunique (duplicate) rows that are common to the two tables, based on columns that have the same name. The modified query, the tables One and Two, and the output are shown below:
PROC SQL examines and displays only the column with the same name, X. There are three common rows across the two tables, which are highlighted above, and these are the three rows that are displayed in the output.
Note that each of the tables contains at least one other row that duplicates a value of X in one of the common rows. For example, in the second and third rows in table One, the value of X is 1, as in one of the common rows. However, in order to be considered a common row and to be included in the output, every duplicate row in one table must have a separate duplicate row in the other table. In this example, there are no rows that have duplicate values and that are also common across tables. Therefore, in this example, the set operation with the keywords ALL and CORR generates the same output as with the keyword CORR alone.
Now that you have seen how the INTERSECT set operator works with very small tables, we can use INTERSECT in a realistic business problem. Suppose you want to display the names of the existing employees who have changed their salary or job code. (This query is the opposite of the query that you solved with the EXCEPT operator.)
Once again, you will use the following tables.
Table | Relevant Columns |
---|---|
Sasuser.Staffchanges lists information for all new employees and existing employees who have had a change in salary or job code |
|
Sasuser.Staffmaster lists information for all existing employees |
|
The relationship between these two tables is shown in the diagram below:
As shown in the earlier example with EXCEPT, the intersection of these two tables includes information for all existing employees who have had changes in job code or salary. It is the intersection of these two tables, shaded above, that you want to display.
To display the unique rows that are common to both tables, you use a PROC SQL set operation that contains INTERSECT. It is known that these tables contain no duplicates, so ALL is used to speed up query processing. The PROC SQL set operation is shown below:
proc sql;
select firstname, lastname
from sasuser.staffchanges
intersect all
select firstname, lastname
from sasuser.staffmaster;
In this PROC SQL step, which contains just one INTERSECT set operator, the order in which you list the tables in the SELECT statement does not make a difference. However, in a more complex PROC SQL step that contains multiple stacked INTERSECT set operators, it is important to think through the table order carefully, depending on when you want the non-matches to be eliminated.
The output shows that there are four existing employees who have changed their salary or job code. Δ
The set operator UNION does both of the following:
selects unique rows from both tables together
overlays columns.
The following example demonstrates how UNION works when used alone and with the keywords ALL and CORR.
To display all rows from the tables One and Two that are unique in the combined set of rows from both tables, use a PROC SQL set operation that includes the UNION operator:
With the UNION operator, PROC SQL first concatenates and sorts the rows from the two tables, and eliminates any duplicate rows. In this example, two rows are eliminated: the second row in table One is a duplicate of the first row, and the fourth row in table Two matches the fifth row in table One. All remaining rows, the unique rows, are included in the output. The columns are overlaid by position.
When the keyword ALL is added to the UNION operator, the output displays all rows from both tables, both unique and duplicate. The modified PROC SQL set operation, the tables One and Two, and the new output are shown below:
When the ALL keyword is used, PROC SQL does not remove duplicates or sort the rows. The output now includes the two duplicate rows that were eliminated in the previous example: the second row in table One and the fourth row in table Two. Note that the rows are in a different order in this output than they were in the output from the previous set operation.
To display all rows from the tables One and Two that are unique in the combined set of rows from both tables, based on columns that have the same name rather than the same position, add the keyword CORR after the set operator. The modified query, the tables One and Two, and the output are shown below:
X is the only column name that is common to both tables, so X is the only column that PROC SQL examines and displays in the output. In the combined set of rows from the two tables, there are duplicates of the values 1, 2, and 3, and these duplicate rows are eliminated from the output. The output displays the six unique values of X.
If the keywords ALL and CORR are used together, the UNION operator will display all rows in the two tables both unique and duplicate, based on the columns that have the same name. In this example, the output displays all 12 values for X, the one column that has the same name in both tables.
The UNION operator can be used to solve a realistic business problem. Suppose you are generating a report based on data from a health clinic. You want to display the results of individual patient stress tests taken in 1998, followed by the results from stress tests taken in 1999. To do this, you will use the UNION operator to combine the tables Sasuser.Stress98 and Sasuser.Stress99. These two tables are similar in structure:
both tables contain nine columns that have the same names
each row contains data for an individual patient.
You are not sure whether the tables contain duplicate records, but you do not want duplicates in your output. Because the tables have the same column structure, you can overlay the columns by position and the CORR keyword is not necessary. The PROC SQL set operation and output are shown below:
If you can determine that these tables have no duplicate records, you could add the keyword ALL to speed up processing by avoiding an extra pass through the data.
We can demonstrate another realistic business problem, to see how summary functions can be used with a set operator (in this case, UNION). Suppose you want to display the following summarized data for members of a frequent-flyer program: total points earned, total points used, and total miles traveled. All three values can be calculated from columns in the table Sasuser.Frequentflyers by using summary functions.
You might wonder why set operations are needed when only one table is involved. If you wanted to display the three summarized values horizontally, in three separate columns, you could solve the problem without a set operation, using the following simple SELECT statement:
proc sql; select sum(pointsearned) format=comma12. label='Total Points Earned', sum(pointsused) format=comma12. label='Total Points Used', sum(milestraveled) format=comma12. label='Total Miles Traveled' from sasuser.frequentflyers;
Assume, however, that you want the three values to be displayed vertically in a single column. To generate this output, you create three different queries on the same table, and then use two UNION set operators to combine the three query results:
proc sql; title 'Points and Miles Traveled'; title2 'by Frequent Flyers'; select 'Total Points Traveled:', sum(MilesTraveled) format=comma12. from sasuser.frequentflyers union select 'Total Points Earned:', sum(PointsEarned) format=comma12. from sasuser.frequentflyers union select 'Total Points Used:', sum(PointsUsed) format=comma12. from sasuser.frequentflyers;
Each SELECT clause defines two columns: a character constant as a label and the summarized value. The output is shown below.
The preceding program reads the same table three times, so it is not the most efficient way to solve this problem. Δ
The set operator OUTER UNION concatenates the results of the queries by
selecting all rows (both unique and nonunique) from both tables
not overlaying columns.
We can demonstrate how OUTER UNION works when used alone and with the keyword CORR. The ALL keyword is not used with OUTER UNION because this operator's default action is to include all rows in output.
Suppose you want to display all rows from both of the tables One and Two, without overlaying columns. The PROC SQL set operation that includes the OUTER UNION operator, the two tables, and the output are shown below:
In the output, the columns have not been overlaid. Instead, all four columns from both tables are displayed. Each row of output contains missing values in the two columns that correspond to the other table.
The output from the preceding set operation contains two columns with the same name. To overlay the columns with a common name, add the CORR keyword to the set operation:
The output from the modified set operation contains only three columns, because the two columns named X are overlaid.
There are many business situations that require two or more tables to be concatenated. For example, suppose you want to display the employee numbers, job codes, and salaries of all mechanics working for an airline. The mechanic job has three levels and there is a separate table containing data for the mechanics at each level: Sasuser.Mechanicslevel1, Sasuser.Mechanicslevel2, and Sasuser.Mechanicslevel3. These tables all contain the same three columns.
The following PROC SQL step uses two OUTER UNION operators to concatenate the tables, and the CORR keyword to overlay the columns that have common names:
A PROC SQL set operation that uses the OUTER UNION operator is just one SAS technique that you can use to concatenate tables. Program 1, below, is the PROC SQL set operation that was shown earlier in this chapter. Program 2 uses a different SAS technique to concatenate the hypothetical tables One and Two.
proc sql; create table three as select * from one outer union corr select * from two; quit;
data three; set one two; run;
proc print data=three noobs; run;
These two programs create the same table as output, as shown below.
When tables have a same-named column, the PROC SQL outer union will not produce the same output unless the keyword CORR is also used. CORR causes the same-named columns (in this example, the two columns named X) to be overlaid; without CORR, the OUTER UNION operator will include both of the same-named columns in the result set. The DATA step program will generate only one column X.
The two concatenation techniques shown above also vary in efficiency. A PROC SQL set operation generally requires more computer resources but might be more convenient and flexible than the DATA step equivalent.
This section contains the following:
a text summary of the material taught in this chapter
syntax for statements and option
a sample program
points to remember.
A set operation combines tables or views vertically (one on top of the other) by combining the results of two queries. A set operation is a SELECT statement that contains
two groups of query clauses (each group beginning with a SELECT clause)
one of the set operators EXCEPT, INTERSECT, UNION, and OUTER UNION
one or both of the keywords ALL and CORR (CORRESPONDING) as modifiers.
A single SELECT statement can contain multiple set operations.
When processing a set operation that displays only unique rows (a set operation that contains the set operator EXCEPT, INTERSECT, or UNION), PROC SQL makes two passes through the data, by default. For set operations that display both unique and duplicate rows, only one pass through the data is required.
For the set operators EXCEPT, INTERSECT, and UNION, columns are overlaid based on the relative position of the columns in the SELECT clause rather than by column name. In order to be overlaid, columns in the same relative position in the two SELECT clauses must have the same data type.
One or both keywords can be used to modify the default action of a set operator.
The set operator EXCEPT selects unique rows from the first table (the table specified in the first query) that are not found in the second table (the table specified in the second query) and overlays columns. This set operation can be modified by using either or both of the keywords ALL and CORR.
The set operator INTERSECT selects unique rows that are common to both tables and overlays columns. This set operation can be modified by using either or both of the keywords ALL and CORR.
The set operator UNION selects unique rows from both tables together and overlays columns. This set operation can be modified by using either or both of the keywords ALL and CORR.
The set operator OUTER UNION concatenates the results of two queries by selecting all rows (both unique and nonunique) from both tables and not
overlaying columns. This set operation can be modified by using the keyword CORR.
PROC SQL;
SELECT column-1<, … column-n>
FROM table-1 | view-1< , … table-n | view-n >
<optional query clauses >
set-operator <ALL> <CORR >
SELECT column-1 <, … column-n>
FROM table-1 | view-1< , … table-n | view-n >
<optional query clauses>;
QUIT;
proc sql; select firstname, lastname from sasuser.staffchanges intersect all select firstname, lastname from sasuser.staffmaster; quit;
Select the best answer for each question. After completing the quiz, check your answers using the answer key in the appendix.
Which statement is false with respect to a set operation that uses the EXCEPT, UNION, or INTERSECT set operator without a keyword?
Column names in the result set are determined by the first table.
To be overlaid, columns must be of the same data type.
To be overlaid, columns must have the same name.
By default, only unique rows are displayed in the result set.
The keyword ALL cannot be used with which of the following set operators?
EXCEPT
INTERSECT
UNION
OUTER UNION
Which PROC SQL step combines the tables Summer and Winter to produce the output displayed below?
proc sql; select * from summer intersect all select * from winter;
proc sql; select * from summer outer union select * from winter;
proc sql; select * from summer union corr select * from winter;
proc sql; select * from summer union select * from winter;
Which PROC SQL step combines tables but does not overlay any columns?
proc sql; select * from groupa outer union select *
from groupb;
proc sql; select * from groupa as a outer union corr select * from groupb as b;
proc sql; select coalesce(a.obs, b.obs) label='Obs', med, duration from groupa as a full join groupb as b on a.obs=b.obs;
proc sql; select * from groupa as a intersect select * from groupb as b;
Which statement is false regarding the keyword CORRESPONDING?
It cannot be used with the keyword ALL.
It overlays columns by name, not by position.
When used in EXCEPT, INTERSECT, and UNION set operations, it removes any columns not found in both tables.
When used in OUTER UNION set operations, it causes same-named columns to be overlaid.
Which PROC SQL step generates the following output from the tables Dogs and Pets?
proc sql; select name, price from pets except all select * from dogs;
proc sql; select name, price from pets except select * from dogs;
proc sql; select name, price from pets except corr all select * from dogs;
proc sql; select * from dogs except corr select name, price from pets;
The PROG1 and PROG2 tables list students who took the PROG1 and PROG2 courses, respectively. Which PROC SQL step will give you the names of the students who took only the PROG1 class?
proc sql; select fname, lname from prog1 intersect select fname, lname from prog2;
proc sql; select fname, lname from prog1 except all select fname, lname from prog2;
proc sql; select * from prog2 intersect corr select * from prog1;
proc sql; select * from prog2 union select * from prog1;
Which PROC SQL step will return the names of all the students who took PROG1, PROG2, or both classes?
proc sql; select fname, lname from prog1 intersect select fname, lname from prog2;
proc sql; select fname, lname from prog1 outer union corr select fname, lname from prog2;
proc sql; select fname, lname from prog1 union select fname, lname from prog2;
proc sql; select fname, lname from prog1 except corr select fname, lname from prog2;
Which PROC SQL step will return the names of all the students who took both the PROG1 and PROG2 classes?
proc sql; select fname, lname from prog1 union select fname, lname from prog2;
proc sql; select fname, lname from prog1 except corr select fname, lname from prog2;
proc sql; select fname, lname from prog1 intersect all select fname, lname from prog2;
proc sql; select fname, lname from prog1 union corr select fname, lname from prog2;
Which PROC SQL step will generate the same results as the following DATA step?
proc sql; select fname, lname from prog1 outer union corr select fname, lname from prog2 order by lname;
proc sql; select fname, lname from prog1 union select fname, lname from prog2 order by lname;
proc sql; select fname, lname from prog2 outer union select fname, lname from prog1 order by lname;
proc sql; select fname, lname from prog2 union corr select fname, lname from prog1 order by lname;