Using the INTERSECT Set Operator

A Brief Overview

The set operator INTERSECT does both of the following:
  • selects unique rows that are common to both tables
  • overlays columns
Figure 4.3 INTERSECT Set Operator Relationship
INTERSECT Set Operator Relationship

Example: Using the INTERSECT Operator Alone

The INTERSECT operator compares and overlays columns in the same way as the EXCEPT operator, by column position instead of column name. However, INTERSECT selects rows differently and displays in output the unique rows that are common to both tables.
The following PROC SQL set operation uses the INTERSECT operator to combine the tables Col1 and Col2.
proc sql;
   select *
      from certadv.col1 intersect
   select *
      from certadv.col2;
quit;
Output 4.11 PROC SQL Query Result Illustration
Tables Col1, Col2, and Output
Tables Col1 and Col2 have only one unique row in common and this row is displayed in the output.

Example: Using the Keyword ALL with the INTERSECT Set Operator

Adding the keyword ALL to the preceding PROC SQL query prevents PROC SQL from making an extra pass through the data. If there were any rows common to tables Col1 and Col2 that were duplicates of other common rows, they would also be included in output. However, as you have seen, there is only one common row in these tables.
proc sql;
   select *
      from certadv.col1 intersect all
   select *
      from certadv.col2;
quit;
Output 4.12 PROC SQL Query Result Illustration
Tables Col1, Col2, and Output

Example: Using the Keyword CORR with the INTERSECT Set Operator

To display the unique rows that are common to the two tables based on the column name instead of the column position, add the keyword CORR to the PROC SQL set operation.
proc sql;
   select * 
      from certadv.col1 intersect corr
   select *
      from certadv.col2;
quit;
Output 4.13 PROC SQL Query Result Illustration
Tables Col1, Col2, and Output
X is the only column name that is common to both tables, so X is the only column that PROC SQL examines and displays in the output. In the first pass, PROC SQL eliminates the rows that are duplicated within each table: the second and third rows in table Col1 contain the same value for X as the first row, and the fourth row in table Col2 contains the same value for X as the third row. In the second pass, PROC SQL eliminates any rows that are not common across tables: the fourth and fifth rows in table Col1 and the fifth row in table Col2 do not have a matching value of X in the other table. The output displays the three rows with unique values of X that are also common to both tables.

Example: Using the Keywords ALL and CORR with the INTERSECT Set Operator

If the keywords ALL and CORR are used together, the INTERSECT operator displays all unique and duplicate rows that are common to the two tables, based on columns that have the same name.
proc sql;
   select *
      from certadv.col1 intersect all corr
   select *
      from certadv.col2;
quit;
Output 4.14 PROC SQL Query Result Illustration
Tables Col1, Col2, and Output
PROC SQL examines and displays only the column with the same name, X. There are three common rows across the two tables, which are highlighted above. These three rows are displayed in the output.
Each of the tables contains at least one other row that duplicates a value of X in one of the common rows. For example, in the second and third rows in table Col1, the value of X is 1, and only one row is displayed in the output. However, in order to be considered a common row and to be included in the output, every duplicate row in one table must have a separate duplicate row in the other table. In this example, there are no rows that have duplicate values and that are also common across tables. Therefore, in this example, the set operation with the keywords ALL and CORR generates the same output as with the keyword CORR alone.

Complex Example Using the INTERSECT Operator

Suppose you want to display the names of the existing employees who have changed their salary or job code.
Table
Relevant Columns
Certadv.Staffchanges lists information for all new employees and existing employees who have had a change in salary or job code.
FirstName, LastName
Certadv.Staffmaster lists information for all existing employees.
FirstName, LastName
The relationship between these two tables is shown in the diagram.
Figure 4.4 INTERSECT Set Operator
INTERSECT Set Operator
To display the unique rows that are common to both tables, you use a PROC SQL set operation that contains INTERSECT. It is known that these tables contain no duplicates, so ALL is used to speed up query processing.
proc sql;
   select firstname, lastname
      from certadv.staffchanges
   intersect all
   select firstname, lastname
      from certadv.staffmaster;
quit;
Output 4.15 PROC SQL Query Result: Using INTERSECT Operator and Keyword ALL
PROC SQL Query Result: Using INTERSECT Operator and ALL Keyword
In this PROC SQL step, which contains just one INTERSECT set operator, the order in which you list the tables in the SELECT statement does not make a difference. However, in a more complex PROC SQL step that contains multiple stacked INTERSECT set operators, it is important to think through the table order carefully, depending on when you want the nonmatching rows to be eliminated. The output shows that there are four existing employees who have changed their salary or job code.
Last updated: October 16, 2019
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset