Overview 262
Introduction 262
Objectives 262
Prerequisites 263
Example 271
Summary 278
Text Summary 278
Syntax 280
Sample Programs 280
Quiz 281
The SQL procedure offers a variety of options that control processing. Some options control execution. For example, you can limit the number of rows read or written during a query. Other options control output. For example, you can control the appearance of long character columns, double-space output, or (as shown below) number your rows. Options are also available for testing and evaluating performance.
Metadata is a description or definition of data or information. SAS session metadata is stored in Dictionary tables, which are special, read-only SAS tables that contain information about SAS libraries, SAS data sets, SAS macros, and external files that are available in the current SAS session. Dictionary tables also contain the settings for SAS system options and SAS titles and footnotes that are currently in effect. You can use the SQL procedure to access the metadata stored in Dictionary tables. For example, you can query a Dictionary table to find out which tables in a SAS library contain a specified column.
In this chapter, you learn to
use PROC SQL options to control execution
use PROC SQL options to control output
use PROC SQL to evaluate performance
reset PROC SQL options without re-invoking the procedure
use Dictionary tables and views to obtain information about SAS files.
Remember that PROC SQL options are specified in the PROC SQL statement.
After you specify an option, it remains in effect until you change it or you re-invoke PROC SQL. Δ
The following tables list the options for controlling processing that are covered in this chapter. A complete description and an example of each option appears in the following sections.
Table 8.1. Options to Control Execution
To do this… | Use this option… |
Restrict the number of input rows | INOBS= |
Restrict the number of output rows | OUTOBS= |
Table 8.2. Options to Control Output
To do this… | Use this option… |
Double-space the output | DOUBLE | NODOUBLE |
Flow characters within a column | FLOW | NOFLOW | FLOW=n | FLOW=n m |
Table 8.3. Options for Testing and Evaluating Performance
To do this… | Use this option… |
Specify whether PROC SQL writes timing information for each statement to the SAS log | STIMER | NOSTIMER |
For a complete list of options, see the SAS documentation for the SQL procedure. Δ
When you are developing queries against large tables, you can reduce the amount of time that it takes for the queries to run by reducing the number of rows that PROC SQL processes. Subsetting the tables with WHERE clauses is one way to do this. Using the INOBS= and OUTOBS= options in PROC SQL is another way.
You already know that you can use the OUTOBS= option to restrict the number of rows that PROC SQL displays or writes to a table. However, the OUTOBS= option does not restrict the rows that are read. The INOBS= option restricts the number of rows that PROC SQL takes as input from any single source. The INOBS= option is similar to the SAS system option OBS= and is useful for debugging queries on large tables.
For more information about the OUTOBS= option, see Chapter 2, "Performing Advanced Queries Using PROC SQL," on page 25. Δ
In the following PROC SQL set operation, INOBS= 5 is specified. As indicated in the log, only five rows from each source table, Sasuser.Mechanicslevel1 and Sasuser.Mechanicslevel2, are used. The resulting table contains 10 rows.
proc sql inobs=5; select * from sasuser.mechanicslevel1 outer union corr select * from sasuser.mechanicslevel2;
You can use the PROMPT | NOPROMPT option with the INOBS= and OUTOBS= options so that you are prompted to stop or continue processing when the limits set by these options are reached.
For more information about PROC SQL set operations, see Chapter 4, "Combining Tables Vertically Using PROC SQL," on page 123. Δ
In a simple query, there might be no apparent differences between using INOBS= or OUTOBS=. Other times, it is important to choose the correct option. For example, using the average function on a column with the PROC SQL option INOBS=10 returns an average of only the 10 values read for that column. Δ
The NUMBER | NONUMBER option specifies whether the output from a query should include a column named ROW, which displays row numbers. NONUMBER is the default. The option is similar to the OBS | NOOBS option in the PRINT procedure.
The following PROC SQL step specifies the NUMBER option. Output from the step includes a column named Row, which contains row numbers.
proc sql inobs=10 number;
select flightnumber, destination
from sasuser.internationalflights;
In some cases, double-spacing your output can make it easier to read. The DOUBLE | NODOUBLE option specifies whether PROC SQL output is double-spaced. The default is NODOUBLE.
The DOUBLE | NODOUBLE option does not affect the appearance of the HTML output. To see the effect of this option, you must have text output selected in SAS Enterprise Guide. Δ
The following PROC SQL step specifies the DOUBLE option. The listing output from this step is double spaced. The HTML output from this step remains single-spaced.
proc sql inobs=10 double;
select flightnumber, destination
from sasuser.internationalflights;
The FLOW | NOFLOW | FLOW=n | FLOW=n m option controls the appearance of wide character columns in listing output. The FLOW option causes text to be flowed in its column instead of wrapping the entire row. n sets the width of the flowed column. Specifying n and m floats the width of the column between limits to achieve a balanced layout.
The FLOW | NOFLOW | FLOW= n | FLOW= n m option does not affect the appearance of HTML, PDF, or RTF output. To see the effect of this option, you must have text output selected in SAS Enterprise Guide. Δ
The following PROC SQL step does not specify the FLOW option. Notice that in the output the name and values for the column ZipCode appear under the name and values for the column FFID due to the wide character columns.
proc sql inobs=5; select ffid, membertype, name, address, city, state, zipcode from sasuser.frequentflyers order by pointsused;
Specifying flow=10 15
causes the text within each character column to float between 10 and 15 spaces, which prevents the ZipCode
column from wrapping underneath the FFID
column.
proc sql inobs=5 flow=10 15;
select ffid, membertype, name, address, city,
state, zipcode
from sasuser.frequentflyers
order by pointsused;
The PROC SQL option STIMER | NOSTIMER specifies whether PROC SQL writes timing information for each statement to the SAS log, instead of writing a cumulative value for the entire procedure. NOSTIMER is the default.
In order to use the STIMER option in PROC SQL, the SAS system option STIMER (the default) must also be in effect. Some host operating environments require that you specify the SAS system option STIMER when you invoke SAS. The STIMER system option controls the printing of performance statistics in the SAS log. If you use the system option alone, the results will contain timing information for the entire procedure, not on a statement-by-statement basis.
You can use the OPTIONS procedure to list the current settings of SAS system options. To find out if the SAS system STIMER option is enabled on your operating environment, submit the following program:
proc options option=stimer value; run;
PROC OPTIONS produces additional information that is specific to the operating environment under which you are running SAS. For more information about this and for descriptions of host-specific options, see the SAS documentation for your operating environment. Δ
Both of the queries in the following PROC SQL step list the name, address, city, state, and ZIP code of customers listed in the Sasuser.FrequentFlyers table. However, the second query only lists this information for customers who have earned more than 7000 points and used less than 3000 points.
When the PROC SQL statement is submitted without the STIMER option, timing information for both queries is written to the SAS log as a cumulative value for the entire procedure.
proc sql; select name, address, city, state, zipcode from sasuser.frequentflyers; select name, address, city, state, zipcode from sasuser.frequentflyers where pointsearned gt 7000 and pointsused lt 3000; quit;
Timing information for a PROC SQL step is not written to the SAS log until a QUIT statement is submitted or another PROC or DATA step is started.Δ
When the PROC SQL statement is submitted with the STIMER option, timing information is written to the SAS log for each SELECT statement.
proc sql stimer;
select name, address, city, state, zipcode
from sasuser.frequentflyers;
select name, address, city, state, zipcode
from sasuser.frequentflyers
where pointsearned gt 7000 and pointsused lt 3000;
quit;
When the STIMER option is used in PROC SQL, the exact wording of the Notes that are written to the SAS log might vary for different versions of SAS. Δ
The STIMER option in PROC SQL is useful when an operation can be accomplished in more than one way and you are benchmarking each technique. Although factors such as code readability and maintenance come into consideration, you might also want to know which PROC SQL step runs the fastest. Δ
After you specify an option, it remains in effect until you change it, or you re-invoke PROC SQL. You can use the RESET statement to add, drop, or change PROC SQL options without re-invoking the SQL procedure.
Options are additive. For example, you can specify the NOPRINT option in a PROC SQL statement, submit a query, and submit the RESET statement with the NUMBER option, without affecting the NOPRINT option.
Suppose you want to submit two PROC SQL queries in a single PROC SQL step. You want
both queries to display only the first five rows of output
the second query to display row numbers in the output.
In the following PROC SQL step, the PROC SQL statement specifies the OUTOBS= option to restrict the number of rows that will be displayed in the output. After the first SELECT statement, the RESET statement adds the NUMBER option to display row numbers in the result set.
proc sql outobs=5; select flightnumber, destination from sasuser.internationalflights; reset number; select flightnumber, destination from sasuser.internationalflights where boarded gt 200;
The output, which contains two result sets, is shown below. The result set from the first SELECT statement reflects only by the OUTOBS= option. The result set from the second SELECT statement reflects both the OUTOBS= option and the NUMBER option that is specified in the RESET statement.
Now suppose you want to modify the PROC SQL step so that the result set from only the first SELECT statement is restricted to five rows of output. In the modified PROC SQL step, the OUTOBS= option is added to the RESET statement to change (reset) the OUTOBS= option that is specified in the PROC SQL statement. The modified step follows:
proc sql outobs=5;
select flightnumber, destination
from sasuser.internationalflights;
reset outobs= number;
select flightnumber, destination
from sasuser.internationalflights
where boarded gt 200;
In the output, the result set from the second SELECT statement now contains all the rows that are generated by the query.
Dictionary tables are commonly used to monitor and manage SAS sessions because the data is easier to manipulate than the output from procedures such as PROC DATASETS.
Dictionary tables are special, read-only SAS tables that contain information about SAS libraries, SAS macros, and external files that are in use or available in the current SAS session. Dictionary tables also contain the settings for SAS system options and SAS titles and footnotes that are currently in effect. For example, the Dictionary.Columns table contains information (such as name, type, length, and format) about all columns in all tables that are known to the current SAS session.
Dictionary tables are
created each time they are referenced in a SAS program
updated automatically
limited to read-only access.
Accessing a Dictionary table causes SAS to determine the current state of the SAS session and return the information that you want. Dictionary tables can be accessed by running a PROC SQL query against the table, using the Dictionary libref. Though SAS librefs are usually limited to eight characters, Dictionary is an automatically assigned, reserved word. You can also access a Dictionary table by referring to the PROC SQL view of the table that is stored in the Sashelp library.
The following table describes some of the Dictionary tables that are available and lists the corresponding Sashelp views. For a complete list of Dictionary tables, see the SAS documentation for the SQL procedure.
Dictionary table | Sashelp view | Contains |
Catalogs | Vcatalg | information about catalog entries |
Columns | Vcolumn | detailed information about variables and their attributes |
Extfiles | Vextfl | currently assigned filerefs |
Indexes | Vindex | information about indexes defined for data files |
Macros | Vmacro | information about both user and system defined macro variables |
Members | Vmember Vsacces Vscatlg Vslib Vstable Vstabvw Vsview | general information about data library members |
Options | Voption | current settings of SAS system options |
Tables | Vtable | detailed information about data sets |
Titles | Vtitle | text assigned to titles and footnotes |
Views | Vview | general information about data views |
You can query Dictionary tables the same way that you query any other table, including subsetting with a WHERE clause, ordering the results, creating tables, and creating PROC SQL views. Because Dictionary tables are read-only objects, you cannot insert rows or columns, alter column attributes, or add integrity constraints to them.
To see how each Dictionary table is defined, submit a DESCRIBE TABLE statement. The DESCRIBE TABLE statement writes a CREATE TABLE statement to the SAS log for the table specified in the DESCRIBE TABLE statement. After you know how a table is defined, you can use its column names in a subsetting WHERE clause in order to retrieve specific information.
The Dictionary.Tables table contains detailed information about tables. The following DESCRIBE TABLE statement displays information about the Dictionary.Tables table in the log window. The information includes the names of the columns stored in the table.
proc sql; describe table dictionary.tables;
To display information about the files in a specific library, specify the column names in a SELECT statement and the Dictionary table name in the FROM clause.
For example, the following PROC SQL step displays the columns
Memname
(name)
Nobs
(number of observations)
Nvar
(number of variables)
Crdate
(creation date) of the tables in the Sasuser library.
The Dictionary column names are specified in the SELECT statement and the Dictionary table name, Dictionary.Tables, is specified in the FROM clause. The library name, Sasuser, is specified in the WHERE clause.
Note that you must specify the library name in the WHERE clause in uppercase letters (because that is how it is stored within SAS) and enclose it in quotation marks. Δ
proc sql; select memname format=$20., nobs, nvar, crdate from dictionary.tables where libname='SASUSER';
Partial output is shown below.
Your output might differ from that shown above, depending on the contents of your Sasuser library. Δ
You can also use Dictionary tables to determine more specific information such as which tables in a SAS library contain a specific column.
The Dictionary.Columns table contains detailed information about variables and their attributes. As in Dictionary.Tables, the Dictionary.Columns table contains a column that is titled Memname
, which lists the name of each table within a library.
proc sql; describe table dictionary.columns;
The following PROC SQL step lists all the tables in the Sasuser library that contain a column named EmpID
. The dictionary column name, Memname
, is specified in the SELECT statement. The Dictionary table, Dictionary.Columns, is specified in the FROM clause. The library name, Sasuser, and the column name, EmpID
, are specified in the WHERE clause.
proc sql; select memname
from dictionary.columns where libname='SASUSER' and name='EmpID';
Partial output is shown below.
Remember that you can also access a Dictionary table by referring to the PROC SQL view of the table that is stored in the Sashelp library. In the following PROC SQL step, the Sashelp view Vcolumn is specified in the FROM clause. The results of the query are identical to the preceding output.
proc sql;
select memname
from sashelp.vcolumn
where libname='SASUSER'
and name='EmpID';
Note that column names in the WHERE clause must be specified in the same case that is used in the Dictionary table and must be enclosed in quotation marks. Δ
You can use Sashelp views in any SAS procedure or DATA step. However, Dictionary tables can be only read by using the SQL procedure. Δ
The LOOPS= option restricts the number of iterations of the inner loop in PROC SQL. By setting a limit, you can prevent queries from consuming excessive resources.
For example, joining three large tables without meeting the join-matching conditions could create a huge internal table that would be inefficient to process. Use the LOOPS= option to prevent this from happening.
You can use the PROMPT | NOPROMPT option to modify the effect of the LOOPS= option so that you are prompted to stop or continue processing when the limit set by the LOOPS= option is reached.
You can use the number of iterations that are reported in the SQLOOPS macro variable (after each PROC SQL statement is executed) to gauge an appropriate value for the LOOPS= option. For more information about the SQLOOPS macro variable, see the SAS documentation for the SQL procedure. Δ
You already know that you can use the EXEC | NOEXEC option to specify whether a statement should be executed after its syntax is checked for accuracy. If the EXEC option is in effect, SAS checks the PROC SQL syntax for accuracy and, if no error is found, executes the SQL statement.
The ERRORSTOP | NOERRORSTOP option specifies whether PROC SQL stops executing if it encounters an error. This option is useful only when the EXEC option is in effect. The default is ERRORSTOP in batch or in a noninteractive session and NOERRORSTOP in an interactive SAS session.
ERRORSTOP instructs PROC SQL to stop executing the statements but to continue checking the syntax after it has encountered an error. ERRORSTOP has an effect only when SAS is running in batch or in noninteractive execution mode.
NOERRORSTOP instructs PROC SQL to execute the statements and to continue checking the syntax after an error occurs. NOERRORSTOP is useful if you want a batch job to continue executing SQL procedure statements after an error is encountered.
This section contains the following:
a text summary of the material taught in this chapter
syntax for statements and options
sample programs
points to remember.
The SQL procedure offers a variety of options that affect processing. Some options control execution. For example, you can limit the number of rows read or written during a query or limit the number of internal loops PROC SQL performs. Other options control output. For example, you can flow character columns, number your rows, or double-space output. Options are also available for testing and evaluating performance. Options are specified in the PROC SQL statement.
The OUTOBS= option restricts the number of rows that PROC SQL displays or writes to a table. The INOBS= option restricts the number of rows that PROC SQL takes as input from any single source. The INOBS= option is similar to the SAS system option OBS= and is useful for debugging queries on large tables.
The NUMBER | NONUMBER option specifies whether the SELECT statement should include a column named ROW, which is the row number of the data as it is retrieved. NONUMBER is the default. The option is similar to the OBS | NOOBS option in the PRINT procedure.
In some cases, double-spacing your output can make it easier to read. The DOUBLE | NODOUBLE option specifies whether PROC SQL output is double-spaced in the listing output. The default is NODOUBLE.
The FLOW | NOFLOW | FLOW= n| FLOW= n m option controls the appearance of wide character columns in the listing output. The FLOW option causes text to be flowed in its column instead of wrapping the entire row. Specifying n sets the width of the flowed column. Specifying n and m floats the width of the column between limits to achieve a balanced layout.
The STIMER | NOSTIMER option specifies whether PROC SQL writes timing information for each statement to the SAS log, in addition to writing a cumulative value for the entire procedure. NOSTIMER is the default. In order to use the STIMER option in PROC SQL, the SAS system option STIMER (the default) must also be in effect.
After you specify an option, it remains in effect until you change it or you re-invoke PROC SQL. You can use the RESET statement to add, drop, or change PROC SQL options without re-invoking the SQL procedure.
SAS session metadata is stored in Dictionary tables, which are special, read-only SAS tables that contain information about SAS libraries, SAS macros, and external files that are available in the current SAS session. A Dictionary table also contains the settings for SAS system options and SAS titles and footnotes that are currently in effect.
Accessing a Dictionary table causes PROC SQL to determine the current state of the SAS session and return the information that you want. Dictionary tables can be accessed by running a PROC SQL query against the table, using the Dictionary libref. You can also access a Dictionary table by referring to the PROC SQL view of the table that is stored in the Sashelp library.
To see how each Dictionary table is defined, submit a DESCRIBE TABLE statement. After you know how a table is defined, you can use its column names in a subsetting WHERE clause in order to retrieve specific information. To display information about the files in a specific library, specify the column names in a SELECT statement and the dictionary table name in the FROM clause. You can also use Dictionary tables to determine more specific information such as which tables in a SAS library contain a specific column.
PROC SQL <option(s)>;
DESCRIBE TABLE table-name <, …table-name>;
SELECT column-1<, …column-n>
FROM table-1 | view-1<, …table-n | view-n>
<WHERE expression>;
RESET <option(s)>;
QUIT;
proc sql outobs=5; select flightnumber, destination from sasuser.internationalflights; reset number; select flightnumber, destination from sasuser.internationalflights where boarded gt 200; quit;
After you specify an option, it remains in effect until you change it or you re-invoke PROC SQL.
The DOUBLE | NODOUBLE and the FLOW | NOFLOW | FLOW= n| FLOW= n m options do not affect the appearance of HTML, PDF, or RTF output that is created with the Output Delivery System.
If you query a Dictionary table about the files in a specific library, the library name used in the WHERE clause must be specified in uppercase letters because that is how it is stored in SAS. Column names used in the WHERE clause must be specified in the same case as they appear in the Dictionary table.
Select the best answer for each question. After completing the quiz, check your answers using the answer key in the appendix.
PROC SQL options are specified in
the PROC SQL statement.
an OPTIONS statement.
a SELECT statement.
the OPTIONS procedure.
Which of the following options restricts the number of rows that PROC SQL takes as input from any single source?
OUTOBS=
INOBS=
OBS=
none of the above
Which PROC SQL step creates the output shown below?
proc sql nonumber outobs=10; select * from sasuser.flightattendants where jobcode='FA1'; select * from sasuser.flightattendants where jobcode='FA2';
proc sql number; select * from sasuser.flightattendants where jobcode='FA1'; reset nonumber outobs=10; select * from sasuser.flightattendants where jobcode='FA2';
proc sql nonumber; select * from sasuser.flightattendants where jobcode='FA1'; reset number outobs=10; select * from sasuser.flightattendants where jobcode='FA2';
proc sql; select * from sasuser.flightattendants where jobcode='FA1'; reset outobs=10; select * from sasuser.flightattendants where jobcode='FA2';
Which of the following options does not affect the appearance of HTML, PDF, or RTF output?
NUMBER | NONUMBER
DOUBLE | NODOUBLE
FLOW | NOFLOW | FLOW= n | FLOW= n m
b and c
Which of the following statements is true regarding the STIMER option in PROC SQL?
The STIMER option in PROC SQL writes timing information for each statement to the SAS log.
The STIMER option in PROC SQL writes only cumulative timing information for the entire procedure to the SAS log.
When using the STIMER option in PROC SQL, the SAS system option STIMER must also be in effect.
a and c
information about SAS libraries.
information about SAS data sets.
information about SAS macros.
all of the above
Dictionary tables are
created each time they are referenced in a SAS program.
updated automatically.
limited to read-only access.
all of the above
Dictionary tables can be accessed
by running a PROC SQL query against the table, using the Dictionary libref.
by referring to the PROC SQL view of the table that is stored in the Sashelp library.
by referring to the PROC SQL view of the table that is stored in the Sasuser library.
a and b
Which of the following PROC SQL steps displays information about the Dictionary table Dictionary.Titles?
proc sql; describe dictionary.titles;
proc sql; describe table dictionary.titles;
proc sql describe table dictionary.titles;
proc sql describe dictionary titles;
Which of the following PROC SQL steps displays the name (Memname
), modification date (Modate
), number of variables (Nvar
), and the number of observations (Nobs
) for each table in the Sasuser library?
proc sql; select memname, modate, nvar, nobs from dictionary.tables where libname='SASUSER';
proc sql; select memname, modate, nvar, nobs from dictionary.tables where libname='Sasuser';
proc sql; select memname, modate, nvar, nobs from 'SASUSER' where table=dictionary.tables;
proc sql; select SASUSER from dictionary.tables where cols= 'memname, modate, nvar, nobs';