Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Appendix 1. Quiz Answer Keys

Chapter 1: Basic Concepts 677
Chapter 2: Referencing Files and Setting Options 680
Chapter 3: Editing and Debugging SAS Programs 683
Chapter 4: Creating List Reports 687
Chapter 5: Creating SAS Data Sets From Raw Files and Excel Work-sheets 692
Chapter 6: Understanding DATA Step Processing 698
Chapter 7: Creating and Applying User-Defined Formats 701
Chapter 8: Creating Enhanced List and Summary Reports 703
Chapter 9: Producing Descriptive Statistics 709
Chapter 10: Producing HTML Output 713
Chapter 11: Creating and Managing Variables 716
Chapter 12: Reading SAS Data Sets Overview 721
Chapter 13: Combining SAS Data Sets 724
Chapter 14: Transforming Data with SAS Functions 731
Chapter 15: Generating Data with DO Loops 735
Chapter 16: Processing Variables with Arrays 738
Chapter 17: Reading Raw Data in Fixed Fields 741
Chapter 18: Reading Free-Format Data 744
Chapter 19: Reading Date and Time Values 750
Chapter 20: Creating a Single Observation from Multiple Records 752
Chapter 21: Creating Multiple Observations from a Single Record 757
Chapter 22: Reading Hierarchical Files 762

Chapter 1: Basic Concepts

How many observations and variables does the data set below contain?
1. 3 observations, 4 variables
2. 3 observations, 3 variables
3. 4 observations, 3 variables
4. can't tell because some values are missing
Correct answer: c
Rows in the data set are called observations, and columns are called variables. Missing values don't affect the structure of the data set.
How many program steps are executed when the program below is processed?
```
data user.tables;
   infile jobs;
   input date name $ job $;
run;
proc sort data=user.tables;
   by name;
run;
proc print data=user.tables;
run;
```
1. three
2. four
3. five
4. six
Correct answer: a
When it encounters a DATA, PROC, or RUN statement, SAS stops reading statements and executes the previous step in the program. The program above contains one DATA step and two PROC steps, for a total of three program steps.
What type of variable is the variable AcctNum in the data set below?
1. numeric
2. character
3. can be either character or numeric
4. can't tell from the data shown
Correct answer: b
It must be a character variable, because the values contain letters and underscores, which are not valid characters for numeric values.
What type of variable is the variable Wear in the data set below, assuming that there is a missing value in the data set?
1. numeric
2. character
3. can be either character or numeric
4. can't tell from the data shown
Correct answer: a
It must be a numeric variable, because the missing value is indicated by a period rather than by a blank.
Which of the following variable names is valid?
1. 4BirthDate
2. $Cost
3. _Items_
4. Tax-Rate
Correct answer: c
Variable names follow the same rules as SAS data set names. They can be 1 to 32 characters long, must begin with a letter (A-Z, either uppercase or lowercase) or an underscore, and can continue with any combination of numbers, letters, or underscores.
Which of the following files is a permanent SAS file?
1. Sashelp.PrdSale
2. Sasuser.MySales
3. Profits.Quarter1
4. all of the above
Correct answer: d
To store a file permanently in a SAS data library, you assign it a libref other than the default Work. For example, by assigning the libref Profits to a SAS data library, you specify that files within the library are to be stored until you delete them. Therefore, SAS files in the Sashelp and Sasuser libraries are permanent files.
In a DATA step, how can you reference a temporary SAS data set named Forecast?
1. Forecast
2. Work.Forecast
3. Sales.Forecast (after assigning the libref Sales)
4. only a and b above
Correct answer: d
To reference a temporary SAS file in a DATA step or PROC step, you can specify the one-level name of the file (for example, Forecast) or the two-level name using the libref Work (for example, Work.Forecast).
What is the default length for the numeric variable Balance?
1. 5
2. 6
3. 7
4. 8
Correct answer: d
The numeric variable Balance has a default length of 8. Numeric values (no matter how many digits they contain) are stored in 8 bytes of storage unless you specify a different length.
How many statements does the following SAS program contain?
```
proc print data=new.prodsale
                label double;
   var state day price1 price2; where state='NC';
   label state='Name of State'; run;
```
1. three
2. four
3. five
4. six
Correct answer: c
The five statements are: 1) the PROC PRINT statement (two lines long); 2) the VAR statement; 3) the WHERE statement (on the same line as the VAR statement); 4) the LABEL statement; and 5) the RUN statement (on the same line as the LABEL statement).
What is a SAS library?
1. collection of SAS files, such as SAS data sets and catalogs
2. in some operating environments, a physical collection of SAS files
3. in some operating environments, a logically related collection of SAS files
4. all of the above
Correct answer: d
Every SAS file is stored in a SAS library, which is a collection of SAS files, such as SAS data sets and catalogs. In some operating environments, a SAS library is a physical collection of files. In others, the files are only logically related. In the Windows and UNIX environments, a SAS library is typically a group of SAS files in the same folder or directory.

Chapter 2: Referencing Files and Setting Options

If you submit the following program, how does the output look?
```
options pagesize=55 nonumber;
proc tabulate data=clinic.admit;
   class actlevel;
   var age height weight;
   table actlevel,(age height weight)*mean;
run;
options linesize=80;
proc means data=clinic.heart min max maxdec=1;
   var arterial heart cardiac urinary;
   class survive sex;
run;
```
1. The PROC MEANS output has a print line width of 80 characters, but the PROC TABULATE output has no print line width.
2. The PROC TABULATE output has no page numbers, but the PROC MEANS output has page numbers.
3. Each page of output from both PROC steps is 55 lines long and has no page numbers, and the PROC MEANS output has a print line width of 80 characters.
4. The date does not appear on output from either PROC step.
Correct answer: c
When you specify a system option, it remains in effect until you change the option or end your SAS session, so both PROC steps generate output that is printed 55 lines per page with no page numbers. If you don't specify a system option, SAS uses the default value for that system option.
How can you create SAS output in HTML format on any SAS platform?
1. by specifying system options
2. by using programming statements
3. by using SAS windows to specify the result format
4. you can't create HTML output on all SAS platforms
Correct answer: b
You can create HTML output using programming statements on any SAS platform. In addition, on all except mainframe platforms, you can use SAS windows to specify HTML as a result format.
In order for the date values 05May1955 and 04Mar2046 to be read correctly, what value must the YEARCUTOFF= option have?
1. a value between 1947 and 1954, inclusive
2. 1955 or higher
3. 1946 or higher
4. any value
Correct answer: d
As long as you specify an informat with the correct field width for reading the entire date value, the YEARCUTOFF= option doesn't affect date values that have four-digit years.
When you specify an engine for a library, you are always specifying
1. the file format for files that are stored in the library.
2. the version of SAS that you are using.
3. access to other software vendors' files.
4. instructions for creating temporary SAS files.
Correct answer: a
A SAS engine is a set of internal instructions that SAS uses for writing to and reading from files in a SAS library. Each engine specifies the file format for files that are stored in the library, which in turn enables SAS to access files with a particular format. Some engines access SAS files, and other engines support access to other vendors' files.
Which statement prints a summary of all the files stored in the library named Area51?
1. proc contents data=area51._all_ nods;
2. proc contents data=area51 _all_ nods;
3. proc contents data=area51 _all_ noobs
4. proc contents data=area51 _all_.nods;
Correct answer: a
To print a summary of library contents with the CONTENTS procedure, use a period to append the _ALL_ option to the libref. Adding the NODS option suppresses detailed information about the files.
The following PROC PRINT output was created immediately after PROC TABULATE output. Which system options were specified when the report was created?
1. OBS=, DATE, and NONUMBER
2. NUMBER, PAGENO=1, and DATE
3. NUMBER and DATE only
4. none of the above
Correct answer: b
Clearly, the DATE and NUMBER (page number) options are specified. Because the page number on the output is 1, even though PROC TABULATE output was just produced, PAGENO=1 must also have been specified. If you don't specify PAGENO=, all output in the Output window is numbered sequentially throughout your SAS session.
Which of the following programs correctly references a SAS data set named SalesAnalysis that is stored in a permanent SAS library?
1. ```
data saleslibrary.salesanalysis;
    set mydata.quarter1sales;
    if sales>100000;
run;
```
2. ```
data mysales.totals;
    set sales_99.salesanalysis;
    if totalsales>50000;
run;
```
3. ```
aproc print data=salesanalysis.quarter1;
    var sales salesrep month;
run;
```
4. ```
proc freq data=1999data.salesanalysis;
    tables quarter*sales;
run;
```
Correct answer: b
Librefs must be 1 to 8 characters long, must begin with a letter or underscore, and can contain only letters, numbers, or underscores. After you assign a libref, you specify it as the first element in the two-level name for a SAS file.
Which time span is used to interpret two-digit year values if the YEARCUTOFF= option is set to 1950?
1. 1950-2049
2. 1950-2050
3. 1949-2050
4. 1950-2000
Correct answer: a
The YEARCUTOFF= option specifies which 100-year span is used to interpret two-digit year values. The default value of YEARCUTOFF= is 1920. However, you can override the default and change the value of YEARCUTOFF= to the first year of another 100-year span. If you specify YEARCUTOFF=1950, then the 100-year span will be from 1950 to 2049.
Assuming you are using SAS code and not special SAS windows, which one of the following statements is false?
1. LIBNAME statements can be stored with a SAS program to reference the SAS library automatically when you submit the program.
2. When you delete a libref, SAS no longer has access to the files in the library. However, the contents of the library still exist on your operating system.
3. Librefs can last from one SAS session to another.
4. You can access files that were created with other vendors' software by submitting a LIBNAME statement.
Correct answer: c
The LIBNAME statement is global, which means that librefs remain in effect until you modify them, cancel them, or end your SAS session. Therefore, the LIBNAME statement assigns the libref for the current SAS session only. You must assign a libref before accessing SAS files that are stored in a permanent SAS data library.
What does the following statement do?
```
libname osiris spss 'c:myfilessasdatadata';
```
1. defines a library called Spss using the OSIRIS engine
2. defines a library called Osiris using the SPSS engine
3. defines two libraries called Osiris and Spss using the default engine
4. defines the default library using the OSIRIS and SPSS engines
Correct answer: b
In the LIBNAME statement, you specify the library name before the engine name. Both are followed by the path.

Chapter 3: Editing and Debugging SAS Programs

As you write and edit SAS programs it's a good idea to
1. begin DATA and PROC steps in column one.
2. indent statements within a step.
3. begin RUN statements in column one.
4. do all of the above.
Correct answer: d
Although you can write SAS statements in almost any format, a consistent layout enhances readability and enables you to understand the program's purpose. It's a good idea to begin DATA and PROC steps in column one, to indent statements within a step, to begin RUN statements in column one, and to include a RUN statement after every DATA step or PROC step.
Suppose you have submitted a SAS program that contains spelling errors. Which set of steps should you perform, in the order shown, to revise and resubmit the program?
1. 1. Correct the errors.
  2. Clear the Log window.
  3. Resubmit the program.
  4. Check the Log window.
2. 1. Correct the errors.
  2. Resubmit the program.
  3. Check the Output window.
  4. Check the Log window.
3. 1. Correct the errors.
  2. Clear the Log window.
  3. Resubmit the program.
  4. Check the Output window.
4. 1. Correct the errors.
  2. Clear the Output window.
  3. Resubmit the program.
  4. Check the Output window.
Correct answer: a
To modify programs that contain errors, if you use the Program Editor window, you usually need to recall the submitted statements from the recall buffer to the Program Editor window, where you can correct the problems. After correcting the errors, you can resubmit the revised program. However, before doing so, it's a good idea to clear the messages from the Log window so that you don't confuse the old error messages with the new messages. Remember to check the Log window again to verify that your program ran correctly.
What happens if you submit the following program?
```
proc sort data=clinic.stress out=maxrates;
   by maxhr;
run;
proc print data=maxrates label double noobs;
   label rechr='Recovery Heart Rate;
   var resthr maxhr rechr date;
   where toler='I' and resthr>90;
   sum fee;
run;
```
1. Log messages indicate that the program ran successfully.
2. A "PROC SORT running" message appears at the top of the active window, and a log message may indicate an error in a statement that seems to be valid.
3. A log message indicates that an option is not valid or not recognized.
4. A "PROC PRINT running" message appears at the top of the active window, and a log message may indicate that a quoted string has become too long or that the statement is ambiguous.
Correct answer: d
The missing quotation mark in the LABEL statement causes SAS to misinterpret the statements in the program. When you submit the program, SAS is unable to resolve the PROC step, and a "PROC PRINT running" message appears at the top of the active window.
What generally happens when a syntax error is detected?
1. SAS continues processing the step.
2. SAS continues to process the step, and the Log window displays messages about the error.
3. SAS stops processing the step in which the error occurred, and the Log window displays messages about the error.
4. SAS stops processing the step in which the error occurred, and the Output window displays messages about the error.
Correct answer: c
Syntax errors generally cause SAS to stop processing the step in which the error occurred. When a program that contains an error is submitted, messages regarding the problem also appear in the Log window. When a syntax error is detected, the Log window displays the word ERROR, identifies the possible location of the error, and gives an explanation of the error.
A syntax error occurs when
1. Some data values are not appropriate for the SAS statements that are specified in a program.
2. the form of the elements in a SAS statement is correct, but the elements are not valid for that usage.
3. program statements do not conform to the rules of the SAS language.
4. none of the above
Correct answer: c
Syntax errors are common types of errors. Some SAS system options, features of the code editing window, and the DATA step debugger can help you identify syntax errors. Other types of errors include data errors (c), semantic errors (b), and execution-time errors.
How can you tell whether you have specified an invalid option in a SAS program?
1. A log message indicates an error in a statement that seems to be valid.
2. A log message indicates that an option is not valid or not recognized.
3. The message "PROC running" or "DATA step running" appears at the top of the active window.
4. You can't tell until you view the output from the program.
Correct answer: b
When you submit a SAS statement that contains an invalid option, a log messages notifies you that the option is not valid or not recognized. You should recall the program, remove or replace the invalid option, check your statement syntax as needed, and resubmit the corrected program.
Which of the following programs contains a syntax error?
1. ```
proc sort data=sasuser.mysales;
   by region;
run;
```
2. ```
dat sasuser.mysales;
   set mydata.sales99;
run;
```
3. ```
proc print data=sasuser.mysales label;
   label region='Sales Region';
run;
```
4. ```
d    none of the above
```
Correct answer: b
The DATA step contains a misspelled keyword (dat instead of data). However, this is such a common (and easily interpretable) error that SAS produces only a warning message, not an error.
What should you do after submitting the following program in the Windows or UNIX operating environment?
```
proc print data=mysales;
   where state='NC;
run;
```
1. Submit a RUN statement to complete the PROC step.
2. Recall the program. Then add a quotation mark and resubmit the corrected program.
3. Cancel the submitted statements. Then recall the program, add a quotation mark, and resubmit the corrected program.
4. Recall the program. Then replace the invalid option and resubmit the corrected program.
Correct answer: c
This program contains an unbalanced quotation mark. When you have an unbalanced quotation mark, SAS is often unable to detect the end of the statement in which it occurs. Simply adding a quotation mark and resubmitting your program usually does not solve the problem. SAS still considers the quotation marks to be unbalanced. To correct the error, you need to resolve the unbalanced quotation mark before you recall, correct, and resubmit the program.
Which of the following commands opens a file in the code editing window?
1. file 'd:programssas ewprog.sas'
2. include 'd:programssas ewprog.sas'
3. open 'd:programssas ewprog.sas'
4. all of the above
Correct answer: b
One way of opening a file in the code editing window is by using the INCLUDE command. Using the INCLUDE command enables you to open a single program or combine stored programs in a single window. To save a SAS program, you can use the FILE command.
Suppose you submit a short, simple DATA step. If the active window displays the message "DATA step running" for a long time, what probably happened?
1. You misspelled a keyword.
2. You forgot to end the DATA step with a RUN statement.
3. You specified an invalid data set option.
4. Some data values weren't appropriate for the SAS statements that you specified.
Correct answer: b
Without a RUN statement (or a following DATA or PROC step), the DATA step doesn't execute, so it continues to run. Unbalanced quotation marks can also cause the "DATA step running" message if relatively little code follows the unbalanced quotation mark. The other three problems above generate errors in the Log window.

Chapter 4: Creating List Reports

Which PROC PRINT step below creates the following output
1. ```
proc print data=flights.laguardia noobs;
   var on changed flight;
   where on>=160;
run;
```
2. ```
proc print data=flights.laguardia;
   var date on changed flight;
   where changed>3;
run;
```
3. ```
proc print data=flights.laguardia label;
   id date;
   var boarded transferred flight;
   label boarded='On' transferred='Changed';
   where flight='219';
run;
```
4. ```
proc print flights.laguardia noobs;
   id date;
   var date on changed flight;
   where flight='219';
run;
```
Correct answer: c
The DATA= option specifies the data set that you are listing, and the ID statement replaces the Obs column with the specified variable. The VAR statement specifies variables and controls the order in which they appear, and the WHERE statement selects rows based on a condition. The LABEL option in the PROC PRINT statement causes the labels specified in the LABEL statement to be displayed.
Which of the following PROC PRINT steps is correct if labels are not stored with the data set?
1. ```
proc print data=allsales.totals label;
   label region8='Region 8 Yearly Totals';
run;
```
2. ```
proc print data=allsales.totals;
   label region8='Region 8 Yearly Totals';
run;
```
3. ```
proc print data allsales.totals label noobs;
run;
```
4. ```
proc print allsales.totals label;
run;
```
Correct answer: a
You use the DATA= option to specify the data set to be printed. The LABEL option specifies that variable labels appear in output instead of variable names.
Which of the following statements selects from a data set only those observations for which the value of the variable Style is RANCH, SPLIT,or TWOSTORY?
1. where style='RANCH' or 'SPLIT' or 'TWOSTORY';
2. where style in 'RANCH' or 'SPLIT' or 'TWOSTORY';
3. where style in (RANCH, SPLIT, TWOSTORY);
4. where style in ('RANCH','SPLIT','TWOSTORY'),
Correct answer: d
In the WHERE statement, the IN operator enables you to select observations based on several values. You specify values in parentheses and separated by spaces or commas. Character values must be enclosed in quotation marks and must be in the same case as in the data set.
If you want to sort your data and create a temporary data set named Calc to store the sorted data, which of the following steps should you submit?
1. ```
proc sort data=work.calc out=finance.dividend;
run;
```
2. ```
proc sort dividend out=calc;
   by account;
run;
```
3. ```
proc sort data=finance.dividend out=work.calc;
   by account;
run;
```
4. ```
proc sort from finance.dividend to calc;
   by account;
run;
```
Correct answer: c
In a PROC SORT step, you specify the DATA= option to specify the data set to sort. The OUT= option specifies an output data set. The required BY statement specifies the variable(s) to use in sorting the data.
Which options are used to create the following PROC PRINT output?
1. the DATE system option and the LABEL option in PROC PRINT
2. the DATE and NONUMBER system options and the DOUBLE and NOOBS options in PROC PRINT
3. the DATE and NONUMBER system options and the DOUBLE option in PROC PRINT
4. the DATE and NONUMBER system options and the NOOBS option in PROC PRINT
Correct answer: b
The DATE and NONUMBER system options cause the output to appear with the date but without page numbers. In the PROC PRINT step, the DOUBLE option specifies double spacing, and the NOOBS option removes the default Obs column.
Which of the following statements can you use in a PROC PRINT step to create this output?
1. ```
var month instructors;
sum instructors aerclass walkjogrun swim;
```
2. ```
var month;
sum instructors aerclass walkjogrun swim;
```
3. ```
var month instructors aerclass;
sum instructors aerclass walkjogrun swim;
```
4. all of the above
Correct answer: d
You do not need to name the variables in a VAR statement if you specify them in the SUM statement, but you can. If you choose not to name the variables in the VAR statement as well, then the SUM statement determines their order in the output.
What happens if you submit the following program?
```
proc sort data=clinic.diabetes;
run;
proc print data=clinic.diabetes;
   var age height weight pulse;
   where sex='F';
run
```
1. The PROC PRINT step runs successfully, printing observations in their sorted order.
2. The PROC SORT step permanently sorts the input data set.
3. The PROC SORT step generates errors and stops processing, but the PROC PRINT step runs successfully, printing observations in their original (unsorted) order.
4. The PROC SORT step runs successfully, but the PROC PRINT step generates errors and stops processing.
Correct answer: c
The BY statement is required in PROC SORT. Without it, the PROC SORT step fails. However, the PROC PRINT step prints the original data set as requested.
If you submit the following program, which output does it create?
```
proc sort data=finance.loans out=work.loans;
   by months amount;
run;
proc print data=work.loans noobs;
   var months;
   sum amount payment;
   where months<360;
run;
```
Correct answer: a
Column totals appear at the end of the report in the same format as the values of the variables, so b is incorrect. Work.Loans is sorted by Month and Amount, so c is incorrect. The program sums both Amount and Payment, so d is incorrect.
Choose the statement below that selects rows in which
- the amount is less than or equal to $5000
- the account is 101-1092 or the rate equals 0.095.
1. ```
where amount <= 5000 and
      account='101-1092' or rate = 0.095;
```
2. ```
where (amount le 5000 and account='101-1092')
      or rate = 0.095;
```
3. ```
where amount <= 5000 and
      (account='101-1092' or rate eq 0.095);
```
4. ```
where amount <= 5000 or account='101-1092'
      and rate = 0.095;
```
Correct answer: c
To ensure that the compound expression is evaluated correctly, you can use parentheses to group
```
account='101-1092' or rate eq 0.095
```
For example, from the data set above, a and b above select observations 2 and 8 (those that have a rate of 0.095); c selects no observations; and d selects observations 4 and 7 (those that have an amount less than or equal to 5000).
What does PROC PRINT display by default?
1. PROC PRINT does not create a default report; you must specify the rows and columns to be displayed.
2. PROC PRINT displays all observations and variables in the data set. If you want an additional column for observation numbers, you can request it.
3. PROC PRINT displays columns in the following order: a column for observation numbers, all character variables, and all numeric variables.
4. PROC PRINT displays all observations and variables in the data set, a column for observation numbers on the far left, and variables in the order in which they occur in the data set.
Correct answer: d
You can remove the column for observation numbers. You can also specify the variables you want, and you can select observations according to conditions.

Chapter 5: Creating SAS Data Sets From Raw Files and Excel Work-sheets

Which SAS statement associates the fileref Crime with the raw data file C:StatesDataCrime?
1. filename crime 'c:statesdatacrime';
2. filename crime c:statesdatacrime;
3. fileref crime 'c:statesdatacrime';
4. filename 'c:statesdatacrime' crime;
Correct answer: a
Before you can read your raw data, you must reference the raw data file by creating a fileref. You assign a fileref by using a FILENAME statement in the same way that you assign a libref by using a LIBNAME statement.
Filerefs remain in effect until …
1. you change them.
2. you cancel them.
3. you end your SAS session.
4. all of the above
Correct answer: d
Like LIBNAME statements, FILENAME statements are global; they remain in effect until you change them, cancel them, or end your SAS session.
Which statement identifies the name of a raw data file to be read with the fileref Products and specifies that the DATA step read only records 1-15?
1. infile products obs 15;
2. infile products obs=15;
3. input products obs=15;
4. input products 1-15;
Correct answer: b
You use an INFILE statement to specify the raw data file to be read. You can specify a fileref or an actual filename (in quotation marks). The OBS= option in the INFILE statement enables you to process only records 1 through n.

Which of the following programs correctly writes the observations from the data set below to a raw data file?

Chapter 5: Creating SAS Data Sets From Raw Files and Excel Work-sheets

data _null_;
   set work.patients;
   infile 'c:clinicpatients
eferals.dat';
   input id 1-4 sex 6 age 8-9 height 11-12
   weight 14-16 pulse 18-20;
run;

data referals.dat;
   set work.patients;
   input id 1-4 sex 6 age 8-9 height 11-12
     weight 14-16 pulse 18-20;
run;

data _null_;
   set work.patients;
   file c:clinicpatients
eferals.dat;
    put id 1-4 sex 6 age 8-9 height 11-12
         weight 14-16 pulse 18-20;

run;

data _null_;
   set work.patients;
   file 'c:clinicpatients
eferals.dat';
   put id 1-4 sex 6 age 8-9 height 11-12
       weight 14-16 pulse 18-20;
run;

Correct answer: d

The keyword _NULL_ in the DATA statement enables you to use the power of the DATA step without actually creating a SAS data set. You use the FILE and PUT statements to write out the observations from a SAS data set to a raw data file. The FILE statement specifies the raw data file and the PUT statement describes the lines to write to the raw data file. The filename and location specified in the FILE statement must be enclosed in quotation marks.

Which raw data file can be read using column input?
4. all of the above.
Correct answer: b
Column input is appropriate only in some situations. When you use column input, your data must be standard character or numeric values, and they must be in fixed fields. That is, values for a particular variable must be in the same location in all records.

Which program creates the output shown below?

data work.salesrep;
   infile empdata;
   input ID $ 1-4 LastName $ 6-12
         FirstName $ 14-18 City $ 20-29;
run;
     proc print data=work.salesrep;
run;

data work.salesrep;
   infile empdata;
   input ID $ 1-4 Name $ 6-12
         FirstName $ 14-18 City $ 20-29;
run;
     proc print data=work.salesrep;
run;

data work.salesrep;
   infile empdata;
   input ID $ 1-4 name1 $ 6-12
         name2 $ 14-18 City $ 20-29;
run;
proc print data=work.salesrep;
run;

all of the above.

Correct answer: a

The INPUT statement creates a variable using the name that you assign to each field. Therefore, when you write an INPUT statement, you need to specify the variable names exactly as you want them to appear in the SAS data set.

Which statement correctly reads the fields in the following order: StockNumber, Price, Item, Finish, Style?
Field Name
Start Column
End Column
Data Type
StockNumber
1
3
character
Finish
5
9
character
Style
11
18
character
Item
20
24
character
Price
27
32
numeric
1. ```
input StockNumber $ 1-3 Finish $ 5-9 Style $ 11-18
      Item $ 20-24 Price 27-32;
```
2. ```
input StockNumber $ 1-3 Price 27-32
      Item $ 20-24 Finish $ 5-9 Style $ 11-18;
```
3. ```
input $ StockNumber 1-3 Price 27-32   $
      Item   20-24 $ Finish 5-9 $ Style 11-18;
```
4. ```
input StockNumber $ 1-3 Price $ 27-32
      Item $ 20-24 Finish $ 5-9 Style $ 11-18;
```
Correct answer: b
You can use column input to read fields in any order. You must specify the variable name to be created, identify character values with a $, and name the correct starting column and ending column for each field.
Which statement correctly re-defines the values of the variable Income as 100 percent higher?
1. income=income*1.00;
2. income=income+(income*2.00);
3. income=income*2;
4. income=*2;
Correct answer: c
To re-define the values of the variable Income in an assignment statement, you specify the variable name on the left side of the equal sign and an appropriate expression including the variable name on the right side of the equal sign.

Which program correctly reads instream data?

data finance.newloan;
   input datalines;
   if country='JAPAN';
   MonthAvg=amount/12;
1998 US     CARS   194324.12
1998 US     TRUCKS 142290.30
1998 CANADA CARS    10483.44
1998 CANADA TRUCKS  93543.64
1998 MEXICO CARS    22500.57
1998 MEXICO TRUCKS  10098.88
1998 JAPAN  CARS    15066.43
1998 JAPAN  TRUCKS  40700.34
;

data finance.newloan;
   input Year 1-4 Country $ 6-11
         Vehicle $ 13-18 Amount 20-28;
   if country='JAPAN';
      MonthAvg=amount/12;

   datalines;
run;

data finance.newloan;
   input Year 1-4 Country 6-11
         Vehicle 13-18 Amount 20-28;
   if country='JAPAN';
   MonthAvg=amount/12;
   datalines;
1998 US     CARS   194324.12
1998 US     TRUCKS 142290.30
1998 CANADA CARS    10483.44
1998 CANADA TRUCKS  93543.64
1998 MEXICO CARS    22500.57
1998 MEXICO TRUCKS  10098.88
1998 JAPAN  CARS    15066.43
1998 JAPAN  TRUCKS  40700.34
;

data finance.newloan;
   input Year 1-4 Country $ 6-11
         Vehicle $ 13-18 Amount 20-28;
   if country='JAPAN';
   MonthAvg=amount/12;
   datalines;
1998 US     CARS   194324.12
1998 US     TRUCKS 142290.30
1998 CANADA CARS    10483.44
1998 CANADA TRUCKS  93543.64
1998 MEXICO CARS    22500.57
1998 MEXICO TRUCKS  10098.88
1998 JAPAN  CARS    15066.43
1998 JAPAN  TRUCKS  40700.34
;

Correct answer: d

To read instream data, you specify a DATALINES statement and data lines, followed by a null statement (single semicolon) to indicate the end of the input data. Program a contains no DATALINES statement, and the INPUT statement doesn't specify the fields to read. Program b contains no data lines, and the INPUT statement in program c doesn't specify the necessary dollar signs for the character variables Country and Vehicle.

Which SAS statement subsets the raw data shown below so that only the observations in which Sex (in the second field) has a value of F are processed?
1. if sex=f;
2. if sex=F;
3. if sex='F';
4. aorb
Correct answer: c
To subset data, you can use a subsetting IF statement in any DATA step to process only those observations that meet a specified condition. Because Sex is a character variable, the value F must be enclosed in quotation marks and must be in the same case as in the data set.

Chapter 6: Understanding DATA Step Processing

Which of the following is not created during the compilation phase?
1. the data set descriptor
2. the first observation
3. the program data vector
4. the _N_ and _ERROR_ automatic variables
Correct answer: b
At the beginning of the compilation phase, the program data vector is created. The program data vector includes the two automatic variables _N_ and _ERROR_. The descriptor portion of the new SAS data set is created at the end of the compilation phase. The descriptor portion includes the name of the data set, the number of observations and variables, and the names and attributes of the variables. Observations are not written until the execution phase.
During the compilation phase, SAS scans each statement in the DATA step, looking for syntax errors. Which of the following is not considered a syntax error?
1. incorrect values and formats
2. invalid options or variable names
3. missing or invalid punctuation
4. missing or misspelled keywords
Correct answer: a
Syntax checking can detect many common errors, but it cannot verify the values of variables or the correctness of formats.
Unless otherwise directed, the DATA step executes...
1. once for each compilation phase.
2. once for each DATA step statement.
3. once for each record in the input file.
4. once for each variable in the input file.
Correct answer: c
The DATA step executes once for each record in the input file, unless otherwise directed.
At the beginning of the execution phase, the value of _N_ is 1, the value of _ERROR_ is 0, and the values of the remaining variables are set to:
1. 0
2. 1
3. undefined
4. missing
Correct answer: d
The remaining variables are initialized to missing. Missing numeric values are represented by periods, and missing character values are represented by blanks.
Suppose you run a program that causes three DATA step errors. What is the value of the automatic variable _ERROR_ when the observation that contains the third error is processed?
1. 0
2. 1
3. 2
4. 3
Correct answer: b
The default value of _ERROR_ is 0, which means there is no error. When an error occurs, whether one error or multiple errors, the value is set to 1.
Which of the following actions occurs at the end of an iteration of the DATA step?
1. The automatic variables _N_ and _ERROR_ are incremented by one.
2. The DATA step stops execution.
3. The descriptor portion of the data set is written.
4. The values of variables created in programming statements are re-set to missing in the program data vector.
Correct answer: d
By default, at the end of the DATA step, the values in the program data vector are written to the data set as an observation, the value of the automatic variable _N_ is incremented by one, control returns to the top of the DATA step, and the values of variables created in programming statements are re-set to missing. The automatic variable _ERROR_ retains its value.
Look carefully at the DATA step shown below. Based on the INPUT statement, in what order will the variables be stored in the new data set?
```
data perm.update;
   infile invent;
   input IDnum $ Item $ 1-13 Instock 21-22
         BackOrd 24-25;
   Total=instock+backord;
run;
```
1. IDnum Item InStock BackOrd Total
2. Item IDnum InStock BackOrd Total
3. Total IDnum Item InStock BackOrd
4. Total Item IDnum InStock BackOrd
Correct answer: a
The order in which variables are defined in the DATA step determines the order in which the variables are stored in the data set.
If SAS cannot interpret syntax errors, then...
1. data set variables will contain missing values.
2. the DATA step does not compile.
3. the DATA step still compiles, but it does not execute.
4. the DATA step still compiles and executes.
Correct answer: c
When SAS can't interpret syntax errors, the DATA step compiles, but it does not execute.
What is wrong with this program?
```
data perm.update;
   infile invent
   input Item $ 1-13 IDnum $ 15-19 Instock 21-22
         BackOrd 24-25;
   total=instock+backord;
run;
```
1. missing semicolon on second line
2. missing semicolon on third line
3. incorrect order of variables
4. incorrect variable type
Correct answer: a
A semicolon is missing from the second line. It will cause an error because the INPUT statement will be interpreted as invalid INFILE statement options.
Look carefully at this section of a SAS session log. Based on the note, what was the most likely problem with the DATA step?
1. A keyword was misspelled in the DATA step.
2. A semicolon was missing from the INFILE statement.
3. A variable was misspelled in the INPUT statement.
4. A dollar sign was missing in the INPUT statement.
Correct answer: d
The third line of the log displays the values for IDnum, which are clearly character values. The fourth line displays the values in the program data vector and shows that the values for IDnum are missing, even though the other values are correctly assigned. Thus, it appears that numeric values were expected for IDnum. A dollar sign, to indicate character values, must be missing from the INPUT statement.

Chapter 7: Creating and Applying User-Defined Formats

If you don't specify the LIBRARY= option, your formats are stored in Work.Formats, and they exist ...
1. only for the current procedure.
2. only for the current DATA step.
3. only for the current SAS session.
4. permanently.
Correct answer: c
If you do not specify the LIBRARY= option, formats are stored in a default format catalog named Work.Formats. As the libref Work implies, any format that is stored in Work.Formats is a temporary format that exists only for the current SAS session.
Which of the following statements will store your formats in a permanent catalog?
1. ```
libname library 'c:sasformatslib';
proc format lib=library
   ...;
```
2. ```
libname library 'c:sasformatslib';
format lib=library
   ...;
```
3. ```
library='c:sasformatslib';
proc format library
   ...;
```
4. ```
library='c:sasformatslib';
proc library
   ...;
```
Correct answer: a
To store formats in a permanent catalog, you first write a LIBNAME statement to associate the libref with the SAS data library in which the catalog will be stored. Then add the LIB= (or LIBRARY=) option to the PROC FORMAT statement, specifying the name of the catalog.
When creating a format with the VALUE statement, the new format's name
- cannot end with a number
- cannot end with a period
- cannot be the name of a SAS format, and...
1. cannot be the name of a data set variable.
2. must be at least two characters long.
3. must be at least eight characters long.
4. must begin with a dollar sign ($) if used with a character variable.
Correct answer: d
The name of a format that is created with a VALUE statement must begin with a dollar sign ($) if it applies to a character variable.

Which of the following FORMAT procedures is written correctly?

proc format lib=library
   value colorfmt;
         1='Red'
         2='Green'
         3='Blue'
run;

proc format lib=library;
   value colorfmt
         1='Red'
         2='Green'
         3='Blue';
run;

proc format lib=library;
   value colorfmt;
         1='Red'
         2='Green'
         3='Blue'
run;

proc format lib=library;
   value colorfmt
         1='Red';
         2='Green';
         3='Blue';
run;

Correct answer: b

A semicolon is needed after the PROC FORMAT statement. The VALUE statement begins with the keyword VALUE and ends with a semicolon after all the labels have been defined.

Which of these is false? Ranges in the VALUE statement can specify...
1. a single value, such as 24 or 'S'.
2. a range of numeric values, such as 0-1500.
3. a range of character values, such as 'a'-'M'.
4. a list of numeric and character values separated by commas, such as 90,'B',180,'D',270.
Correct answer: d
You can list values separated by commas, but the list must contain either all numeric values or all character values. Data set variables are either numeric or character.
How many characters can be used in a label?
1. 40
2. 96
3. 200
4. 256
Correct answer: d
When specifying a label, enclose it in quotation marks and limit the label to 256 characters.
Which keyword can be used to label missing numeric values as well as any values that are not specified in a range?
1. LOW
2. MISS
3. MISSING
4. OTHER
Correct answer: d
MISS and MISSING are invalid keywords, and LOW does not include missing numeric values. The keyword OTHER can be used in the VALUE statement to label missing values as well as any values that are not specifically included in a range.
You can place the FORMAT statement in either a DATA step or a PROC step. What happens when you place it in a DATA step?
1. You temporarily associate the formats with variables.
2. You permanently associate the formats with variables.
3. You replace the original data with the format labels.
4. You make the formats available to other data sets.
Correct answer: b
By placing the FORMAT statement in a DATA step, you permanently associate the defined format with variables.
The format JOBFMT was created in a FORMAT procedure. Which FORMAT statement will apply it to the variable JobTitle in the program output?
1. format jobtitle jobfmt;
2. format jobtitle jobfmt.;
3. format jobtitle=jobfmt;
4. format jobtitle='jobfmt';
Correct answer: b
To associate a user-defined format with a variable, place a period at the end of the format name when it is used in the FORMAT statement.
Which keyword, when added to the PROC FORMAT statement, will display all the formats in your catalog?
1. CATALOG
2. LISTFMT
3. FMTCAT
4. FMTLIB
Correct answer: d
Adding the keyword FMTLIB to the PROC FORMAT statement displays a list of all the formats in your catalog, along with descriptions of their values.

Chapter 8: Creating Enhanced List and Summary Reports

If Style has four unique values and you submit the following program, which output do you get? (Assume that all the other variables are numeric.)
```
proc report data=sasuser.houses nowd;
   column style sqfeet bedrooms price;
   define style / group;
run;
```
Correct answer: a
This program creates a summary report, which consolidates into one row all observations from the data set that have a unique combination of values for the variable Style.
When you define an order variable,
1. the detail rows are ordered according to their formatted values.
2. you can't create summary reports.
3. PROC REPORT displays only the first occurrence of each order variable value in a set of rows that have the same value for all order variables.
4. all of the above
Correct answer: d
Order variables do order rows according to the formatted values of the order variable, and PROC REPORT suppresses repetitious printing of order values. However, you can't use order variables in a summary report.
Which attributes or options are reflected in this PROC REPORT output?
1. SKIPLINE and FORMAT=
2. CENTER, HEADLINE, HEADSKIP, and either WIDTH=, SPACING=, or FORMAT=
3. SPACING= only
4. CENTER, FORMAT=, and HEADLINE
Correct answer: b
The HEADLINE option underlines the headings, and the HEADSKIP option skips a line between the headings and the rows in the report. Also, Style is centered, and the column for Price is wider than the default.

To create a summary report that shows the average number of bedrooms and the maximum number of baths for each style of house, which DEFINE statements do you use in your PROC REPORT step?

define style / center 'Style of/House';
define bedrooms / mean 'Average/Bedrooms';
define baths / max 'Maximum/Baths';

define style / group;
define bedrooms / mean 'Average/Bedrooms';
define baths / max 'Maximum/Baths';

define style / order;
define bedrooms / mean 'Average/Bedrooms';
define baths / max 'Maximum/Baths';

define style / group;
define bedrooms / 'Average/Bedrooms';
define baths / 'Maximum/Baths'

Correct answer: b

To create a summary report, you must define a group variable. To produce the statistics that you want, you must specify the MEAN and MAX statistics for Bedrooms and Baths.

Which program does not contain an error?

proc report data=sasuser.houses nowd;
   column style bedrooms baths;
   define style / order;
   define bedbathratio / computed format=4.2;
   compute bedbathratio;
      bedbathratio=baths.sum/bedrooms.sum; endcomp;
run;

proc report data=sasuser.houses nowd;
   column style bedrooms baths BedBathRatio;
   define style / order;
   define bedbathratio / order format=4.2;
   compute bedbathratio;
      bedbathratio=baths.sum/bedrooms.sum; endcomp;
run;

proc report data=sasuser.houses nowd;
   column style bedrooms baths BedBathRatio;
   define style / order;
   define bedbathratio / computed format=4.2;
   compute bedbathratio;
      bedbathratio=baths.sum/bedrooms.sum; endcomp;
run;

proc report data=sasuser.houses nowd;
   column style bedrooms baths BedBathRatio;
   define style / order;
   define bedbathratio / computed format=4.2;
   compute bedbathratio;
      bedbathratio=baths/bedrooms;
   endcomp;
run;

Correct answer: c

Program c correctly specifies a computed variable in the COLUMN statement, defines the variable in a DEFINE statement, and computes values using the form variable-name.statistic in a compute block.

What output does this PROC REPORT step produce?
```
proc report data=sasuser.houses nowd;
   column style sqfeet bedrooms price;
run;
```
1. a list report ordered by values of the first variable in the COLUMN statement
2. a summary report ordered by values of the first variable in the COLUMN statement
3. a list report that displays a row for each observation in the input data set and which calculates the SUM statistic for numeric variables
4. a list report that calculates the N (frequency) statistic for character variables
Correct answer: c
By default, PROC REPORT displays character variables as display variables. A report that contains one or more display variables has a detail row for each observation in the data set.

Which of the following programs produces this output?

Chapter 8: Creating Enhanced List and Summary Reports

proc report data=sasuser.houses nowd;
   column style condo range split
          twostory price;


   define price / mean 'Average Price';
run;

proc report data=sasuser.houses nowd;
   column style price;
   define style / group;
   define price / mean 'Average Price';
run;

proc report data=sasuser.houses nowd;
   column style price;
   define style / across;
   define price / mean 'Average Price';
run;

proc report data=sasuser.houses nowd;
   column style price;
   define style / across 'CONDO' 'RANCH'
         'SPLIT' 'TWOSTORY';
   define price / mean 'Average Price';
run;

Correct answer: c

In this output, the table cells contain a frequency count for each unique value of an across variable, Style. You don't have to specify across variable values in your PROC REPORT step.

If you submit this program, where does your PROC REPORT output appear?
```
proc report data=sasuser.houses nowd;
   column style sqfeet bedrooms price;
   define style / group;
run;
```
1. in the PROC REPORT window
2. as HTML and/or SAS listing output
3. both of the above
4. neither of the above
Correct answer: b
In nonwindowing mode, your PROC REPORT output appears as HTML and/or as SAS listing output, depending on your option settings.
How can you create output with headings that break as shown below?
1. You must specify the SPLIT= option in the PROC REPORT statement and use the split character in column headings in DEFINE statements.
2. You must use the default split character in column headings in DEFINE statements.
3. You must specify either the WIDTH= or the SPACING= attribute in DEFINE statements.
4. These headings split this way by default.
Correct answer: d
By default, columns for character variables are the same as the variable's length, and columns for numeric variables have a width of 9. So these headings split this way by default.
Suppose you want to create a report using both character and numeric variables. If you don't use any DEFINE statements in your PROC REPORT step,
1. your PROC REPORT step will not execute successfully.
2. you can produce only list reports.
3. you can order rows by specifying options in the PROC REPORT statement.
4. you can produce only summary reports.
Correct answer: b
Unless you use DEFINE statements to define order variables or group variables, you can't order rows or produce summary reports. However, DEFINE statements are not required in all PROC REPORT steps.

Chapter 9: Producing Descriptive Statistics

The default statistics produced by the MEANS procedure are are n-count, mean, minimum, maximum, and...
1. median
2. range
3. standard deviation
4. standard error of the mean.
Correct answer: c
By default, the MEANS procedure produces the n-count, mean, minimum, maximum, and standard deviation.
Which statement will limit a PROC MEANS analysis to the variables Boarded, Transfer, and Deplane?
1. by boarded transfer deplane;
2. class boarded transfer deplane;
3. output boarded transfer deplane;
4. var boarded transfer deplane;
Correct answer: d
To specify the variables that PROC MEANS analyzes, add a VAR statement and list the variable names.
The data set Survey.Health includes the following variables. Which is a poor candidate for PROC MEANS analysis?
1. IDnum
2. Age
3. Height
4. Weight
Correct answer: a
Unlike Age, Height, or Weight, the values of IDnum are unlikely to yield any useful statistics.
Which of the following statements is true regarding BY group processing?
1. BY variables must be either indexed or sorted.
2. Summary statistics are computed for BY variables.
3. BY group processing is preferred when you are categorizing data that contains few variables.
4. BY group processing overwrites your data set with the newly grouped observations.
Correct answer: a
Unlike CLASS processing, BY group processing requires that your data already be indexed or sorted in the order of the BY variables. You might need to run the SORT procedure before using PROC MEANS with a BY group.
Which group processing statement produced the PROC MEANS output shown below?
1. class sex survive;
2. class survive sex;
3. by sex survive;
4. by survive sex;
Correct answer: b
A CLASS statement produces a single large table, whereas BY group processing creates a series of small tables. The order of the variables in the CLASS statement determines their order in the output table.

Which program can be used to create the following output?

Chapter 9: Producing Descriptive Statistics

proc means data=clinic.diabetes;
   var age height weight;
   class sex;
   output out=work.sum_gender
      mean=AvgAge AvgHeight AvgWeight;
run;

proc summary data=clinic.diabetes print;
   var age height weight; class sex;
   output out=work.sum_gender
      mean=AvgAge AvgHeight AvgWeight;
run;

proc means data=clinic.diabetes noprint;
   var age height weight;
   class sex;
   output out=work.sum_gender
      mean=AvgAge AvgHeight AvgWeight;
run;

Both a and b.

Correct answer: d

You can use either PROC MEANS or PROC SUMMARY to create the table. Adding a PRINT option to the PROC SUMMARY statement produces the same report as if you used PROC MEANS.

By default, PROC FREQ creates a table of frequencies and percentages for which data set variables?
1. character variables
2. numeric variables
3. both character and numeric variables
4. none: variables must always be specified
Correct answer: c
By default, PROC FREQ creates a table for all variables in a data set.
Frequency distributions work best with variables that contain
1. continuous values.
2. numeric values.
3. categorical values.
4. unique values.
Correct answer: c
Both continuous values and many unique values can result in lengthy and meaningless tables. Frequency distributions work best with categorical values.

Which PROC FREQ step produced this two-way table?

proc freq data=clinic.diabetes;
   tables height weight;
   format height htfmt. weight wtfmt.;
run;

proc freq data=clinic.diabetes;
   tables weight height;
   format weight wtfmt. height htfmt.;
run;

proc freq data=clinic.diabetes;
   tables height*weight;
   format height htfmt. weight wtfmt.;
run;

proc freq data=clinic.diabetes;
   tables weight*height;
   format weight wtfmt. height htfmt.;
run;

Correct answer: d

An asterisk is used to join the variables in a two-way TABLES statement. The first variable forms the table rows, and the second variable forms the table columns.

Which PROC FREQ step produced this table?

proc freq data=clinic.diabetes;
   tables sex weight / list;
   format weight wtfmt.;
run;

proc freq data=clinic.diabetes;
   tables sex*weight / nocol;
   format weight wtfmt.;
run;

proc freq data=clinic.diabetes;
   tables sex weight / norow nocol;
   format weight wtfmt.;
run;

proc freq data=clinic.diabetes;
   tables sex*weight / nofreq norow nocol;
   format weight wtfmt.;
run;

Correct answer: d

An asterisk is used to join the variables in crosstabulation tables. The only results shown in this table are cell percentages. The NOFREQ option suppresses cell frequencies, the NOROW option suppresses row percentages, and the NOCOL option suppresses column percentages.

Chapter 10: Producing HTML Output

Using ODS statements, how many types of output can you generate at once?
1. 1 (only listing output)
2. 2
3. 3
4. as many as you want
Correct answer: d
You can generate any number of output types as long as you open the ODS destination for each type of output you want to create.
If ODS is set to its default settings, what types of output are created by the code below?
```
ods html file='c:myhtml.htm';
ods pdf file='c:mypdf.pdf';
```
1. HTML and PDF
2. PDF only
3. HTML, PDF, and listing
4. No output is created because ODS is closed by default.
Correct answer: c
Listing output is created by default, so these statements create HTML, PDF, and listing output.
What is the purpose of closing the Listing destination in the code shown below?
```
ods listing close;
ods html ... ;
```
1. It conserves system resources.
2. It simplifies your program.
3. It makes your program compatible with other hardware platforms.
4. It makes your program compatible with previous versions of SAS.
Correct answer: a
By default, SAS programs produce listing output. If you want only HTML output, it's a good idea to close the Listing destination before creating HTML output, as an open destination uses system resources.
When the code shown below is run, what will the file D:Outputody.html contain?
```
ods html body='d:outputody.html';
proc print data=work.alpha;
run;
proc print data=work.beta;
run;
ods html close;
```
1. The PROC PRINT output for Work.Alpha.
2. The PROC PRINT output for Work.Beta.
3. The PROC PRINT output for both Work.Alpha and Work.Beta.
4. Nothing. No output will be written to D:Outputody.html.
Correct answer: c
When multiple procedures are run while HTML output is open, procedure output is appended to the same body file.
When the code shown below is run, what file will be loaded by the links in D:Outputcontents.html?
```
ods html body='d:outputody.html'
         contents='d:outputcontents.html'
         frame='d:outputframe.html';
```
1. D:Outputody.html
2. D:Outputcontents.html
3. D:Outputframe.html
4. There are no links from the file D:Outputcontents.html.
Correct answer: a
The CONTENTS= option creates a table of contents containing links to the body file, D:Outputody.html.
The table of contents created by the CONTENTS= option contains a numbered heading for
1. each procedure.
2. each procedure that creates output.
3. each procedure and DATA step.
4. each HTML file created by your program.
Correct answer: b
The table of contents contains a numbered heading for each procedure that creates output.
When the code shown below is run, what will the file D:Outputframe.html display?
```
ods html body='d:outputody.html'
         contents='d:outputcontents.html'
         frame='d:outputframe.html';
```
1. The file D:Outputcontents.html.
2. The file D:Outputframe.html.
3. The files D:Outputcontents.html and D:Outputody.html.
4. It displays no other files.
Correct answer: c
The FRAME= option creates an HTML file that integrates the table of contents and the body file.
What is the purpose of the URL= suboptions shown below?
```
ods html body='d:outputody.html'  (url='body.html')
         contents='d:outputcontents.html'
         (url='contents.html')
         frame='d:outputframe.html';
```
1. To create absolute link addresses for loading the files from a server.
2. To create relative link addresses for loading the files from a server.
3. To allow HTML files to be loaded from a local drive.
4. To send HTML output to two locations.
Correct answer: b
Specifying the URL= suboption in the file specification provides a URL that ODS uses in the links it creates. Specifying a simple (one name) URL creates a relative link address to the file.
Which ODS HTML option was used in creating the following table?
1. format=brown
2. format='brown'
3. style=brown
4. style='brown'
Correct answer: c
You can change the appearance of HTML output by using the STYLE= option in the ODS HTML statement. The style name doesn't need quotation marks.
What is the purpose of the PATH= option?
```
ods html  path='d:output' (url=none)
          body='body.html'
          contents='contents.html'
          frame='frame.html';
```
1. It creates absolute link addresses for loading HTML files from a server.
2. It creates relative link addresses for loading HTML files from a server.
3. It allows HTML files to be loaded from a local drive.
4. It specifies the location of HTML file output.
Correct answer: d
You use the PATH= option to specify the location for HTML files to be stored. When you use the PATH= option, you don't need to specify the full path name for the body, contents, or frame files.

Chapter 11: Creating and Managing Variables

Which program creates the output shown below?

data test2;
   infile furnture;
   input StockNum $ 1-3 Finish $ 5-9 Style $ 11-18
         Item $ 20-24 Price 26-31;
   if finish='oak' then delete;
   retain TotPrice 100;
   totalprice+price;
   drop price;
run;
proc print data=test2 noobs;
run;

data test2;
   infile furnture;
   input StockNum $ 1-3 Finish $ 5-9 Style $ 11-18

         Item $ 20-24 Price 26-31;
   if finish='oak' and price<200 then delete;
   TotalPrice+price;
run;
proc print data=test2 noobs;
run;

data test2(drop=price);
   infile furnture;
   input StockNum $ 1-3 Finish $ 5-9 Style $ 11-18
         Item $ 20-24 Price 26-31;
   if finish='oak' and price<200 then delete;
   TotalPrice+price;
run;
proc print data=test2 noobs;
run;

data test2;
   infile furnture;
   input StockNum $ 1-3 Finish $ 5-9 Style $ 11-18
         Item $ 20-24 Price 26-31;
   if finish=oak and price<200 then delete price;
   TotalPrice+price;
run;
proc print data=test2 noobs;
run;

Correct answer: c

Program c correctly deletes the observation in which the value of Finish is oak and the value of Price is less than 200. It also creates TotalPrice by summing the variable Price down observations, then drops Price by using the DROP= data set option in the DATA statement.

How is the variable Amount labeled and formatted in the PROC PRINT output?
```
data credit;
   infile creddata;
   input Account $ 1-5 Name $ 7-25 Type $ 27
          Transact $ 29-35 Amount 37-50;
   label amount='Amount of Loan';
   format amount dollar12.2;
run;
proc print data=credit label;
   label amount='Total Amount Loaned';
   format amount comma10.;
run;
```
1. label Amount of Loan, format DOLLAR12.2
2. label Total Amount Loaned, format COMMA10.
3. label Amount, default format
4. The PROC PRINT step does not execute because two labels and two formats are assigned to the same variable.
Correct answer: b
The PROC PRINT output displays the label Total Amount Loaned for the variable Amount and formats this variable using the COMMA10. format. Temporary labels or formats that are assigned in a PROC step override permanent labels or formats that are assigned in a DATA step.
Consider the IF-THEN statement shown below. When the statement is executed, which expression is evaluated first?
```
if finlexam>=95
   and (research='A' or
       (project='A' and present='A'))
   then Grade='A+';
```
1. finlexam>=95
2. research='A'
3. project='A' and present='A'
4. research='A' or (project='A' and present='A')
Correct answer: c
Logical comparisons that are enclosed in parentheses are evaluated as true or false before they are compared to other expressions. In the example above, the AND comparison within the nested parentheses is evaluated before being compared to the OR comparison.
Consider the small raw data file and program shown below. What is the value of Count after the fourth record is read?
1. missing
2. 0
3. 30
4. 70
Correct answer: d
The sum statement adds the result of the expression that is on the right side of the plus sign to the numeric variable that is on the left side. The new value is then retained for subsequent observations. The sum statement treats the missing value as a 0, so the value of Count in the fourth observation would be 10+20+0+40,or 70.
Now consider the revised program below. What is the value of Count after the third observation is read?
1. missing
2. 0
3. 100
4. 130
Correct answer: d
The RETAIN statement assigns an initial value of 100 to the variable Count, so the value of Count in the third observation would be 100+10+20+0,or 130.
For the observation shown below, what is the result of the IF-THEN statements?
Status
Type
Count
Action
Control
Ok
3
12
E
Go
```
if status='OK' and type=3
   then Count+1;
if status='S' or action='E'
   then Control='Stop';
```
1. ```
Count = 12    Control = Go
```
2. ```
Count = 13    Control = Stop
```
3. ```
Count = 12    Control = Stop
```
4. ```
Count = 13    Control = Go
```
Correct answer: c
You must enclose character values in quotation marks, and you must specify them in the same case in which they appear in the data set. The value ok is not identical to OK, so the value of Count is not changed by the IF-THEN statement.
Which of the following can determine the length of a new variable?
1. the length of the variable's first value
2. the assignment statement
3. the LENGTH statement
4. all of the above
Correct answer: d
The length of a variable is determined by its first reference in the DATA step. When creating a new character variable, SAS allocates as many bytes of storage space as there are characters in the first value that it encounters for that variable. The first reference to a new variable can also be made with a LENGTH statement or an assignment statement. The length of the variable's first value does not matter once the variable has been referenced in your program.

Which set of statements is equivalent to the code shown below?

if code='1' then Type='Fixed';
if code='2' then Type='Variable';
if code^='1' and code^='2' then Type='Unknown';

if code='1' then Type='Fixed';
else if code='2' then Type='Variable';
else Type='Unknown';

if code='1' then Type='Fixed';
if code='2' then Type='Variable';

else Type='Unknown';

if code='1' then type='Fixed';
else code='2' and type='Variable';
else type='Unknown';

if code='1' and type='Fixed';
then code='2' and type='Variable';
else type='Unknown';

Correct answer: a

You can write multiple ELSE statements to specify a series of mutually exclusive conditions. The ELSE statement must immediately follow the IF-THEN statement in your program. An ELSE statement executes only if the previous IF-THEN/ELSE statement is false.

What is the length of the variable Type, as created in the DATA step below?
```
data finance.newloan;
   set finance.records;
   TotLoan+payment;
   if code='1' then Type='Fixed';
   else Type='Variable';
   length type $ 10;
run;
```
1. 5
2. 8
3. 10
4. it depends on the first value of Type
Correct answer: a
The length of a new variable is determined by the first reference in the DATA step, not by data values. In this case, the length of Type is determined by the value Fixed. The LENGTH statement is in the wrong place; it must be read before any other reference to the variable in the DATA step. The LENGTH statement cannot change the length of an existing variable.

Which program contains an error?

data clinic.stress(drop=timemin timesec);
   infile tests;
   input ID $ 1-4 Name $ 6-25 RestHR 27-29 MaxHR 31-33
         RecHR 35-37 TimeMin 39-40 TimeSec 42-43
         Tolerance $ 45;
   TotalTime=(timemin*60)+timesec;
   SumSec+totaltime;
run;

proc print data=clinic.stress;
   label totaltime='Total Duration of Test';
   format timemin 5.2;
   drop sumsec;
run;

proc print data=clinic.stress(keep=totaltime timemin);
   label totaltime='Total Duration of Test';
   format timemin 5.2;
run;

data clinic.stress;
   infile tests;

   input ID $ 1-4 Name $ 6-25 RestHR 27-29 MaxHR 31-33
         RecHR 35-37 TimeMin 39-40 TimeSec 42-43
         Tolerance $ 45;
   TotalTime=(timemin*60)+timesec;
   keep id totaltime tolerance;
run;

Correct answer: b

To select variables, you can use a DROP or KEEP statement in any DATA step. You can also use the DROP= or KEEP= data set options following a data set name in any DATA or PROC step. However, you cannot use DROP or KEEP statements in PROC steps.

Chapter 12: Reading SAS Data Sets Overview

If you submit the following program, which variables appear in the new data set?
```
data work.cardiac(drop=age group);
   set clinic.fitness(keep=age weight group);
   if group=2 and age>40;
run;
```
1. none
2. Weight
3. Age, Group
4. Age, Weight, Group
Correct answer: b
The variables Age, Weight, and Group are specified using the KEEP= option in the SET statement. After processing, Age and Group are dropped in the DATA statement.
Which of the following programs correctly reads the data set Orders and creates the data set FastOrdr?
1. ```
data catalog.fastordr(drop=ordrtime);
   set july.orders(keep=product units price);
   if ordrtime<4;
   Total=units*price;
run;
```
2. ```
data catalog.orders(drop=ordrtime);
   set july.fastordr(keep=product units price);
   if ordrtime<4;
   Total=units*price;
run;
```
3. ```
data catalog.fastordr(drop=ordrtime);
   set july.orders(keep=product units price
                   ordrtime);
   if ordrtime<4;
   Total=units*price;
run;
```
4. none of the above
Correct answer: c
You specify the data set to be created in the DATA statement. The DROP= data set option prevents variables from being written to the data set. Because you use the variable OrdrTime when processing your data, you cannot drop OrdrTime in the SET statement. If you use the KEEP= option in the SET statement, then you must list OrdrTime as one of the variables to be kept.
Which of the following statements is false about BY-group processing?
When you use the BY statement with the SET statement:
1. The data sets listed in the SET statement must be indexed or sorted by the values of the BY variable(s).
2. The DATA step automatically creates two variables, FIRST. and LAST., for each variable in the BY statement.
3. FIRST. and LAST. identify the first and last observation in each BY group, respectively.
4. FIRST. and LAST. are stored in the data set.
Correct answer: d
When you use the BY statement with the SET statement, the DATA step creates the temporary variables FIRST. and LAST. They are not stored in the data set.
There are 500 observations in the data set Usa. What is the result of submitting the following program?
```
data work.getobs5(drop=obsnum);
   obsnum=5;
   set company.usa(keep=manager payroll) point=obsnum;
   stop;
run;
```
1. an error
2. an empty data set
3. continuous loop
4. a data set that contains one observation
Correct answer: b
The DATA step writes observations to output at the end of the DATA step. However, in this program, the STOP statement stops processing before the end of the DATA step. An explicit OUTPUT statement is needed in order to produce observations.
There is no end-of-file condition when you use direct access to read data, so how can your program prevent a continuous loop?
1. Do not use a POINT= variable.
2. Check for an invalid value of the POINT= variable.
3. Do not use an END= variable.
4. Include an OUTPUT statement.
Correct answer: b
To avoid a continuous loop when using direct access, either include a STOP statement or use programming logic that checks for an invalid value of the POINT= variable. If SAS reads an invalid value of the POINT= variable, it sets the automatic variable _ERROR_ to 1. You can use this information to check for conditions that cause continuous processing.
Assuming that the data set Company.USA has five or more observations, what is the result of submitting the following program?
```
data work.getobs5(drop=obsnum);
   obsnum=5;
   set company.usa(keep=manager payroll) point=obsnum;
```
```
   output;
   stop;
run;
```
1. an error
2. an empty data set
3. a continuous loop
4. a data set that contains one observation
Correct answer: d
By combining the POINT= option with the OUTPUT and STOP statements, your program can write a single observation to output.
Which of the following statements is true regarding direct access of data sets?
1. You cannot specify END= with POINT=.
2. You cannot specify OUTPUT with POINT=.
3. You cannot specify STOP with END=.
4. You cannot specify FIRST. with LAST.
Correct answer: a
The END= option and POINT= option are incompatible in the same SET statement. Use one or the other in your program.
What is the result of submitting the following program?
```
data work.addtoend;
   set clinic.stress2 end=last;
   if last;
run;
```
1. an error
2. an empty data set
3. a continuous loop
4. a data set that contains one observation
Correct answer: d
This program uses the END= option to name a temporary variable that contains an end-of-file marker. That variable – last – is set to 1 when the SET statement reads the last observation of the data set.
At the start of DATA step processing, during the compilation phase, variables are created in the program data vector (PDV), and observations are set to:
1. blank
2. missing
3. 0
4. there are no observations.
Correct answer: d
At the bottom of the DATA step, the compilation phase is complete, and the descriptor portion of the new SAS data set is created. There are no observations because the DATA step has not yet executed.
The DATA step executes:
1. continuously if you use the POINT= option and the STOP statement.
2. once for each variable in the output data set.
3. once for each observation in the input data set.
4. until it encounters an OUTPUT statement.
Correct answer: c
The DATA step executes once for each observation in the input data set. You use the POINT= option with the STOP statement to prevent continuous looping.

Chapter 13: Combining SAS Data Sets

Which program will combine Brothers.One and Brothers.Two to produce Brothers.Three?
1. ```
data brothers.three;
   set brothers.one;
   set brothers.two;
run;
```
2. ```
data brothers.three;
   set brothers.one brothers.two;
run;
```
3. ```
data brothers.three;
   set brothers.one brothers.two;
   by varx;
run;
```
4. ```
data brothers.three;
   merge brothers.one brothers.two;
   by varx;
run;
```
Correct answer: a
This is a case of one-to-one reading, which requires multiple SET statements. Notice that where same-named variables occur, the values that are read in from the second data set replace those that are read in from the first one. Also, the number of observations in the new data set is the number of observations in the smallest original data set.
Which program will combine Actors.Props1 and Actors.Props2 to produce Actors.Props3?
1. ```
data actors.props3;
   set actors.props1;
   set actors.props2;
run;
```
2. ```
data actors.props3;
   set actors.props1 actors.props2;
run;
```
3. ```
data actors.props3;
   set actors.props1 actors.props2;
   by actor;
run;
```
4. ```
data actors.props3;
   merge actors.props1 actors.props2;
   by actor;
run;
```
Correct answer: c
This is a case of interleaving, which requires a list of data set names in the SET statement and one or more BY variables in the BY statement. Notice that observations in each BY group are read sequentially, in the order in which the data sets and BY variables are listed. The new data set contains all the variables from all the input data sets, as well as the total number of records from all input data sets.
If you submit the following program, which new data set is created?
```
data work.jobsatis;
   set work.dataone work.datatwo;
run;
```
4. none of the above
Correct answer: a
Concatenating appends the observations from one data set to another data set. The new data set contains the total number of records from all input data sets, so b is incorrect. All the variables from all the input data sets appear in the new data set, so c is incorrect.
If you concatenate the data sets below in the order shown, what is the value of Sale in observation 2 of the new data set?
1. missing
2. $30,000
3. $40,000
4. you cannot concatenate these data sets
Correct answer: a
The concatenated data sets are read sequentially, in the order in which they are listed in the SET statement. The second observation in Sales.Reps does not contain a value for Sale, so a missing value appears for this variable. (Note that if you merge the data sets, the value of Sale for the second observation is $30,000.)
What happens if you merge the following data sets by the variable SSN?
1. The values of Age in the 1st data set overwrite the values of Age in the 2nd data set.
2. The values of Age in the 2nd data set overwrite the values of Age in the 1st data set.
3. The DATA step fails because the two data sets contain same-named variables that have different values.
4. The values of Age in the 2nd data set are set to missing.
Correct answer: b
If you have variables with the same name in more than one input data set, values of the same-named variable in the first data set in which it appears are overwritten by values of the same-named variable in subsequent data sets.
Suppose you merge data sets Health.Set1 and Health.Set2 below:
Which output does the following program create?
```
data work.merged;
   merge health.set1(in=in1) health.set2(in=in2);
   by id;
   if in1 and in2;
run;
proc print data=work.merged;
run;
```
4. none of the above
Correct answer: c
The DATA step uses the IN= data set option and the subsetting IF statement to exclude unmatched observations from the output data set. So a and b, which contain unmatched observations, are incorrect.
The data sets Ensemble.Spring and Ensemble.Summer both contain a variable named Blue. How do you prevent the values of the variable Blue from being overwritten when you merge the two data sets?
1. ```
data ensemble.merged;
   merge ensemble.spring(in=blue)
         ensemble.summer;
   by fabric;
run;
```
2. ```
data ensemble.merged;
   merge ensemble.spring(out=blue)
         ensemble.summer;
   by fabric;
run;
```
3. ```
data ensemble.merged;
   merge ensemble.spring(blue=navy)
         ensemble.summer;
   by fabric;
run;
```
4. ```
data ensemble.merged;
   merge ensemble.spring(rename=(blue=navy))
         ensemble.summer;
   by fabric;
run;
```
Correct answer: d
Match-merging overwrites same-named variables in the first data set with same-named variables in subsequent data sets. To prevent overwriting, rename variables by using the RENAME= data set option in the MERGE statement.
What happens if you submit the following program to merge Blood.Donors1 and Blood.Donors2, shown below?
```
data work.merged;
   merge blood.donors1 blood.donors2;
   by id;
run
```
1. The Merged data set contains some missing values because not all observations have matching observations in the other data set.
2. The Merged data set contains eight observations.
3. The DATA step produces errors.
4. Values for Units in Blood.Donors2 overwrite values of Units in Blood.Donors1.
Correct answer: c
The two input data sets are not sorted by values of the BY variable, so the DATA step produces errors and stops processing.
If you merge Company.Staff1 and Company.Staff2 below by ID, how many observations does the new data set contain?
1. 4
2. 5
3. 6
4. 9
Correct answer: c
In this example, the new data set contains one observation for each unique value of ID. The merged data set is shown below.
If you merge data sets Sales.Reps, Sales.Close, and Sales.Bonus by ID, what is the value of Bonus in the third observation in the new data set?
1. $4,000
2. $3,000
3. missing
4. can't tell from the information given
Correct answer: a
In the new data set, the third observation is the second observation for ID number 2 (Kelly Windsor). The value for Bonus is retained from the previous observation because the BY variable value didn't change. The new data set is shown below.

Chapter 14: Transforming Data with SAS Functions

Which function calculates the average of the variables Var1, Var2, Var3, and Var4?
1. ```
mean(var1,var4)
```
2. ```
mean(var1-var4)
```
3. ```
mean(of var1,var4)
```
4. ```
mean(of var1-var4)
```
Correct answer: d
Use a variable list to specify a range of variables as the function argument. When specifying a variable list, be sure to precede the list with the word OF. If you omit the word OF, the function argument might not be interpreted as expected.
Within the data set Hrd.Temp, PayRate is a character variable and Hours is a numeric variable. What happens when the following program is run?
```
data work.temp;
   set hrd.temp;
   Salary=payrate*hours;
run;
```
1. SAS converts the values of PayRate to numeric values. No message is written to the log.
2. SAS converts the values of PayRate to numeric values. A message is written to the log.
3. SAS converts the values of Hours to character values. No message is written to the log.
4. SAS converts the values of Hours to character values. A message is written to the log.
Correct answer: b
When this DATA step is executed, SAS automatically converts the character values of PayRate to numeric values so that the calculation can occur. Whenever data is automatically converted, a message is written to the SAS log stating that the conversion has occurred.
A typical value for the character variable Target is 123,456. Which statement Correctly converts the values of Target to numeric values when creating the variable TargetNo?
1. ```
TargetNo=input(target,comma6.);
```
2. ```
TargetNo=input(target,comma7.);
```
3. ```
TargetNo=put(target,comma6.);
```
4. ```
TargetNo=put(target,comma7.)
```
Correct answer: b
You explicitly convert character values to numeric values by using the INPUT function. Be sure to select an informat that can read the form of the values.
A typical value for the numeric variable SiteNum is 12.3. Which statement correctly converts the values of SiteNum to character values when creating the variable Location?
1. ```
Location=dept||'/'||input(sitenum,3.1);
```
2. ```
Location=dept||'/'||input(sitenum,4.1);
```
3. ```
Location=dept||'/'||put(sitenum,3.1);
```
4. ```
Location=dept||'/'||put(sitenum,4.1);
```
Correct answer: d
You explicitly convert numeric values to character values by using the PUT function. Be sure to select a format that can read the form of the values.
Suppose the YEARCUTOFF= system option is set to 1920. Which MDY function creates the date value for January 3, 2020?
1. ```
MDY(1,3,20)
```
2. ```
MDY(3,1,20)
```
3. ```
MDY(1,3,2020)
```
4. ```
MDY(3,1,2020)
```
Correct answer: c
Because the YEARCUTOFF= system option is set to 1920, SAS sees the two-digit year value 20 as 1920. Four-digit year values are always read correctly.
The variable Address2 contains values such as Piscataway, NJ. How do you assign the two-letter state abbreviations to a new variable named State?
1. ```
State=scan(address2,2);
```
2. ```
State=scan(address2,13,2);
```
3. ```
State=substr(address2,2);
```
4. ```
State=substr(address2,13,2);
```
Correct answer: a
The SCAN function is used to extract words from a character value when you know the order of the words, when their position varies, and when the words are marked by some delimiter. In this case, you don't need to specify delimiters, because the blank and the comma are default delimiters.
The variable IDCode contains values such as 123FA and 321MB. The fourth character identifies sex. How do you assign these character codes to a new variable named Sex?
1. ```
Sex=scan(idcode,4);
```
2. ```
Sex=scan(idcode,4,1);
```
3. ```
Sex=substr(idcode,4);
```
4. ```
Sex=substr(idcode,4,1);
```
Correct answer: d
The SUBSTR function is best used when you know the exact position of the substring to extract from the character value. You specify the position to start from and the number of characters to extract.

Due to growth within the 919 area code, the telephone exchange 555 is being reassigned to the 920 area code. The data set Clients.Piedmont includes the variable Phone, which contains telephone numbers in the form 919-555-1234. Which of the following programs will correctly change the values of Phone?

data work.piedmont(drop=areacode exchange);
   set clients.piedmont;
   Areacode=substr(phone,1,3);
   Exchange=substr(phone,5,3);
   if areacode='919' and exchange='555'
      then scan(phone,1,3)='920';
run;

data work.piedmont(drop=areacode exchange);
   set clients.piedmont;
   Areacode=substr(phone,1,3);
   Exchange=substr(phone,5,3);
   if areacode='919' and exchange='555'
      then phone=scan('920',1,3);
run;

data work.piedmont(drop=areacode exchange);
   set clients.piedmont;
   Areacode=substr(phone,1,3);
   Exchange=substr(phone,5,3);
   if areacode='919' and exchange='555'
      then substr(phone,5,3)='920';
run;

data work.piedmont(drop=areacode exchange);
   set clients.piedmont;
   Areacode=substr(phone,1,3);
   Exchange=substr(phone,5,3);
   if areacode='919' and exchange='555'
      then phone=substr('920',1,3);
run;

Correct answer: c

The SUBSTR function replaces variable values if it is placed on the left side of an assignment statement. When placed on the right side (as in Question 7), the function extracts a substring.

Suppose you need to create the variable FullName by concatenating the values of FirstName, which contains first names, and LastName, which contains last names. What's the best way to remove extra blanks between first names and last names?
1. ```
data work.maillist;
   set retail.maillist;
   length FullName $ 40;
   fullname=trim firstname||' '||lastname;
run;
```
2. ```
data work.maillist;
   set retail.maillist;
   length FullName $ 40;
   fullname=trim(firstname)||' '||lastname;
run;
```
3. ```
data work.maillist;
   set retail.maillist;
   length FullName $ 40;
   fullname=trim(firstname)||' '||trim(lastname);
run;
```
4. ```
data work.maillist;
   set retail.maillist;
   length FullName $ 40;
   fullname=trim(firstname||' '||lastname);
run;
```
Correct answer: b
The TRIM function removes trailing blanks from character values. In this case, extra blanks must be removed from the values of FirstName. Although answer c also works, the extra TRIM function for the variable LastName is unnecessary. Because of the LENGTH statement, all values of FullName are padded to 40 characters.
Within the data set Furnitur.Bookcase, the variable Finish contains values such as ash/cherry/teak/matte-black. Which of the following creates a subset of the data in which the values of Finish contain the string walnut? Make the search for the string case-insensitive.
1. ```
data work.bookcase;
   set furnitur.bookcase;
   if index(finish,walnut) = 0;
run;
```
2. ```
data work.bookcase;
   set furnitur.bookcase;
   if index(finish,'walnut') > 0;
run;
```
3. ```
data work.bookcase;
   set furnitur.bookcase;
   if index(lowcase(finish),walnut) = 0;
run;
```
4. ```
data work.bookcase;
   set furnitur.bookcase;
   if index(lowcase(finish),'walnut') > 0;
run;
```
Correct answer: d
Use the INDEX function in a subsetting IF statement, enclosing the character string in quotation marks. Only those observations in which the function locates the string and returns a value greater than 0 are written to the data set.

Chapter 15: Generating Data with DO Loops

Which statement is false regarding the use of DO loops?
1. They can contain conditional clauses.
2. They can generate multiple observations.
3. They can be used to combine DATA and PROC steps.
4. They can be used to read data.
Correct answer: c
DO loops are DATA step statements and cannot be used in conjunction with PROC steps.
During each execution of the following DO loop, the value of Earned is calculated and is added to its previous value. How many times does this DO loop execute?
```
data finance.earnings;
   Amount=1000;
   Rate=.075/12;
   do month=1 to 12;
      Earned+(amount+earned)*rate;
   end;
run;
```
1. 0
2. 1
3. 12
4. 13
Correct answer: c
The number of iterations is determined by the DO statement's stop value, which in this case is 12.
On January 1 of each year, $5000 is invested in an account. Complete the DATA step below to determine the value of the account after 15 years if a constant interest rate of ten percent is expected.
```
data work.invest;
   ...
     Capital+5000;
     capital+(capital*.10);
   end;
run;
```
1. ```
do count=1 to 15;
```
2. ```
do count=1 to 15 by 10%;
```
3. ```
do count=1 to capital;
```
4. ```
do count=capital to (capital*.10);
```
Correct answer: a
Use a DO loop to perform repetitive calculations starting at 1 and looping 15 times.
In the data set Work.Invest, what would be the stored value for Year?
```
data work.invest;
   do year=1990 to 2004;
      Capital+5000;
      capital+(capital*.10);
      end;
run;
```
1. missing
2. 1990
3. 2004
4. 2005
Correct answer: d
At the end of the fifteenth iteration of the DO loop, the value for Year is incremented to 2005. Because this value exceeds the stop value, the DO loop ends. At the bottom of the DATA step, the current values are written to the data set.
Which of the following statements is false regarding the program shown below?
```
data work.invest;
   do year=1990 to 2004;
      Capital+5000;
      capital+(capital*.10);
      output;
   end;
run;
```
1. The OUTPUT statement writes current values to the data set immediately.
2. The stored value for Year is 2005.
3. The OUTPUT statement overrides the automatic output at the end of the DATA step.
4. The DO loop performs 15 iterations.
Correct answer: b
The OUTPUT statement overrides the automatic output at the end of the DATA step. On the last iteration of the DO loop, the value of Year, 2004, is written to the data set.
How many observations will the data set Work.Earn contain?
```
data work.earn;
   Value=2000;
   do year=1 to 20;
      Interest=value*.075;
      value+interest;
      output;
   end;
run;
```
1. 0
2. 1
3. 19
4. 20
Correct answer: d
The number of observations is based on the number of times the OUTPUT statement executes. The new data set has 20 observations, one for each iteration of the DO loop.
Which of the following would you use to compare the result of investing $4,000 a year for five years in three different banks that compound interest monthly?
Assume a fixed rate for the five-year period.
1. DO WHILE statement
2. nested DO loops
3. DO UNTIL statement
4. a DO group
Correct answer: b
Place the monthly calculation in a DO loop within a DO loop that iterates once for each year. The DO WHILE and DO UNTIL statements are not used here because the number of required iterations is fixed. A non-iterative DO group would not be useful.
Which statement is false regarding DO UNTIL statements?
1. The condition is evaluated at the top of the loop, before the enclosed statements are executed.
2. The enclosed statements are always executed at least once.
3. SAS statements in the DO loop are executed until the specified condition is true.
4. The DO loop must have a closing END statement.
Correct answer: a
The DO UNTIL condition is evaluated at the bottom of the loop, so the enclosed statements are always excecuted at least once.
Select the DO WHILE statement that would generate the same result as the program below.
```
data work.invest;
capital=100000;
   do until(Capital gt 500000);
      Year+1;
      capital+(capital*.10);
   end;
run;
```
1. ```
do while(Capital ge 500000);
```
2. ```
do while(Capital=500000);
```
3. ```
do while(Capital le 500000);
```
4. ```
do while(Capital>500000);
```
Correct answer: c
Because the DO WHILE loop is evaluated at the top of the loop, you specify the condition that must exist in order to execute the enclosed statements.
In the following program, complete the statement so that the program stops generating observations when Distance reaches 250 miles or when 10 gallons of fuel have been used.
```
data work.go250;
   set perm.cars;
   do gallons=1 to 10 ... ;
      Distance=gallons*mpg;
      output;
   end;
run;
```
1. ```
while(Distance<250)
```
2. ```
when(Distance>250)
```
3. ```
over(Distance le 250)
```
4. ```
until(Distance=250)
```
Correct answer: a
The WHILE expression causes the DO loop to stop executing when the value of Distance becomes equal to or greater than 250.

Chapter 16: Processing Variables with Arrays

Which statement is false regarding an ARRAY statement?
1. It is an executable statement.
2. It can be used to create variables.
3. It must contain either all numeric or all character elements.
4. It must be used to define an array before the array name can be referenced.
Correct answer: a
An ARRAY statement is not an executable statement; it merely defines an array.
What belongs within the braces of this ARRAY statement?
```
array contrib{?} qtr1-qtr4;
```
1. ```
quarter
```
2. ```
quarter*
```
3. ```
1-4
```
4. ```
4
```
Correct answer: d
The value in parentheses indicates the number of elements in the array. In this case, there are four elements.
For the program below, select an iterative DO statement to process all elements in the contrib array.
```
data work.contrib;
   array contrib{4} qtr1-qtr4;
       ...
       contrib{i}=contrib{i}*1.25;
   end;
run;
```
1. ```
do i=4;
```
2. ```
do i=1 to 4;
```
3. ```
do until i=4;
```
4. ```
do while i le 4;
```
Correct answer: b
In the DO statement, you specify the index variable that represents the values of the array elements. Then specify the start and stop positions of the array elements.
What is the value of the index variable that references Jul in the statements below?
```
array quarter{4} Jan Apr Jul Oct;
do i=1 to 4;
   yeargoal=quarter{i}*1.2;
end;
```
1. 1
2. 2
3. 3
4. 4
Correct answer: c
The index value represents the position of the array element. In this case, the third element is Jul.
Which DO statement would not process all the elements in the factors array shown below?
```
array factors{*} age height weight bloodpr;
```
1. ```
do i=1 to dim(factors);
```
2. ```
do i=1 to dim(*);
```
3. ```
do i=1,2,3,4;
```
4. ```
do i=1 to 4;
```
Correct answer: b
To process all the elements in an array, you can either specify the array dimension or use the DIM function with the array name as the argument.
Which statement below is false regarding the use of arrays to create variables?
1. The variables are added to the program data vector during the compilation of the DATA step.
2. You do not need to specify the array elements in the ARRAY statement.
3. By default, all character variables are assigned a length of eight.
4. Only character variables can be created.
Correct answer: d
Either numeric or character variables can be created by an ARRAY statement.
For the first observation, what is the value of diff{i} at the end of the second iteration of the DO loop?
```
array wt{*} weight1-weight10;
array diff{9};
do i=1 to 9;
   diff{i}=wt{i+1}-wt{i};
end;
```
1. 15
2. 10
3. 8
4. -7
Correct answer: a
At the end of the second iteration, diff{i} resolves as follows:
```
diff{2}=wt{2+1}-wt{2};
diff{2}=215-200
```
Finish the ARRAY statement below to create temporary array elements that have initial values of 9000, 9300, 9600, and 9900.
```
array goal{4}  ... ;
```
1. ```
_temporary_ (9000 9300 9600 9900)
```
2. ```
temporary (9000 9300 9600 9900)
```
3. ```
_temporary_ 9000 9300 9600 9900
```
4. ```
(temporary) 9000 9300 9600 9900
```
Correct answer: a
To create temporary array elements, specify _TEMPORARY_ after the array name and dimension. Specify an initial value for each element, separated by either blanks or commas, and enclose the values in parentheses.
Based on the ARRAY statement below, select the array reference for the array element q50.
```
array ques{3,25} q1-q75;
```
1. ```
ques{q50}
```
2. ```
ques{1,50}
```
3. ```
ques{2,25}
```
4. ```
ques{3,0}
```
Correct answer: c
This two-dimensional array would consist of three rows of 25 elements. The first row would contain q1 through q25, the second row would start with q26 and end with q50, and the third row would start with q51 and end with q75.

Select the ARRAY statement that defines the array in the following program.

data rainwear.coat;
   input category high1-high3 / low1-low3;
   ...
   do i=1 to 2;
      do j=1 to 3;
         compare{i,j}=round(compare{i,j}*1.12);
      end;
   end;
run;

array compare{1,6} high1-high3 low1-low3;

array compare{2,3} high1-high3 low1-low3;

array compare{3,2} high1-high3 low1-low3;

array compare{3,3} high1-high3 low1-low3;

Correct answer: b

The nested DO loops indicate that the array is named compare and is a two-dimensional array that has two rows and three columns.

Chapter 17: Reading Raw Data in Fixed Fields

Which SAS statement correctly uses column input to read the values in the raw data file below in this order: Address (4th field), SquareFeet (second field), Style (first field), Bedrooms (third field)?
1. ```
input Address 15-29 SquareFeet 8-11 Style 1-6
      Bedrooms 13;
```
2. ```
input $ 15-29 Address 8-11 SquareFeet $ 1-6 Style
      13 Bedrooms;
```
3. ```
input Address $ 15-29 SquareFeet 8-11 Style $ 1-6
      Bedrooms 13;
```
4. ```
input Address 15-29 $ SquareFeet 8-11 Style 1-6
      $ Bedrooms 13;
```
Correct answer: c
Column input specifies the variable's name, followed by a dollar ($) sign if the values are character values, and the beginning and ending column locations of the raw data values.
Which is not an advantage of column input?
1. It can be used to read character variables that contain embedded blanks.
2. No placeholder is required for missing data.
3. Standard as well as nonstandard data values can be read.
4. Fields do not have to be separated by blanks or other delimiters.
Correct answer: c
Column input is useful for reading standard values only.
Which is an example of standard numeric data?
1. -34.245
2. $24,234.25
3. 1/2
4. 50%
Correct answer: a
A standard numeric value can contain numbers, scientific notation, decimal points, and plus and minus signs. Nonstandard numeric data includes values that contain fractions or special characters such as commas, dollar signs, and percent signs.
Formatted input can be used to read
1. standard free-format data
2. standard data in fixed fields
3. nonstandard data in fixed fields
4. both standard and nonstandard data in fixed fields
Correct answer: d
Formatted input can be used to read both standard and nonstandard data in fixed fields.
Which informat should you use to read the values in column 1-5?
1. w.
2. $w.
3. w.d
4. COMMAw.d
Correct answer: b
The $w. informat enables you to read character data. The w represents the field width of the data value or the total number of columns that contain the raw data field.
The COMMAw.d informat can be used to read which of the following values?
1. 12,805
2. $177.95
3. 18%
4. all of the above
Correct answer: d
The COMMAw.d informat strips out special characters, such as commas, dollar signs, and percent signs, from numeric data and stores only numeric values in a SAS data set.
Which INPUT statement correctly reads the values for ModelNumber (first field) after the values for Item (second field)? Both Item and ModelNumber are character variables.
1. ```
input +7 Item $9. @1 ModelNumber $5.;
```
2. ```
input +6 Item $9. @1 ModelNumber $5.;
```
3. ```
input @7 Item $9. +1 ModelNumber $5.;
```
4. ```
input @7 Item $9 @1 ModelNumber 5.;
```
Correct answer: b
The +6 pointer control moves the input pointer to the beginning column of Item, and the values are read. Then the @1 pointer control returns to column 1, where the values for ModelNumber are located.
Which INPUT statement correctly reads the numeric values for Cost (third field)?
1. ```
input @17 Cost 7.2;
```
2. ```
input @17 Cost 9.2.;
```
3. ```
input @17 Cost comma7.;
```
4. ```
input @17 Cost comma9.;
```
Correct answer: d
The values for Cost contain dollar signs and commas, so you must use the COMMAw.d informat. Counting the numbers, dollar sign, comma, and decimal point, the field width is 9 columns. Because the data value contains decimal places, a d value is not needed.
Which SAS statement correctly uses formatted input to read the values in this order: Item (first field), UnitCost (second field), Quantity (third field)?
1. ```
input @1 Item $9. +1 UnitCost comma6.
      @18 Quantity 3.;
```
2. ```
input Item $9. @11 UnitCost comma6.
      @18 Quantity 3.;
```
3. ```
input Item $9. +1 UnitCost comma6.
      @18 Quantity 3.;
```
4. all of the above
Correct answer: d
The default location of the column pointer control is column 1, so a column pointer control is optional for reading the first field. You can use the @n or +n pointer controls to specify the beginning column of the other fields. You can use the $w. informat to read the values for Item, the COMMAw.d informat for UnitCost, and the w.d informat for Quantity.
Which raw data file requires the PAD option in the INFILE statement in order to Correctly read the data using either column input or formatted input?
Correct answer: a
Use the PAD option in the INFILE statement to read variable-length records that contain fixed-field data. The PAD option pads each record with blanks so that all data lines have the same length.

Chapter 18: Reading Free-Format Data

The raw data file referenced by the fileref Students contains data that is
1. arranged in fixed fields
2. free-format
3. mixed-format
4. arranged in columns
Correct answer: b
The raw data file contains data that is free-format, meaning that the data is not arranged in columns or fixed fields.
Which input style should be used to read the values in the raw data file that is referenced by the fileref Students?
1. column
2. formatted
3. list
4. mixed
Correct answer: c
List input should be used to read data that is free-format because you do not need to specify the column locations of the data.

Which SAS program was used to create the raw data file Teamdat from the SAS data set Work.Scores?

data _null_;
   set work.scores;
   file 'c:data	eamdat' dlm=',';
   put name highscore team;
run;

data _null_;
   set work.scores;
   file 'c:data	eamdat' dlm=' ';
   put name highscore team;
run;

data _null_;
   set work.scores;
   file 'c:data	eamdat' dsd;
   put name highscore team;
run;

data _null_;
   set work.scores;
   file 'c:data	eamdat';
   put name highscore team;
run;

Correct answer: c

You can use the DSD option in the FILE statement to specify that data values containing commas should be enclosed in quotation marks. The DSD option uses a comma as the delimiter by default.

Which SAS statement reads the raw data values in order and assigns them to the variables shown below?
Variables: FirstName (character), LastName (character), Age (numeric), School (character), Class (numeric)
1. ```
input FirstName $ LastName $ Age School $ Class;
```
2. ```
input FirstName LastName Age School Class;
```
3. ```
input FirstName $ 1-4 LastName $ 6-12 Age 14-15
      School $ 17-19 Class 21;
```
4. ```
input FirstName 1-4 LastName 6-12 Age 14-15
      School 17-19 Class 21;
```
Correct answer: a
Because the data is free-format, list input is used to read the values. With list input, you simply name each variable and identify its type.
Which SAS statement should be used to read the raw data file that is referenced by the fileref Salesrep?
1. ```
infile salesrep;
```
2. ```
infile salesrep ':';
```
3. ```
infile salesrep dlm;
```
4. ```
infile salesrep dlm=':';
```
Correct answer: d
The INFILE statement identifies the location of the external data file. The DLM= option specifies the colon (:) as the delimiter that separates each field.
Which of the following raw data files can be read by using the MISSOVER option in the INFILE statement? Spaces for missing values are highlighted with colored blocks.
Correct answer: a
You can use the MISSOVER option in the INFILE statement to read the missing values at the end of a record. The MISSOVER option prevents SAS from moving to the next record if values are missing in the current record.

Which SAS program correctly reads the data in the raw data file that is referenced by the fileref Volunteer?

data perm.contest;
   infile volunteer;
   input FirstName $ LastName $ Age
         School $ Class;
run;

data perm.contest;
   infile volunteer;
   length LastName $ 11;
   input FirstName $ lastname $ Age
         School $ Class;
run;

data perm.contest;
   infile volunteer;
   input FirstName $ lastname $ Age
         School $ Class;   length LastName $ 11;
run;

data perm.contest;
   infile volunteer;
   input FirstName $ LastName $ 11. Age
         School $ Class;
run;

Correct answer: b

The LENGTH statement extends the length of the character variable LastName so that it is large enough to accommodate the data. Variable attributes such as length are defined the first time a variable is named in a DATA step. The LENGTH statement should precede the INPUT statement so that the correct length is defined.

Which type of input should be used to read the values in the raw data file that is referenced by the fileref University?
1. column
2. formatted
3. list
4. modified list
Correct answer: d
Notice that the values for School contain embedded blanks, and the values for Enrolled are nonstandard numeric values. Modified list input can be used to read the values that contain embedded blanks and nonstandard values.
Which SAS statement correctly reads the values for Flavor and Quantity? Make sure the length of each variable can accommodate the values shown.
1. ```
input Flavor & $9. Quantity : comma.;
```
2. ```
input Flavor & $14. Quantity : comma.;
```
3. ```
input Flavor : $14. Quantity & comma.;
```
4. ```
input Flavor $14. Quantity : comma.;
```
Correct answer: b
The INPUT statement uses list input with format modifiers and informats to read the values for each variable. The ampersand (&) modifier enables you to read character values that contain single embedded blanks. The colon (:) modifier enables you to read nonstandard data values and character values that are longer than eight characters, but which contain no embedded blanks.
Which SAS statement correctly reads the raw data values in order and assigns them to these corresponding variables: Year (numeric), School (character), Enrolled (numeric)?
1. ```
input Year School & $27.
      Enrolled : comma.;
```
2. ```
input Year 1-4 School & $27.
      Enrolled : comma.;
```
3. ```
input @1 Year 4. +1 School & $27.
      Enrolled : comma.;
```
4. all of the above
Correct answer: d
The values for Year can be read with column, formatted, or list input. However, the values for School and Enrolled are free-format data that contain embedded blanks or nonstandard values. Therefore, these last two variables must be read with modified list input.

Chapter 19: Reading Date and Time Values

SAS date values are the number of days since which date?
1. January 1, 1900
2. January 1, 1950
3. January 1, 1960
4. January 1, 1970
Correct answer: c
A SAS date value is the number of days from January 1, 1960, to the given date.
A great advantage of storing dates and times as SAS numeric date and time values is that
1. they can easily be edited.
2. they can easily be read and understood.
3. they can be used in text strings like other character values.
4. they can be used in calculations like other numeric values.
Correct answer: d
In addition to tracking time intervals, SAS date and time values can be used in calculations like other numeric values. This lets you calculate values that involve dates much more easily than in other programming languages.
SAS does not automatically make adjustments for daylight saving time, but it does make adjustments for:
1. leap seconds
2. leap years
3. Julian dates
4. time zones
Correct answer: b
SAS automatically makes adjustments for leap years.
An input data file has date expressions in the form 10222001. Which SAS informat should you use to read these dates?
1. DATE6.
2. DATE8.
3. MMDDYY6.
4. MMDDYY8.
Correct answer: d
The SAS informat MMDDYYw. reads dates such as 10222001, 10/22/01, or 10-22-01. In this case, the field width is eight.
The minimum width of the TIMEw. informat is:
1. 4
2. 5
3. 6
4. 7
Correct answer: b
The minimum acceptable field width for the TIMEw. informat is five. If you specify a w value less than five, you will receive an error message in the SAS log.
Shown below are date and time expressions and corresponding SAS datetime informats. Which date and time expression cannot be read by the informat that is shown beside it?
1. 30May2000:10:03:17.2 DATETIME20.
2. 0May00 10:03:17.2 DATETIME18.
3. 30May2000/10:03 DATETIME15.
4. 30May2000/1003 DATETIME14.
Correct answer: d
In the time value of a date and time expression, you must use delimiters to separate the values for hour, minutes, and seconds.
What is the default value of the YEARCUTOFF= system option?
1. 1920
2. 1910
3. 1900
4. 1930
Correct answer: a
The default value of YEARCUTOFF= is 1920. This enables you to read two-digit years from 00-19 as the years 2000 through 2019.
Suppose your input data file contains the date expression 13APR2009. The YEARCUTOFF= system option is set to 1910. SAS will read the date as:
1. 13APR1909
2. 13APR1920
3. 13APR2009
4. 13APR2020
Correct answer: c
The value of the YEARCUTOFF= system option does not affect four-digit year values. Four-digit values are always read correctly.
Suppose the YEARCUTOFF= system option is set to 1920. An input file contains the date expression 12/08/1925, which is being read with the MMDDYY8. informat. Which date will appear in your data?
1. 08DEC1920
2. 08DEC1925
3. 08DEC2019
4. 08DEC2025
Correct answer: c
The w value of the informat MMDDYY8. is too small to read the entire value, so the last two digits of the year are truncated. The last two digits thus become 19 instead of 25. Because the YEARCUTOFF= system option is set to 1920, SAS interprets this year as 2019. To avoid such errors, be sure to specify an informat that is wide enough for your date expressions.
Suppose your program creates two variables from an input file. Both variables are stored as SAS date values: FirstDay records the start of a billing cycle, and LastDay records the end of that cycle. The code for calculating the total number of days in the cycle would be:
1. ```
TotDays=lastday-firstday;
```
2. ```
TotDays=lastday-firstday+1;
```
3. ```
TotDays=lastday/firstday;
```
4. You cannot use date values in calculations.
Correct answer: b
To find the number of days spanned by two dates, subtract the first day from the last day and add one. Because SAS date values are numeric values, they can easily be used in calculations.

Chapter 20: Creating a Single Observation from Multiple Records

You can position the input pointer on a specific record by using
1. column pointer controls.
2. column specifications.
3. line pointer controls.
4. line hold specifiers.
Correct answer: c
Information for one observation can be spread out over several records. You can write one INPUT statement that contains line pointer controls to specify the record(s) from which values are read.
Which pointer control is used to read multiple records sequentially?
1. @n
2. +n
3. /
4. all of the above
Correct answer: c
The forward slash (/) line pointer control is used to read multiple records sequentially. Each time a / pointer is encountered, the input pointer advances to the next line. @n and +n are column pointer controls.
Which pointer control can be used to read records non-sequentially?
1. @n
2. #n
3. +n
4. /
Correct answer: b
The #n line pointer control is used to read records non-sequentially. The #n specifies the absolute number of the line to which you want to move the pointer.
Which SAS statement correctly reads the values for Fname, Lname, Address, City, State and Zip in order?
1. ```
input Fname $ Lname $ /
      Address $20. /
      City $ State $ Zip $;
```
2. ```
input Fname $ Lname $ /;
      Address $20. /;
      City $ State $ Zip $;
```
3. ```
input / Fname $ Lname $
      / Address $20.
      City $ State $ Zip $;
```
4. ```
input / Fname $ Lname $;
      / Address $20.;
      City $ State $ Zip $;
```
Correct answer: a
The INPUT statement uses the / line pointer control to move the input pointer forward from the first record to the second record, and from the second record to the third record. The / line pointer control only moves the input pointer forward and must be specified after the instructions for reading the values in the current record. You should place a semicolon only at the end of a complete INPUT statement.
Which INPUT statement correctly reads the values for ID in the fourth record, then returns to the first record to read the values for Fname and Lname?
1. ```
input #4 ID $5.
      #1 Fname $ Lname $;
```
2. ```
input #4 ID $ 1-5
      #1 Fname $ Lname $;
```
3. ```
input #4 ID $
      #1 Fname $ Lname $;
```
4. all of the above
Correct answer: d
The first # n line pointer control enables you to read the values for ID from the fourth record. The second #n line pointer control moves back to the first record and reads the values for Fname and Lname. You can use formatted input, column input, or list input to read the values for ID.
How many records will be read for each execution of the DATA step?
```
data spring.sportswr;
   infile newitems;
   input #1 Item $ Color $
         #3 @8 Price comma6.
         #2 Fabric $
         #3 SKU $ 1-6; run;
```
1. one
2. two
3. three
4. four
Correct answer: c
The first time the DATA step executes, the first three records are read, and an observation is written to the data set. During the second iteration, the next three records are read, and the second observation is written to the data set. During the third iteration, the last three records are read, and the final observation is written to the data set.
Which INPUT statement correctly reads the values for City, State, and Zip?
1. ```
input #3 City $ State $ Zip $;
```
2. ```
input #3 City & $11. State $ Zip $;
```
3. ```
input #3 City $11. +2 State $2. + 2 Zip $5.;
```
4. all of the above
Correct answer: b
A combination of modified and simple list input can used be to read the values for City, State, and Zip. You need to use modified list input to read the values for City, because one of the values is longer than eight characters and contains an embedded blank. You cannot use formatted input, because the values do not begin and end in the same column in each record.

Which program does not read the values in the first record as a variable named Item and the values in the second record as two variables named Inventory and Type?

Chapter 20: Creating a Single Observation from Multiple Records

data perm.supplies;
   infile instock pad;
   input Item & $16. /
         Inventory 2. Type $8.;
run;

data perm.supplies;
   infile instock pad;
   input Item & $16.
         / Inventory 2. Type $8.;
  run;

data perm.supplies;
   infile instock pad;
   input #1 Item & $16.
         Inventory 2. Type $8.;
run;

data perm.supplies;
   infile instock pad;
   input Item & $16.
         #2 Inventory 2. Type $8.;
run;

Correct answer: c

The values for Item in the first record are read, then the following / or # n line pointer control advances the input pointer to the second record, to read the values for Inventory and Type.

Which INPUT statement reads the values for Lname, Fname, Department and Salary (in that order)?
1. ```
input #1 Lname $ Fname $ /
      Department $12. Salary comma10.;
```
2. ```
input #1 Lname $ Fname $ /
      Department : $12. Salary : comma.;
```
3. ```
input #1 Lname $ Fname $
      #2 Department : $12. Salary : comma.;
```
4. both b and c
Correct answer: d
You can use either the / or #n line pointer control to advance the input pointer to the second line, in order to read the values for Department and Salary. The colon (:) modifier is used to read the character values that are longer than eight characters (Department) and the nonstandard data values (Salary).
Which raw data file poses potential problems when you are reading multiple records for each observation?
Correct answer: c
The third raw data file does not contain the same number of records for each observation, so the output from this data set will show invalid data for the ID and salary information in the fourth line.

Chapter 21: Creating Multiple Observations from a Single Record

Which is true for the double trailing at sign (@@)?
1. It enables the next INPUT statement to read from the current record across multiple iterations of the DATA step.
2. It must be the last item specified in the INPUT statement.
3. It is released when the input pointer moves past the end of the record.
4. all of the above
Correct answer: d
The double trailing at sign (@@) enables the next INPUT statement to read from the current record across multiple iterations of the DATA step. It must be the last item specified in the INPUT statement. A record that is being held by the double trailing at sign (@@) is not released until the input pointer moves past the end of the record, or until an INPUT statement that has no line-hold specifier executes.
A record that is being held by a single trailing at sign (@) is automatically released when
1. the input pointer moves past the end of the record.
2. the next iteration of the DATA step begins.
3. another INPUT statement that has an @ executes.
4. another value is read from the observation.
Correct answer: b
Unlike the double trailing at sign (@@), the single trailing at sign (@) is automatically released when control returns to the top of the DATA step for the next iteration. The trailing @ does not toggle on and off. If another INPUT statement that has a trailing @ executes, the holding effect is still on.
Which SAS program correctly creates a separate observation for each block of data?
1. ```
data perm.produce;
   infile fruit;
   input Item $ Variety : $10.;
run;
```
2. ```
data perm.produce;
   infile fruit;
   input Item $ Variety : $10. @;
run;
```
3. ```
data perm.produce;
   infile fruit;
   input Item $ Variety : $10. @@;
run;
```
4. ```
data perm.produce;
   infile fruit @@;
   input Item $ Variety : $10.;
run;
```
Correct answer: c
Each record in this file contains three repeating blocks of data values for Item and Variety. The INPUT statement reads a block of values for Item and Variety, and then holds the current record by using the double-trailing at sign (@@). The values in the program data vector are written to the data set as the first observation. In the next iteration, the INPUT statement reads the next block of values for Item and Variety from the same record.
Which SAS program reads the values for ID and holds the record for each value of Quantity, so that three observations are created for each record?
1. ```
data work.sales;
   infile unitsold;
   input ID $;
   do week=1 to 3;
      input Quantity : comma.;
      output;
   end;
run;
```
2. ```
data work.sales;
   infile unitsold;
   input ID $ @@;
   do week=1 to 3;
      input Quantity : comma.;
      output;
   end;
run;
```
3. ```
data work.sales;
   infile unitsold;
   input ID $ @;
   do week=1 to 3;
      input Quantity : comma.;
      output;
   end;
run;
```
4. ```
data work.sales;
   infile unitsold;
   input ID $ @;
   do week=1 to 3;
      input Quantity : comma. @;
      output;
   end;
run;
```
Correct answer: d
This raw data file contains an ID field followed by repeating fields. The first INPUT statement reads the values for ID and uses the @ line-hold specifier to hold the current record for the next INPUT statement in the DATA step. The second INPUT statement reads the values for Quantity. When all of the repeating fields have been read, control returns to the top of the DATA step, and the record is released.
Which SAS statement repetitively executes several statements when the value of an index variable named count ranges from 1 to 50, incremented by 5?
1. ```
do count=1 to 50 by 5;
```
2. ```
do while count=1 to 50 by 5;
```
3. ```
do count=1 to 50 + 5;
```
4. ```
do while (count=1 to 50 + 5);
```
Correct answer: a
The iterative DO statement begins the execution of a loop based on the value of an index variable. Here, the loop executes when the value of count ranges from 1 to 50, incremented by 5.
Which option below, when used in a DATA step, writes an observation to the data set after each value for Activity has been read?
1. ```
do choice=1 to 3;
   input Activity : $10. @;
   output;
end;
run;
```
2. ```
do choice=1 to 3;
   input Activity : $10. @;
end;
output;
run;
```
3. ```
do choice=1 to 3;
   input Activity : $10. @;
end;
run;
```
4. both a and b
Correct answer: a
The OUTPUT statement must be included in the loop so that each time a value for Activity is read, an observation is immediately written to the data set.
Which SAS statement repetitively executes several statements while the value of Cholesterol is greater than 200?
1. ```
do cholesterol > 200;
```
2. ```
do cholesterol gt 200;
```
3. ```
do while (cholesterol > 200);
```
4. ```
do while cholesterol > 200;
```
Correct answer: c
The DO WHILE statement checks for the condition that Cholesterol is greater than 200. The expression must be enclosed in parentheses. The expression is evaluated at the top of the loop, before any statements are executed. If the condition is true, the DO WHILE loop executes. If the expression is false the first time it is evaluated, then the loop never executes.
Which choice below is an example of a sum statement?
1. ```
totalpay=1;
```
2. ```
totalpay+1;
```
3. ```
totalpay*1;
```
4. ```
totalpay by 1;
```
Correct answer: b
The sum statement adds the result of an expression to a counter variable. So the + sign is an essential part of the sum statement. Here, the value of TotalPay is incremented by 1.

Which program creates the SAS data set Perm.Topstore from the raw data file shown below?

Chapter 21: Creating Multiple Observations from a Single Record

data perm.topstores;
   infile sales98 missover;
   input Store Sales : comma. @;
   do while (sales ne .);
      month + 1;
      output;
      input sales : comma. @;
   end;
run;

data perm.topstores;
   infile sales98 missover;
   input Store Sales : comma. @;
   do while (sales ne .);
      Month=0;
      month + 1;
      output;
      input sales : comma. @;
   end;
run;

data perm.topstores;
   infile sales98 missover;
   input Store Sales : comma.
   Month @;
   do while (sales ne .);
       month + 1;
      input sales : comma. @;
   end;
   output;
run;

data perm.topstores;
   infile sales98 missover;
   input Store Sales : comma. @;
   Month=0;
   do while (sales ne .);
      month + 1;
      output;
      input sales : comma. @;
   end;
run;

Correct answer: b

The sum statement adds the result of an expression to a counter variable. So the + sign is an essential part of the sum statement. Here, the value of TotalPay is incremented by 1.

How many observations are produced by the DATA step that reads this external file?
```
data perm.choices;
   infile icecream missover;
   input Day $ Flavor : $10. @;
   do while (flavor ne ' '),
       output; input flavor : $10. @;
   end;
run;
```
1. 3
2. 5
3. 12
4. 15
Correct answer: c
This DATA step produces one observation for each repeating field. The MISSOVER option in the INFILE statement prevents SAS from reading the next record when missing values occur at the end of a record. Every observation contains one value for Flavor, paired with the corresponding value for ID. Because there are 12 values for Flavor, there are 12 observations in the data set.

Chapter 22: Reading Hierarchical Files

When you write a DATA step to create one observation per detail record you need to
1. distinguish between header and detail records.
2. keep the header record as a part of each observation until the next header record is encountered.
3. hold the current value of each record type so that the other values in the record can be read.
4. all of the above
Correct answer: d
In order to create one observation per detail record, it is necessary to distinguish between header and detail records. Use a RETAIN statement to keep the header record as part of each observation until the next header record is encountered. You also need to use the @ line-hold specifier to hold the current value of each record type so that the other values in the record can be read.
Which SAS statement reads the value for code (in the first field), and then holds the value until an INPUT statement reads the remaining value in each observation in the same iteration of the DATA step?
1. ```
input code $2. @;
```
2. ```
input code $2. @@;
```
3. ```
retain code;
```
4. none of the above
Correct answer: a
An INPUT statement is used to read the value for code. The single @ sign at the end of the INPUT statement holds the current record for a later INPUT statement in the same iteration of the DATA step.
Which SAS statement checks for the condition that Record equals C and executes a single statement to read the values for Amount?
1. ```
if record=c then input @3 Amount comma7.;
```
2. ```
if record='C' then input @3 Amount comma7.;
```
3. ```
if record='C' then do input @3 Amount comma7.;
```
4. ```
if record=C then do input @3 Amount comma7.;
```
Correct answer: b
The IF-THEN statement defines the condition that Record equals C and executes an INPUT statement to read the values for Amount when the condition is true. C must be enclosed in quotation marks and must be specified exactly as shown because it is a character value.
After the value for code is read in the sixth iteration, which illustration of the program data vector is correct?
```
data perm.produce (drop=code);
   infile orders;
   retain Vegetable;
   input code $1. @;
   if code='H' then input @3 vegetable $6.;
   if code='P';
   input @3 Variety : $10. @15 Supplier : $15.;
run;
proc print data=perm.produce;
run;
```
Correct answer: b
The value of Vegetable is retained across iterations of the DATA step. As the sixth iteration begins, the INPUT statement reads the value for code and holds the record, so that the values for Variety and Supplier can be read with an additional INPUT statement.
What happens when the fourth iteration of the DATA step is complete?
```
data perm.orders (drop=type);
   infile produce;
   retain Fruit;
   input type $1. @;
   if type='F' then input @3 fruit $7.;
   if type='V';
   input @3 Variety : $16. @20 Price comma5.;
run;
```
1. All of the values in the program data vector are written to the data set as the third observation.
2. All of the values in the program data vector are written to the data set as the fourth observation.
3. The values for Fruit, Variety, and Price are written to the data set as the third observation.
4. The values for Fruit, Variety, and Price are written to the data set as the fourth observation.
Correct answer: c
This program creates one observation for each detail record. The RETAIN statement retains the value for Fruit as part of each observation until the values for Variety and Price can be read. The DROP= option in the DATA statement prevents the values for type from being written to the data set.
Which SAS statement indicates that several other statements should be executed when Record has a value of A?
1. ```
if record='A' then do;
```
2. ```
if record=A then do;
```
3. ```
if record='A' then;
```
4. ```
if record=A then;
```
Correct answer: a
The IF-THEN statement defines the condition that Record equals A and specifies a simple DO group. The keyword DO indicates that several executable statements follow until the DO group is closed by an END statement. The value A must be enclosed in quotation marks and specified exactly as shown because it is a character value.
Which is true for the following statements (X indicates a header record)?
```
if code='X' then do;
   if _n_ > 1 then output;
   Total=0;
   input Name $ 3-20;
end;
```
1. _N_ equals the number of times the DATA step has begun to execute.
2. When code='X' and _n_ > 1 are true, an OUTPUT statement is executed.
3. Each header record causes an observation to be written to the data set.
4. a and b
Correct answer: d
_N_ is an automatic variable whose value is the number of times the DATA step has begun to execute. The expression _n_ > 1 defines a condition where the DATA step has executed more than once. When the conditions code='X' and _n_ > 1 are true, an OUTPUT statement is executed, and Total is initialized to zero. Thus, each header record except for the first one causes an observation to be written to the data set.
What happens when the condition type='P' is false?
```
if type='P' then input @3 ID $5. @9 Address $20.;
else if type='V' then input @3 Charge 6.;
```
1. The values for ID and Address are read.
2. The values for Charge are read.
3. type is assigned the value of V.
4. The ELSE statement is executed.
Correct answer: d
The condition is false, so the values for ID and Address are not read. Instead, the ELSE statement is executed and defines another condition which may or may not be true.
What happens when last has a value other than zero?
```
data perm.househld (drop=code);
   infile citydata end=last;
   retain Address;
   input type $1. @;
   if code='A' then do;
      if _n_ > 1 then output;
      Total=0;
      input address $ 3-17;
   end;
   else if code='N' then total+1;
   if last then output;
run;
```
1. last has a value of 1.
2. The OUTPUT statement writes the last observation to the data set.
3. The current value of last is written to the DATA set.
4. a and b
Correct answer: d
You can determine when the current record is the last record in an external file by specifying the END= option in the INFILE statement. last is a temporary numeric variable whose value is zero until the last line is read. last has a value of 1 after the last line is read. Like automatic variables, the END= variable is not written to the data set.
Based on the values in the program data vector, what happens next?
```
data work.supplies (drop=type amount);
   infile orders end=last;
   retain Department Extension;
   input type $1. @;
   if type='D' then do;
      if _n_ > 1 then output;
```
```
      Total=0;
      input @3 department $10. @16 extension $5.;
   end;
   else if type='S' then do;
      input @16 Amount comma5.;
      total+amount;
      if last then output; end;
run;
```
1. All the values in the program data vector are written to the data set as the first observation.
2. The values for Department, Total, and Extension are written to the data set as the first observation.
3. The values for Department, Total, and Extension are written to the data set as the fourth observation.
4. The value of last changes to 1.
Correct answer: b
This program creates one observation for each header record and combines information from each detail record into the summary variable, Total. When the value of type is D and the value of _N_ is greater than 1, the OUTPUT statement executes, and the values for Department, Total and Extension are written to the data set as the first observation. The variables _N_ , last, type and Amount are not written to the data set.