Appendix 1. Quiz Answer Keys

Chapter 1: Performing Queries Using PROC SQL 879
Chapter 2: Performing Advanced Queries Using PROC SQL 882
Chapter 3: Combining Tables Horizontally Using PROC SQL 887
Chapter 4: Combining Tables Vertically Using PROC SQL 893
Chapter 5: Creating and Managing Tables Using PROC SQL 901
Chapter 6: Creating and Managing Indexes Using PROC SQL 904
Chapter 7: Creating and Managing Views Using PROC SQL 908
Chapter 8: Managing Processing Using PROC SQL 911
Chapter 9: Introducing Macro Variables 915
Chapter 10: Processing Macro Variables at Execution Time 918
Chapter 11: Creating and Using Macro Programs 922
Chapter 12: Storing Macro Programs 926
Chapter 13: Creating Samples and Indexes 930
Chapter 14: Combining Data Vertically 934
Chapter 15: Combining Data Horizontally 940
Chapter 16: Using Lookup Tables to Match Data 945
Chapter 17: Formatting Data 951
Chapter 18: Modifying SAS Data Sets and Tracking Changes 954
Chapter 19: Introduction to Efficient SAS Programming 957
Chapter 20: Controlling Memory Usage 957
Chapter 21: Controlling Data Storage Space 959
Chapter 22: Using Best Practices 961
Chapter 23: Selecting Efficient Sorting Strategies 962
Chapter 24: Querying Data Efficiently 964

Chapter 1: Performing Queries Using PROC SQL

  1. Which of the clauses in the PROC SQL program below is written incorrectly?

    proc sql;
       select style sqfeet bedrooms
          from choice.houses
          where sqfeet ge 800;
    1. SELECT

    2. FROM

    3. WHERE

    4. both a and c

    Correct answer: a

    The SELECT clause in the program is written incorrectly. Columns that are listed in the clause must be separated by commas, not just blanks.

  2. How many statements does the program below contain?

    proc sql;
       select grapes,oranges,
              grapes + oranges as sumsales
          from sales.produce
          order by sumsales;
    1. two

    2. three

    3. four

    4. five

    Correct answer: a

    There are two statements, the PROC SQL statement and the SELECT statement. The SELECT statement contains three clauses.

  3. Complete the following PROC SQL query to select the columns Address and SqFeet from the table List.Size and to select Price from the table List.Price. (Only the Address column appears in both tables.)

    proc sql;
       _____________
           from list.size,list.price;
    1. select address,sqfeet,price
    2. select size.address,sqfeet,price
    3. select price.address,sqfeet,price
    4. either b or c

    Correct answer: b

    The SELECT clause lists the columns from both tables to be queried. You must use a prefix with the Address column because it appears in both tables. The prefix specifies the table from which you want the column to be read.

  4. Which of the clauses below correctly sorts rows by the values of the columns Price and SqFeet?

    1. order price, sqfeet
    2. order by price,sqfeet
    3. sort by price sqfeet
    4. sort price sqfeet

    Correct answer: b

    The ORDER BY clause specifies how the rows are to be sorted. You follow the keywords ORDER BY by one or more column names or numbers, separated by commas.

  5. Which clause below specifies that the two tables Produce and Hardware be queried? Both tables are located in a library to which the libref Sales has been assigned.

    1. select sales.produce sales.hardware
    2. from sales.produce sales.hardware
    3. from sales.produce,sales.hardware
    4. where sales.produce, sales.hardware

    Correct answer: c

    In the FROM clause, you list the names of the tables to be queried, separated by commas.

  6. Complete the SELECT clause below to create a new column named Profit by subtracting the values of the column Cost from those of the column Price.

    select fruit,cost,price,
        ________________
    1. Profit=price-cost
    2. price-cost as Profit
    3. profit=price-cost
    4. Profit as price-cost

    Correct answer: b

    To create a new column and assign a column alias to the column, you specify the following in the SELECT clause, in the order shown here: an expression, (optionally) the keyword AS, and a column alias. The case that you use when you create the column name is the one that will be displayed in the output.

  7. What happens if you use a GROUP BY clause in a PROC SQL step without a summary function?

    1. The step does not execute.

    2. The first numeric column is summed by default.

    3. The GROUP BY clause is changed to an ORDER BY clause.

    4. The step executes but does not group or sort data.

    Correct answer: c

    The GROUP BY clause is used in queries that include one or more summary functions. If you specify a GROUP BY clause in a query that does not contain a summary function, your clause is changed to an ORDER BY clause.

  8. If you specify a CREATE TABLE statement in your PROC SQL step,

    1. the results of the query are displayed, and a new table is created.

    2. a new table is created, but it does not contain any summarization that was specified in the PROC SQL step.

    3. a new table is created, but no report is displayed.

    4. results are grouped by the value of the summarized column.

    Correct answer: c

    The CREATE TABLE statement enables you to store your results in a SAS table instead of displaying the query results as a report.

  9. Which statement is true regarding the use of the PROC SQL step to query data that is stored in two or more tables?

    1. When you join multiple tables, the tables must contain a common column.

    2. You must specify the table from which you want each column to be read.

    3. The tables that are being joined must be from the same type of data source.

    4. If two tables that are being joined contain a same-named column, then you must specify the table from which you want the column to be read.

    Correct answer: d

    If you are joining two tables that contain a same-named column, then you must use a prefix to specify the table(s) from which you want the column to be read. Remember that if you join tables that don't contain columns that have matching data values, you can produce a huge amount of output. Be sure to specify a WHERE clause to select only the rows that you want.

  10. Which clause in the following program is incorrect?

    proc sql;
       select sex,mean(weight) as avgweight
          from company.employees company.health
          where employees.id=health.id
          group by sex;
    1. SELECT

    2. FROM

    3. WHERE

    4. GROUP BY

    Correct answer: b

    The table names that are specified in the FROM clause must be separated by commas. Note that you can specify columns in the WHERE clause that are not specified in the SELECT clause.

Chapter 2: Performing Advanced Queries Using PROC SQL

  1. Which PROC SQL query will remove duplicate values of MemberType from the query output, so that only the unique values are listed?

    1. proc sql nodup;
         select membertype
            from sasuser.frequentflyers;
    2. proc sql;
         select distinct(membertype)
                as MemberType
            from sasuser.frequentflyers;
    3. proc sql;
         select unique membertype
            from sasuser.frequentflyers
            group by membertype;
    4. proc sql;
         select distinct membertype
            from sasuser.frequentflyers;

    Correct answer: d

    To remove duplicate values from PROC SQL output, you specify the DISTINCT keyword before the column name in the SELECT clause.

  2. Which of the following will cause PROC SQL to list rows that have no data in the Address column?

    1. WHERE address is missing
    2. WHERE address not exists
    3. WHERE address is null
    4. both a and c

    Correct answer: d

    To list rows that have no data (that is, missing data), you can use either of these other conditional operators: IS MISSING or IS NULL. The NOT EXISTS operator is used specifically with a subquery, and resolves to true if the subquery returns no values to the outer query.

  3. You are creating a PROC SQL query that will list all employees who have spent (or overspent) their allotted 120 hours of vacation for the current year. The hours that each employee used are stored in the existing column Spent. Your query defines a new column, Balance, to calculate each employee's balance of vacation hours.

    Which query will produce the report that you want?

    1. proc sql;
         select name, spent,
                120-spent as calculated Balance
            from Company.Absences
            where balance <= 0;
    2. proc sql;
         select name, spent,
                120-spent as Balance
            from Company.Absences
            where calculated balance <= 0;
    3. proc sql;
         select name, spent,
                120-spent as Balance
            from Company.Absences
            where balance <= 0;
    4. proc sql;
         select name, spent,
                120-spent as calculated Balance
            from Company.Absences
            where calculated balance <= 0;

    Correct answer: b

    When a WHERE clause references a new column that was defined in the SELECT clause, the WHERE clause must specify the keyword CALCULATED before the column name.

  4. Consider this PROC SQL query:

    proc sql;
       select flightnumber,
              count(*) as Flights,
              avg(boarded)
              label="Average Boarded"
              format=3.
          from sasuser.internationalflights
          group by flightnumber
          having avg(boarded) > 150;

    The table Sasuser.Internationalflights contains 201 rows, 7 unique values of FlightNumber, 115 unique values of Boarded, and 4 different flight numbers that have an average value of Boarded that is greater than 150. How many rows of output will the query generate?

    1. 150

    2. 7

    3. 4

    4. 1

    Correct answer: c

    To determine how PROC SQL calculates and displays output from summary functions, consider the key factors. This PROC SQL query has a GROUP BY clause, and it does not specify any columns that are outside of summary functions. Therefore, PROC SQL calculates and displays the summary function for each group. There are 7 unique values of FlightNumber, but the HAVING clause specifies only the flights that have an average number of boarded passengers greater than 150. Because 4 of the 7 flight numbers meet this condition, the output will contain 4 rows.

  5. You are writing a PROC SQL query that will display the names of all library cardholders who work as volunteers for the library, and the number of books that each volunteer currently has checked out. You will use one or both of the following tables:

    • Library.Circulation lists the name and contact information for all library cardholders, and the number of books that each cardholder currently has checked out.

    • Library.Volunteers lists the name and contact information for all library volunteers.

    Assume that the values of Name are unique in both tables.

    Which of the following PROC SQL queries will produce your report?

    1. proc sql;
         select name, checkedout
            from library.circulation
            where * in
               (select *
                  from library.volunteers);
    2. proc sql;
         select name, checkedout
            from library.circulation
            where name in
               (select name
                  from library.volunteers);
    3. proc sql;
         select name
            from library.volunteers
            where name, checkedout in
               (select name, checkedout
                  from library.circulation);
    4. proc sql;
         select name, checkedout
            from library.circulation
            where name in
               (select name
                  from library.volunteers;);

    Correct answer: b

    Your PROC SQL query needs to use data from both tables. The outer query reads the name and number of books checked out from Library.Circulation. The multiple-value noncorrelated subquery selects the names of volunteers from Library.Volunteers and passes these names back to the outer query. The outer query then selects data for only the volunteers whose names match names returned by the subquery. The subquery is indented under the outer query's WHERE clause, is enclosed in parentheses, and does not require a semicolon inside the closing parenthesis.

  6. By definition, a noncorrelated subquery is a nested query that

    1. returns a single value to the outer query.

    2. contains at least one summary function.

    3. executes independently of the outer query.

    4. requires only a single value to be passed to it by the outer query.

    Correct answer: c

    A noncorrelated subquery is a nested query that executes independently of the outer query. The outer query passes no values to the subquery.

  7. Which statement about the following PROC SQL query is false?

    proc sql;
       validate
       select name label='Country',
              rate label='Literacy Rate'
          from world.literacy
          where 'Asia' =
             (select continent
                from world.continents
                where literacy.name =
                      continents.country)
          order by 2;
    1. The query syntax is not valid.

    2. The outer query must pass values to the subquery before the subquery can return values to the outer query.

    3. PROC SQL will not execute this query when it is submitted.

    4. After the query is submitted, the SAS log will indicate whether the query has valid syntax.

    Correct answer: a

    The syntax in this PROC SQL query is valid, so the first statement is false. The query contains a correlated subquery, so the second statement is true. The VALIDATE keyword is used after the PROC SQL statement, so the third statement is true. And the last statement correctly indicates that the VALIDATE keyword causes the SAS log to display a special message if the query syntax is valid, or standard error messages if the syntax is not valid.

  8. Consider the following PROC SQL query:

    proc sql;
       select lastname, firstname,
              total, since
          from charity.donors
          where not exists
             (select lastname
                from charity.current
                where donors.lastname =
                      current.lastname);

    The query references two tables:

    • Charity.Donors lists name and contact information for all donors who have made contributions since the charity was founded. The table also contains these two columns: Total, which shows the total dollars given by each donor, and Since, which stores the first year in which each donor gave money.

    • Charity.Current lists the names of all donors who have made contributions in the current year, and the total dollars each has given this year (YearTotal).

    Assume that the values of LastName are unique in both tables.

    The output of this query displays

    1. all donors whose rows do not contain any missing values.

    2. all donors who made a contribution in the current year.

    3. all donors who did not make a contribution in the current year.

    4. all donors whose current year's donation in Charity.Current has not yet been added to Total in Charity.Donors.

    Correct answer: c

    In this PROC SQL query, the outer query uses the operator NOT EXISTS with a correlated subquery. The outer query selects all rows from Charity.Donors whose names do not appear in Charity.Current. In other words, this PROC SQL query output lists all donors who did not make a contribution in the current year.

  9. Which statement about data remerging is true?

    1. When PROC SQL remerges data, it combines data from two tables.

    2. By using data remerging, PROC SQL can avoid making two passes through the data.

    3. When PROC SQL remerges data, it displays a related message in the SAS log.

    4. PROC SQL does not attempt to remerge data unless a subquery is used.

    Correct answer: c

    The third statement about data remerging is correct.

  10. A public library has several categories of books. Each book in the library is assigned to only one category. The table Library.Inventory contains one row for each book in the library. The Checkouts column indicates the number of times that each book has been checked out.

    You want to display only the categories that have an average circulation (number of checkouts) that is less than 2500. Does the following PROC SQL query produce the results that you want?

    proc sql;
    title 'Categories with Average Circulation';
    title2 'Less Than 2500';
       select category,
              avg(checkouts) as AvgCheckouts
          from library.inventory
          having avg(checkouts) < 2500
          order by 1;
    1. No. This query will not run because a HAVING clause cannot contain a summary function.

    2. No. This query will not run because the HAVING clause must include the CALCULATED keyword before the summary function.

    3. No. Because there is no GROUP BY clause, the HAVING clause treats the entire table as one group.

    4. Yes.

    Correct answer: c

    PROC SQL can execute this query, but the query will not produce the results that you want. If you omit the GROUP BY clause in a query that contains a HAVING clause, then the HAVING clause and any summary functions treat the entire table as one group. Without a GROUP BY clause, the HAVING clause in this example calculates the average circulation for the table as a whole (all books in the library), not for each group (each category of books). The output contains either all the rows in the table (if the average circulation for the entire table is less than 2500) or none of the rows in the table (if the average circulation for the entire table is greater than 2500).

Chapter 3: Combining Tables Horizontally Using PROC SQL

  1. A Cartesian product is returned when

    1. join conditions are not specified in a PROC SQL join.

    2. join conditions are not specified in a PROC SQL set operation.

    3. more than two tables are specified in a PROC SQL join.

    4. the keyword ALL is used with the OUTER UNION operator.

    Correct answer: a

    A Cartesian product is returned when join conditions are not specified in a PROC SQL join. In a Cartesian product, each row from the first table is combined with every row from the second table.

  2. Given the PROC SQL query and tables shown below, which output is generated?

    Chapter 3: Combining Tables Horizontally Using PROC SQL
    1. Chapter 3: Combining Tables Horizontally Using PROC SQL
    2. Chapter 3: Combining Tables Horizontally Using PROC SQL
    3. Chapter 3: Combining Tables Horizontally Using PROC SQL
    4. Chapter 3: Combining Tables Horizontally Using PROC SQL

    Correct answer: b

    This PROC SQL query is an inner join. It combines the rows from the first table that match rows from the second table, based on the matching criteria specified in the WHERE clause. Columns are not overlaid, so all columns from the referenced tables (including any columns with duplicate names) are displayed. Any unmatched rows from either table are not displayed.

  3. Given the PROC SQL query and tables shown below, which output is generated?

    Chapter 3: Combining Tables Horizontally Using PROC SQL
    1. Chapter 3: Combining Tables Horizontally Using PROC SQL
    2. Chapter 3: Combining Tables Horizontally Using PROC SQL
    3. Chapter 3: Combining Tables Horizontally Using PROC SQL
    4. Chapter 3: Combining Tables Horizontally Using PROC SQL

    Correct answer: d

    This PROC SQL query is a right outer join, which retrieves all rows that match across tables, based on the join conditions in the ON clause, plus nonmatching rows from the right (second) table.

  4. Which PROC SQL query produces the same output as the query shown here?

    proc sql;
       select a.*,
              duration
          from groupa as a,
               groupb as b
          where a.obs=b.obs;

    Note

    Assume that the table Groupa contains the columns Obs and Med. Groupb contains the columns Obs and Duration. Δ

    1. proc sql;
         select a.obs label='Obs',
          med
          b.obs label='Obs',
          duration
      from groupa as a, groupb as b
      where a.obs=b.obs;
    2. proc sql;
         select coalesce(a.obs, b.obs)
                label='Obs', med, duration
            from groupa as a
            full join
            groupb as b
            on a.obs=b.obs;
    3. proc sql;
         select a.*, duration
            from groupa as a
            left join
            groupb as b
            where a.obs=b.obs;
    4. proc sql;
         select a.*, duration
            from groupa as a
            inner join
            groupb as b
            on a.obs=b.obs;

    Correct answer: d

    There are two valid formats for writing a PROC SQL inner join. The PROC SQL query shown at the top of this question uses the first inner join format, which does not use a keyword to indicate the type of join. The alternate format is similar to an outer join and uses the keyword INNER JOIN.

  5. Which output will the following PROC SQL query generate?

    Chapter 3: Combining Tables Horizontally Using PROC SQL
    1. Chapter 3: Combining Tables Horizontally Using PROC SQL
    2. Chapter 3: Combining Tables Horizontally Using PROC SQL
    3. Chapter 3: Combining Tables Horizontally Using PROC SQL
    4. Chapter 3: Combining Tables Horizontally Using PROC SQL

    Correct answer: a

    This PROC SQL query is a left outer join, which retrieves all rows that match across tables (based on the join conditions in the ON clause), plus nonmatching rows from the left (first) table. No columns are overlaid, so all columns from both tables are displayed.

  6. In order for PROC SQL to perform an inner join,

    1. the tables being joined must contain the same number of columns.

    2. the tables must be sorted before they are joined.

    3. the columns that are specified in a join condition in the WHERE clause must have the same data type.

    4. the columns that are specified in a join condition in the WHERE clause must have the same name.

    Correct answer: c

    Inner joins combine the rows from the first table that match rows from the second table, based on one or more join conditions in the WHERE clause. The columns being matched must have the same data type, but they are not required to have the same name. For joins, the tables being joined can have different numbers of columns, and the rows do not need to be sorted.

  7. Which statement about in-line views is false?

    1. Once defined, an in-line view can be referenced in any PROC SQL query in the current SAS session.

    2. An in-line view can be assigned a table alias but not a permanent name.

    3. In-line views can be combined with tables in PROC SQL joins and set operations.

    4. This PROC SQL query contains an in-line view that uses valid syntax:

      proc sql;
         select name, numvisits
            from (select name, sum(checkin)
                 as numvisits
                    from facility as f, members as m
              
                 where area='POOL' and
                       f.id=m.id
                 group by name)
      where numvisits<=10
      order by 1;

    Correct answer: a

    Unlike a table, an in-line view exists only during query execution. Because it is temporary, an in-line view can be referenced only in the query in which it is defined.

  8. Which PROC SQL query will generate the same output as the DATA step match-merge and PRINT step shown below?

    Chapter 3: Combining Tables Horizontally Using PROC SQL
    1. proc sql;
      title 'Merged';
         select a.g3, z, r
            from table1 as a
            full join
            table2 as b
            on a.g3 = b.g3
            order by 1;
    2. proc sql;
      title 'Merged';
         select a.g3, z, r
            from table1 as a
            full join corr
            table2 as b
            on a.g3 = b.g3
            order by 1;
    3. proc sql;
      title 'Merged';
         select coalesce(a.g3, b.g3)
                label='G3', z, r
            from table1 as a
            full join
            table2 as b
            on a.g3 = b.g3
            order by 1;
    4. proc sql;
      title 'Merged';
         select g3, z, r
            from table1 as a
      full join
      table2 as b
      on a.g3 = b.g3
      order by 1;

    Correct answer: c

    In order to generate the same output as the DATA step and PRINT steps, the PROC SQL full outer join must use the COALESCE function with the duplicate columns specified as arguments.

  9. A PROC SQL inner join can combine

    1. a maximum of 2 tables or in-line views, but multiple joins can be chained together.

    2. a maximum of 32 tables or 2 in-line views.

    3. a maximum of 32 tables, which includes any tables referenced by an in-line view.

    4. a maximum of 2 tables and 32 columns.

    Correct answer: c

    A maximum of 32 tables can be combined in a single inner join. If the join involves views (either in-line views or PROC SQL views), it is the number of tables that underlie the views, not the number of views, that counts towards the limit of 32.

  10. Which statement about the use of table aliases is false?

    1. Table aliases must be used when referencing identical table names from different libraries.

    2. Table aliases can be referenced by using the keyword AS.

    3. Table aliases (or full table names) must be used when referencing a column name that is the same in two or more tables.

    4. Table aliases must be used when using summary functions.

    Correct answer: d

    The use of summary functions does not require the use of table aliases. All of the other statements about table aliases that are shown here are true.

Chapter 4: Combining Tables Vertically Using PROC SQL

  1. Which statement is false with respect to a set operation that uses the EXCEPT, UNION, or INTERSECT set operator without a keyword?

    1. Column names in the result set are determined by the first table.

    2. To be overlaid, columns must be of the same data type.

    3. To be overlaid, columns must have the same name.

    4. By default, only unique rows are displayed in the result set.

    Correct answer: c

    In set operations that use the operator EXCEPT, INTERSECT, or UNION, and no keyword, columns are overlaid based on their position in the SELECT clause. It does not matter whether the overlaid columns have the same name. When columns are overlaid, the column name is taken from the first table that is specified in the SELECT clause.

  2. The keyword ALL cannot be used with which of the following set operators?

    1. EXCEPT

    2. INTERSECT

    3. UNION

    4. OUTER UNION

    Correct answer: d

    By default, when processing a set operation that contains the EXCEPT, INTERSECT, and UNION set operators, PROC SQL makes an extra pass through the data to eliminate duplicate rows. The keyword ALL is used to suppress that additional pass through the tables, allowing duplicate rows to appear in the result set. Because the OUTER UNION set operator displays all rows, the keyword ALL is invalid and cannot be used with OUTER UNION.

  3. Which PROC SQL step combines the tables Summer and Winter to produce the output displayed below?

    Chapter 4: Combining Tables Vertically Using PROC SQL
    1. proc sql;
         select *
            from summer
         intersect all
         select *
            from winter;
    2. proc sql;
         select *
            from summer
         outer union
         select *
            from winter;
    3. proc sql;
         select *
            from summer
         union corr
         select *
            from winter;
    4. proc sql;
         select *
            from summer
         union
         select *
            from winter;

    Correct answer: d

    The output shown above contains all rows that are unique in the combined set of rows from both tables, and the columns have been overlaid by position. This output is generated by a set operation that uses the set operator UNION without keywords.

  4. Which PROC SQL step combines tables but does not overlay any columns?

    1. proc sql;
         select *
            from groupa
         outer union
         select *
            from groupb;
    2. proc sql;
         select *
            from groupa as a
         outer union corr
         select *
            from groupb as b;
    3. proc sql;
         select coalesce(a.obs, b.obs)
                label='Obs', med, duration
            from groupa as a
            full join
            groupb as b
            on a.obs=b.obs;
    4. proc sql;
         select *
            from groupa as a
         intersect
         select *
            from groupb as b;

    Correct answer: a

    The PROC SQL set operation that uses the set operator OUTER UNION without a keyword is the only code shown that does not overlay any columns in output.

  5. Which statement is false regarding the keyword CORRESPONDING?

    1. It cannot be used with the keyword ALL.

    2. It overlays columns by name, not by position.

    3. When used in EXCEPT, INTERSECT, and UNION set operations, it removes any columns not found in both tables.

    4. When used in OUTER UNION set operations, it causes same-named columns to be overlaid.

    Correct answer: a

    The keyword CORRESPONDING (CORR) can be used alone or together with the keyword ALL.

  6. Which PROC SQL step generates the following output from the tables Dogs and Pets?

    Chapter 4: Combining Tables Vertically Using PROC SQL
    1. proc sql;
         select name, price
            from pets
         except all
         select *
            from dogs;
    2. proc sql;
         select name, price
            from pets
         except
         select *
            from dogs;
    3. proc sql;
         select name, price
            from pets
         except corr all
         select *
            from dogs;
    4. proc sql;
         select *
            from dogs
         except corr
         select name, price
            from pets;

    Correct answer: b

    This PROC SQL output includes all rows from the table Pets that do not appear in the table Dogs. No duplicates are displayed. A PROC SQL set operation that contains the set operator EXCEPT without keywords produces these results.

  7. The PROG1 and PROG2 tables list students who took the PROG1 and PROG2 courses, respectively. Which PROC SQL step will give you the names of the students who took only the PROG1 class?

    Chapter 4: Combining Tables Vertically Using PROC SQL
    1. proc sql;
         select fname, lname
            from prog1
         intersect
         select fname, lname
            from prog2;
    2. proc sql;
         select fname, lname
            from prog1
         except all
         select fname, lname
            from prog2;
    3. proc sql;
         select *
            from prog2
         intersect corr
         select *
            from prog1;
    4. proc sql;
         select *
            from prog2
         union
         select *
            from prog1;

    Correct answer: b

    The set operator EXCEPT returns all the rows in the first table that do not appear in the second table. The keyword ALL suppresses the extra pass that PROC SQL makes through the data to eliminate duplicate rows. The EXCEPT operator when used alone will also produce the output specified in the question.

  8. Which PROC SQL step will return the names of all the students who took PROG1, PROG2, or both classes?

    Chapter 4: Combining Tables Vertically Using PROC SQL
    1. proc sql;
         select fname, lname
            from prog1
         intersect
         select fname, lname
            from prog2;
    2. proc sql;
         select fname, lname
            from prog1
         outer union corr
         select fname, lname
            from prog2;
    3. proc sql;
         select fname, lname
            from prog1
         union
         select fname, lname
            from prog2;
    4. proc sql;
         select fname, lname
            from prog1
         except corr
         select fname, lname
            from prog2;

    Correct answer: c

    The set operator UNION returns all rows that are unique in the combined set of rows from both tables.

  9. Which PROC SQL step will return the names of all the students who took both the PROG1 and PROG2 classes?

    Chapter 4: Combining Tables Vertically Using PROC SQL
    1. proc sql;
         select fname, lname
            from prog1
         union
         select fname, lname
            from prog2;
    2. proc sql;
         select fname, lname
            from prog1
         except corr
         select fname, lname
            from prog2;
    3. proc sql;
         select fname, lname
            from prog1
         intersect all
         select fname, lname
            from prog2;
    4. proc sql;
         select fname, lname
            from prog1
         union corr
         select fname, lname
            from prog2;

    Correct answer: c

    The set operator INTERSECT returns all rows that are common to both tables. Specifying the keyword ALL suppresses PROC SQL's additional pass through the data to eliminate duplicate rows.

  10. Which PROC SQL step will generate the same results as the following DATA step?

    Chapter 4: Combining Tables Vertically Using PROC SQL
    1. proc sql;
         select fname, lname
            from prog1
         outer union corr
         select fname, lname
            from prog2
         order by lname;
    2. proc sql;
         select fname, lname
            from prog1
         union
         select fname, lname
            from prog2
         order by lname;
    3. proc sql;
         select fname, lname
            from prog2
         outer union
         select fname, lname
            from prog1
         order by lname;
    4. proc sql;
         select fname, lname
            from prog2
         union corr
         select fname, lname
            from prog1
         order by lname;

    Correct answer: a

    The DATA step returns all rows from the first table along with all rows from the second table, maintaining the order specified in the BY statement. Same-named columns are overlaid by default. The set operator OUTER UNION returns all rows from both tables. The CORR keyword causes same-named columns to be overlaid. The ORDER BY clause causes the result rows to be ordered by values of the specified column (LName).

Chapter 5: Creating and Managing Tables Using PROC SQL

  1. Which of the following PROC SQL steps creates a new table by copying only the column structure (but not the rows) of an existing table?

    1. proc sql;
         create table work.newpayroll as
            select *
               from sasuser.payrollmaster;
    2. proc sql;
         create table work.newpayroll
            like sasuser.payrollmaster;
    3. proc sql;
         create table work.newpayroll
            copy sasuser.payrollmaster;
    4. proc sql;
         create table work.newpayroll
            describe sasuser.payrollmaster;

    Correct answer: b

    The CREATE TABLE statement that includes a LIKE clause copies the column names and attributes from an existing table into a new table. No rows of data are inserted.

  2. Which of the following PROC SQL steps creates a table that contains rows for the level-1 flight attendants only?

    1. proc sql;
         create table work.newpayroll as
            select *
               from sasuser.payrollmaster
               where jobcode='FA1';
    2. proc sql;
         create work.newpayroll as
            select *
               from sasuser.payrollmaster
               where jobcode='FA1';
    3. proc sql;
         create table work.newpayroll
            copy sasuser.payrollmaster
               where jobcode='FA1';
    4. proc sql;
         create table work.newpayroll as
                sasuser.payrollmaster
            where jobcode='FA1';

    Correct answer: a

    The CREATE TABLE statement that includes the AS keyword and query clauses creates a table and loads the results of the query into the new table. The WHERE clause selects only the rows for the level-1 flight attendants.

  3. Which of the following statements is true regarding the UNDO_POLICY=REQUIRED option?

    1. It must be used with the REQUIRED integrity constraint.

    2. It ignores the specified integrity constraints if any of the rows that you want to insert or update do not meet the constraint criteria.

    3. It restores your table to its original state if any of the rows that you try to insert or update do not meet the specified integrity constraint criteria.

    4. It allows rows that meet the specified integrity constraint criteria to be inserted or updated, but rejects rows that do not meet the integrity constraint criteria.

    Correct answer: c

    UNDO POLICY=REQUIRED is the default setting for PROC SQL. This setting undoes all inserts or updates if 1 or more rows violate the integrity constraint criteria, and restores the table to its original state before the inserts or updates.

  4. Which of the following is not a type of integrity constraint?

    1. CHECK

    2. NULL

    3. UNIQUE

    4. PRIMARY KEY

    Correct answer: b

    The NOT NULL integrity constraint specifies that data is required and cannot have a null (missing) value.

  5. Which of the following PROC SQL steps deletes rows for all frequent-flyer program members who traveled less than 10,000 miles?

    1. proc sql;
         delete rows
            from work.frequentflyers
            where milestraveled < 10000;
    2. proc sql;
         drop rows
            from work.frequentflyers
            where milestraveled < 10000;
    3. proc sql;
         drop table
            from work.frequentflyers
            where milestraveled < 10000;
    4. proc sql;
         delete
            from work.frequentflyers
            where milestraveled < 10000;

    Correct answer: d

    The DELETE statement deletes rows that are specified in the WHERE clause from the table. If no WHERE clause is specified, all rows are deleted. The DROP TABLE statement drops (deletes) an entire table; the syntax shown in option c is not valid.

  6. Which of the following PROC SQL steps gives bonuses (in points) to frequent-flyer program members as follows:

    • a 50% bonus for members who traveled less than 10,000 miles

    • a 100% bonus for members who traveled 10,000 miles or more?

    1. proc sql;
         update work.frequentflyers
         set pointsearned=pointsearned*
            case if milestraveled < 10000
                    then 1.5
                 if milestraveled >= 10000
                    then 2
                 else 1
            end;
    2. proc sql;
         update work.frequentflyers
         set pointsearned=pointsearned*
            case when milestraveled < 10000
                      then 1.5
                 when milestraveled >= 10000
                      then 2
                 else 1
            end;
    3. proc sql;
         update work.frequentflyers
         set pointsearned=pointsearned*
            case if milestraveled < 10000
                    then pointsearned*1.5
                 if milestraveled >= 10000
                    then pointsearned*2
                 else 1
            end;
    4. proc sql;
         update work.frequentflyers
         set pointsearned=pointsearned*
            case if milestraveled < 10000
                    then pointsearned*1.5
                 if milestraveled >= 10000
                    then pointsearned*2
                 else pointsearned*1
            end;

    Correct answer: b

    The UPDATE statement that includes a SET clause is used to modify rows in a table. WHEN-THEN clauses in the CASE expression enable you to update a column value based on specified criteria.

  7. Which of the following statements is used to add new rows to a table?

    1. INSERT

    2. LOAD

    3. VALUES

    4. CREATE TABLE

    Correct answer: a

    The INSERT statement is used to insert new rows into a new or existing table. There is no LOAD statement in PROC SQL, VALUES is a clause, and the CREATE TABLE statement is used to create a table.

  8. Which of the following statements regarding the ALTER TABLE statement is false?

    1. It allows you to update column attributes.

    2. It allows you to add new columns in your table.

    3. It allows you to drop columns in your table.

    4. It allows you to change a character column to a numeric column.

    Correct answer: d

    The ALTER TABLE statement is used to modify attributes of existing columns (include the MODIFY clause), add new column definitions (include the ADD clause), or delete existing columns (include the DROP clause).

  9. Which of the following displays the structure of a table in the SAS log?

    1. proc sql;
         describe as
            select *
               from sasuser.payrollmaster;
    2. proc sql;
         describe contents sasuser.payrollmaster;
    3. proc sql;
         describe table sasuser.payrollmaster;
    4. proc sql;
         describe * from sasuser.payrollmaster;

    Correct answer: c

    The DESCRIBE TABLE statement lists the column attributes for a specified table.

  10. Which of the following creates an empty table that contains the 2 columns FullName and Age?

    1. proc sql;
         create table work.names
            (FullName char(25), Age num);
    2. proc sql;
         create table work.names as
            (FullName char(25), Age num);
    3. proc sql;
         create work.names
            (FullName char(25), Age num);
    4. proc sql;
         create table work.names
            set (FullName char(25), Age num);

    Correct answer: a

    The CREATE TABLE statement can include column specifications to create an empty table. The entire group of column specifications must be enclosed in a single set of parentheses. You must list each column's name, data type, and (for character columns) length. The length is specified as an integer in parentheses. Multiple column specifications must be separated by commas.

Chapter 6: Creating and Managing Indexes Using PROC SQL

  1. Which of the following will create an index on the column EmpID for the table Sasuser.Staffmaster?

    1. proc sql;
         create simple index(empid)
            on sasuser.staffmaster;
    2. proc sql;
         create empid index
            on sasuser.staffmaster(empid);
    3. proc sql;
         create simple index
            on empid from sasuser.staffmaster;
    4. proc sql;
         create index empid
            on sasuser.staffmaster(empid);

    Correct answer: d

    The index specified above is based on one column, so it is a simple index. In the CREATE INDEX statement, you specify the index name after the keywords CREATE INDEX. You do not include a keyword to specify that this is a simple index. The name of the key column is specified in parentheses after the table name. The name of a simple index must be the same as the name of the key column.

  2. Which keyword must you add to your index definition in the CREATE INDEX statement to ensure that no duplicate values of the key column can exist?

    1. KEY

    2. UNIQUE

    3. NODUPS

    4. NODUPKEY

    Correct answer: b

    To create a unique index, the UNIQUE keyword is added to the CREATE INDEX statement, between the keywords CREATE and INDEX.

  3. Which of the following will create a composite index for the table Sasuser.Flightdelays?(Sasuser.Flightdelays contains the following columns: Date, FlightNumber, Origin, Destination, DelayCategory, DestinationType, DayOfWeek, and Delay.)

    1. proc sql;
         create index destination
            on sasuser.flightdelays(flightnumber,
                                    destination);
    2. proc sql;
         create composite index places
            on sasuser.flightdelays (flightnumber,
                                     destination);
    3. proc sql;
         create index on flightnumber,destination
            from sasuser.flightdelays (places);
    4. proc sql;
         create index places
            on sasuser.flightdelays (flightnumber,
                                     destination);

    Correct answer: d

    A composite index is based on two or more columns. In the CREATE INDEX statement, you specify the index name after the keywords CREATE INDEX. You do not include a keyword to specify that this is a composite index. The names of the key columns are specified in parentheses after the table name. The name of a composite index cannot be the same as the name of any columns in the table.

  4. Which of the following will write a message to the SAS log that shows whether PROC SQL has used an index?

    1. options msglevel=i;
      proc sql;
         select *
            from sasuser.internationalflights
            where date between '01mar2000'd
                  and '07mar2000'd;
    2. options index=yes;
      proc sql;
         select *
            from sasuser.internationalflights
            where date between '01mar2000'd
                  and '07mar2000'd;
    3. proc sql;
         select * (idxwhere=yes)
            from sasuser.internationalflights
            where date between '01mar2000'd
                  and '07mar2000'd;
    4. proc sql;
         select * (msglevel=i)
            from sasuser.internationalflights
            where date between '01mar2000'd
                  and '07mar2000'd;

    Correct answer: a

    Specifying the option MSGLEVEL=I causes informational messages about index usage to be written to the SAS log.

  5. Which of the following will drop (delete) an index from a table?

    1. proc sql;
         drop composite index flights
            from sasuser.marchflights;
    2. proc sql;
         delete index flights
            on sasuser.staffmaster(flightnumber, date);
    3. proc sql;
         drop index flights
            from sasuser.marchflights;
    4. proc sql;
         delete index
            on sasuser.marchflights(flightnumber,
                                    flightdate);

    Correct answer: c

    The DROP INDEX statement drops one or more specified indexes from a table. You specify the name of each index to be dropped after the keywords DROP INDEX. The table name is specified after the keyword FROM. The type of index and the names of the indexed columns are not specified in the statement.

  6. Which of the following statements will show you all the indexes that are defined for a table?

    1. DESCRIBE INDEX

    2. DESCRIBE TABLE

    3. SELECT

    4. IDXNAME

    Correct answer: b

    The DESCRIBE TABLE statement lists all indexes for one or more tables that you specify, along with other information about the table(s).

  7. What is the purpose of specifying the data set option IDXWHERE=YES?

    1. It forces SAS to use the best available index to process the WHERE expression.

    2. It creates an index from the expression in the WHERE clause.

    3. It writes messages about index usage to the SAS log.

    4. It stops SAS from using any index.

    Correct answer: a

    The IDXWHERE=YES data set option tells SAS to use the best available index, even if the index does not optimize performance.

  8. Which of the following is false regarding the use of an index?

    1. Equijoins can be performed without internal sorts.

    2. Indexes provide fast access to a small subset of data.

    3. Indexes can be created for numeric columns only.

    4. Indexes can enforce uniqueness.

    Correct answer: c

    Indexes can be created on either character or numeric columns.

  9. Using an index is not likely to optimize a PROC SQL query in which of the following situations?

    1. The query contains an IN subquery that references the key column.

    2. The key column is specified in a WHERE clause expression that contains a comparison operator, the TRIM or SUBSTR function, the CONTAINS operator, or the LIKE operator.

    3. The query is an equijoin, and all the columns in the join expression are indexed in one of the tables being joined.

    4. The key column is specified only in a SELECT clause.

    Correct answer: d

    Using an index will optimize specific classes of PROC SQL queries. A query in which the key column is specified only in a SELECT clause is not one of these queries.

  10. Which of the following is false regarding the IDXNAME= data set option?

    1. The specified index must exist.

    2. The specified index must be suitable by having at least its first or only column match a condition in the WHERE expression.

    3. The option allows you to create and name an index on the table.

    4. The option directs SAS to use an index that you specify.

    Correct answer: c

    The IDXNAME= data set option directs PROC SQL to use an index that you specify. The specified index must exist and must be suitable by having at least its first or only column match the condition in the WHERE expression.

Chapter 7: Creating and Managing Views Using PROC SQL

  1. Which of the following statements is false regarding a PROC SQL view?

    1. A view cannot be used in a join.

    2. A view accesses the most current underlying data.

    3. A view follows the same naming conventions as a table.

    4. A view can be used in SAS programs in place of an actual SAS data file.

    Correct answer: a

    A PROC SQL view accesses the most current underlying data and can be joined with tables or other views. In addition, a PROC SQL view can

    • be used in SAS programs in place of an actual SAS data file

    • be derived from one or more tables, PROC SQL views, or DATA step views.

  2. Which of the following statements describes an advantage of using a PROC SQL view?

    1. Views often save space, because a view is usually quite small compared with the data that it accesses.

    2. Views prevent users from continually submitting queries to omit unwanted columns or rows.

    3. Views hide complex joins or queries from users.

    4. all of the above

    Correct answer: d

    PROC SQL views are useful because they

    • often save space (a view is usually quite small compared with the data that it accesses)

    • prevent users from continually submitting queries to omit unwanted columns or rows

    • hide complex joins or queries from users.

    In addition, PROC SQL views

    • ensure that input data sets are always current, because data is derived from tables at execution time

    • can be used to shield sensitive or confidential columns from users while enabling the same users to view other columns in the same table.

  3. Which PROC SQL step creates a view that queries the table Sasuser.Payrollmaster?

    1. proc sql;
         insert into sasuser.newview
            select * from sasuser.payrollmaster;
    2. proc sql;
         create sasuser.newview as
            select * from sasuser.payrollmaster;
    3. proc sql;
         create view sasuser.newview as
            select * from sasuser.payrollmaster;
    4. proc sql;
         select * from sasuser.payrollmaster
            into view sasuser.newview;

    Correct answer: c

    You use the CREATE VIEW statement to create a view. The keywords CREATE VIEW are followed by the name of the view and the keyword AS.

  4. Which of the following PROC SQL steps enables you to see a description of the view definition?

    1. proc sql;
         select * from sasuser.payrollmasterv;
    2. proc sql;
         describe view sasuser.payrollmasterv;
    3. proc sql;
         list sasuser.payrollmasterv;
    4. proc sql;
         contents view=sasuser.payrollmasterv;

    Correct answer: b

    The DESCRIBE VIEW statement displays the view definition in the SAS log.

  5. Which PROC SQL step correctly references the view Data.Empview?

    1. proc sql;
         select *
            from data.empview;
    2. proc sql;
         select *
            from view data.empview;
    3. proc sql;
         select view *
            from data.empview;
    4. proc sql;
         select *
            from data
            where view='empview';

    Correct answer: a

    A view can be used in a PROC SQL step just as you would use an actual SAS table.

  6. Which of the following PROC SQL steps correctly embeds a LIBNAME statement with a view definition?

    1. proc sql;
         insert into sasuser.newview
            select * from airline.supervisors
               libname airline 'c:mysql';
    2. proc sql;
         create view sasuser.newview as
            from airline.supervisors
               embed libname airline 'c:mysql';
    3. proc sql;
         using airline 'c:mysql';
            insert into sasuser.newview
               select * from airline.supervisors;
    4. proc sql;
         create view sasuser.newview as
            select * from airline.supervisors
               using libname airline 'c:mysql';

    Correct answer: d

    The USING clause enables you to embed a LIBNAME statement in your view definition. The USING clause must be the last clause in the CREATE VIEW statement.

  7. PROC SQL views can access data from

    1. a SAS data file.

    2. another PROC SQL view.

    3. a relational database table.

    4. all of the above

    Correct answer: d

    PROC SQL views can access data from a SAS data file, a DATA step view, a PROC SQL view, or a relational database table.

  8. When you are working with PROC SQL views, it is best to

    1. avoid using an ORDER BY clause in a view.

    2. avoid creating views that are based on tables whose structure might change.

    3. specify a one-level name in the FROM clause if the view resides in the same SAS data library as the contributing table(s).

    4. all of the above

    Correct answer: d

    When you are working with PROC SQL views, it is best to

    • avoid using an ORDER BY clause in a view. If you specify an ORDER BY clause, the data must be sorted each time the view is referenced.

    • avoid creating views that are based on tables whose structure might change. A view is no longer valid when it references a nonexistent column.

    • specify a one-level name in the FROM clause if the view resides in the same SAS data library as the contributing table(s). Using a one-level name in the FROM clause prevents you from having to change the view if you assign a different libref to the SAS data library that contains the view and its contributing table or tables.

  9. You can update the data underlying PROC SQL view using the INSERT, DELETE, and UPDATE statements under which of the following conditions:

    1. The view is joined or linked to another table.

    2. The view contains a subquery.

    3. The view contains a WHERE clause.

    4. all of the above

    Correct answer: c

    You can update a PROC SQL view provided that the view does not join or link to another table, the view does not have a subquery, or you try to update a derived column. You can update a view that contains a WHERE clause. The WHERE clause can be in the UPDATE clause or in the view. You cannot update a view that contains any other clause such as an ORDER BY or a HAVING clause.

  10. Which of the following programs drops (deletes) a view?

    1. proc sql;
         delete sasuser.newview;
    2. proc sql;
         drop view sasuser.newview;
    3. proc sql;
         erase view sasuser.newview;
    4. proc sql;
         remove newview from sasuser;

    Correct answer: b

    The DROP VIEW statement drops a view from the specified library.

Chapter 8: Managing Processing Using PROC SQL

  1. PROC SQL options are specified in

    1. the PROC SQL statement.

    2. an OPTIONS statement.

    3. a SELECT statement.

    4. the OPTIONS procedure.

    Correct answer: a

    PROC SQL options are specified in the PROC SQL statement. After you specify an option, it remains in effect until you change it or you re-invoke PROC SQL.

  2. Which of the following options restricts the number of rows that PROC SQL takes as input from any single source?

    1. OUTOBS=

    2. INOBS=

    3. OBS=

    4. none of the above

    Correct answer: b

    The INOBS= option restricts the number of rows that PROC SQL takes as input from any single source. The INOBS= option is similar to the SAS system option OBS= and is useful for debugging queries on large tables. The OUTOBS= option restricts the number of rows that PROC SQL displays or writes to a table.

  3. Which PROC SQL step creates the output shown below?

    Chapter 8: Managing Processing Using PROC SQL
    Chapter 8: Managing Processing Using PROC SQL
    1. proc sql nonumber outobs=10;
         select *
            from sasuser.flightattendants
            where jobcode='FA1';
         select *
            from sasuser.flightattendants
            where jobcode='FA2';
    2. proc sql number;
         select *
            from sasuser.flightattendants
            where jobcode='FA1';
      reset nonumber outobs=10;
         select *
            from sasuser.flightattendants
            where jobcode='FA2';
    3. proc sql nonumber;
         select *
            from sasuser.flightattendants
            where jobcode='FA1';
      reset number outobs=10;
         select *
            from sasuser.flightattendants
            where jobcode='FA2';
    4. proc sql;
         select *
            from sasuser.flightattendants
            where jobcode='FA1';
      reset outobs=10;
         select *
             from sasuser.flightattendants
             where jobcode='FA2';

    Correct answer: c

    After you specify an option, it remains in effect until you change it or you re-invoke PROC SQL. You can use the RESET statement to add, drop, or change PROC SQL options without re-invoking the SQL procedure. In the correct answer, the RESET statement adds the NUMBER option and the OUTOBS= option. The resulting output lists the first 10 rows in the table Sasuser.Flightattendants where the value of Jobcode equals FA2 and includes a column named Row.

  4. Which of the following options does not affect the appearance of HTML, PDF, or RTF output?

    1. NUMBER | NONUMBER

    2. DOUBLE | NODOUBLE

    3. FLOW | NOFLOW | FLOW=n | FLOW=n m

    4. b and c

    Correct answer: d

    The DOUBLE | NODOUBLE option specifies whether PROC SQL output is double-spaced in listing output. The FLOW | NOFLOW | FLOW=n | FLOW=n m option controls the appearance of wide character columns in listing output. Neither option affects the appearance of HTML output.

  5. Which of the following statements is true regarding the STIMER option in PROC SQL?

    1. The STIMER option in PROC SQL writes timing information for each statement to the SAS log.

    2. The STIMER option in PROC SQL writes only cumulative timing information for the entire procedure to the SAS log.

    3. When using the STIMER option in PROC SQL, the SAS system option STIMER must also be in effect.

    4. a and c

    Correct answer: d

    The STIMER | NOSTIMER option in PROC SQL specifies whether PROC SQL writes timing information for each statement to the SAS log, instead of as a cumulative value for the entire procedure. NOSTIMER is the default. In order to use the STIMER option in PROC SQL, the SAS system option STIMER (the default) must also be in effect. If you use the system option alone, you will receive timing information for the entire procedure, not on a statement-by-statement basis.

  6. A Dictionary table contains

    1. information about SAS data libraries.

    2. information about SAS data sets.

    3. information about SAS macros.

    4. all of the above

    Correct answer: d

    A Dictionary table is a special, read-only SAS data view that contains information about SAS data libraries, SAS data sets, SAS macros, and external files that are in use or available in the current SAS session. A Dictionary table also contains the settings for SAS system options that are currently in effect.

  7. Dictionary tables are

    1. created each time they are referenced in a SAS program.

    2. updated automatically.

    3. limited to read-only access.

    4. all of the above

    Correct answer: d

    Dictionary tables are created each time they are referenced in a SAS program, updated automatically, and limited to read-only access. Accessing a Dictionary table causes SAS to determine the current state of the SAS session and return the information that you want.

  8. Dictionary tables can be accessed

    1. by running a PROC SQL query against the table, using the Dictionary libref.

    2. by referring to the PROC SQL view of the table that is stored in the Sashelp library.

    3. by referring to the PROC SQL view of the table that is stored in the Sasuser library.

    4. a and b

    Correct answer: d

    Dictionary tables can be accessed by running a PROC SQL query against the table, using the Dictionary libref. Though SAS librefs are usually limited to eight characters, Dictionary is an automatically assigned, reserved word. You can also access a Dictionary table by referring to the PROC SQL view of the table that is stored in the Sashelp library.

  9. Which of the following PROC SQL steps displays information about the Dictionary table Dictionary.Titles?

    1. proc sql;
         describe dictionary.titles;
    2. proc sql;
         describe table dictionary.titles;
    3. proc sql describe table dictionary.titles;
    4. proc sql describe dictionary titles;

    Correct answer: b

    To see how a Dictionary table is defined, submit a DESCRIBE TABLE statement. The DESCRIBE TABLE statement writes a CREATE TABLE statement to the SAS log for the table specified in the DESCRIBE TABLE statement.

  10. Which of the following PROC SQL steps displays the name (Memname), modification date (Modate), number of variables (Nvar), and the number of observations (Nobs) for each table in the Sasuser library?

    1. proc sql;
         select memname, modate, nvar, nobs
            from dictionary.tables
            where libname='SASUSER';
    2. proc sql;
         select memname, modate, nvar, nobs
            from dictionary.tables
            where libname='Sasuser';
    3. proc sql;
         select memname, modate, nvar, nobs
            from 'SASUSER'
            where table=dictionary.tables;
    4. proc sql;
         select SASUSER
            from dictionary.tables
            where cols= 'memname, modate, nvar, nobs';

    Correct answer: a

    To display information about the files in a specific library, specify the column names in a SELECT statement and the Dictionary table name in the FROM clause. The library name in the WHERE clause must be specified in uppercase letters because that is how it is stored in SAS and it must be enclosed in quotation marks.

Chapter 9: Introducing Macro Variables

  1. Which of the following statements is false?

    1. A macro variable can be defined and referenced anywhere in a SAS program except within data lines.

    2. Macro variables are always user-defined, and their values remain constant until they are changed by the user.

    3. Macro variables are text strings that are independent of SAS data sets.

    4. The values of macro variables can be up to 65,534 characters long.

    Correct answer: b

    Macro variables are always text strings that are independent of SAS data sets. The value of a macro variable can be up to 65,534 characters long, and the name of a macro variable can be up to 32 characters long. A macro variable can be defined or referenced anywhere in a SAS program except within data lines. There are two types of macro variables: automatic and user-defined.

  2. Which of the following statements will generate an error message while trying to display the value of the macro variable month in the SAS log?

    1. title "Total Sales for '&month' ";
    2. title "Total Sales for 'month'";
    3. title "Total Sales for &month";
    4. title Total Sales for "&month";

    Correct answer: c

    To reference a macro variable, you precede the name with an ampersand. You do not need to enclose the macro variable reference in quotation marks.

  3. Which of the following statements will generate an error message while trying to display the value of the macro variable month in the SAS log?

    1. options &month;
    2. %PUT &month;
    3. options symbolgen;
    4. %PUT the macro variable MONTH has the value &month.;

    Correct answer: a

    There are two ways to display the value of a macro variable in the SAS log: you can turn on the SYMBOLGEN system option to list the values of all macro variables that are used, or you can use the %PUT statement to write specific text, including macro variable values, to the log.

  4. Which statement will create a macro variable named location that has the value storage?

    1. &let location = storage;
    2. let &location = storage;
    3. %let location = "storage";
    4. %let location = storage;

    Correct answer: d

    You use the %LET statement to define a macro variable. You do not need to enclose the value in quotation marks. If you do include quotation marks in the assigned value for a macro variable, the quotation marks will be stored as part of the value.

  5. What value will these statements assign to the macro variable reptitle:

    %let area = "Southeast";
    %let reptitle = *   Sales Report for &area Area   *;
    1. Sales Report for Southeast Area

    2. Sales Report for "Southeast" Area

    3. *Sales Report for "Southeast" Area*

    4. * Sales Report for "Southeast" Area *

    Correct answer: d

    Macro variables are stored as character strings. Quotation marks and most special characters are stored exactly as they are assigned, but leading blanks are stripped from assigned values. You can also include references to other macro variables within %LET statements.

  6. Assuming that you began your SAS session today, which of the following statements correctly sets the macro variable currdate to today's date:

    1. %let currdate = %sysfunc(today(), worddate.);
    2. %let currdate = &sysdate9;
    3. %let currdate = %sysfunc(date());
    4. all of the above

    Correct answer: d

    SYSDATE9 is an automatic macro variable that stores the date that your SAS session began in ddmmmyyyy format. You can use the %SYSFUNC function along with any DATA step function, so both the TODAY() function and the DATE() function will result in the current date.

  7. Macro character functions

    1. can be used to manipulate character strings in macro variable values.

    2. have the same basic syntax as the corresponding DATA step functions and yield similar results.

    3. all of the above

    4. none of the above

    Correct answer: c

    Macro character functions such as %UPCASE and %SUBSTR enable you to perform character manipulations on your macro variable values.

  8. The four types of tokens that SAS recognizes are

    1. expressions, literals, names, and special characters.

    2. literals, names, numbers, and special characters.

    3. expressions, names, numbers, and special characters.

    4. expressions, literals, numbers, and special characters.

    Correct answer: b

    The word scanner recognizes four types of tokens. Expressions are not a type of token.

  9. What are the resulting values for the macro variables that are defined here?

    %let month1 = June;
    %let month2 = July;
    %let period1 = &month1&month2;
    %let period2 = May&month1;
    %let period3 = &month2.Aug;
    1. month1 June

      month2 July

      period1 June July

      period2 May June

      period3 July Aug

    2. month1 June

      month2 July

      period1 JuneJuly

      period2 MayJune

      period3 July.Aug

    3. month1 June

      month2 July

      period1 JuneJuly

      period2 MayJune

      period3 JulyAug

    4. month1 June

      month2 July

      period1 junejuly

      period2 Mayjune

      period3 julyaug

    Correct answer: c

    You can combine macro variable references with text to create new text strings. If you precede a macro variable with text, the ampersand at the beginning of the macro variable name signals the end of the text and the beginning of a macro variable name. If you want text to follow the macro variable value, you must signal the end of the macro variable name with a period.

  10. Which of the following correctly produces a title in which the current date is left justified in order to remove extra blanks?

    1. title "Report for %sysfunc(left(%sysfunc(today(),worddate.)))";
    2. title "Report for %sysfunc(left(today(), worddate.))";
    3. title "Report for %sysfunc(left(%qsysfunc(today(), worddate.)))";
    4. title "Report for %left(today(), worddate.))";

    Correct answer: c

    You use the %QSYSFUNC function in this case, in order to mask the comma that results from the worddate. format. You must mask this comma since the LEFT() function expects only one argument.

Chapter 10: Processing Macro Variables at Execution Time

  1. Which of the following is false?

    1. A %LET statement causes the macro processor to create a macro variable before the program is compiled.

    2. To create a macro variable that is based on data calculated by the DATA step, you use the SYMPUT function.

    3. Macro functions are always processed during the execution of the DATA step.

    4. Macro variable references in a DATA step are always resolved prior to DATA step execution.

    Correct answer: c

    Most macro functions are handled by the macro processor before any SAS language statements in the DATA step are executed. For example, the %LET statement and any macro variable references (&macvar) are passed to the macro processor before the program is compiled. In order to create or update macro variables during DATA step execution, you use the SYMPUT routine.

  2. Which of the following correctly creates a macro variable named region and assigns to it a value that is based on the value of the data set variable Location?

    1. data new;
         set sasuser.all;
         if location='Boston' then do;
            call symput('region', 'East');
         end;
         else do;
            call symput('region', 'West');
         end;
      run;
    2. data new;
         set sasuser.all;
         if location='Boston' then do;
            %let region=East;
         end;
         else
            %let region=West;
         end;
      run;
    3. data new;
         set sasuser.all;
         if location='Boston' then do;
            call symput(region, "East");
         end;
         else
            call symput(region, "West");
         end;
      run;
    4. data new;
         set sasuser.all;
         if location='Boston' then do;
            symput(region, East);
         end;
         else
            symput(region, West);
         end;
      run;

    Correct answer: a

    To create a macro variable and assign to it a value that is based on the value of a DATA step variable, you use the SYMPUT routine. In the SYMPUT routine, to assign a literal string as a macro variable name, you enclose the literal in quotation marks. To assign a literal string as a value of the macro variable, you enclose the literal in quotation marks.

  3. The SYMPUT routine cannot

    1. be used to assign a data set variable as a value to a macro variable.

    2. create a series of macro variables in one DATA step.

    3. automatically convert a numeric value to a character value when used to assign a value to a macro variable in a DATA step.

    4. be used to assign a numeric value to a macro variable in an SCL program.

    Correct answer: d

    The SYMPUT routine enables you to assign a data set variable as the value of a macro variable. You can also use the SYMPUT routine to create a series of related macro variables. Because all macro variable values are character strings, SYMPUT automatically converts any numeric value that you attempt to assign as a value for a macro variable. In an SCL program, you must use SYMPUTN rather than SYMPUT if you are attempting to assign a numeric value to a macro variable.

  4. Which of the following programs correctly creates a series of macro variables whose names are values of the data set variable Course_code, then indirectly references one of those macro variables in a later step?

    1. data _null_;
         set sasuser.courses;
         call symput(course_code, trim(course_title));
      %let crsid=C005;
      proc print data=sasuser.schedule noobs label;
         where course_code="&crsid";
         var location begin_date teacher;
         title1 "Schedule for &c005";
      run;
    2. data _null_;
         set sasuser.courses;
         call symput(course_code, trim(course_title));
      run;
      %let crsid=C005;
      proc print data=sasuser.schedule noobs label;
         where course_code="&crsid";
         var location begin_date teacher;
         title1 "Schedule for &&&crsid";
      run;
    3. data _null_;
         set sasuser.courses;
         call symput('course_code', trim(course_title));
      run;
      %let crsid=C005;
      proc print data=sasuser.schedule noobs label;
         where course_code="&crsid";
         var location begin_date teacher;
         title1 "Schedule for &&&crsid";
      run;
    4. data _null_;
         set sasuser.courses;
         call symget(course_code, trim(course_title));
      run;
      
      
      %let crsid=C005;
      proc print data=sasuser.schedule noobs label;
         where course_code="&crsid";
         var location begin_date teacher;
         title1 "Schedule for &&&crsid";
      run;

    Correct answer: b

    You can use multiple ampersands to create an indirect reference when the value of one macro variable is the name of another. If you enclose the DATA step variable name in quotation marks in the SYMPUT routine, the new macro variable will have the same name as the DATA step variable rather than having the DATA step variable's value as a name. Use the SYMGET function to obtain the value of a macro variable during the execution of a DATA step.

  5. Which of the following statements about the resolution of macro variable references is false?

    1. Two ampersands resolve to one ampersand.

    2. If more than four consecutive ampersands precede a name token, the macro processor generates an error message.

    3. Re-scanning continues until there are no remaining macro triggers that the macro processor can resolve.

    4. The macro processor always re-scans a name token that is preceded by multiple ampersands or by multiple percent signs.

    Correct answer: b

    If more than four consecutive ampersands precede a name token, rescanning continues from left to right until no more triggers can be resolved. The Forward Re-scan rule describes how the macro processor resolves macro variable references that start with multiple ampersands or with multiple percent signs.

  6. In which of the following situations would you use SYMGET rather than a macro variable reference (&macvar)?

    1. to create a DATA step variable from a macro variable value during the execution of the DATA step

    2. to include a macro variable reference in a PROC SQL view

    3. to access the value of a macro variable during the execution of an SCL program

    4. all of the above

    Correct answer: d

    A macro variable reference (&macvar) is resolved before any SAS language statements are sent to the compiler. The SYMGET function enables you to obtain the value of a macro variable during the execution of a DATA step or a PROC SQL step. The SYMGET function can also be used to obtain the value of a macro variable during the execution of an SCL program.

  7. Which of the following correctly creates a macro variable in a PROC SQL step?

    1. call symput(daily_fee, put(fee/days, dollar8.);
    2. %let daily_fee=put(fee/days, dollar8.)
    3. select fee/days format=dollar8.
      into :daily_fee from sasuser.all;
    4. select fee/days format=dollar8.
      into daily_fee from sasuser.all;

    Correct answer: c

    To create a macro variable during the execution of a PROC SQL step, use the INTO clause of the SELECT statement. In the INTO clause, you precede the name of the macro variable with a colon.

  8. According to the global symbol table shown here, what value will a reference to &&teach&crs resolve to?

    Chapter 10: Processing Macro Variables at Execution Time
    1. &TEACH3

    2. TEACH3

    3. Forest, Mr. Peter

    4. none of the above

    Correct answer: c

    You can use multiple ampersands to delay the resolution of a macro variable reference. You can also combine macro variable references in order to create new tokens. In this example, the reference &&teach&crs resolves to &teach3 on the first scan. On the next scan, &teach3 resolves to Forest, Mr. Peter.

  9. Which of the following statements correctly creates a DATA step variable named Price and assigns to it the value of the macro variable daily_fee during DATA step execution?

    1. price=&daily_fee;
    2. price=symget(daily_fee);
    3. price=symget(&daily_fee);
    4. price=symget("daily_fee");

    Correct answer: d

    You can use the SYMGET function in an assignment statement to obtain the current value of a macro variable and to assign that value to a DATA step variable. The SYMGET function enables you to obtain the value of a macro variable during execution of a DATA step, a PROC SQL step, or an SCL program.

  10. Which of the following is false?

    1. The SYMPUT routine can be used to create a macro variable during execution of the DATA step or during execution of an SCL program.

    2. In the DATA step, the SYMPUT routine automatically converts to a character value any numeric value that you attempt to assign as the value of a macro variable.

    3. PROC SQL automatically converts to a numeric value any macro variable value that you attempt to compare to a numeric value.

    4. In an SCL program, the SYMPUTN routine can be used to assign a numeric value to a macro variable.

    Correct answer: c

    The SYMPUT routine can be used in either the DATA step or in an SCL program. In the DATA step, the SYMPUT routine will perform automatic conversion on numeric values that you attempt to assign as values for macro variables, using the BEST12. format. In an SCL program, you should use the SYMPUTN routine if you want to assign a numeric value as a value for a macro variable. In a PROC SQL step, you need to use the INPUT function in order to convert macro variable values to numeric before you compare them to other numeric values.

Chapter 11: Creating and Using Macro Programs

  1. Which of the following is false?

    1. A %MACRO statement must always be paired with a %MEND statement.

    2. A macro definition can include macro variable references, but it cannot include SAS language statements.

    3. Only macro language statements are checked for syntax errors when the macro is compiled.

    4. Compiled macros are stored in a temporary SAS catalog by default.

    Correct answer: b

    A macro definition must begin with a %MACRO statement and must end with a %MEND statement. The macro definition can include macro language statements as well as SAS language statements. When the macro is compiled, macro language statements are checked for syntax errors. The compiled macro is stored in a temporary SAS catalog by default.

  2. Which of the following examples correctly defines a macro named Print that includes parameters named vars and total?

    1. %macro print(vars, total);
         proc print data=classes;
            var vars;
            sum total;
         run;
      %mend print;
    2. %macro print('vars', 'total');
         proc print data=classes;
            var &vars;
            sum &total;
         run;
      %mend print;
    3. %macro print(vars, total);
         proc print data=classes;
            var &vars;
            sum &total;
         run;
      %mend print;
    4. %macro print(vars, total);
         proc print data=classes;
            var :vars;
            sum :total;
         run;
      %mend print;

    Correct answer: c

    To include positional parameters in a macro definition, you list the parameters in parentheses and separate them with commas. When the macro is executed, macro variables will be created in the local symbol table and will have the same names as the parameters. You can then use these macro variables within the macro.

  3. Which of the following correctly references the macro named Printdsn as shown here:

    %macro printdsn(dsn,vars);
       %if &vars= %then %do;
          proc print data=&dsn;
          title "Full Listing of %upcase(&dsn) data set";
          run;
       %end;
       %else %do;
          proc print data=&dsn;
             var &vars;
          title "Listing of %upcase(&dsn) data set";
          run;
       %end;
    %mend;
    1. %printdsn(sasuser.courses, course_title days);
    2. %printdsn(dsn=sasuser.courses, vars=course_title days)
    3. %printdsn(sasuser.courses, course_title days)
    4. %printdsn(sasuser.courses, course_title, days)

    Correct answer: c

    To call a macro that includes positional parameters, you precede the macro name with a percent sign. You list the values for the macro variables that are defined by the parameters in parentheses. List values in the same order in which the parameters are listed, and separate them with commas. Remember that a macro call is not a SAS language statement and does not require a semicolon.

  4. If you use a mixed parameter list in your macro program definition, which of the following is false?

    1. You must list positional parameters before any keyword parameters.

    2. Values for both positional and keyword parameters are stored in a local symbol table.

    3. Default values for keyword parameters are the values that are assigned in the macro definition, whereas positional parameters have a default value of null.

    4. You can assign a null value to a keyword parameter in a call to the macro by omitting the parameter from the call.

    Correct answer: d

    In a mixed parameter list, positional parameters must be listed before any keyword parameters. Both positional and keyword parameters create macro variables in the local symbol table. To assign a null value to a keyword parameter, you list the parameter without a value in the macro call.

  5. Which of the following is false?

    1. A macro program is compiled when you submit the macro definition.

    2. A macro program is executed when you call it (%macro-name).

    3. A macro program is stored in a SAS catalog entry only after it is executed.

    4. A macro program is available for execution throughout the SAS session in which it is compiled.

    Correct answer: c

    When you submit a macro definition, the macro is compiled and is stored in a SAS catalog. Then when you call the macro, the macro is executed. The macro is available for execution anytime throughout the current SAS session.

  6. When you use an %IF-%THEN statement in your macro program,

    1. you must place %DO and %END statements around code that describes the conditional action, if that code contains multiple statements.

    2. the %ELSE statement is optional.

    3. you cannot refer to DATA step variables in the logical expression of the %IF statement.

    4. all of the above.

    Correct answer: d

    You can use %IF-%THEN statements to conditionally process code. Within a %IF-%THEN statement, you must use %DO and %END statements to enclose multiple statements. %IF-%THEN statements are similar to IF THEN statements in the DATA step, but they are part of the macro language.

  7. Which of the following can be placed onto the input stack?

    1. only whole steps.

    2. only whole steps or whole statements.

    3. only whole statements or pieces of text within a statement.

    4. whole steps, whole statements, or pieces of text within statements.

    Correct answer: d

    By using %IF-%THEN statements, you can place whole steps, individual statements, or parts of statements onto the input stack.

  8. Which of the following will create a macro variable named class in a local symbol table?

    1. data _null_;
         set sasuser.courses;
         %let class=course_title;
      run;
    2. data _null_;
         set sasuser.courses;
         call symput('class', course_title);
      run;
    3. %macro sample(dsn);
         %local class;
         %let class=course_title;
         data_null_;
            set &dsn;
         run;
      %mend;
    4. %global class;
      %macro sample(dsn);
         %let class=course_title;
         data _null_;
            set &dsn;
         run;
      %mend;

    Correct answer: c

    There are several ways to create macro variables in the local symbol table. Macro variables that are created by parameters in a macro definition or by a %LOCAL statement are always created in the local table. Macro variables that are created by a %LET statement or by the SYMPUT routine inside a macro definition might be created in the local table as well.

  9. Which of the following examples correctly defines the macro program Hex?

    1. %macro hex(start=1, stop=10, incr=1);
         %local i;
         data _null_;
         %do i=&start to &stop by &incr;
            value=&i;
            put "Hexadecimal form of &i is " value hex6.;
         %end;
         run;
      %mend hex;
    2. %macro hex(start=1, stop=10, incr=1);
         %local i;
         data _null_;
         %do i=&start %to &stop %by &incr;
            value=&i;
            put "Hexadecimal form of &i is " value hex6.;
         %end;
         run;
      %mend hex;
    3. %macro hex(start=1, stop=10, incr=1);
         %local i;
         data _null_;
         %do i=&start to &stop by &incr;
            value=&i;
            put "Hexadecimal form of &i is " value hex6.;
         run;
      %mend hex;
    4. %macro hex(start=1, stop=10, incr=1);
         %local i;
         data _null_;
         %do i=&start to &stop by &incr;
            value=&i;
            put "Hexadeciaml form of &i is " value hex6.;
         %end
         run;
      %mend hex;

    Correct answer: b

    To define macros with %DO loops you use a %DO statement and a %END statement. Be sure to precede all keywords in the statements with percent signs since the %DO and %END statements are macro language statements. Also, be sure to end these statements with semicolons.

  10. When you submit a call to a compiled macro, what happens?

    1. First, the macro processor checks all macro programming statements in the macro for syntax errors.

      Then the macro processor executes all statements in the macro.

    2. The macro processor executes compiled macro programming statements.

      Then any SAS programming language statements are executed by the macro processor.

    3. First, all compiled macro programming statements are executed by the macro processor.

      After all macro statements have been processed, any SAS language statements are passed back to the input stack in order to be passed to the compiler and then executed.

    4. The macro processor executes compiled macro statements.

      If any SAS language statements are encountered, they are passed back to the input stack.

      The macro processor pauses while those statements are passed to the compiler and then executed.

      Then the macro processor continues to repeat these steps until it reaches the %MEND statement.

    Correct answer: d

    When you submit a call to a compiled macro, the macro is executed. Specifically, the macro processor executes compiled macro language statements first. When any SAS language statements are encountered, the macro processor places these statements onto the input stack and pauses while they are passed to the compiler and then executed. Then the macro processor continues to repeat these steps until the %MEND statement is reached.

Chapter 12: Storing Macro Programs

  1. The %INCLUDE statement

    1. can be used to insert the contents of an external file into a program.

    2. will cause a macro definition that is stored in an external file to be compiled when the contents of that file are inserted into a program and submitted.

    3. can be specified with the SOURCE2 option in order to write the contents of the external file that is inserted into a program to the SAS log.

    4. all of the above

    Correct answer: d

    The %INCLUDE statement can be used to insert the contents of an external file into a SAS program. If a macro definition is stored in an external file, the %INCLUDE statement causes the macro definition to be compiled when it is inserted into the SAS program. The contents of the macro definition will be written to the SAS log only if the SOURCE2 option is specified.

  2. If you store a macro definition in a SAS catalog SOURCE entry

    1. the macro definition can be submitted for compilation by using the FILENAME and %INCLUDE statements.

    2. you can use the PROC CATALOG statement to compile the macro.

    3. the SOURCE entry will be deleted at the end of the session.

    4. you do not need to compile the macro before you invoke it in a program.

    Correct answer: a

    When a macro definition is stored as a catalog SOURCE entry, you must compile it before you can call it from a SAS program. You compile a macro that is stored as a catalog SOURCE entry by using the CATALOG access method. This creates a session-compiled macro that will be deleted at the end of the SAS session. The PROC CATALOG statement enables you to view a list of the contents of a SAS catalog.

  3. Which of the following programs correctly sets the appropriate system options and calls the macro Prtlast? Assume that Prtlast is stored in an autocall library as a text file and that it has not been compiled during the current SAS session.

    1. libname mylib 'c:mylib';
      filename macsrc 'mylib.macsrc';
      options mautosource sasautos=(macsrc, sasautos);
      %prtlast
    2. libname mylib 'c:mylib';
      filename macsrc catalog 'mylib.macsrc';
      %prtlast
    3. filename mylib 'c:mylib';
      options mautosource sasautos=(sasautos,mylib);
      %prtlast
    4. libname mylib 'c:mylib';
      options mautosource sasautos=mylib;
      %prtlast

    Correct answer: c

    To call a macro that is stored in an autocall library, you must specify both the MAUTOSOURCE system options and the SASAUTOS system option. The SASAUTOS system option can be set to include multiple pathnames or filerefs. Once these two system options are set, you can call the macro by preceding the macro name with a percent sign.

  4. If you use the Stored Compiled Macro Facility,

    1. the macro processor does not compile a macro every time it is used.

    2. the only compiled macros that the Stored Compiled Macro Facility can access are those that are stored in the Sasmacr catalog.

    3. you need to specify the MSTORED and SASMSTORE system options.

    4. all of the above

    Correct answer: d

    The Stored Compiled Macro Facility enables you to store compiled macros permanently so that you can reuse them in later SAS sessions without compiling them again. Compiled macros must be stored in a catalog named Sasmacr, and both the MSTORED system option and the SASMSTORE system option must be specified.

  5. Which of the following correctly creates a permanently stored compiled macro?

    1. libname macrolib 'c:mylib';
      options sasmstore;
      %macro prtlast; / store
         proc print data=&syslast (obs=5);
            title "Listing of &syslast data set";
         run;
      %mend;
    2. libname macrolib 'c:mylib';
      options mstored sasmstore=macrolib;
      %macro prtlast / store;
         proc print data=&syslast (obs=5);
            title "Listing of &syslast data set";
         run;
      %mend;
    3. libname macrolib 'c:mylib';
      options mstored sasmstore=macrolib;
      %macro prtlast;
         proc print data=&syslast (obs=5);
            title "Listing of &syslast data set";
         run;
      %mend;
    4. libname macrolib 'c:mylib';
      %macro prtlast / store;
         proc print data=&syslast (obs=5);
            title "Listing of &syslast data set";
         run;
      %mend;

    Correct answer: b

    In order to create a permanently stored compiled macro, you must specify the MSTORED system option. The SASMSTORE system option must be specified to point to the library in which you want your macros to be stored. You must also use the STORE option in the %MACRO statement.

  6. When you submit the following code, what happens?

    %macro prtlast;
       proc print data=&syslast (obs=5);
          title "Listing of &syslast data set";
       run;
    %mend;
    1. A session-compiled macro named Prtlast is stored in Work.Sasmacr.

    2. A macro named Prtlast is stored in the autocall library.

    3. The Prtlast macro is stored as a stored compiled macro.

    4. The Prtlast macro is stored as a SOURCE entry in a permanent SAS catalog.

    Correct answer: a

    When you submit a macro definition, SAS creates a session-compiled macro and stores it in the temporary SAS catalog Work.Sasmacr. This macro will be deleted at the end of the SAS session.

  7. Why would you want to store your macros in external files?

    1. You could easily share your macros with others.

    2. You could edit your macros with any text editor.

    3. Your macros would be available for use in later SAS sessions.

    4. all of the above

    Correct answer: d

    If you store your macro definitions in external files, you can easily share these files with others. Also, you can edit a macro definition that is stored in an external file with any text editor, and you can reuse the macro in other SAS sessions.

  8. What will the following PROC CATALOG step do?

    proc catalog cat=mylib.sasmacr;
       contents;
    quit;
    1. Copy the contents of the Sasmacr catalog to a temporary data set.

    2. List the contents of the Sasmacr catalog as output.

    3. Copy the contents of the output window to the Sasmacr catalog.

    4. none of the above

    Correct answer: b

    The PROC CATALOG step enables you to view a list of the contents of a SAS catalog. This might be especially useful if you store your macro definitions as SOURCE entries in permanent SAS catalogs. You might also use the PROC CATALOG step to see a list of the session-compiled macros that are stored in Work.Sasmacr.

  9. Which of the following is not true about stored compiled macros?

    1. Because these stored macros are compiled, you should save and maintain the source for the macro definitions in a different location.

    2. The Stored Compiled Macro Facility compiles and saves compiled macros in a permanent catalog, in a library that you specify.

    3. You do not need to specify any system options in order to use the Stored Compiled Macro Facility.

    4. You cannot move a stored compiled macro to another operating system.

    Correct answer: c

    In order to use the Stored Compiled Macro Facility, you need to specify the MSTORED and SASMSTORE system options. The Stored Compiled Macro Facility saves the compiled macro in a permanent SAS catalog, but it does not save the macro definition. You cannot move a compiled macro across operating systems. Since you cannot re-create the macro definition from a compiled macro, it is a good idea to save your source program permanently as well.

  10. Which of the following is not true?

    1. The autocall macro facility stores compiled SAS macros in a collection of external files called an autocall library.

    2. Autocall libraries can be concatenated together.

    3. One disadvantage of the autocall facility is that the first time you call an autocall macro in a SAS session, the macro processor must use system resources to compile it.

    4. The autocall facility can be used in conjunction with the Stored Compiled Macro Facility.

    Correct answer: a

    The autocall macro facility stores macro definitions — not compiled macros — permanently. The first time an autocall macro is called during a SAS session, the macro is compiled and a session-compiled macro is created in Work.Sasmacr.You can have multiple autocall libraries that are concatenated, and you can use the autocall facility in conjunction with the Stored Compiled Macro Facility.

Chapter 13: Creating Samples and Indexes

  1. The variable that is created by the POINT= option is assigned a value

    1. automatically during compilation of the DATA step.

    2. automatically during execution of the DATA step.

    3. during compilation of the DATA step, by program statements.

    4. during execution of the DATA step, by program statements.

    Correct answer: d

    The POINT= option in the SET statement names a variable. You must use program statements to assign a value to this variable during execution of the DATA step, before execution of the SET statement. Also, the value of the POINT= variable should be a number that corresponds to an observation number in the input data set, and it should be different each time the SET statement executes.

  2. Which of the following programs correctly creates a systematic sample from a data set with an unknown number of observations and outputs these sample observations to a data set named Sample?

    1. data sample;
         set sasuser.sale2000 point=thisone nobs=totnum;
         output;
         stop;
      run;
    2. data sample;
         do thisone=100 to totnum by 100;
            set sasuser.sale2000 point=thisone nobs=totnum;
            output;
         end;
         stop;
      run;
    3. data sample;
         do thisone=100 to 1000 by 100;
            set sasuser.sale2000 point=thisone;
            output;
         end;
         stop;
      run;
    4. data sample;
         do thisone=100 to totnum by 100;
            set sasuser.sale2000 point=thisone nobs=totnum;
         end;
      run;

    Correct answer: b

    To create a systematic sample from a data set that has an unknown number of observations, you use the NOBS= option in conjunction with the POINT= option in the SET statement. The NOBS= variable is automatically assigned a value of the total number of observations in the input data set, and you must assign a value to the POINT= variable before the SET statement executes.

  3. Which of the following expressions will generate a random integer between 1 and 50?

    1. ceil(ranuni(50))
    2. ranuni(50)
    3. ceil(ranuni(0)*50)
    4. ceil(ranuni(0))*50

    Correct answer: c

    In order to create a random sample of a data set, you need to generate a random integer. You can use the RANUNI function in conjunction with the CEIL function to create a random integer. You can use a multiplier with the RANUNI function to increase the range from which the random number is chosen to include as many numbers as you need.

  4. An index

    1. is an optional file that is associated with a data set.

    2. provides direct access to specific observations of a data set, based on the value of one or more key variables.

    3. can be classified as simple or composite, either of which can consist of unique values.

    4. all of the above

    Correct answer: d

    An index is a separate file from a data set that contains information about observations within the data set. Specifically, an index contains value/identifier pairs that indicate the location of observations within the data set and the value of one or more key variables in that observation.

  5. Which of the following correctly creates a data set named Flights from the Sasuser.Revenue data set, creates a composite index named Fromto that is based on the values of Origin and Dest, and prints informational messages about the index to the SAS log?

    1. options msglevel=i;
      data flights index=(Fromto=origin dest);
         set sasuser.revenue;
      run;
    2. options msglevel=n;
      data flights (index=(Fromto=origin dest));
         set sasuser.revenue;
      run;
    3. options msglevel=i;
      data flights (index=(Fromto=(origin dest)));
         set sasuser.revenue;
      run;
    4. options msglevel=n;
      data flights (index=Fromto);
         set sasuser.revenue;
      run;

    Correct answer: c

    To create an index at the same time that you create a data set, you use the INDEX= option in the DATA statement. You must assign a unique name to a composite index, while a simple index is automatically assigned the name of the key variable as its name. You can set the value of the MSGLEVEL= system option to I in order to see messages about indexes in the SAS log.

  6. Which of the following is true?

    1. When you add observations to a data set, the index(es) are automatically updated with additional value/identifier pairs.

    2. When you rename a variable that is used as the key variable in a simple index, you must re-create the index.

    3. When you delete a data set, the index file remains until you delete it as well.

    4. When you copy a data set with the COPY statement, you must also copy the index file in another step.

    Correct answer: a

    For many maintenance tasks that you perform on a data set, SAS automatically performs corresponding tasks to the index file. For example, if you delete a data set, the index file is deleted as well. If you rename a data set with the CHANGE statement in the DATASETS procedure, SAS automatically renames the index file. If you copy a data set to a new location with the COPY statement in the DATASETS procedure, SAS automatically reconstructs the index file in the new location.

  7. To create an index on an existing data set, you use

    1. PROC DATASETS.

    2. PROC SQL.

    3. the DATA step with the INDEX= option, to rebuild the data set.

    4. any of the above

    Correct answer: d

    You can use the DATASETS procedure or the SQL procedure to create an index on or delete an index from an existing data set. You can also rebuild the index with a DATA step and use the INDEX= option to create an index on the rebuilt data set. However, rebuilding a data set uses more system resources than adding an index to an existing data set with either the DATASETS or the SQL procedure.

  8. Which of the following correctly creates a simple index named Origin on the Revenue data set?

    1. proc sql;
         create index origin on revenue(origin);
      quit;
    2. proc sql;
         modify revenue;
         index=origin;
      quit;
    3. proc sql data=revenue;
         create index origin;
      quit;
    4. proc sql;
         index=origin on revenue;
      quit;

    Correct answer: a

    You use the CREATE INDEX statement of the SQL procedure to create an index on an existing data set. In the SQL procedure, you must name the index in the CREATE INDEX statement; for a simple index, the index name must match the name of the key variable.

  9. To view a list of the indexes that are associated with a data set, you use

    1. PROC COPY or the COPY statement in PROC DATASETS.

    2. PROC CONTENTS or the CONTENTS statement in PROC DATASETS.

    3. the MSGLEVEL= system option and a PROC PRINT step.

    4. any of the above

    Correct answer: b

    You can use either the CONTENTS procedure or the CONTENTS statement in the DATASETS procedure to generate a list of information about a data set, including a list of existing indexes. All indexes for a data set are stored in a single file that is separate from but has the same name as the data set.

  10. Suppose that the Sasuser.Revenue data set has a simple index named FlightID. For which of the following programs will the index be used?

    1. proc print data=sasuser.revenue;
         where flightid ne 'IA11200';
      run;
    2. data someflights;
         set sasuser.revenue;
         where flightid > 'IA11200';
      run;
    3. data someflights;
         set sasuser.revenue;
         if flightid > 'IA11200';
      run;
    4. proc print data=sasuser.revenue;
         where origin='RDU' or flightid='IA03400';
      run;

    Correct answer: b

    An index can improve the efficiency with which SAS is able to access certain observations in a data set. However, an index is not always useful. SAS will not use an index to process subsetting IF statements, or other statements that SAS determines might be more efficiently processed without an index.

Chapter 14: Combining Data Vertically

  1. Which of the following statements associates the fileref OnSale with the raw data files London.dat, Paris.dat, and Zurich.dat? The files are stored in the C:RoutesNew directory in the Windows operating environment.

    1. filename onsale (c:
      outes
      ewlondon.dat,
         c:
      outes
      ewparis.dat,
         c:
      outes
      ewzurich.dat);
    2. filename onsale 'c:
      outes
      ewlondon.dat'
         'c:
      outes
      ewparis.dat'
         'c:
      outes
      ewzurich.dat';
    3. filename onsale ('c:
      outes
      ewlondon.dat'
         'c:
      outes
      ewparis.dat'
         'c:
      outes
      ewzurich.dat');
    4. filename onsale 'c:
      outes
      ewlondon.dat
         c:
      outes
      ewparis.dat
         c:
      outes
      ewzurich.dat';

    Correct answer: c

    When a FILENAME statement is used to assign a fileref to multiple raw data files, the list of files must be enclosed in a single set of parentheses. Each filename specified must be enclosed in quotation marks.

  2. Which of the following statements is true?

    1. The FILEVAR= option can be used to dynamically change the currently opened input file to a new physical file.

    2. The FILEVAR= variable is not written to the data set.

    3. The FILEVAR= variable must contain a character string that is a physical filename.

    4. all of the above

    Correct answer: d

    The FILEVAR= option enables you to dynamically change the currently opened input file to a new input file. The FILEVAR= variable must contain a character string that is a physical filename. Like automatic variables, the FILEVAR= variable is not written to the data set.

  3. Given the following program, which table correctly shows the corresponding values of the variables x and readfile?

    data work.revenue;
       do x = 8, 9, 10;
          readfile=compress("c:datamonth"
            !!put(x,2.)!!".dat",' ');
          do until (lastobs);
             infile temp filevar=nextfile
               end=lastobs;
             input Date : date7. Location $
                   Sales : dollar10.2;
             output;
          end;
       end;
       stop;
    run;
    1. Chapter 14: Combining Data Vertically
    2. Chapter 14: Combining Data Vertically
    3. Chapter 14: Combining Data Vertically
    4. Chapter 14: Combining Data Vertically

    Correct answer: b

    The DO statement creates the index variable x and assigns it the values of 8, 9, and 10. The assignment statement assigns the name of a raw data file to readfile using the current value of x and the PUT function, which concatenates the values of x with the text strings c:datamonth and .dat. The COMPRESS function removes blank spaces from the values of readfile.

  4. If the current date is March 30, 2003, which table correctly shows the corresponding values of the variables y1, y2, y3, and nextfile?

    data work.quarter (drop=monthnum midmon lastmon);
       y3=year(today());
       y2=y3-1;
       y1=y3-2;
       do i = y3, y2, y1;
          nextfile="c:dataY"!!put(i,4.)!!".dat";
          do until (lastobs);
             infile temp filevar=nextfile
               end=lastobs;
             input Flight $ Origin $ Dest $
                   Date : date9.;
             output;
          end;
       end;
       stop;
    run;
    1. Chapter 14: Combining Data Vertically
    2. Chapter 14: Combining Data Vertically
    3. Chapter 14: Combining Data Vertically
    4. Chapter 14: Combining Data Vertically

    Correct answer: a

    The TODAY function returns the current date from the system clock as a SAS date value. The year number is then extracted from the current date using the YEAR function. The value of the current year, 2003, is assigned to y3. The year values 2002 and 2001 are assigned to y2 and y1, respectively. The PUT function concatenates the text string c:dataY with the year values and the text string .dat.

  5. Which of the following statements is false?

    1. The END= variable is set to 0 when SAS processes the last data record in the input file.

    2. The END= variable is set to 1 when SAS processes the last data record in the input file.

    3. The END= variable is not written to the data set.

    4. a and c

    Correct answer: a

    The END= option enables you to name a variable whose value is controlled by SAS. The value of the variable is 0 when you are not reading the last record in an input file and 1 when you are reading the last record in an input file. You can test the value of the END= variable to determine if the DATA step should continue processing. Like automatic variables, the END= variable is not written to the SAS data set.

  6. Which program appends Work.London to Work.Flights?

    Chapter 14: Combining Data Vertically
    1. proc append base=work.london
                  data=work.flights;
      run;
    2. proc append data=work.london
                  base=work.flights;
      run;
    3. proc append data=work.london work.flights;
      run;
    4. proc append data=work.flights work.london;
      run;

    Correct answer: b

    PROC APPEND uses the BASE= and DATA= arguments. BASE=SAS-data-set names the data set to which you want to add observations. DATA=SAS-data-set names the SAS data set containing observations that you want to append to the end of the SAS data set specified in the BASE= argument.

  7. What happens when the following program is submitted?

    proc append base=staff.marketing
                data=staff.sales force;
    run;
    Chapter 14: Combining Data Vertically
    1. The length of LastName is converted to 20 in Staff.Marketing.

    2. LastName is dropped from Staff.Marketing.

    3. Missing values are assigned to LastName observations that are read in from Staff.Sales.

    4. Some of the values of LastName may be truncated in the observations that are read in from Staff.Sales.

    Correct answer: d

    If a DATA= data set contains variables that are longer than the corresponding variables in the BASE= data set, the FORCE option must be used with PROC APPEND. Using the FORCE option enables you to append the data sets. However, some of the variable values may be truncated in the observations that are read in from the DATA= data set.

  8. Which program appends Work.April to Work.Y2003?

    Chapter 14: Combining Data Vertically
    1. proc append base=work.y2003
                  data=work.april;
      run;
    2. proc append base=work.april
                  data=work.y2003 force;
      run;
    3. proc append base=work.y2003
                  data=work.april force;
      run;
    4. proc append base=work.april
                  data=work.y2003;
      run;

    Correct answer: c

    You must use the FORCE option with PROC APPEND when the DATA= data set contains a variable that does not have the same type as the corresponding variable in the BASE= data set.

  9. What happens when the SAS data set Work.NewHires is appended to the SAS data set Work.Employees using PROC APPEND?

    Chapter 14: Combining Data Vertically
    1. Missing values are assigned to Room for the observations that are read in from Work.NewHires.

    2. Missing values are assigned to Room for all of the observations in Work.Employees.

    3. Room is dropped from Work.Employees.

    4. The values of Name are truncated in the observations that are read in from Work.NewHires.

    Correct answer: a

    PROC APPEND reads only the data in the DATA= SAS data set, not the BASE= SAS data set. When the BASE= data set contains more variables than the DATA= data set, missing values for the additional variables are assigned to the observations that are read in from the DATA= data set.

  10. You do not need to use the FORCE option with PROC APPEND when

    1. the DATA= data set contains variables that are not in the BASE= data set.

    2. the BASE= data set contains variables that are not in the DATA= data set.

    3. the variables in the DATA= data set are longer than the corresponding variables in the BASE= data set

    4. the variables in the DATA= data set have a different type than the corresponding variables in the BASE= data set.

    Correct answer: b

    The FORCE option does not need to be used if the BASE= data set contains variables that are not in the DATA= data set. The FORCE option must be used if

    • the DATA= data set contains variables that are not in the BASE= data set

    • the variables in the DATA= data set are longer than the corresponding variables in the BASE= data set

    • the variables in the DATA= data set have a different type than the corresponding variables in the BASE= data set.

Chapter 15: Combining Data Horizontally

  1. According to the data set descriptions below, which of the variables listed qualify as BY variables for a DATA step match-merge?

    Chapter 15: Combining Data Horizontally
    1. Code and IDnum

    2. Manager and Supervisor

    3. Manager and IDnum

    4. There are no variables that are common to both of these data sets.

    Correct answer: c

    Remember that common variables might not have the same names. Manager and IDnum are the only two variables listed that match according to type and description. You can use the RENAME= option to rename one of these variables so that they can be used as BY variables in the MERGE statement of the DATA step.

  2. Suppose you want to merge Dataset1, Dataset2, and Dataset3. Also suppose that Dataset1 and Dataset2 have the common variable Startdate, Dataset2 and Dataset3 have the common variable Instructor, and that these data sets have no other common variables. How can you use a DATA step to merge these three data sets into one new data set?

    1. You use a MERGE statement in one DATA step to merge Dataset1, Dataset2, and Dataset3 by Startdate and Instructor.

    2. You sort Dataset1 and Dataset2 by Startdate and merge them into a temporary data set in a DATA step. Then you sort the temporary data set and Dataset3 by Instructor and merge them into a new data set in a DATA step.

    3. You can merge these data sets only with a PROC SQL step.

    4. You cannot merge these three data sets at all because they do not have a common variable.

    Correct answer: b

    In order to merge multiple data sets in a DATA step, the data sets must have a common variable. However, if there are variables that are common to at least two of the input data sets, and if each input data set contains at least one of these variables, then you can use subsequent DATA steps to merge the data sets. You can also use a PROC SQL step to merge data sets that do not have common variables.

  3. Which of the following programs correctly creates a table with the results of a PROC SQL inner join matched on the values of empcode?

    1. proc sql;
         select newsals.empcode allemps.lastname
                newsals.salary contrib.amount
            from sasuser.allemps, sasuser.contrib,
                 sasuser.newsals
            where empcode=allemps.empid
                  and empcode=contrib.empid;
      quit;
    2. proc sql;
         create table usesql as
            select newsals.empcode allemps.lastname
                   newsals.salsry contrib.amount
               from sasuser.allemps, sasuser.contrib,
                    sasuser.newsals
      quit;
    3. proc sql;
         create table usesql as;
            select newsals.empcode, allemps.lastname,
                   newsals.salary, contrib.amount;
               from sasuser.allemps, sasuser.contrib,
                    sasuser.newsals;
               where empcode=allemps.empid
                     and empcode=contrib.empid;
      quit;
    4. proc sql;
         create table usesql as
            select newsals.empcode, allemps.lastname,
                   newsals.salary, contrib.amount
               from sasuser.allemps, sasuser.contrib,
                    sasuser.newsals
               where empcode=allemps.empid
                     and empcode=contrib.empid;
      quit;

    Correct answer: d

    You can use PROC SQL to join data from data sets that do not have a single common variable among them. If you create a new table with the result of an inner join in a PROC SQL step, the resulting data set can be similar or identical to the result of a DATA step match-merge.

  4. To process a default DATA step match-merge, SAS first reads the descriptor portion of each data set and sets up the PDV and the descriptor portion of the new data set. Which of the following accurately describes the rest of this process?

    1. Next, SAS sequentially match-merges observations and writes the new observation to the PDV, then to the new data set. When the BY value changes in all the input data sets, the PDV is initialized to missing. Missing values for variables, as well as missing values that result from unmatched observations, are written to the new data set.

    2. Next, SAS sequentially match-merges observations and writes the new observation to the PDV, then to the new data set. After each DATA step iteration, the PDV is initialized to missing. Missing values for variables, as well as missing values that result from unmatched observations, are omitted from the new data set.

    3. Next, SAS creates a Cartesian product of all possible combinations of observations and writes them to the PDV, then to the new data set. Then SAS goes through the new data set and eliminates all observations that do not have matching values of the BY variable.

    4. Next, SAS creates a Cartesian product of all possible combinations of observations and writes them to the PDV, then to the new data set. The new data set is then ordered by values of the BY variable.

    Correct answer: a

    In a DATA step match-merge, SAS reads observations from the input data sets sequentially and match-merges them with observations from other input data sets. Combined observations are created when SAS writes values from all input data sets to the variables in the PDV. These observations, as well as any observations that contain missing or nonmatched values, are then written to the new data set. A PROC SQL join creates a Cartesian product of matches and then eliminates nonmatching data.

  5. Which of the following statements is false about using multiple SET statements in one DATA step?

    1. You can use multiple SET statements to combine observations from several SAS data sets.

    2. Processing stops when SAS encounters the end-of-file (EOF) marker on either data set (even if there is more data in the other data set).

    3. You can use multiple SET statements in one DATA step only if the data sets in each SET statement have a common variable.

    4. The variables in the PDV are not reinitialized when a second SET statement is executed.

    Correct answer: c

    You can use multiple SET statements in one DATA step to combine observations from several data sets, and the data sets do not need to have a common variable. When you use multiple SET statements, you need to keep in mind the process that SAS uses to combine data from the input data sets. Otherwise, you might achieve unexpected results.

  6. Select the program that correctly creates a new data set named Sasuser.Summary that contains one observation with summary data created from the Salary variable of the Sasuser.Empdata data set.

    1. proc sum data=sasuser.emdata noprint;
         output out=sasuser.summary sum=Salarysum;
      run;
    2. proc means data=sasuser.empdata noprint;
         var salary;
         output out=sasuser.summary sum=Salarysum;
      run;
    3. proc sum data=sasuser.empdata noprint;
         var salary;
         output out=sasuser.summary sum=Salarysum;
      run;
    4. proc means data=sasuser.empdata noprint;
         output=sasuser.summary sum=Salarysum;
      run;

    Correct answer: b

    You can use the MEANS procedure to create a new data set that contains a summary statistic. You use the NOPRINT option to suppress the default report and the OUTPUT statement to route the results from the MEANS procedure to a new data set. You use the VAR statement to focus the procedure on one or more specific variables from the input data set.

  7. If the value of Cargosum is $1000 at the end of the first iteration of the DATA step shown below, what is the value of Cargosum in the PDV when the DATA step is in its third iteration?

    data sasuser.percent1;
       if _n_=1 then set sasuser.summary (keep=cargosum);
       set sasuser.monthsum (keep=salemon revcargo);
       PctRev=revcargo/cargosum;
    run;
    1. $1000

    2. $3000

    3. The value is missing.

    4. The value cannot be determined without seeing the data that is in Sasuser.Summary.

    Correct answer: a

    The _N_ variable records how many times the DATA step has iterated. In the example shown above, _N_ is used to ensure that only the first observation is read from Sasuser.Summary. Since the values in the PDV are not reinitialized after each DATA step iteration, this value will be retained as long as the DATA step continues to iterate. Therefore, if the value of Cargosum is $1000 in the first iteration, it will be $1000 in each subsequent iteration as well.

  8. According to the data set shown, what is the value of Totalrev in the PDV at the end of the fourth iteration of the DATA step?

    Chapter 15: Combining Data Horizontally
    1. The value is missing.

    2. $350.00

    3. $520.00

    4. $1100.00

    Correct answer: d

    Totalrev is the accumulator variable of the sum statement, which is automatically initialized with a value of 0. If the expression in a sum statement produces a missing value, SAS replaces the missing value with a value of 0. As the DATA step iterates, the sum statement retains the accumulator variable so that it will accumulate a total.

  9. Which of the following programs correctly uses an index to combine data from two input data sets?

    1. data work.profit;
         set sasuser.sale2000(keep=routeid flightid date
             rev1st revbusiness revecon revcargo)
             key=flightdate;
         set sasuser.dnunder;
         Profit=sum(rev1st, revbusiness, revecon, revcargo,
                -expenses);
      run;
    2. data work.profit;
         set sasuser.dnunder;
         set sasuser.sale2000(keep=routeid flightid date
             rev1st revbusiness revecon revcargo)
             key=flightdate;
         where routeid='0000103';
         Profit=sum(rev1st, revbusiness, revecon, revcargo,
                -expenses);
      run;
    3. data work.profit;
         set sasuser.dnunder;
         set sasuser.sale2000(keep=routeid flightid date
             rev1st revbusiness revecon revcargo);
             key=flightdate;
         Profit=sum(rev1st, revbusiness, revecon, revcargo,
                -expenses);
      run;
    4. data work.profit;
         set sasuser.dnunder;
         set sasuser.sale2000(keep=routeid flightid date
             rev1st revbusiness revecon revcargo)
             key=flightdate;
         Profit=sum(rev1st, revbusiness, revecon, revcargo,
                -expenses);
      run;

    Correct answer: d

    You use the KEY= option in a SET statement to cause SAS to use an index to combine data from multiple data sets. When the SET statement with the KEY= option executes, the program data vector must already contain a value for the indexed variable. You cannot use WHERE processing on a data set that has been read with the KEY= option within the same DATA step.

  10. Which of the following statements about the _IORC_ variable is false?

    1. It is automatically created when you use either a SET statement with the KEY= option or the MODIFY statement with the KEY= option in a DATA step.

    2. A value of zero for _IORC_ means that the most recent SET statement with the KEY= option (or MODIFY statement with the KEY= option) did not execute successfully.

    3. A value of zero for _IORC_ means that the most recent SET statement with the KEY= option (or MODIFY statement with the KEY= option) executed successfully.

    4. You can use the _IORC_ variable to prevent nonmatching data from being included when you use an index to combine data from multiple data sets.

    Correct answer: b

    When you use the KEY= option, SAS creates an automatic variable named _IORC_, which stands for INPUT/OUTPUT Return Code. If the value of _IORC_ is zero, the index search was successful. The _IORC_ variable is also created automatically when you use a MODIFY statement in a DATA step.

Chapter 16: Using Lookup Tables to Match Data

  1. Which SAS statement correctly specifies the array Sales as illustrated in the following table?

    Chapter 16: Using Lookup Tables to Match Data
    1. array Sales{3,4} m1-m12;
    2. array Sales{4,3} m1-m12;
    3. array {3,4} Sales m1-m12;
    4. array {4,12} Sales m1-m12;

    Correct answer: a

    An array is specified using the keyword ARRAY followed by the name of the array and the dimensions of the array. In a two-dimensional array, the two dimensions can be thought of as a table of rows and columns. The first dimension in the ARRAY statement specifies the number of rows. The second dimension specifies the number of columns.

  2. Which of the following statements creates temporary array elements?

    1. array new {*} _temporary_;
    2. array new {6} _temporary_;
    3. array new {*} _temporary_ Jan Feb Mar Apr May Jun;
    4. array _temporary_ new {6} Jan Feb Mar Apr May Jun;

    Correct answer: b

    To create temporary array elements, specify the keyword _TEMPORARY_ after the array name and dimension. Remember that if you use an asterisk to count the array elements, you must list the array elements. You cannot use the asterisk and the _TEMPORARY_ keyword together in an ARRAY statement.

  3. Which DO statement processes all of the elements in the Yearx array?

    array Yearx{12} Jan--Dec;
    1. do i=1 to dim(yearx);
    2. do i=1 to 12;
    3. do i=Jan to Dec;
    4. a and b

    Correct answer: d

    To process all of the elements in an array, you can use either the DIM function with the array name as the argument or specify the array dimension.

  4. Given the following program, what is the value of Points in the fifth observation in the data set Work.Results?

    Chapter 16: Using Lookup Tables to Match Data
    1. 40

    2. 50

    3. 60

    4. 70

    Correct answer: a

    The ARRAY statement creates the two-dimensional array score and specifies the dimensions of the array: two rows and four columns. The value of Points for each observation is determined by referencing the array based on the values of Week and Finish in the Work.Contest data set. The row number for the array reference is determined by the value of Week. The column number for the array reference is determined by the value of Finish.

  5. Array values should be stored in a SAS data set when

    1. there are too many values to initialize easily in an array.

    2. the values change frequently.

    3. the same values are used in many programs.

    4. all of the above

    Correct answer: d

    Array values should be stored in a SAS data set when there are too many values to initialize easily in an array, the values change frequently, or the same values are used in many programs.

  6. Given the following program, which statement is not true?

    data work.lookup1;
       array Targets{1997:1999,12} _temporary_;
       if _n_=1 then do i= 1 to 3;
                set sasuser.ctargets;
                array Mnth{*} Jan--Dec;
                do j=1 to dim(mnth);
                   targets{year,j}=mnth{j};
                end;
       end;
       set sasuser.monthsum(keep=salemon revcargo monthno);
       year=input(substr(salemon,4),4.);
       Ctarget=targets{year,monthno};
    run;
    1. The IF-THEN statement specifies that the Targets array is loaded once.

    2. During the first iteration of the DATA step, the outer DO loop executes three times.

    3. After the first iteration of the DO loop, the pointer drops down to the second SET statement.

    4. During the second iteration of the DATA step, the condition _N_=1 is false. So, the DO loop doesn't execute.

    Correct answer: c

    The IF-THEN statement specifies that the Targets array is loaded only once, during the first iteration of the DATA step. During the first iteration of the DATA step, the condition _N_=1 is true, so the outer DO loop executes three times; once for each observation in Sasuser.Ctargets. After the third iteration of the DO loop, the pointer drops down to the second SET statement and the values from the first observation in Sasuser.Monthum are read into the program data vector. During the second iteration of the DATA step, the condition _N_=1 is false. So, the DO loop doesn't execute.

  7. Given the following program, which variable names will appear in the data set Work.New?

    Chapter 16: Using Lookup Tables to Match Data
    1. Year, Jan, Feb, Mar, Apr
    2. Year, 2000, 2001, 2002
    3. _NAME_, Col1, Col2, Col3
    4. _NAME_, Jan, Feb, Mar, Apr

    Correct answer: c

    The TRANSPOSE procedure creates an output data set by restructuring the values in a SAS data set. When the data set is restructured, selected variables are transposed into observations. The procedure creates several variable names by default. _NAME_ is the default name of the variable that PROC TRANSPOSE creates to identify the source of the values in each observation in the output data set. The remaining transposed variables are named COL1…COLn by default.

  8. Which program creates the output data set Work.Temp2?

    Chapter 16: Using Lookup Tables to Match Data
    1. proc transpose data=work.temp
           out=work.temp2
           prefix=Quarter;
      run;
    2. proc transpose data=work.temp
           out=work.temp2
           name=Month
           prefix=Quarter;
      run;
    3. proc transpose data=work.temp
           out=work.temp2
           prefix=Month
           name=Quarter;
        run
    4. proc transpose data=work.temp
           out=work.temp2
           prefix=Month
           index=Quarter;
      run;

    Correct answer: b

    You can use several options with PROC TRANSPOSE to give the variables in the output data set descriptive names. The NAME= option specifies a name for _NAME_ variable. The PREFIX= option specifies a prefix to use in constructing names for the other variables in the output data set.

  9. Which version of the data set Work.Sales2 is created by the following program?

    Chapter 16: Using Lookup Tables to Match Data
    1. Chapter 16: Using Lookup Tables to Match Data
    2. Chapter 16: Using Lookup Tables to Match Data
    3. Chapter 16: Using Lookup Tables to Match Data
    4. Chapter 16: Using Lookup Tables to Match Data

    Correct answer: b

    A BY statement can be used with PROC TRANSPOSE. For each BY group, PROC TRANSPOSE creates one observation for each variable that it transposes. The BY variable itself is not transposed. The original data set must be sorted or indexed prior to using a BY statement with PROC TRANSPOSE.

  10. Which program creates the data set Work.Fishsize?

    Chapter 16: Using Lookup Tables to Match Data
    Chapter 16: Using Lookup Tables to Match Data
    1. proc transpose data=work.fishdata
           out=work.fishsize
           prefix=Measurement;
      run;
    2. proc transpose data=work.fishdata
           out=work.fishsize
           prefix=Measurement;
           by location;
      run;
    3. proc transpose data=work.fishdata
           out=work.fishsize
           prefix=Measurement;
           by date;
      run;
    4. proc transpose data=work.fishdata
           out=work.fishsize
           prefix=Measurement;
           by location date;
      run;

    Correct answer: d

    The observations in Work.Fishsize are grouped by Location and Date. For each BY group, PROC TRANSPOSE creates four observations, one for each variable (Length1, Weight1, Length2, and Weight2) that it is transposing.

Chapter 17: Formatting Data

  1. Which SAS system option is used to identify format catalogs to SAS?

    1. FMTERR

    2. FMTLIB

    3. NOFMTERR

    4. FMTSEARCH=

    Correct answer: d

    By default, SAS searches for custom formats in the Work and Library libraries. The FMTSEARCH= system option specifies other catalogs to search when a format is referenced.

  2. Given the following PROC FORMAT step, how is the value 70 displayed when the AGEGRP. format is applied?

    proc format;
       picture agegrp
               1-<13='00 Youth'
               13-<20='00 Teen'
               20-<70='00 Adult'
               70-high='000 Senior';
    run;
    1. 000 Senior

    2. 70 Adult

    3. 70 Senior

    4. 070 Senior

    Correct answer: c

    A non-inclusive range is used such that the age at the high end of the range is not included. To create the picture format, three zeros are used to create a position for a three-digit numeric value. Because zero is used as a digit selector rather than a nonzero value, leading zeros are not included in the formatted value.

  3. When the NOFMTERR system option is in effect, what happens when SAS encounters a format it cannot locate?

    1. Creates the format in the default Work.Formats directory and continues processing.

    2. Substitutes the $w.or w. format and continues processing.

    3. Stops processing and writes an error message to the log.

    4. Skips processing at that step and continues with the next step and writes a note to the log.

    Correct answer: b

    By default, FMTERR is in effect and SAS stops processing if it cannot find a format that is referenced. When NOFMTERR is in effect, SAS substitutes the $w. or w. format and continues processing.

  4. Which of the following variables must be in the data set that is specified on the CNTLIN= option?

    1. End
    2. FmtName
    3. Value
    4. Description

    Correct answer: b

    A data set that is used to create a format with the CNTLIN= option must have the variables FmtName, Start, and Label. If a range is specified, it must also include the variable End.

  5. Given the following code, what option is missing?

    proc format;
        value times (?) 
             '00:00't-'04:59't = 'Red Eye'
             '05:00't-'11:59't = 'Morning'
             '12:00't-'17:59't = 'Afternoon'
             '18:00't-'23:59't = 'Evening'
             '00:00't-'11:59't = 'AM'
             '12:00't-'23:59't = 'PM';
    run;
    1. MULTILABEL

    2. MULTIRANGE

    3. MLF

    4. MULTIFORMAT

    Correct answer: a

    The format created by this value statement has overlapping ranges, so the MULTILABEL option must be used. A multilabel format can be used by any procedure that supports the MLF option.

  6. Which PROC FORMAT option is used to create a SAS data set from a format?

    1. CNTLIN=

    2. LIB=

    3. CNTLOUT=

    4. FMTLIB

    Correct answer: c

    The CNTLOUT= option is used to create a SAS data set from a format.

  7. Given the following OPTIONS statement, in what order will SAS search to find a user-defined format?

    options fmtsearch=(work abc.newfmt sasuser);
    1. Work.Formats

      Chapter 17: Formatting Data
      Abc.Newfmt
      Chapter 17: Formatting Data
      Sasuser.Formats
      Chapter 17: Formatting Data
      Library.Formats

    2. Work.Formats

      Chapter 17: Formatting Data
      Library.Formats
      Chapter 17: Formatting Data
      Abc.Newfmt
      Chapter 17: Formatting Data
      Sasuser.Formats

    3. Work.Formats

      Chapter 17: Formatting Data
      Abc.Newfmt
      Chapter 17: Formatting Data
      Sasuser.Format

    4. the default search order

    Correct answer: b

    SAS will search in the order specified on the FMTSEARCH= option. By default, SAS searches in the Work and Library libraries first unless they are specified on the option. Because Library is not specified here, it is searched after Work.

  8. What option is used with PROC FORMAT to document the formats in a particular format catalog?

    1. FMTSEARCH

    2. FMTERR

    3. CATALOG

    4. FMTLIB

    Correct answer: d

    The FMTLIB keyword is used to document the formats in a catalog. You can use the SELECT and EXCLUDE statements to process specific formats rather than the entire catalog.

  9. Which set of statements would you add to the PROC CATALOG code to copy the LEVELS. and $PICKS. formats from the Sasuser.Formats catalog to the Work.Formats catalog?

    proc catalog cat=sasuser.formats;
       ? 
       ? 
    run;
    1. copy out=sasuser.formats;
      select levels.format $picks.format;
    2. copy out=work.formats;
      select levels $picks;
    3. copy out=work.formats;
      select levels.format picks.formatc;
    4. copy out=work.formats;
      select levels.format $picks.format;

    Correct answer: c

    In the COPY statement, OUT= specifies the catalog to which you want to copy the format catalog entry. In the SELECT statement you specify the catalog entries by their entire name. Remember that numeric formats are stored with the extension .FORMAT and character formats are stored with the extension .FORMATC.

  10. Given the following PROC FORMAT step, how is the value 6.1 displayed when the SKICOND format is applied?

    proc format;
       value skicond
              0-3='Poor'
             3<-6='Fair'
             6<-9='Good'
             9<-high='Excellent';
    run;
    1. 6.1

    2. Fair

    3. Good

    4. .

    Correct answer: c

    The value 6.1 falls in the range 6<-9, which is labeled 'Good.' The non-inclusive range does not include the value 6, but it does include everything above 6.

Chapter 18: Modifying SAS Data Sets and Tracking Changes

  1. Which type of integrity constraint would you place on the variable StoreID to ensure that there are no missing values and that there are no duplicate values?

    1. UNIQUE

    2. CHECK

    3. PRIMARY KEY

    4. NOT NULL

    Correct answer: c

    The PRIMARY KEY integrity constraint includes both the NOT NULL and UNIQUE constraints.

  2. Which code creates an audit trail on the SAS data set Reports.Quarter1?

    1. proc datasets nolist;
         audit quarter1;
         initiate;
      quit;
    2. proc datasets lib=reports nolist;
         audit initiate reports.quarter1;
      quit;
    3. proc datasets lib=reports nolist;
         initiate audit quarter1;
      quit;
    4. proc datasets lib=reports nolist;
         audit quarter1;
         initiate;
      quit;

    Correct answer: d

    To initiate an audit on an existing SAS data set with the DATASETS procedure, you specify the data set in the AUDIT statement, and then you specify the INITIATE statement. You specify the library with the LIB= option.

  3. Which DATA step uses the transaction data set Records.Overnight to update the master data set Records.Snowfall by accumAmt?

    1. data records.snowfall;
         modify records.snowfall records.overnight
         key=accumAmt;
      run;
    2. data records.snowfall;
         modify records.overnight records.snowfall;
         by accumAmt;
      run;
    3. data records.snowfall;
         modify records.snowfall records.overnight;
         by accumAmt;
      run;
    4. data records.snowfall;
         modify records.snowfall records.overnight;
         update accumAmt;
      run;

    Correct answer: c

    In the MODIFY statement, you specify the master data set followed by the transaction data set. Then you specify the variable in the BY statement.

  4. The automatic variable _IORC_ is created when you use the MODIFY statement with a BY statement or the KEY= option. How can you use the value of _IORC_?

    1. to determine whether the index specified on the KEY= option is a valid index

    2. to determine the number of observations that were updated in the master data set

    3. to determine the status of the I/O operation

    4. to determine the number of observations that could not be updated in the master data set

    Correct answer: c

    The value of _IORC_ is a numeric return code that indicates the status of the most recently executed I/O operation. Checking the value of this variable allows you to detect abnormal I/O conditions and direct execution in particular ways.

  5. Which PROC DATASETS step creates an integrity constraint named val_age on the data set Survey to ensure that values of the variable age are greater than or equal to 18?

    1. proc datasets nolist;
         modify age;
         ic create val_age=check(where=(age>=18));
      quit;
    2. proc datasets nolist;
         modify Survey;
         ic create val_age=check(age>=18);
      quit;
    3. proc datasets nolist;
         modify survey;
         integrity constraint
                   val_age=check(where=(age>=18));
      quit;
    4. proc datasets nolist;
         modify survey;
         ic create val_age=check(where=(age>=18));
      quit;

    Correct answer: d

    In the MODIFY statement, you list the SAS data set that you want to modify. Then you use the IC CREATE statement to create the integrity constraint. This integrity constraint is a CHECK constraint and you use a WHERE clause to specify the condition that the variable values must meet.

  6. Which statement about using the MODIFY statement in a DATA step is true?

    1. MODIFY creates a second copy of the data while variables in the data are being matched with a WHERE clause and then deletes the second copy.

    2. You cannot modify the descriptor portion of the data set using the MODIFY statement.

    3. You can use the MODIFY statement to change the name of a variable.

    4. If the system terminates abnormally while a DATA step that is using the WHERE statement is processing, SAS automatically saves a copy of the unaltered data set.

    Correct answer: b

    The MODIFY statement in a DATA step can be used only to modify the values in a data set. It cannot be used to modify the descriptor portion of the data set.

  7. Which of the following statements about audit trails is true?

    1. They create historical versions of data so that a copy of the data set is saved each time the data is replaced.

    2. They record information about changes to observations in a data set each time the data set is replaced.

    3. They record information about changes to observations in a data set each time the data is modified in place.

    4. The audit trail file has the same name as the SAS data file it is monitoring, but has #AUDIT at the end of the data set name.

    Correct answer: c

    Audit trails are used to track changes that are made to a data set in place.

  8. Which code initiates generation data sets on the existing SAS data set Sasuser.Amounts and specifies that five historical versions are saved in addition to the base version?

    1. proc datasets lib=sasuser nolist;
         modify Amounts (genmax=6);
      quit;
    2. proc datasets lib=sasuser nolist;
         modify Amounts (genmax=5);
      quit;
    3. proc datasets lib=sasuser nolist;
         modify Amounts (gennum=6);
      quit;
    4. proc datasets lib=sasuser nolist;
         modify Amounts (gennum=5);
      quit;

    Correct answer: a

    You use the DATASETS procedure and the MODIFY statement to specify a number of generation data sets for a data set. The GENMAX= option is used to specify the number of versions to save. The number you specify includes the base version.

  9. Which statement about using the KEY= option in the MODIFY statement is true?

    1. SAS locates the variables to update using the index specified in the KEY= option and then automatically overlays nonmissing transaction values as it does when you use the MODIFY/BY statements.

    2. When you use the KEY= option, you must explicitly state the update that you want to make. SAS does not automatically overlay nonmissing transaction values.

    3. The KEY= option is used to specify a variable to match for updating observations.

    4. The index named in the KEY= option must be a simple index.

    Correct answer: b

    When you use the KEY= option, you must specify the update that you want to make to the data set.

  10. Which code deletes all generations of the data set Sasuser.Amounts including the base data set?

    1. proc datasets lib=sasuser nolist;
         delete amounts (gennum=ALL);
      quit;
    2. proc datasets lib=sasuser nolist;
         delete amounts (gennum=HIST);
      quit;
    3. proc datasets lib=sasuser nolist;
         delete amounts (gennum=0);
      quit;
    4. proc datasets lib=sasuser nolist;
         delete amounts;
      quit;

    Correct answer: a

    The keyword ALL is used to indicate that you want to delete all generations of the specified data set including the base version. The keyword HIST deletes the generation data sets, but saves the base version.

Chapter 19: Introduction to Efficient SAS Programming

This chapter has no quiz.

Chapter 20: Controlling Memory Usage

  1. Which of the following statements is true regarding the BUFNO= option?

    1. The BUFNO= option specifies the size of each buffer that is used for reading or writing a SAS data set.

    2. The BUFNO= option can improve execution time by limiting the number of input/output operations that are required.

    3. Using the BUFNO= option results in permanent changes to the data set.

    4. Using the BUFNO= option to increase the number of buffers results in decreased memory consumption.

    Correct answer: b

    You can use the BUFNO= system option or data set option to control how many buffers are available for reading or writing a SAS data set. Using BUFNO= can improve execution time by limiting the number of input/output operations that are required for a particular SAS data set. However, the improvement in I/O comes at the cost of increased memory consumption. The buffer number is not a permanent attribute of the data set and is valid only for the current step or SAS session.

  2. Which of the following statements is not true regarding a page?

    1. A page is the unit of data transfer between the engine and the operating environment.

    2. A page includes the number of bytes used by the descriptor portion, the data values, and the overhead.

    3. The size of a page is analogous to buffer size.

    4. The size of a page can be changed at any time.

    Correct answer: d

    A page is fixed in size when the data set is created, either to a default value or a specified value. You can use the BUFSIZE= option to control the page size of an output SAS data set. BUFSIZE= specifies not only the page size (in bytes), but also the size of each buffer that is used for reading or writing the SAS data set. The new buffer size is permanent; after it is specified, it is used whenever the data set is processed.

  3. The total number of bytes occupied by a data set equals…?

    1. the page size multiplied by the number of pages.

    2. the page size multiplied by the number of observations.

    3. the sum of the page size and the number of pages.

    4. the number of pages multiplied by the number of variables.

    Correct answer: a

    The total number of bytes occupied by a data set equals the page size multiplied by the number of pages. You can use the CONTENTS procedure to report the page size and the number of pages.

  4. Which statement opens the file Work.Quarter1, allocates enough buffers to hold the entire file in memory, and reads the data into memory?

    1. sasfile work.quarter1 open;
    2. sasfile work.quarter1 load;
    3. sasfile work.quarter1 bufno=max;
    4. sasfile work.quarter1 bufsize=max;

    Correct answer: b

    The SASFILE LOAD statement opens the file, allocates the buffers, and reads the data into memory.

  5. Which of the following statements is true regarding a file that is opened with the SASFILE statement?

    1. The file is available to subsequent DATA and PROC steps or applications until a SASFILE CLOSE statement is executed or until the program ends.

    2. The file is available to subsequent DATA and PROC steps or applications until a SASFILE END statement is executed.

    3. The file is available for subsequent utility or output processing until the program ends.

    4. If the file increases in size during processing, the number of buffers remains the same.

    Correct answer: a

    When a SAS data file is opened using the SASFILE statement, the data is held in memory, and is available to subsequent DATA and PROC steps or applications, until either a SASFILE CLOSE statement is executed or the program ends. Though a file that is opened with the SASFILE statement can be used for subsequent input or update processing, it cannot be used for subsequent utility or output processing. If the file in memory increases in size during processing, the number of buffers also increases.

Chapter 21: Controlling Data Storage Space

  1. Which of the following statements about uncompressed SAS data files is true?

    1. The descriptor portion is stored on whatever page has enough room for it.

    2. New observations are always added in the first sufficient available space.

    3. Deleted observation space is tracked.

    4. New observations are always added at the end of the data set.

    Correct answer: d

    The descriptor portion of an uncompressed data file is always stored at the end of the first data set page. New observations are always added to the end of the data set, and deleted observation space is neither tracked nor reused.

  2. Which of the following statements about compressed SAS data files is true?

    1. The descriptor portion is stored on whatever data set page has enough room for it.

    2. Deleted observation spaced can be reused.

    3. Compressed SAS data files have a smaller overhead than uncompressed SAS data files.

    4. In a compressed SAS data set, each observation must be the same size.

    Correct answer: b

    The descriptor portion of a compressed data file is always stored at the end of the first data set page. If you specify REUSE=YES, SAS tracks and reuses deleted observation space within a compressed data file. Therefore, every observation in a compressed data file can be a different size. Compressed data files do have a larger overhead than uncompressed data files.

  3. Which of the following programs correctly creates reduced-length numeric variables?

    1. data temp;
         infile file1;
         input x 4.
               y 3.
               z 2.;
        run;
    2. data temp;
         format x 4.
                y 3.
                z 2.;
         infile file1;
         input x 4.
               y 3.
               z 2.;
      run;
    3. data temp;
         length x 4
                y 3
                z 2;
         infile file1;
         input x 4.
               y 3.
               z 2.;
        run;
    4. data temp;
         informat x 4.
                  y 3.
                  z 2.;
         infile file1;
         input x 4.
               y 3.
               z 2.;
      run;

    Correct answer: c

    You use the LENGTH statement to assign a reduced length to a numeric variable. If you do not use the LENGTH statement to define a reduced length for numeric variables, their default length is 8 bytes. The FORMAT statement associates a format with a variable, and the INFORMAT statement associates an informat with a variable.

  4. Which of the following statements about SAS data views is true?

    1. SAS data views use less disk space but more CPU resources than SAS data files.

    2. SAS data views can be created only in permanent SAS data libraries.

    3. SAS data views use less CPU resources but more disk space than SAS data files.

    4. SAS data views can be created only in temporary SAS data libraries.

    Correct answer: a

    SAS data views use significantly less disk space than SAS data files. However, SAS data views typically need more CPU resources than SAS data files. You can create a SAS data view in either the temporary SAS data library or in a permanent SAS data library.

  5. Which of the following programs should you use to detect any loss of precision between the default-length numeric variables in Company.Regular and the reduced-length numeric variables in the data set Company.Reduced?

    1. proc contents data=company.regular;
         compare data=company.reduced;
      run;
    2. proc compare base=company.regular
         compare=company.reduced;
      run;
    3. proc print data=company.regular;
      run;
      
      proc print data=company.reduced;
      run;
    4. proc datasets library=company;
         contents data=regular compare=reduced;
      run;

    Correct answer: b

    You use the COMPARE procedure to detect any differences in the values of two data sets. The COMPARE statement is not valid syntax in either the CONTENTS procedure or the DATASETS procedure. Printing both data sets might not reveal differences in the precise values of the shortened variables, depending on the formats that are used.

Chapter 22: Using Best Practices

  1. Placing the subsetting IF statement at the top rather than near the bottom of a DATA step results in a savings in CPU usage. What happens if the subset is large rather than small?

    1. The savings in CPU usage increases as the subset grows larger because the I/O increases.

    2. The savings in CPU usage decreases as the subset grows larger. However, placing the subsetting IF statement at the top of a DATA step always uses fewer resources than placing it at the bottom.

    3. The savings in CPU usage remains constant as the subset grows larger. However, placing the subsetting IF statement near the bottom of a data set is preferable.

    4. The savings in CPU usage decreases as the subset grows larger. However, placing the subsetting IF statement near the bottom of a data set increases the I/O.

    Correct answer: b

    As SAS processes a larger subset of the data, more CPU resources are required. However, positioning of the subsetting IF statement in a DATA step does affect performance and efficiency.

  2. Which of the following statements is true about techniques that are used for modifying data and attributes?

    1. You can use PROC DATASETS to modify both data values and variable attributes.

    2. You can use PROC DATASETS to modify only data values.

    3. You can use the DATA step to modify both data values and variable attributes.

    4. You can use the DATA step to modify only variable attributes.

    Correct answer: c

    The DATA step is the only technique that can be used to modify both data values and variable attributes. The DATASETS procedure enables you to modify only variable attributes.

  3. For selecting observations, is a subsetting IF statement or a WHERE statement more efficient? Why?

    1. A subsetting IF statement is more efficient because it loads all observations sequentially into the program data vector.

    2. A subsetting IF statement is more efficient because it examines what is in the input buffer and selects observations before they are loaded into the program data vector, which results in a savings in CPU operations.

    3. A WHERE statement is more efficient because it loads all observations sequentially into the program data vector.

    4. A WHERE statement is more efficient because it examines what is in the input buffer and selects observations before they are loaded into the program data vector, which results in a savings in CPU operations.

    Correct answer: d

    For selecting observations, a WHERE statement is more efficient than a subsetting IF statement because it examines what is in the input buffer and selects observations before they are loaded into the program data vector, which results in a savings in CPU operations.

  4. When is it more advantageous to create a temporary SAS data set rather than a permanent SAS data set?

    1. When the external file on which the data set is based might change between SAS sessions.

    2. When the external file on which the data set is based does not change between SAS sessions.

    3. When the data set is needed for more than one SAS session.

    4. When you are converting raw numeric values to SAS data values.

    Correct answer: a

    It is more advantageous to create a temporary SAS data set rather than a permanent SAS data set when the external file on which the data set is based is frequently updated between SAS sessions.

  5. When you compare the technique of using multiple DATASETS procedures to using a single DATASETS procedure to modify the descriptor portion of a data set, which is true?

    1. A one-step DATASETS procedure results in an increase in I/O operations.

    2. Multiple DATASETS procedures result in a decrease in I/O operations.

    3. A one-step DATASETS procedure results in a decrease in CPU usage.

    4. Multiple DATASETS procedures result in a decrease in CPU usage.

    Correct answer: c

    A one-step DATASETS procedure results in a savings of CPU usage and I/O operations. PROC DATASETS supports RUN-group processing, which enables you to process multiple SAS data sets from the same library with one invocation of the procedure.

Chapter 23: Selecting Efficient Sorting Strategies

  1. When the following program is submitted, what is the value of FIRST.Product_Line for the third observation in the data set Work.Products?

    data new.products;
      set work.products
      by product_line notsorted;
    run;

    SAS Data Set Work.Products

    Obs

    Product_Line

    Product_Name

    Supplier_Name

    1

    Children

    Kids Children's Fleece Hat

    3Top Sports

    2

    Children

    Ski Jacket w/Removable Fleece

    Scandinavian Clothing A/S

    3

    Clothes & Shoes

    Tyfoon Linen Pants

    Typhoon Clothing

    4

    Clothes & Shoes

    Watchit 120 Sterling/Reflective

    Eclipse Inc

    5

    Clothes & Shoes

    Money Belt, Black

    Top Sports

    1. 1

    2. 3

    3. 0

    4. Clothes & Shoes

    Correct answer: a

    First. is a temporary automatic variable that identifies the first observation in each BY group. When an observation is the first in a BY group, SAS sets the value of the FIRST.variable to 1. For all other observations in the BY group, the value of the FIRST.variable is 0.

  2. Which option is used with the SORT procedure to store only the BY variables and the observation numbers in temporary files?

    1. NOTSORTED

    2. GROUPFORMAT

    3. TAGSORT

    4. SORTEDBY=

    Correct answer: c

    The TAGSORT option stores only the BY variables and the observation numbers in temporary files. The BY variables and the observation numbers are called tags. At the completion of the sorting process, PROC SORT uses the tags to retrieve records from the input data set in sorted order.

  3. Which of the following is not an advantage of BY-group processing with an index that is based on the BY variables?

    1. The data can be sequenced without using the SORT procedure.

    2. There is no need to re-sort a data set when observations are modified or added.

    3. It is generally more efficient than reading a sorted data set sequentially.

    4. The data can be sequenced by different variables if multiple indexes are used.

    Correct answer: c

    When BY-group processing is used with an index that is based on the BY variables, the data can be sequenced without using the SORT procedure. The data can be sequenced by different variables if multiple indexes are used. Because indexes are updated automatically, there is no need to re-sort a data set when observations are modified or added. However, BY-group processing with an index is generally less efficient than reading a sorted data set sequentially.

  4. Which SORT procedure option compares all of the variable values for each observation to those for the previous observation that was written to the output data set?

    1. NODUPKEY

    2. NODUPRECS

    3. EQUALS

    4. NOEQUALS

    Correct answer: b

    The NODUPRECS option compares all of the variable values for each observation to those for the previous observation that was written to the output data set. If an exact match is found, then the observation is not written to the output data set.

  5. What happens if the workspace that is required for completing a sort is less than or equal to the value that is specified in the SORTSIZE= system option or procedure option?

    1. CPU time is increased.

    2. I/O is increased.

    3. The entire sort can take place in memory.

    4. A temporary utility file is created in the Work directory or in a mainframe temporary area.

    Correct answer: c

    The SORTSIZE= system option or procedure option specifies how much memory is available to the SORT procedure. If the required workspace is less than or equal to the value specified in the SORTSIZE= system option or procedure option, then the entire sort can take place in memory, which reduces processing time.

Chapter 24: Querying Data Efficiently

  1. Why can using an index reduce the number of I/O operations that are required for accessing a small subset?

    1. Using an index requires larger input buffers, which can hold more pages.

    2. The index does not have to be loaded into an input buffer.

    3. The number of observations that SAS has to load into the program data vector (PDV) is decreased.

    4. The number of pages that SAS has to load into input buffers is decreased.

    Correct answer: d

    When using an index to select a subset, SAS loads only the pages that contain a qualified observation into input buffers. When accessing observations sequentially, SAS must load all observations into input buffers. Loading more pages requires more I/O operations.

  2. You want to select a subset of observations in the data set Company.Products, and you have defined a simple index on the variable Rating. SAS cannot use the index to process which of the following WHERE statements?

    1. where rating is missing;
    2. where rating=int(rating);
    3. where rating between 3.5 and 7.5;
    4. where rating=5.5;

    Correct answer: b

    SAS considers using an index to process a WHERE condition that contains one of a specific group of operators and functions. However, SAS will not consider using an index for a WHERE condition that contains other elements, such as a function other than TRIM or SUBSTR.

  3. In which of the following situations is sequential access likely to be more efficient than direct access for WHERE processing?

    1. The subset contains over 75% of the observations in the data set.

    2. The WHERE expression specifies both key variables in a single composite index.

    3. The data is sorted on the key variable.

    4. The data set is very large.

    Correct answer: a

    The size of the subset relative to the size of the data set is an important factor in determining which access method is most efficient. If a subset is large (more than 33% of the data set), it is likely to be more efficient to use sequential access than direct access. Direct access is usually more efficient when you are selecting a small subset (less than 33% of the data set), especially if the data set is large (has a high page count). However, if the data set is very small (less than three pages), using an index is not efficient. The number of key variables specified in a WHERE expression does not determine which access method is most efficient. If the two key variables that are specified are the first two variables in the same index, the WHERE expression is a candidate for compound optimization. Sorting the data also does not determine which access method is most efficient. However, sorting the data before subsetting improves the efficiency of WHERE processing regardless of the access method.

  4. You want to summarize data for one class variable, and you are trying to decide whether to use PROC MEANS (or PROC SUMMARY), PROC REPORT, PROC TABULATE, PROC SQL, or the DATA step with PROC SORT. Which of the following statements about the efficiency of these summarization tools is not true?

    1. PROC MEANS (or PROC SUMMARY), PROC REPORT, and PROC TABULATE have similar resource usage.

    2. The efficiency of all these tools is affected by the shape of the data.

    3. The SQL procedure is always the least efficient because it is a general-purpose tool.

    4. PROC SQL and the DATA step with PROC SORT have similar resource usage.

    Correct answer: c

    When you are summarizing data for one or more class variables, the tools in each of the following groups are similar in resource usage:

    • PROC MEANS (or PROC SUMMARY), PROC REPORT, and PROC TABULATE

    • PROC SQL and the DATA step with PROC SORT.

    However, the relative efficiency of the two groups of tools varies according to the shape of the data.

  5. Which of the following techniques is most efficient for summarizing data for specific combinations of class variables?

    1. the NWAY option in multiple PROC MEANS steps

    2. the TYPES statement in a PROC MEANS step

    3. the WHERE= option in a PROC MEANS step

    4. a basic PROC MEANS step

    Correct answer: b

    The TYPES statement in a PROC MEANS step is the most efficient technique for summarizing data for one or more specific combinations of class variables. A program that contains the NWAY option in multiple PROC MEANS steps is usually least efficient because SAS must read the entire data set in each step. A PROC MEANS step that contains the WHERE= option in the OUTPUT statement is also less efficient than the TYPES statement in PROC MEANS because SAS must calculate all possible combinations of class variables, and subsetting does not occur until the results are written to output. A basic PROC MEANS step summarizes data for all possible combinations of class variables instead of for specific combinations.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset