Chapter 16. Processing Variables with Arrays

Overview

Introduction

In DATA step programming, you often need to perform the same action on more than one variable. Although you can process variables individually, it is easier to handle them as a group. You can do this by using array processing.

For example, using an array and DO loop, the program below eliminates the need for 365 separate programming statements to convert the daily temperature from Fahrenheit to Celsius for the year.

data work.report(drop=i);
   set master.temps;
   array daytemp{365} day1-day365; 
   do i=1 to 365; 
      daytemp{i}=5*(daytemp{i}-32)/9; 
      end;
run;

You can use arrays to simplify the code needed to

  • perform repetitive calculations

  • create many variables that have the same attributes

  • read data

  • rotate SAS data sets by changing variables to observations or observations to variables

  • compare variables

  • perform a table lookup.

This chapter teaches you how to define an array and how to reference elements of the array in the DATA step.

Objectives

In this chapter, you learn to

  • group variables into one- and two-dimensional arrays

  • perform an action on array elements

  • create new variables with an ARRAY statement

  • assign initial values to array elements

  • create temporary array elements with an ARRAY statement.

Creating One-Dimensional Arrays

Understanding SAS Arrays

A SAS array is a temporary grouping of SAS variables under a single name. An array exists only for the duration of the DATA step.

Understanding SAS Arrays

One reason for using an array is to reduce the number of statements that are required for processing variables. For example, in the DATA step below, the values of seven data set variables are converted from Fahrenheit to Celsius temperatures.

data work.report;
   set master.temps;
   mon=5*(mon-32)/9;
   tue=5*(tue-32)/9;
   wed=5*(wed-32)/9;
   thr=5*(thr-32)/9;
   fri=5*(fri-32)/9;
   sat=5*(sat-32)/9;
   sun=5*(sun-32)/9;
run;

As you can see, the assignment statements perform the same calculation on each variable in this series of statements. Only the name of the variable changes in each statement.

By grouping the variables into a one-dimensional array, you can process the variables in a DO loop. You use fewer statements, and the DATA step program is more easily modified or corrected.

data work.report(drop=i);
   set master.temps;
   array wkday{7} mon tue wed thr fri sat sun;
   do i=1 to 7;
      wkday{i}=5*(wkday{i}-32)/9;
   end;
run;

You will learn other uses for arrays as you continue through this chapter.

Defining an Array

To group previously defined data set variables into an array, use an ARRAY statement.

For example, in the data set Finance.Sales91, you might want to process the variables Qtr1, Qtr2, Qtr3, and Qtr4 in the same way.

Defining an Array

Specifying the Array Name

To group the variables in the array, first give the array a name. In this example, make the array name sales.

array sales{4} qtr1 qtr2 qtr3 qtr4;

Specifying the Dimension

Following the array name, you must specify the dimension of the array. The dimension describes the number and arrangement of elements in the array. There are several ways to specify the dimension.

  • In a one-dimensional array, you can simply specify the number of array elements. The array elements are the existing variables that you want to reference and process elsewhere in the DATA step.

    array sales{4} qtr1 qtr2 qtr3 qtr4;
  • The dimension of an array doesn't have to be the number of array elements. You can specify a range of values for the dimension when you define the array. For example, you can define the array sales as follows:

    array sales{96:99} totals96 totals97 totals98 totals99;
  • You can also indicate the dimension of a one-dimensional array by using an asterisk (*). This way, SAS determines the dimension of the array by counting the number of elements.

    array sales{*} qtr1 qtr2 qtr3 qtr4;
  • Enclose the dimension in either parentheses, braces, or brackets.

               ( )
    array sales{4} qtr1 qtr2 qtr3 qtr4;
               [ ]

Specifying Array Elements

When specifying the elements of an array, you can list each variable name that you want to include in the array. When listing elements, separate each element with a space. As with all SAS statements, you end the ARRAY statement with a semicolon (;).

array sales{4}  qtr1 qtr2 qtr3 qtr4;

You can also specify array elements as a variable list. Here is an example of an ARRAY statement that groups the variables Qtr1 through Qtr4 into a one-dimensional array, using a variable list.

array sales{4}  qtr1-qtr4;

Let's look more closely at array elements that are specified as variable lists.

Variable Lists as Array Elements

You can specify variable lists in the forms shown below. Each type of variable list is explained in more detail following the table.

VariablesForm
a numbered range of variablesVar1-Varn
all numeric variables_NUMERIC_
all character variables_CHARACTER_
all variables_ALL_

A Numbered Range of Variables

Qtr1 Qtr2 Qtr3 Qtr4 → Qtr1-Qtr4
  • The variables must have the same name except for the last character or characters.

  • The last character of each variable must be numeric.

  • The variables must be numbered consecutively.

array sales{4}  qtr1-qtr4;

In the preceding example, you would use sales(4) to reference Qtr4. However, the index of an array doesn't have to range from one to the number of array elements. You can specify a range of values for the index when you define the array. For example, you can define the array sales as follows:

array sales{96:99}  totals96-totals99;

All Numeric Variables

Amount Rate Term → _NUMERIC_

_NUMERIC_ specifies all numeric variables that have already been defined in the current DATA step.

array sales{*}  _numeric_;

All Character Variables

FrstName LastName Address → _CHARACTER_

_CHARACTER_ specifies all character variables that have already been defined in the current DATA step.

array sales{*} _character_;

All Variables

FrstName LastName Address Amount Rate Term → _ALL_

_ALL_ specifies all variables that have already been defined in the current DATA step. The variables must all be of the same type: all character or all numeric.

array sales{*}  _all_;

Referencing Elements of an Array

Now let's look at some ways you can use arrays to process variables in the DATA step.

data work.report(drop=i);
   set master.temps;
   array wkday{7} mon tue wed thr fri sat sun;
   do i=1 to 7;
      if wkday{i}>95 then output;
   end;
run;


data work.weights(drop=i);
   set master.class;
   array wt{6} w1-w6;
   do i=1 to 6;
      wt{i}=wt{i}*2.2;
   end;
run;


data work.new(drop=i);
   set master.synyms;
   array term{9} also1-also9;
   do i=1 to 9;
      if term{i} ne " " then output;
   end;
run;

The ability to reference the elements of an array by an index value is what gives arrays their power. Typically, arrays are used with DO loops to process multiple variables and to perform repetitive calculations.

array quarter{4} jan apr jul oct;
do  i=1 to 4;
      YearGoal=quarter{i}*1.2;
end;

When you define an array in a DATA step, an index value is assigned to each array element. The index values are assigned in the order of the array elements.

                  1   2   3   4
array quarter{4} jan apr jul oct;
do i=1 to 4;
   YearGoal=quarter{i}*1.2;
end;

You use an array reference to perform an action on an array element during execution. To reference an array element in the DATA step, specify the name of the array, followed by an index value enclosed in parentheses.

When used in a DO loop, the index variable of the iterative DO statement can reference each element of the array.

array qtr{4}  jan apr jul oct;
do i=1 to 4;
   YearGoal=quarter{i}*1.2;
end;

For example, the DO loop above increments the index variable i from the lower bound of the quarter array, 1, to the upper bound, 4. The following sequence illustrates this process:

                   1
array quarter{4}  jan apr jul oct;
do i=1 to 4;
      YearGoal=quarter{1}*1.2;
end;

                         2
        array quarter{4} jan  apr jul oct;
        do i=1 to 4;
           YearGoal=quarter{2}*1.2;
        end;


                                      3
           array quarter{4} jan apr  jul oct;
           do i=1 to 4;
              YearGoal=quarter{ 3}*1.2;
           end;
                                             4
              array quarter{4} jan apr jul  oct;
              do i=1 to 4;
                 YearGoal=quarter{4}*1.2;
              end;

During each iteration of the DO loop, quarter{i} refers to an element of the array quarter in the order listed.

Let's look at another example of a DATA step that contains an array with a DO loop.

The Health Center of a company conducts a fitness class for its employees. Each week, participants are weighed so that they can monitor their progress. The weight data, currently stored in kilograms, needs to be converted to pounds.

Referencing Elements of an Array

You can use a DO loop to update the variables Weight1 through Weight6 for each observation in the Hrd.Fitclass data set.

data hrd.convert;
   set hrd.fitclass;
   array wt{6} weight1-weight6;
   do i=1 to 6; 
      wt{i}=wt{i}*2.2046; 
    end;
run;
Referencing Elements of an Array

Compilation and Execution

To understand how the DO loop processes the array elements, let's examine the compilation and execution phases of this DATA step.

During compilation, the program data vector is created for the Hrd.Convert data set.

Compilation and Execution

The DATA step is scanned for syntax errors. If there are any syntax errors in the ARRAY statement, they are detected at this time.

The index values of the array elements are assigned. Note that the array name and the array references are not included in the program data vector. The array name and array references exist only for the duration of the DATA step.

During the first iteration of the DATA step, the first observation in Hrd.Fitclass is read into the program data vector.

data hrd.convert;
   set hrd.fitclass;
   array wt{6} weight1-weight6;
   do i=1 to 6;
      wt{i}=wt{i}*2.2046;
   end;
run;
Compilation and Execution

Because the ARRAY statement is a compile-time only statement, it is ignored during execution. The DO loop is executed next.

During the first iteration of the DO loop, the index variable i is set to 1. As a result, the array reference wt{i} becomes wt{1}. Because wt{1} refers to the first array element, Weight1, the value of Weight1 is converted from kilograms to pounds.

data hrd.convert;
   set hrd.fitclass;
   array wt{6} weight1-weight6;
   do i=1 to 6;
      wt{1}=wt{1}*2.2046; 
   end;
run;
Compilation and Execution

Graphical Display of Array Processing

As the DATA step continues its DO loop iterations, the index variable i is changed from 1 to 2, 3, 4, 5, and 6, causing Weight1 through Weight6 to receive new values in the program data vector. Watch how the program data vector is built as you examine the process step by step below.

data hrd.convert;
   set hrd.fitclass;
   array wt{6} weight1-weight6;
   do i=1 to 6;
      wt{i}=wt{i}*2.2046;
   end;
run;

Using the DIM Function in an Iterative DO Statement

When using DO loops to process arrays, you can also use the DIM function to specify the TO clause of the iterative DO statement. For a one-dimensional array, specify the array name as the argument for the DIM function. The function returns the number of elements in the array.

In this example, dim(wt) returns a value of 6.

data hrd.convert;
   set hrd.fitclass;
   array wt{*} weight1-weight6;
   do i=1 to dim(wt);
      wt{i}=wt{i}*2.2046;
   end;
run;

When you use the DIM function, you do not have to re-specify the stop value of an iterative DO statement if you change the dimension of the array.


data hrd.convert;
   set hrd.fitclass;
   array wt{*} weight1-
weight6;
   do i=1 to dim(wt);
      wt{i}=wt{i}*2.2046;
   end;
run;

data hrd.convert;
   set hrd.fitclass;
   array wt{*} weight1-weight10;
   do i=1 to dim(wt);
      wt{i}=wt{i}*2.2046;
   end;
run;

Expanding Your Use of Arrays

Creating Variables in an ARRAY Statement

So far, you have learned several ways to reference existing variables in an ARRAY statement. You can also create variables in an ARRAY statement by omitting the array elements from the statement. Because you are not referencing existing variables, SAS automatically creates the variables for you and assigns default names to them.

For example, suppose you need to calculate the weight gain or loss from week to week for each member of a fitness class, shown below.

Creating Variables in an ARRAY Statement

You'd like to create variables that contain this weekly difference. To perform the calculation, you first group the variables Weight1 through Weight6 into an array.

data hrd.diff;
   set hrd.convert;
   array wt{6} weight1-weight6;

Next, you want to create the new variables to store the differences between the six recorded weights. You can use an additional ARRAY statement without elements to create the new variables.

data hrd.diff;
   set hrd.convert;
   array wt{6} weight1-weight6;
   array WgtDiff{5};
Creating Variables in an ARRAY Statement

Remember, when creating variables in an ARRAY statement, you do not need to specify array elements as long as you specify how many elements will be in the array.

array WgtDiff{5};

Default Variable Names

The default variable names are created by concatenating the array name and the numbers 1, 2, 3, and so on, up to the array dimension.

                  array WgtDiff{5};
                          . . . . .
                      .    .  .  .  .
                  .      .   .    .   .
               .       .    .      .    .
           .         .     .        .     .
       .           .      .          .      .
WgtDiff1     WgtDiff2  WgtDiff3  WgtDiff4  WgtDiff5
Default Variable Names
array WgtDiff{5} Oct12 Oct19 Oct26 Nov02 Nov09;
                 array WgtDiff{5};
                       . . . . .
                     .  .  .  .  .
                   .   .   .   .   .
                 .    .    .    .   .
               .     .     .     .    .
             .      .      .      .    .
           Oct12  Oct19   Oct26   Nov02   Nov09

Arrays of Character Variables

To create an array of character variables, add a dollar sign ($) after the array dimension.

array firstname{5}  $;

By default, all character variables that are created in an ARRAY statement are assigned a length of 8. You can assign your own length by specifying the length after the dollar sign.

array firstname{5} $ 24g;

The length that you specify is automatically assigned to all variables that are created by the ARRAY statement.

During the compilation of the DATA step, the variables that this ARRAY statement creates are added to the program data vector and are stored in the resulting data set.

data hrd.diff;
   set hrd.convert;
   array wt{6} Weight1-Weight6;
   array WgtDiff{5};
Arrays of Character Variables
Arrays of Character Variables
Arrays of Character Variables

Now you can use a DO loop to calculate the differences between each of the recorded weights. Notice that each value of WgtDiff{i} is calculated by subtracting wt{i} from wt{i+1}. By manipulating the index variable, you can easily reference any array element.

data hrd.diff;
   set hrd.convert;
   array wt{6} weight1-weight6;
   array WgtDiff{5};
   do i=1 to 5; 
      wgtdiff{i}=wt{i+1}-wt{i}; 
   end;
run;

A portion of the resulting data set is shown below.

Arrays of Character Variables

Assigning Initial Values to Arrays

Sometimes it is useful to assign initial values to elements of an array when you define the array.

array goal{4} g1 g2 g3 g4 (initial values);

To assign initial values in an ARRAY statement:

  1. Place the values after the array elements.

    array goal{4} g1 g2 g3 g4 (9000 9300 9600 9900);
  2. Specify one initial value for each corresponding array element.

    Assigning Initial Values to Arrays
  3. Separate each value with a comma or blank.

    Assigning Initial Values to Arrays
  4. Enclose the initial values in parentheses.

    Assigning Initial Values to Arrays
  5. Enclose each character value in quotation marks.

    Assigning Initial Values to Arrays

It's also possible to assign initial values to an array without specifying each array element. The following statement creates the variables Var1, Var2, Var3, and Var4, and assigns them initial values of 1, 2, 3, and 4:

array Var{4} (1 2 3 4);

For this example, assume that you have the task of comparing the actual sales figures in the Finance.Qsales data set to the sales goals for each sales representative at the beginning of the year. The sales goals are not recorded in Finance.Qsales.

Assigning Initial Values to Arrays

The DATA step below reads the Finance.Qsales data set to create the Finance.Report data set. The ARRAY statement creates an array to process sales data for each quarter.

data finance.report;
   set finance.qsales;
   array sale{4} sales1-sales4;

To compare the actual sales to the sales goals, you must create the variables for the sales goals and assign values to them.

data finance.report;
   set finance.qsales;
   array sale{4} sales1-sales4;
   array Goal{4} (9000 9300 9600 9900);

A third ARRAY statement creates the variables Achieved1 through Achieved4 to store the comparison of actual sales versus sales goals.

data finance.report;
   set finance.qsales;
   array sale{4} sales1-sales4;
   array Goal{4} (9000 9300 9600 9900);
   array Achieved{4};
   do i=1 to 4;
      achieved{i}=100*sale{i}/goal{i};
      end;
run;

A DO loop executes four times to calculate the value of each element of the achieved array (expressed as a percentage).

data finance.report;
   set finance.qsales;
   array sale{4} sales1-sales4;
   array Goal{4} (9000 9300 9600 9900);
   array Achieved{4};
   do i=1 to 4; 
      achieved{i}=100*sale{i}/goal{i}; 
   end;
run;

Before submitting this DATA step, you can drop the index variable from the new data set by adding a DROP= option to the DATA statement.

data finance.report(drop=i);
   set finance.qsales;
   array sale{4} sales1-sales4;
   array Goal{4} (9000 9300 9600 9900);
   array Achieved{4};
   do i=1 to 4;
      achieved{i}=100*sale{i}/goal{i};
   end;
run;

This is an example of a simple table-lookup program. The resulting data set contains the variables that were read from Finance.Qsales, plus the eight variables that were created with ARRAY statements.

Assigning Initial Values to Arrays
Assigning Initial Values to Arrays

The variables Goal1 through Goal4 should not be stored in the data set, because they are needed only to calculate the values of Achieved1 through Achieved4. The next example shows you how to create temporary array elements.

Creating Temporary Array Elements

To create temporary array elements for DATA step processing without creating new variables, specify _TEMPORARY_ after the array name and dimension.

data finance.report(drop=i);
   set finance.qsales;
   array sale{4} sales1-sales4;
   array goal{4}  _temporary_ (9000 9300 9600 9900);
   array Achieved{4};
   do i=1 to 4;
      achieved{i}=100*sale{i}/goal{i};
   end;
run;

Temporary array elements do not appear in the resulting data set.

Creating Temporary Array Elements

Temporary array elements are useful when the array is needed only to perform a calculation. You can improve performance time by using temporary array elements.

Understanding Multidimensional Arrays

So far, you have learned how to group variables into one-dimensional arrays. You can also group variables into table-like structures called multidimensional arrays. This section teaches you how to define and use two-dimensional arrays, which are a common type of multidimensional array.

Suppose you want to write a DATA step to compare responses on a quiz to the correct answers. As long as there is only one correct answer per question, this is a simple one-to-one comparison.

Resp1  →  Answer1
Resp2  →  Answer2
Resp3  →  Answer3
Resp4  →  Answer4

However, if there is more than one correct answer per question, you must compare each response to each possible correct answer in order to determine whether there is a match.

Resp1  →  Answer1  Answer2  Answer3
Resp2  →  Answer4  Answer5  Answer6
Resp3  →  Answer7  Answer8  Answer9
Resp4  →  Answer10  Answer11  Answer12

You can process the above data more easily by grouping the Answer variables into a two-dimensional array. Just as you can think of a one-dimensional array as a single row of variables, as in this example …

Answer1 Answer2 Answer3 Answer4 … Answer9 Answer10 Answer11 Answer12

… you can think of a two-dimensional array as multiple rows of variables.

Answer1  Answer2  Answer3
Answer4  Answer5  Answer6
Answer7  Answer8  Answer9
Answer10  Answer11  Answer12

Defining a Multidimensional Array

To define a multidimensional array, you specify the number of elements in each dimension, separated by a comma. This ARRAY statement defines a two-dimensional array:

array new{3,4} x1-x12;

In a two-dimensional array, the two dimensions can be thought of as a table of rows and columns.

Defining a Multidimensional Array

The first dimension in the ARRAY statement specifies the number of rows.

Defining a Multidimensional Array

The second dimension specifies the number of columns.

Defining a Multidimensional Array

You can reference any element of the array by specifying the two dimensions. In the example below, you can perform an action on the variable x7 by specifying the array reference new(2,3). You can easily locate the array element in the table by finding the row (2), then the column (3).

Defining a Multidimensional Array

When you define a two-dimensional array, the array elements are grouped in the order in which they are listed in the ARRAY statement. For example, the array elements x1 through x4 can be thought of as the first row of the table.

                        
array new{3,4}  x1   x2   x3   x4 x5 x6 x7 x8 x9 x10 x11 x12;
Defining a Multidimensional Array

The elements x5 through x8 become the second row of the table, and so on.

                        
array new{3,4} x1 x2 x3 x4  x5   x6   x7   x8 x9 x10 x11 x12;
Defining a Multidimensional Array

Referencing Elements of a Two-Dimensional Array

Multidimensional arrays are typically used with nested DO loops. The next example uses a one-dimensional array, a two-dimensional array, and a nested DO loop to re-structure a set of variables.

Your company's sales figures are stored by month in the SAS data set Finance.Monthly. Your task is to generate a new data set of quarterly sales rather than monthly sales.

Referencing Elements of a Two-Dimensional Array

Defining the array m{4,3} puts the variables Month1 through Month12 into four groups of three months (yearly quarters).

Referencing Elements of a Two-Dimensional Array
data finance.quarters;
   set finance.monthly;
   array m{4,3} month1-month12;

Defining the array Qtr{4} creates the numeric variables Qtr1, Qtr2, Qtr3, Qtr4, which will be used to sum the sales for each quarter.

data finance.quarters;
   set finance.monthly;
   array m{4,3} month1-month12;
   array Qtr{4};

A nested DO loop is used to reference the values of the variables Month1 through Month12 and to calculate the values of Qtr1 through Qtr4. Because the variables i and j are used only for loop processing, the DROP= option is used to exclude them from the Finance.Quarters data set.

data finance.quarters(drop=i j);
   set finance.monthly;
   array m{4,3} month1-month12;
   array Qtr{4};
   do i=1 to 4;
     qtr{i}=0;
      do j=1 to 3;
         qtr{i}+m{i,j};
      end; 
   end;
run;

Each element in the Qtr array represents the sum of one row in the m array. The number of elements in the Qtr array should match the first dimension of the m array (that is, the number of rows in the m array). The first DO loop executes once for each of the four elements of the Qtr array.

The assignment statement, qtr{i}=0, sets the value of qtr{i} to zero after each iteration of the first DO loop. Without the assignment statement, the values of Qtr1, Qtr2, Qtr3, and Qtr4 would accumulate across iterations of the data step due to the qtr{i}+m{i,j} sum statement within the DO loop.

data finance.quarters(drop=i j);
   set finance.monthly;
   array m{4,3} month1-month12;
   array Qtr{4};
   do i=1 to 4; 
     qtr{i}=0;
      do j1 to 3;
         qtr{i}+m{i,j};
      end;
   end;
run;

The second DO loop executes the same number of times as the second dimension of the m array (that is, the number of columns in each row of the m array).

data finance.quarters(drop=i j);
   set finance.monthly;
   array m{4,3} month1-month12;
   array Qtr{4};
    do i=1 to 4;
      qtr{i}=0;
       do j=1 to 3;
          qtr{i}+m{i,j};
       end;
   end;
run;

To see how the nested DO loop processes these arrays, let's examine the execution of this DATA step.

When this DATA step is compiled, the program data vector is created. The PDV contains the variables Year, Month1 through Month12, and the new variables Qtr1 through Qtr4. (Only the beginning and ending portions of the program data vector are represented here.)

data finance.quarters(drop=i j);
   set finance.monthly;
   array m{4,3} month1-month12;
   array Qtr{4};
   do i=1 to 4;
     qtr{i}=0;
      do j=1 to 3;
         qtr{i}+m{i,j};
      end;
   end;
run;
Referencing Elements of a Two-Dimensional Array

During the first execution of the DATA step, the values of the first observation of Finance.Monthly are read into the program data vector. When the first DO loop executes the first time, the index variable i is set to 1.

data finance.quarters(drop=i j);
   set finance.monthly;
   array m{4,3} month1-month12;
   array Qtr{4};
   do i=1 to 4;          i=1
      qtr{i}=0;
       do j=1 to 3;
          qtr{i}+m{i,j};
       end;
   end;
run;
Referencing Elements of a Two-Dimensional Array

During the first iteration of the nested DO loop, the value of Month1, which is referenced by m(i,j), is added to Qtr1.

data finance.quarters(drop=i j);
   set finance.monthly;
   array m{4,3} month1-month12;
   array Qtr{4};
   do i=1 to 4;         i=1
     qtr{i}=0;
    > do j=1 to 3;        j=1
            qtr(1)+m(1,1);
         end;
   end;
run;
Referencing Elements of a Two-Dimensional Array

During the second iteration of the nested DO loop, the value of Month2, which is referenced by m (i,j), is added to Qtr1.

data finance.quarters(drop=i j);
      set finance.monthly;
      array m{4,3} month1-month12;
      array Qtr{4};
      do i=1 to 4;         i=1
        qtr{i}=0;
      > do j=1 to 3;        j=2
              qtr{1}+m{1,2};
        end;
     end;
run;
Referencing Elements of a Two-Dimensional Array

The nested DO loop continues to execute until the index variable j exceeds the stop value, 3. When the nested DO loop completes execution, the total sales for the first quarter, Qtr1, have been computed.

data finance.quarters(drop=i j);
      set finance.monthly;
      array m{4,3} month1-month12;
      array Qtr{4};
      do i=1 to 4;         i=1
        qtr{i}=0;
       > do j=1 to 3;        j=3
              qtr{1}+m{1,3};
         end;
      end;
run;
Referencing Elements of a Two-Dimensional Array

The outer DO loop increments i to 2, and the process continues for the array element Qtr2 and the m array elements Month4 through Month6.

data finance.quarters(drop=i j);
   set finance.monthly;
   array m{4,3} month1-month12;
   array Qtr{4};
 > do i=1 to 4;        i=2
     qtr{i}=0;
      do j=1 to 3;      j=1
         qtr{i}+m{i,j};
      end;
   end;
run;
Referencing Elements of a Two-Dimensional Array

After the outer DO loop completes execution, the end of the DATA step is reached, and the variable values for the first observation are written to the data set Finance.Quarters.

data finance.quarters(drop=i j);
   set finance.monthly;
   array m{4,3} month1-month12;
   array Qtr{4};
 > do i=1 to  4;          i=5 (loop ends)
     qtr{i}=0;
      do j=1 to 3;
         qtr{i}+m{i,j};
      end;
   end;
run;
Referencing Elements of a Two-Dimensional Array

What you have seen so far represents the first iteration of the DATA step. All observations in the data set Finance.Monthly are processed in the same manner. Below is a portion of the resulting data set, which contains the sales figures grouped by quarters.

Referencing Elements of a Two-Dimensional Array

Additional Features

You've seen a number of uses for arrays, including creating variables, performing repetitive calculations, and performing table lookups. You can also use arrays for rotating (transposing) a SAS data set.

When you rotate a SAS data set, you change variables to observations or observations to variables. For example, suppose you want to rotate the Finance.Funddrive data set to create four output observations from each input observation.

Additional Features

The following program rotates the data set and lists the first 16 observations in the new data set.

Additional Features

Chapter Summary

Text Summary

Purpose of SAS Arrays

An array is a temporary grouping of variables under a single name. This can reduce the number of statements needed to process variables and can simplify the maintenance of DATA step programs.

Defining an Array

To group previously defined data set variables into an array, use an ARRAY statement that specifies the array's name; its dimension enclosed in braces, brackets, or parentheses; and the elements to include. For example: array sales{4} qtr1 qtr2 qtr3 qtr4;

Variable Lists as Array Elements

You can use a variable list to specify array elements. Depending on the form of the variable list, it can specify all numeric or all character variables, or a numbered range of variables.

Referencing Elements of an Array

When you define an array in a DATA step, an index value is assigned to each element. During execution, you can use an array reference to perform actions on specific array elements. When used in a DO loop, for example, the index variable of the iterative DO statement can reference each element of the array.

The DIM Function

When using DO loops to process arrays, you can also use the DIM function to specify the TO clause of the iterative DO statement. When you use the DIM function, you do not have to re-specify the stop value of a DO statement if you change the dimension of the array.

Creating Variables with the ARRAY Statement

If you don't specify array elements in an ARRAY statement, SAS automatically creates the variables for you by concatenating the array name and the numbers 1, 2, 3 … up to the array dimension. To create an array of character variables, add a dollar sign ($) after the array dimension. By default, all character variables that are created with an ARRAY statement are assigned a length of 8; however, you can specify a different length after the dollar sign.

Assigning Initial Values to Arrays

To assign initial values in an ARRAY statement, place the values in parentheses after the array elements, specifying one initial value for each array element and separating each value with a comma or blank. To assign initial values to character variables, enclose each value in quotation marks.

Creating Temporary Array Elements

You can create temporary array elements for DATA step processing without creating additional variables. Just specify _TEMPORARY_ after the array name and dimension. This is useful when the array is needed only to perform a calculation.

Multidimensional Arrays

To define a multidimensional array, specify the number of elements in each dimension, separated by a comma. For example, array new{3,4} x1-x12; defines a two-dimensional array, with the first dimension specifying the number of rows (3) and the second dimension specifying the number of columns (4).

Referencing Elements of a Two-Dimensional Array

Multidimensional arrays are typically used with nested DO loops. If a DO loop processes a two-dimensional array, you can reference any element within the array by specifying the two dimensions.

Rotating Data Sets

You can use arrays to rotate a data set. Rotating a data set changes variables to observations or observations to variables.

Syntax

ARRAY array-name{dimension} < elements>;

array-name(index value)

DIM(array-name)

Sample Programs

data work.report(drop=i);
   set master.temps;
   array wkday{7} mon tue wed thr fri sat sun;
   do i=1 to 7;
      wkday{i}=5*(wkday{i}-32)/9;
   end;
run;

data hrd.convert(drop=i);
   set hrd.fitclass;
   array wt{6} weight1-weight6;
   do i=1 to dim(wt);
      wt{i}=wt{i}*2.2046;
   end;
run;

data hrd.diff(drop=i);
  set hrd.convert;
  array wt{6} weight1-weight6;
  array WgtDiff{5};
  do i=1 to 5;
     wgtdiff{i}=wt{i+1}-wt{i};
  end;
run;

data finance.report(drop=i);
   set finance.qsales;
   array sale{4} sales1-sales4;
   array goal{4} _temporary_ (9000 9300 9600 9900);
   array Achieved{4};
   do i=1 to 4;
      achieved{i}=100*sale{i}/goal{i};
   end;
run;

data finance.quarters(drop=i j);
   set finance.monthly;
   array m{4,3} month1-month12;
   array Qtr{4};
   do i=1 to 4;
     qtr{i}=0;
      do j=1 to 3;
         qtr{i}+m{i,j};
      end;
   end;
run;

Points to Remember

  • A SAS array exists only for the duration of the DATA step.

  • Do not give an array the same name as a variable in the same DATA step. Also, avoid using the name of a SAS function as an array name; the array will be correct, but you won't be able to use the function in the same DATA step, and a warning will be written to the SAS log.

  • You can indicate the dimension of a one-dimensional array with an asterisk (*) as long as you specify the elements of the array.

  • When referencing array elements, be careful not to confuse variable names with the array references. WgtDiff1 through WgtDiff5 is not the same as WgtDiff(1) through WgtDiff(5).

Chapter Quiz

Select the best answer for each question. After completing the quiz, check your answers using the answer key in the appendix.

  1. Which statement is false regarding an ARRAY statement?

    1. It is an executable statement.

    2. It can be used to create variables.

    3. It must contain either all numeric or all character elements.

    4. It must be used to define an array before the array name can be referenced.

  2. What belongs within the braces of this ARRAY statement?

    array contrib{?} qtr1-qtr4;
    1. quarter
    2. quarter*
    3. 1-4
    4. 4
  3. For the program below, select an iterative DO statement to process all elements in the contrib array.

    data work.contrib;
       array contrib{4} qtr1-qtr4;
          
          contrib{i}=contrib{i}*1.25;
       end;
    run;
    1. do i=4;
    2. do i=1 to 4;
    3. do until i=4;
    4. do while i le 4;
  4. What is the value of the index variable that references Jul in the statements below?

    array quarter{4} Jan Apr Jul Oct;
    do i=1 to 4;
       yeargoal=quarter{i}*1.2;
    end;
    1. 1
    2. 2
    3. 3
    4. 4
  5. Which DO statement would not process all the elements in the factors array shown below?

    array factors{*} age height weight bloodpr;
    1. do i=1 to dim(factors);
    2. do i=1 to dim(*);
    3. do i=1,2,3,4;
    4. do i=1 to 4;
  6. Which statement below is false regarding the use of arrays to create variables?

    1. The variables are added to the program data vector during the compilation of the DATA step.

    2. You do not need to specify the array elements in the ARRAY statement.

    3. By default, all character variables are assigned a length of eight.

    4. Only character variables can be created.

  7. For the first observation, what is the value of diff{i} at the end of the second iteration of the DO loop?

    Chapter Quiz
    array wt{*} weight1-weight10;
    array diff{9};
    do i=1 to 9;
       diff{i}=wt{i+1}-wt{i};
    end;
    1. 15
    2. 10
    3. 8
    4. -7
  8. Finish the ARRAY statement below to create temporary array elements that have initial values of 9000, 9300, 9600, and 9900.

    array goal{4}   ;
    1. _temporary_ (9000 9300 9600 9900)
    2. temporary (9000 9300 9600 9900)
    3. _temporary_ 9000 9300 9600 9900
    4. (temporary) 9000 9300 9600 9900
  9. Based on the ARRAY statement below, select the array reference for the array element q50.

    array ques{3,25} q1-q75;
    1. ques{q50}
    2. ques{1,50}
    3.  ques{2,25}
    4. ques{3,0}
  10. Select the ARRAY statement that defines the array in the following program.

    data rainwear.coat;
       input category high1-high3 / low1-low3;
       …
       do i=1 to 2;
          do j=1 to 3;
             compare{i,j}=round(compare{i,j}*1.12);
         end;
       end;
    run;
    1. array compare{1,6} high1-high3 low1-low3;
    2. array compare{2,3} high1-high3 low1-low3;
    3. array compare{3,2} high1-high3 low1-low3;
    4. array compare{3,3} high1-high3 low1-low3;
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset