When
you submit a DATA step, SAS processes the DATA step and creates a
new SAS data set. A SAS DATA step is processed in two phases:
When you submit a
DATA step for execution, SAS checks the syntax of the SAS statements
and compiles them. In this phase, SAS identifies the type and length
of each new variable, and determines whether a variable type conversion
is necessary for each subsequent reference to a variable. During the
compilation phase, SAS creates the following items:
-
-
program data vector (PDV)
-
When the compilation
phase is complete, the descriptor portion of the new data set is created.
By default, a simple
DATA step iterates once for each observation that is being created.
The flow of action in the Execution Phase of a simple DATA step is
described as follows:
-
The DATA step begins
with a DATA statement. Each time the DATA statement executes, a new
iteration of the DATA step begins, and the _N_ automatic variable
is incremented by 1. The _N_ automatic variable represents the number
of times the DATA step has iterated.
-
SAS sets the newly
created program variables to missing in the program data vector (PDV).
-
SAS reads a data
record from a raw data file into the input buffer, or it reads an
observation from a SAS data set directly into the program data vector.
You can use an INPUT, MERGE, SET, MODIFY, or UPDATE statement to read
a record.
-
SAS executes any
subsequent programming statements for the current record and updates
the PDV.
-
When SAS executes
the last statement in the DATA step, all values in the PDV, except
those marked to be dropped, are written as a single observation to
the data set. Note that variables that you read with a SET, MERGE,
MODIFY, or UPDATE statement are not reset to missing here.
-
SAS counts another
iteration, reads the next record or observation, and executes the
subsequent programming statements for the current observation.
-
The DATA step terminates
when SAS encounters the end-of-file in a SAS data set or a raw data
file.