The SAS Macro Facility

SAS Processing of Macros

A SAS program can be any combination of the following elements:
  • DATA steps or PROC steps
  • global statements
  • Structured Query Language (SQL) code
  • SAS macro language code
When you submit a SAS program, the code is copied to a memory location called the input stack. The presence of text in the input stack triggers a component called the word scanner to begin its work.
Figure 7.1 SAS Processing with an Input Stack
SAS Processing with Input Stack
The word scanner has two major functions. First, it pulls the raw text from the input stack character by character and transforms it into tokens. Second, it sends tokens for processing to the compiler and macro processor. A program is then separated into components called tokens. There are four types of tokens: name, number, special, and literal.
To build a token, the word scanner extracts characters until it reaches a delimiter, or until the next character does not meet the rules of the current token. A delimiter is any whitespace character such as a space, tab, or end-of-line character.
Name tokens consist of a maximum of 32 characters, must begin with a letter or underscore, and can include only letter, digit, and underscore characters.
Number tokens define a SAS floating-point numeric value. They can consist of a digit, decimal point, leading sign, and exponent indicator (e or E). Date, time, and datetime specifications also become number tokens (for example: '29APR2019'd, '14:05:32.1't, '29APR2019 14:05:32.1'dt).
Literal tokens consist of a string of any characters enclosed in single or double quotation marks. They can contain up to 32,767 characters and are handled as a single unit.
Special tokens are made up of any character or group of characters that have special meaning in the SAS language. Examples include * / + - ; ( ) . & %
Knowing how tokenization works helps you understand how the various parts of SAS and the macro processor work together. Understanding differences in timing between macro processing and SAS code compilation and execution is especially important.

Tokenization

Between the input stack and the compiler, SAS programs are tokenized into smaller pieces.
  1. Tokens are passed on demand to the compiler.
  2. The compiler requests tokens until it receives a semicolon.
  3. The compiler performs a syntax check on the statement.
The following example illustrates how the input stack, word scanner, and compiler work together.
title "MPG City Over 25";
proc print data=sashelp.cars noobs;
   var Make Model Type MPG_City MPG_Highway MSRP;
   where MPG_City>25;
run;
Figure 7.2 The First Step in the Tokenization Process
Tokenization Process
When the code is copied to the input stack, the word scanner retrieves one character at a time until it reaches the first delimiter, a blank. When TITLE is recognized as a name token, the word scanner tags it and passes it to the compiler.
Figure 7.3 The Tokenization Process, continued
Tokenization Process
The word scanner tags the double quotation mark as the start of a literal token.
Figure 7.4 The Tokenization Process, continued
Tokenization Process
It then retrieves, tokenizes, and holds additional text until it retrieves another double quotation mark. It passes the text as a single literal token to the compiler, and then tokenization continues. The semicolon is a special token, and the end-of-line character is a delimiter.
Figure 7.5 The Tokenization Process, continued
Tokenization Process
The semicolon is sent to the compiler, ending the TITLE statement. The compiler checks the syntax, and because TITLE is a global statement, it is executed immediately. The tokenization process continues with the PROC PRINT step. The compiler performs a syntax check at the end of each statement.
Figure 7.6 TheTokenization Process, concluded
Tokenization Processing
The code executes when it encounters a step boundary, in this case the RUN statement.

Macro Triggers

The macro facility includes a macro processor that is responsible for handling all macro language elements. Certain token sequences, known as macro triggers, alert the word scanner that the subsequent code should be sent to the macro processor.
The word scanner recognizes the following token sequences as macro triggers:
  • % followed immediately by a name token (such as %LET)
  • & followed immediately by a name token (such as &AMT)
When a macro trigger is detected, the word scanner passes it to the macro processor for evaluation. Here is the sequence that the macro processor follows:
  • It examines these tokens.
  • It requests additional tokens as necessary.
  • It performs the action indicated.
For macro variables, the processor does one of the following:
  • It creates a macro variable in the global symbol table and assigns a value to the variable.
  • It changes the value of an existing macro variable in the global symbol table.
  • It looks up an existing macro variable in the global symbol table and returns the variable's value to the input stack in place of the original reference.
The word scanner then resumes processing tokens from the input stack.
Note: The word scanner does not recognize macro triggers that are enclosed in single quotation marks. Remember that if you need to reference a macro variable within a literal token, such as the title text in a TITLE statement, you must enclose the text string in double quotation marks or else the macro variable reference is not resolved.
Last updated: October 16, 2019
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset