CHAPTER 2

image

SAS Introduction

Chapter 1 introduced SAS, the most widely used tool in the world of analytics. SAS is a software suite that can retrieve data from a variety of data sources. It can help you clean the data and perform statistical operations on it. For nontechnical users, it also provides a graphical user interface (GUI) to perform various analytics operations. The soul of SAS is its programming language, which is used by most analysts. It provides more advanced data handling and analytical capabilities than the GUI. The SAS programming language, also known as the SAS scripting language, is much easier to learn than most other programming languages such as FORTRAN, C, and Java.

In this book, you will work with the SAS programming language, not the GUI, but you are encouraged to explore the GUI on your own. This book is not intended to provide in-depth coverage of the SAS programming language; only Base SAS and the procedures required for analytics will be covered here.

SAS was originally used in statistics applications for agriculture projects. Now it’s used in the following industries: casinos, communications, education, financial services, government, health insurance, healthcare providers, hotels, insurance, life sciences, manufacturing, media, oil and gas, retail, travel, transportation, utilities, and many more.

Next you will learn some simple SAS programming steps.

Starting SAS in Windows

In Windows, you can start SAS by selecting Start image Programs like you would with any other software. The SAS folder in the Start menu programs list might show all the SAS-related software in SAS. You need to click Base SAS, that is, SAS 9.1 or SAS 9.2 or SAS 9.3, depending on the version of the SAS you have installed. Figure 2-1 shows how to start SAS in Windows.

9781484200445_Fig02-01.jpg

Figure 2-1. Starting SAS in Windows

Sometimes you may get the error shown in Figure 2-2. This error pops up when your SAS license is expired, and you may have to renew the license in that case. You can renew SAS by buying a renewal license from SAS. A new SAS installation data (SID) file will be provided by SAS to make it function normally.

9781484200445_Fig02-02.jpg

Figure 2-2. License expiration error in SAS

The SAS Opening Screen

When you open the SAS program in Windows, your screen, in most cases, should look like the SAS Windows environment shown in Figure 2-3. This may depend upon the operating system of the machine you are using; Figure 2-3 shows a Microsoft Windows SAS session. This is the most usual way your SAS screen will appear. But some windows or icons might be hidden on some systems depending upon the installation procedure and settings you have.

9781484200445_Fig02-03.jpg

Figure 2-3. A typical SAS Windows environment

The Five Main Windows

After opening the SAS screen, you will see many icons and windows. There are five main windows in SAS environment, as shown in Figure 2-4.

9781484200445_Fig02-04.jpg

Figure 2-4. Windows in SAS environment

The five main SAS windows are the Editor window, Log window, Output window, Explorer window, and Results window. If you can’t find any of these windows, you can make them visible by using the View option from the top menu bar.

Editor Window

The Editor window is used for writing SAS scripts that will be used in data modeling and analysis. It’s like any other programming text editor. The editor is syntax sensitive and color codes your SAS scripts so that reading the program or identifying errors in the program is easy.

To execute the code, you use either the Submit icon on the top or select Run image Submit (Figure 2-5) from the top menu bar.

9781484200445_Fig02-05.jpg

Figure 2-5. Submitting a SAS program

Figure 2-6 shows the Editor window with some sample code. This code prints the prdsale table from the SAS help library. (This will be explained in detail later in this book).

proc print data=sashelp.prdsale;
run;

9781484200445_Fig02-06.jpg

Figure 2-6. The SAS Editor window

This SAS code is saved in .sas format, which is the usual extension for all the SAS programming script files. You can open these code files with SAS or any other text editor.

Log Window

The Log window is used for debugging. Any observations or debugging suggestions from the SAS package appear in the Log window. Specifically, the programming statements, notes, errors, or warnings associated with your program will appear in this window. Usually errors in the code are highlighted in red along with an explanation.

Figure 2-7 shows the Log window when the following program is run. There is no syntax error in this code.

proc print data=sashelp.prdsale;
run;

9781484200445_Fig02-07.jpg

Figure 2-7. SAS Log window

Figure 2-8 shows the Log window with a syntax error intentionally introduced in the code.

proc printing data=sashelp.prdsale;
run;

9781484200445_Fig02-08.jpg

Figure 2-8. Syntax error in SAS code

Note that the error correctly indicates that the procedure name is misspelled.

Figure 2-9 shows the Log window with a non-syntax-related error in the code.

proc print data=sashelp.prdsales;
run;

9781484200445_Fig02-09.jpg

Figure 2-9. A typical non-syntax-related error

The error here correctly identifies that the data file name is misspelled and it does not exist in the SAS help library.

If you try to save the log code file, then it will get saved in .log format. Generally, log files are appended. So, you might see all the previous log information also in your current log file. You can press Ctrl+E to erase or clean the log file.

Output Window

The actual program output, like print data or output data from any SAS procedure, is shown in the Output window. Only the printable output from your program will appear in this window. If the program doesn’t generate any output or if there is any syntax error in the code that caused the SAS system to stop abruptly, then the Output window might be blank.

Figure 2-10 shows the Output window when you run the following code:

proc print data=sashelp.prdsale;
run;

9781484200445_Fig02-10.jpg

Figure 2-10. SAS Output window

I wrote some code for printing the prdsale data set, and the Output window shows the data records of the table. The output shows that the table contains some data for product sales with fields such as actual, predict, country, and division.

The default output is in a list or file listing format. If you try to save this output file, then it will get saved in .lst format. However, for SAS 9.3 and newer, HTML is the default option.

HTML Output

By default SAS generates listing output, but HTML output is a good option to see the output files in a more readable and formatted way. Broadly speaking, HTML files are easy to navigate and understand when compared to default listing files. HTML files can easily be transferred into other formats such as Excel and PowerPoint by using a simple copy and paste command. You can use HTML options in the code to create HTML output, or you can directly set the SAS options to create the HTML output for every program execution.

Here are the steps to set HTML output creation from the SAS menu environment:

  1. Select: Tools image Options image Preferences (Figure 2-11).

    9781484200445_Fig02-11.jpg

    Figure 2-11. Selecting Preferences

  2. In the Preferences window, select the Results tab and check Create HTML (Figure 2-12).

    9781484200445_Fig02-12.jpg

    Figure 2-12. Selecting Create HTML

Figure 2-13 shows the Preferences dialog box after checking the Create HTML option.

9781484200445_Fig02-13.jpg

Figure 2-13. After selecting Create HTML

The HTML output type gives you several themes to choose from. Depending on the style of your report and the theme of your business, you can pick the HTML style. There is absolutely no difference between HTML and listing output except for the format. In this book, you will be setting HTML code as the default Output window. Figure 2-14 shows the HTML output for the print code from earlier.

9781484200445_Fig02-14.jpg

Figure 2-14. Typical HTML output

Explorer Window

The Explorer window serves as an easy access point for all your files and libraries (see Figure 2-15). Libraries are the objects where data sets and other SAS files are stored. You will be mostly visiting the Explorer window to see various libraries and the files inside.

9781484200445_Fig02-15.jpg

Figure 2-15. SAS Explorer window

Results Window

The Results window serves as a table of contents (TOC) for the Output window, listing each part of your results in an outline form. The Results window shows both listing and HTML output files as well as the cumulative tree of results that are run in the current session (see Figure 2-16).

9781484200445_Fig02-16.jpg

Figure 2-16. SAS Results window

The Explorer and Results windows are shown on the left side of the GUI, one below the other. You can toggle between the Explorer and Results windows by clicking Explorer or Results on the bottom taskbar.

Important Menu Options and Icons

In this section we discuss some important SAS menu options, such as creating, closing, and savings SAS program files, which will be used in your day-to-day working with the SAS environment.

To create a new SAS program file, Select: File image New Program from the top menu bar, as shown in Figure 2-17.

9781484200445_Fig02-17.jpg

Figure 2-17. Menu option to create a new program file

To open an old SAS program file, Select: File image Open Program from the top menu bar, as shown in Figure 2-18.

9781484200445_Fig02-18.jpg

Figure 2-18. Opening a program file

To save a SAS program file, Select: File image Save or image Save as from the top menu bar, as shown in Figure 2-19.

9781484200445_Fig02-19.jpg

Figure 2-19. Saving a program file

View Options

Sometimes you can’t find the window you are looking for or maybe you closed one of the windows by mistake. You can use the View menu option to bring them back. Just click the View option on the top menu bar and open the desired window, as shown in Figure 2-20.

9781484200445_Fig02-20.jpg

Figure 2-20. View options

Run Menu

The Run menu is used for submitting the whole or a selected portion of the SAS program, as shown in Figure 2-21.

9781484200445_Fig02-21.jpg

Figure 2-21. Submitting a SAS program

Solutions Menu

The Solutions menu gives you access to various customized SAS solutions depending on the software options you have installed, as shown in Figure 2-22. In this book, you will be using only the simple Base SAS scripts; no customized solutions are used.

9781484200445_Fig02-22.jpg

Figure 2-22. SAS Solutions options

You have now seen a small SAS print program and all the important windows that will matter to you going forward with analytics using SAS.

Shortcut Icons

Figure 2-23 shows some useful shortcut icons.

9781484200445_Fig02-23.jpg

Figure 2-23. Convenient shortcut icons

Here are the most useful shortcuts:

  • New: Starts a new program
  • Open: Opens an existing code file
  • Save: Saves the current code file
  • Submit: Submits the code
  • Break: Stops a submitted code execution
  • Help: Gives help on SAS syntax and options

Writing and Executing a SAS Program

To start writing a SAS program, open a new Editor window. SAS programming scripts are nothing but a sequence of statements. SAS code contains statements, expressions, functions and call routines, options, and formats. SAS code is easy to write when compared to other programming languages. It is a simplified programming language with built-in programs known as SAS procedures (prewritten code). These procedures have already been written and tested in SAS. You just need to call the right procedure at the right place in your SAS script. For example, if you want to find the average of a variable, then there is no need to write the code to compute the average. Just call the SAS procedure Proc Means, which calculates the average for you. There are a few rules and some user-friendly features in SAS programs that you should know.

  • SAS statements end with a semicolon.
  • SAS statements can begin and end anywhere in a line.
  • One statement can continue over several lines.
  • Several statements can be on a line.
  • SAS statements are not case sensitive.
  • Blanks or special characters separate the “words” in a SAS statement.
  • SAS programs end with run or quit, which prompts SAS to start execution of the code.

Here is the code for finding the average of the actual (actual sales) variable in the prdsale data set of the sashelp library:

proc means data =sashelp.prdsale;
var actual;
run;

This same code can be written as shown here (all these code samples will execute without any error):

proc means data =sashelp.prdsale;
var actual;
run;

proc
means data =sashelp.prdsale;
var actual;
run;

proc means data =sashelp.prdsale;var actual;run;

PROC MEANS Data =sashelp.prdsale;
var actual;
Run;

In this book, you will be writing simple SAS code scripts to do all the important tasks in an analytics project, such as import the data, clean the data, analyze the data, and report the results.

Comments in the Code

If you have done even a bit of programming, you are aware that you write comments in the code for documentation to explain the logic or flow involved in the code or just to remind you of something later about the code. Similarly, while writing SAS scripts, you can write comments. Most SAS scripts are small compared to conventional COBOL or Java code, which can be pages long.

There are two styles of comments you can use.

  • One starts with an asterisk (*) and ends with a semicolon (;). Here’s an example:
* This script does logistic regression for the price table data;
  • The other style starts with a slash asterisk (/*) and ends with an asterisk slash (*/). Here is an example:
/* This script does logistic regression for the price table data */

The following is some sample code with comments.

*program to find the mean of actual sales;
PROC MEANS Data =sashelp.prdsale;
var actual;
Run;

The previous is the same as the following:

/*below program illustrates how to write a SAS code that finds the average actual sales
The dataset used here is prdsale, it is in sashelp library
MEANS is the produce that finds the averages
Var statement is used to specify the variables*/

PROC MEANS Data =sashelp.prdsale;
var actual;
Run;

Your First SAS Program

Type the code shown next in your Editor window. This code prints the prdsale data set from the sashelp library. (Chapter 3 discusses libraries.) This code (sashelp.prdsale) can be viewed as a table in the database, where sashelp is the database and prdsale is a table in it.

proc print data=sashelp.prdsale;
run;

The run statement at the end tells SAS to start processing the previous statements. You can use either the submit icon or the Run image Submit option menu option. It is a good habit to first select the code and then execute. If you directly submit, the SAS system will execute all the code present in the program file or the Editor window. When you execute this code, you will see some information in the output file and also in the log file. In this case, the log file shows no error, and the output screen shows the data values in the prdsale table (data set), as shown in Table 2-1.

Table 2-1. Data Values in the prdsale Table

Tab1

Similarly, you can use the following code for finding the average of an actual variable:

proc means data =sashelp.prdsale;
var actual;
run;

When you execute the previous code, the log file shows no error, and you get the output shown in Table 2-2.

Table 2-2. Output of SAS Means Procedure for prdsale

Tab2

Here is one more code example:

data income_data;
Input income expenses;
Cards;
1200 1000
9000 600
;
run;
Proc print data=income_data;
Run;

This generates the output shown in Table 2-3.

Tables 2-3. Print of income_data

Obs

income

expenses

1

1200

1000

2

9000

600

Debugging SAS Code Using a Log File

Reading the log file is an important aspect of SAS program execution. Generally, new users write the code and tend to look at the output directly, but looking at the log file is equally important. The log file has mainly three notification types: errors, warnings, and notes.

  • An error means that there is a syntax or other error. Sometimes an error can stop a script from executing.
  • A warning indicates that the SAS system has automatically corrected your error and executed the code. This is dangerous when SAS misinterprets and executes.
  • Notes give a running commentary of important steps while executing programs.

The log file for the SAS code used to print prdsale gives the following information:

31   proc print data=sashelp.prdsale;
32   run;

NOTE: Writing HTML Body file: sashtml8.htm
NOTE: There were 1440 observations read from the data set SASHELP.PRDSALE.
NOTE: PROCEDURE PRINT used (Total process time):
      real time           3.85 seconds
      cpu time            3.79 seconds

The previous message from the log file shows that there is no sign of an error in the code and that 1,440 observations were read from the data for printing.

The log file for the SAS code used to find the mean gives the following information. The note in the log file shows 1,440 observations were read from the data set.

33   proc means data =sashelp.prdsale;
34   var actual;
35   run;

NOTE: Writing HTML Body file: sashtml9.htm
NOTE: There were 1440 observations read from the data set SASHELP.PRDSALE.
NOTE: PROCEDURE MEANS used (Total process time):
      real time           0.25 seconds
      cpu time            0.09 seconds

Again, there are no errors, and the means procedure ran successfully.

The following is the log file for the third code snippet, which was used to create the print income_data:

53   data income_data;
54   Input income expenses;
55   Cards;

NOTE: The data set WORK.INCOME_DATA has 2 observations and 2 variables.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds

58   ;
59   run;
60   Proc print data=income_data;
61   Run;

NOTE: Writing HTML Body file: sashtml11.htm
NOTE: There were 2 observations read from the data set WORK.INCOME_DATA.
NOTE: PROCEDURE PRINT used (Total process time):
      real time             0.29 seconds
      cpu time              0.18 seconds

The log file doesn’t show any errors. If you deliberately write some incorrect syntax and submit it, you see an error in the log file.

The following code generates the subsequent message in the log file.

proc dataprinting data=sashelp.prdsale;
run;

40   proc dataprinting data=sashelp.prdsale;
ERROR: Procedure DATAPRINTING not found.
41   run;

NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE DATAPRINTING used (Total process time):
      real time           0.01 seconds
      cpu time            0.00 seconds

The error is clearly mentioned.

The following is one more example of an error:

proc print data=sashelp.prdsales;
run;

42   proc print data=sashelp.prdsales;
ERROR: File SASHELP.PRDSALES.DATA does not exist.
43   run;

NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE PRINT used (Total process time):
      real time           0.03 seconds
      cpu time            0.01 seconds

Example for Warnings in Log File

The following code is for copying the data from the prdsale data set from the SAS help library into a new data file:

data new_data;
set sashelp.prdsale;
where actuals<1000;
run;

The following is the log file for the previous code:

14   data new_data;
15   set sashelp.prdsale;
16   where actuals<1000;
ERROR: Variable actuals is not on file SASHELP.PRDSALE.
17   run;

NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.NEW_DATA may be incomplete.  When this step was stopped there were

         0 observations and 10 variables.
WARNING: Data set WORK.NEW_DATA was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
      real time           0.07 seconds
      cpu time            0.03 seconds

The warnings show that SAS has gone ahead and created a data file with 10 variables and 0 observations.

Tips for Writing, Reading the Log File, and Debugging

Here are a few tips for writing and debugging the SAS programs, meant for beginners:

  • Try to keep the log file clean by erasing it after code execution. Press Ctrl+E to clean the log file.
  • Do not directly go to output, even when the code executes successfully. Make it a habit to see the log file first and then move to the output file.
  • Always start from the first error at the beginning of the log information. Generally SAS shows the last few lines of the log file, but most of the time if you correct the first few errors, the late ones may disappear. However, it depends upon the type of errors that your code contains.
  • Understand the notifications; there are three kinds of notifications in log files: errors, warnings, and notes.
  • Try to avoid these most common errors.
    • Missing semicolons
    • Missing semicolon at the beginning or earlier in the code, which may create an error in a much later statement
    • Missing run statement
    • Misspelling the key words and data set names
    • Unbalanced quotation marks and parentheses
  • Always write the code explanations in the form of comments.
  • Instead of executing the whole SAS program in one go, try to execute the code snippets one after the other. That way it will be easier for you to pinpoint the error.

Saving SAS Files

You can use the Save or Save as option in the File menu, which will prompt you to save the SAS program file in the desired location, as shown in Figure 2-24.

9781484200445_Fig02-24.jpg

Figure 2-24. Saving SAS files

Exercise

Here are some exercises to help you become more familiar with reading log files.

  1. Write and execute the following code. Look at the log file and identify the errors, if any.
    proc print data=sashelp.airr;run;
  2. Save your SAS file and open it using the File menu.
  3. Write and execute the following code. Look at the log file and identify the errors, if any.
    proc print data=sashelp.buy;run;

Conclusion

This chapter introduced you to the SAS programming environment. It discussed navigation in the SAS Windows environment, various menu options, and some shortcut icons. It also discussed writing simple SAS codes and reading log files. In the next few chapters, you will get into more details of SAS programming. Later this programming knowledge will help you in analysis, where you try to interact with SAS by writing SAS programs to analyze the data.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset