Chapter 6: Reading and Writing Data from an Excel Spreadsheet

6.1  Introduction

6.2  Using the Import Wizard to Convert a Spreadsheet to a SAS Data Set

6.3  Creating an Excel Spreadsheet from a SAS Data Set

6.4  Using an Engine to Read an Excel Spreadsheet

6.5  Using the SAS Output Delivery System to Convert a SAS Data Set to an Excel Spreadsheet

6.6  A Quick Look at the Import Utility in SAS Studio

6.7  Problems

 

6.1  Introduction

It is quite common to be given a Microsoft Office Excel spreadsheet as your data source. Luckily, SAS has several methods to easily convert a spreadsheet into a SAS data set. One way is to convert the spreadsheet into a comma-separated values (CSV) file and to read the file using INFILE statements (using the DSD option—see Chapter 3 for more information) and INPUT statements. However, if you have licensed SAS/ACCESS Interface to PC Files, or you are using SAS Enterprise Guide or SAS Studio (perhaps with the University Edition), you can have SAS do the conversion automatically.

6.2  Using the Import Wizard to Convert a Spreadsheet to a SAS Data Set

The workbook used in the next example was created using Microsoft Excel. The Wage column (E) was computed by multiplying the Hours Worked column (C) by the Rate Per Hour column (D). Notice that the column headings are not valid SAS variable names. In addition, this workbook has two worksheets: one named Temporary (for temporary workers) and the other named Permanent (for permanent employees).

Figure 6.1: Excel Workbook Wages

Figure 6.1: Excel Workbook Wages

The first step is to select Import Data from the SAS File menu, as shown here:

Figure 6.2: Select Import Data

Figure 6.2: Select Import Data

This brings up a screen where you can select from a variety of formats (Excel, Access, dBase, Lotus, and several others). Select Microsoft Excel from the pull-down menu.

Figure 6.3: Select Excel

Figure 6.3: Select Excel

Click Next to bring up the next screen. Here you can either type in the name for the Excel file you want to read, or select Browse to obtain a list of the Excel spreadsheets on your system.

Figure 6.4: Select Next and Browse for Your File

Figure 6.4: Select Next and Browse for Your File

If you have multiple worksheets, you can select the one you want to import. The default worksheet name is Sheet1$ (SAS places a dollar sign after the worksheet name). Any named ranges that have been created in the worksheet are also listed. Worksheet names end in a $; the names of named ranges don’t.

Figure 6.5: Select a Worksheet if More than One

Figure 6.5: Select a Worksheet if More than One

Once you have selected the worksheet you want to import, click Next. You are presented with a screen where you can enter a library (Work, if you want a temporary SAS data set, or a libref created with a LIBNAME statement) and the name of the SAS data set (labeled as Member in the destination screen). See the following:

Figure 6.6: Choose a Library and File Name

Figure 6.6: Choose a Library and File Name

If you then select Finish, SAS imports your spreadsheet. If you select Next, you can choose to save the conversion program (PROC IMPORT). If you save this program (you will be prompted for the name of a file in which to save the program), you can run it later to perform your conversion.

You are done. That’s all there is to it. In this example, you chose the Work library and called your SAS data set Wages_Permanent. It is a good idea to list the contents of this data set using the SAS viewer or PROC PRINT before attempting to use it for reports or further analysis. If the data set is large, you can list the first few observations in the data set by using the OBS=n data set option to limit the number of observations you want to process. For example, to see the first four observations in data set Wages_Permanent, you could run the following program:

Program 6.1: Using PROC PRINT to List the First Four Observations in a Data Set

  title "The First Four Observations of Wages_Permanent";

  proc print data=Wages_Permanent(obs=4);

  run;

Here is the output:

Figure 6.7: Output from Program 6.1

Figure 6.7: Output from Program 6.1

Notice that SAS has created valid variable names from the Excel column headings. It replaces any invalid characters in column headings (in this case, blanks) with an underscore (_). It is also a good idea to use the SAS Explorer or PROC CONTENTS to view the type (character or numeric) and length of each variable in this data set. You may need to use PROC DATASETS to change the format of one or more variables or to write a short DATA step to perform a character-to-numeric conversion for variables that you want to be numeric but, for one reason or another, wound up as character.

Using the OBS=n data set option is a very useful way to check the data values of very large data sets. By the way, while we are on the topic, you can combine the OBS=n data set option with the FIRSTOBS=m option. The value of FIRSTOBS= defines the first observation you want to process; the value of the OBS= option is the last observation you want to process. It is a good idea to think of OBS= as LASTOBS. Suppose you want to list observations 100 through 110 in a very large SAS data set called Verybig in a library with a libref of Project. You would combine the FIRSTOBS= and OBS= options like this:

Program 6.2: Using the FIRSTOBS= and OBS= Options Together

  title "Observations 100 through 110 in VERYBIG";

  proc print data=Project.Verybig(firstobs=100 obs=110);

  run;

Program 6.2 results in a listing containing 11 observations (observation 100 through observation 110).

Importing Excel Worksheets using either Enterprise Guide or SAS Studio is even easier. For example, when using SAS Studio, you don't even have to select an input file format—it looks at the file extension (for example, XLS or XLSX) and automatically selects the correct import method.

6.3  Creating an Excel Spreadsheet from a SAS Data Set

Before we leave this chapter, let’s see how you can use the Export Wizard to convert a SAS data set into an Excel spreadsheet. From the File menu, select Export Data and then select Microsoft Excel.

Suppose you want to convert your Sales data set (located in the C:ookslearning folder) to an Excel spreadsheet. First, you need to be sure you have a library reference to the folder. For example, issue the following statement:

libname Learn 'C:ookslearning';

Then, follow these steps:

From the File menu, select Export Data.

Figure 6.8: First Step in Exporting an Excel Spreadsheet (using the SAS Windowing System)

Figure 6.8: First Step in Exporting an Excel Spreadsheet (using the SAS Windowing System)

On the next screen, enter the LIBNAME and the name of the SAS data set you want to export (Sales in this example) and click Next.

Figure 6.9: Name the Library and Data Set Name

Figure 6.9: Name the Library and Data Set Name

Next, select Microsoft Excel from the pull-down menu.

Figure 6.10: Select Microsoft Excel

Figure 6.10: Select Microsoft Excel

You can browse or enter the name of the Excel spreadsheet.

Figure 6.11: Name the Excel Workbook

Figure 6.11: Name the Excel Workbook

You can also name the specific table.

Figure 6.12: Supply a Table Name

Figure 6.12: Supply a Table Name

Select Finish and SAS writes the data to an Excel spreadsheet (or select Next to write the PROC EXPORT statements to a file). Here, the file is called Sales.xlsx (in the C:ookslearning folder) and the table name is Table1.

6.4  Using an Engine to Read an Excel Spreadsheet

You can have SAS treat an Excel spreadsheet as if it were a SAS data set by using the XLS or XLSX engine. As an example, suppose you want to run a SAS procedure with the data in a spreadsheet called Wages.xls in your C:ookslearning folder. The following LIBNAME statement enables you to access this spreadsheet directly:

libname Read 'C:ookslearningWages.xls';

You can now access any of the worksheets within this file. This particular spreadsheet file contains two worksheets, Temporary and Permanent. Suppose you want to compute the mean of Wage and Hours Worked in the Permanent worksheet. Here are the SAS statements to do that:

Program 6.3: Reading a Spreadsheet Using an XLSX Engine

  title "Statistics from Sales Spreadsheet";

  libname Read 'C:ookslearningWages.xls';

  proc means data=Read.'Permanent$'n mean;

     var Wage Hours_Worked;

  run;

There are several important points to notice in this program.

First, because SAS requires you to follow the worksheet name with a dollar sign and because dollar signs are not normally allowed in SAS data set names, you need to use a name literal to do this. In Program 6.3, you place the worksheet name (Permanent$) in single quotes and follow this with an n. This notation allows you to use invalid characters as part of SAS names.

Note: In order for this named literal to work, you must have the option VALIDVARNAME set to ANY. To do this, submit the following statement:

options validvarname=any;  

Next, remember that the column heading Hours Worked is not a valid SAS variable name and the rule is that SAS will substitute an underscore for any invalid variable names. Thus, you need to refer to this column as Hours_Worked. Here is the output from PROC MEANS:

Figure 6.13: Output from Program 6.3

Figure 6.13: Output from Program 6.3

Once you have created the appropriate LIBNAME statement, you can treat your spreadsheets as if they were SAS data sets.

6.5  Using the SAS Output Delivery System to Convert a SAS Data Set to an Excel Spreadsheet

You can use the Output Delivery System (ODS) to create CSV files that Excel can open directly. For more information on ODS, refer to Chapter 19.

As an example, suppose you want to send the contents of the permanent SAS data set Survey to Excel. Program 6.4 creates a CSV file from the SAS data set:

Program 6.4: Using ODS to Convert a SAS Data Set into a CSV File (to Be Read by Excel)

  libname Learn 'C:ookslearning';

  ods csv file='C:ookslearningODS_Example.csv';

  proc print data=Learn.Survey noobs;

  run;

  ods csv close;

The ODS CSV statement opens the CSV file as an output destination. Notice that the NOOBS option of PROC PRINT is used to remove the OBS column from the output. It is important to close the file with an ODS CLOSE statement following the PROC PRINT. Here is a listing of the CSV file C:ookslearningODS_Example.csv:

"ID","Gender","Age","Salary","Ques1","Ques2","Ques3","Ques4","Ques5"

 001,"M",23,28000,1,2,1,2,3

 002,"F",55,76123,4,5,2,1,1

 003,"M",38,36500,2,2,2,2,1

 004,"F",67,128000,5,3,2,2,4

 005,"M",22,23060,3,3,3,4,2

 006,"M",63,90000,2,3,5,4,3

 007,"F",45,76100,5,3,4,3,3

 

You can open this file directly in Excel with the following result:

Figure 6.14: Opening the CSV File in Excel

Figure 6.14: Opening the CSV File in Excel

SAS can read and write Excel data very easily. Be sure to check the resulting files following a transfer to ensure that data values, especially dates, were processed properly. In addition, if you are creating a SAS data set, be sure to run PROC CONTENTS or use the SAS Explorer to verify variable types and lengths.

6.6  A Quick Look at the Import Utility in SAS Studio

As mentioned earlier in this chapter, both SAS Enterprise Guide and SAS Studio can be used to import and export Microsoft Excel files. This section shows some screen shots of importing a spreadsheet into SAS using SAS Studio.

You start by choosing Tasks and Utilities in the Navigation Pane.

Figure 6.15: Choose Tasks and Utilities in the Navigation Pane

Figure 6.15: Choose Tasks and Utilities in the Navigation Pane

This gives you access to submenus, including Import Data.

Figure 6.16: Select Import Data

Figure 6.16: Select Import Data

 

You can drag a file from a list or click Select File to choose which file to import.

Figure 6.17: Drag a File to the Drag-and-Drop Area or Click Select File

Figure 6.17: Drag a File to the Drag-and-Drop Area or Click Select File

Select the file you want to import. Remember, this file must either be in the default shared folder (SASUniversityEditionMyfolders) or another shared folder that you have created on your virtual machine.

Figure 6.18: Selecting a File

Figure 6.18: Selecting a File

SAS Studio will automatically use the correct import method, based on the file extension (XLS or XLSX, for example).

Figure 6.19: Click the Run Icon

Figure 6.19: Click the Run Icon

You will be given a choice of naming the SAS data set and the library. Once this is done, click the Run icon.

SAS Studio will automatically show you output from PROC CONTENTS so that you can see the variable names, variable types, and storage lengths of all the variables.

Figure 6.20: Portion of PROC CONTENTS

Figure 6.20: Portion of PROC CONTENTS

Hopefully, this short section will convince you that SAS Studio (with or without SAS University Edition) is easy to use.

6.7  Problems

Solutions to odd-numbered problems are located at the back of this book. Solutions to all problems are available to professors. If you are a professor, visit the book’s companion website at support.sas.com/cody for information about how to obtain the solutions to all problems.

1.       Use the SAS Explorer Window (or the import facility in Enterprise Guide or SAS Studio) to read the spreadsheet DrugTest.xls and convert this to a temporary SAS data set called Drugtest. Use PROC PRINT to list the observations in this data set.

2.       Run the following program to create a CSV file. Substitute a folder of your choice for the one specified in the program:

  data Soccer;

     input Team : $20. Wins Losses;

  datalines;

  Readington 20 3

  Raritan 10 10

  Branchburg 3 18

  Somerville 5 18

  ;

  options nodate nonumber;

  title;

  ods listing close;

  ods csv file='C:ookslearningSoccer.csv';

  proc print data=Soccer noobs;

  run;

  ods csv close;

  ods listing;

Open Excel on your computer and open the CSV file (you will have to change the file type to .csv). It should look like this:

image shown here

Save this as a spreadsheet using the FileSave As pull down menu and naming the file Soccer.xls.

Now, use the SAS IMPORT wizard to convert this spreadsheet into a permanent SAS data set called Soccer in a folder of your choice.

3.       Read the file soccer.xls created in Problem 2 using an XLS engine. The table name is SOCCER. (You will need to use a name constant 'SOCCER$'n to read this file.)

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset