Chapter 10
Portability

“Why is there a machete in this ambulance?!”

Not “Do you guys carry epinephrine?” or “Where is the bag valve mask?” No, my first question after jumping in the back of the ambulance as we raced to a call for suspected anaphylaxis was to question the legitimacy of the machete tucked precariously inside the sliding door.

If you're a firefighter, one of your first stops in any city—especially overseas—is the local firehouse. While it's not uncommon for a drop-in to become a half-day of riding calls, somehow I found myself volunteering at the Bomberos Voluntarios (volunteer fire department) in Antigua, Guatemala, not even a week after moving to the city. So much for relaxing after Afghanistan…

The predominantly Spanish-speaking crews welcomed me, given that Antigua is rife with tourists, and Americans were especially elated when an English speaker showed up after a stabbing or tuc-tuc accident. While I didn't always expect to understand my crew or patients, I did expect emergency medical training to be relatively universal in tools, techniques, and protocols. This wasn't always the case.

Climbing into the back of the minivan—oh, yeah, the ambulance was a converted Chrysler minivan—the glint of the machete blade first caught my eye, but the amazement didn't end there.

The hatchback had neither latch nor lock, so a seatbelt had been tied to the metal hasp on the floor (where the door originally would have locked) and also the hatchback—such that the rear door was secured only by buckling the seatbelt.

Yet the hatchback hydraulics still somehow functioned, so while the door had to be closed from the outside, the seatbelt could only be fastened (to secure the door) from the inside, which made for some interesting acrobatics on a two-man crew.

Of course the belt left a gap of a couple inches, so the door incessantly bounced and rattled as we raced through the cobblestone streets of Antigua. Because the litter was not secured to the ambulance, half my job seemed to entail reassuring patients that I would prevent them from sliding out the back—and the other half was actually preventing them from sliding out the back!

Start an IV—sure thing, that's a transferable skill in any language. Start an IV while bouncing down cobblestone streets—no, thanks, please pull this thing over.

There was oxygen in the ambulance, but we were advised not to use it (in nearly all cases) because the bottles could only be refilled in Guatemala City, an hour's drive away. The Bomberos Voluntarios didn't have the money for that commute.

Most supplies were haphazardly garnered through donations from the United States and Europe, so we took nothing for granted and made every effort to conserve tape, gauze, bandages, and everything else. But gas was the scarcest resource of all, with the gas gauge of every apparatus I ever rode registering on empty—just enough to triangulate to town, the hospital, and back to the firehouse. There were absolutely no supermarket runs, joyriding, or “staging” in interesting locations, as are common back in the States.

While the vast majority of my skills and training were transferable, the environment, calls, and equipment were at times so different from anything I'd ever experienced that I simply had to take a knee and wait for instruction.

Despite the many differences, as I stood in a receiving line one night helping distribute firefighter patches and badges to a graduating class of bomberos, I was certain that the pride I felt was no different from their own.

And what of the machete? Although I asked every firefighter in the station, the most common responses were “¡Ten cuidado!” (Be careful) or “¿Por qué no?” (Why wouldn't we have a machete in the ambulance?!)

images

Something I often take for granted in the United States is the level of care provided through emergency medical services. Despite subtle protocol and equipment differences, treatment is largely standardized, and patients may not realize discrepancies in level of care that exist elsewhere in the world. SAS practitioners who spend their career working within a single environment also may never encounter system and environmental variability discussed in this chapter.

The majority of my medical training transferred directly from the United States to Guatemala. After all, there are only so many ways to splint an arm or bandage wounds after a bike accident. The Base SAS language is also tremendously portable between environments and, in the vast majority of cases, software written in one environment will perform equivalently in another.

In other cases, however, protocols differed or equipment was unavailable, but we made do. We once transported a patient to an ambulance in a wheelbarrow, following his collapse during the Carrera de Charolas (race of the trays)—a sprint through the cobblestone streets in which waiters, waitresses, and bartenders each must single-handedly carry a tray of beverages. But the solution worked.

A similar aim of software portability is to deliver equivalent functionality despite environmental variability, and this can be achieved when SAS software can detect its environment and respond dynamically.

Sometimes we couldn't replicate equipment; for example, we didn't carry pulse oximeters in the ambulance, so oxygen saturation vitals were never collected. SAS software, too, can encounter environments in which some function or task cannot be completed, but this should be detected and handled as an exception, rather than allowing software to fail aimlessly.

In all cases, my goal while working with the Bomberos Voluntarios remained to provide the highest standard of care possible, regardless of what environmental obstacles we faced. Software portability strives for the same objective, often overcoming environmental variability to achieve equivalent results that deliver full or partial business value.

DEFINING PORTABILITY

Portability is “the ease with which a system or component can be transferred from one hardware or software environment to another.”1 As a subcomponent of transferability within the International Organization for Standardization (ISO) software product quality model, software portability describes the ability of software to operate equivalently across variable environments. Variability can be manifested in the hardware, operating system (OS), infrastructure (e.g., network, folder structure), SAS application (e.g., interface, version, available components, compatibility options), or SAS processing mode (i.e., interactive, batch).

With few exceptions, Base SAS is naturally portable between environments without code modification, and this flexibility is expected in fourth-generation languages (4GLs) that largely aim to remove environmental-specific code that must be customized by developers. The compounding effects of the numerous dimensions of environmental variability can cause rare but unwanted results in SAS software function or performance. Therefore, implementing portability in software effectively validates that the software has been tested across specific environments and that it can be safely transferred to and executed in environments identical to those in which it was tested.

This chapter introduces several dimensions of portability that can effectively expand the intended audience of SAS software, making it viable in a greater range of environments. Even within development environments that are relatively stable over time, portability is sought because it can facilitate software that executes without modification in separate development, test, and production environments, militating against the inefficient practice of maintaining separate code bases. To the extent that development, test, and production environments can execute identical software, code maintainability will be substantially improved.

DISAMBIGUATING PORTABILITY

Transferability is “the degree to which the software product can be transferred from one environment to another.”2 Within the ISO software product quality model, the transferability characteristic includes four sub-characteristics: portability, installability, adaptability, and compliance. Portability is the focus of this chapter and typically reflects the ease, efficiency, and effectiveness with which software operates across two or more environments. Portability is often defined dichotomously as software that either functions or does not. However, as demonstrated throughout this chapter, gradations of portability can exist, especially where software functionality is portable between systems but performance varies substantially.

Installability is “the degree to which the software product can be successfully installed and uninstalled in a specified environment.”3 Installability can facilitate software that “unpacks” its environment, including required infrastructure, folders, and files, and often enables users to customize software adequately during initialization. Often, in literature, both portability and installability are defined as static performance attributes (and thus characteristics of internal software quality) because examination of code can demonstrate the environments to which software is intended to be portable. However, because portability and installability failures are demonstrated through software execution (not inspection), they are included as dynamic performance requirements within this text.

Adaptability is defined as synonymous to flexibility, “the degree to which the software product can be adapted for different specified environments without applying actions or means other than those provided for this purpose for the software considered.”4 Flexibility is referenced throughout this text, principally in the roles that it plays in supporting software stability, reusability, and extensibility. Transferability compliance is not discussed in this text but is “the degree to which the software product adheres to standards or conventions relating to portability.”5

Although subsumed under the compatibility quality characteristic in the ISO software product quality model, interoperability can be closely tied to portability and is defined as “the degree to which the software product can be cooperatively operable with one or more other software products.”6 The SAS application demonstrates interoperability in its ability to interface with third-party software products that can include drivers, APIs, and databases. Interoperability is not discussed further in this text.

3GL VERSUS 4GL PORTABILITY

Software portability is so ubiquitous that it is widely overlooked and taken for granted—and that is the intent. I wake up in the morning, and before even climbing out of bed, I perform a Google search on my phone to check the weather and news. Minutes later, I flip open a laptop and check traffic on Google Maps before driving to work. At work, it's a different computer and possibly a different Internet browser, but the same Google interface to conduct countless searches. Millions of people are using the same web application around the world on various devices and interfaces, all having similar experiences due to software portability. Without this portability, user experience on the application could vary tremendously by Internet browser, hardware, device, and other environmental characteristics.

Data analytic software often differs from traditional software applications because solutions are developed to meet niche analytical requirements rather than for broad dissemination. SAS practitioners don't typically buy SAS Enterprise Guide to develop software and then attempt to market that software beyond its original use or environment. SAS software may be shared and reused within an organization, and sometimes more broadly disseminated through literature and technical white papers, but it inherently differs from applications development because en masse distribution and use are never goals.

4GLs such as Base SAS are inherently more portable than third-generation languages (3GLs). 4GLs focus less on resource management and more on functionality. Thus, Base SAS is designed to offer similar service across the breadth of hardware on which it operates, so that running the general linear model or GLM procedure in one environment will perform equivalently to other environments. When stakeholders purchase the SAS application, they expect the SAS software they author to function identically regardless of the environment in which it's run.

Thus, SAS practitioners have far fewer portability obstacles than can plague 3GL developers, and in many cases can ignore the environment in which they are operating. Most SAS software written on one system will function on another. Because portability can often be overlooked in SAS development, however, it's often not contemplated until functional or performance failures are investigated and portability is determined to be the culprit. The following “Facets of Portability” sections define the overarching dimensions of SAS environments that can pose portability challenges to SAS practitioners.

FACETS OF PORTABILITY

Portability aims to overcome functional and performance discrepancies that exist among systems due to the myriad environmental attributes that can vary in SAS environments. These attributes can include the following:

  • OS—for example, Windows, UNIX
  • SAS interface—for example, SAS Display Manager, SAS Enterprise Guide, SAS Studio
  • SAS processing mode—for example, batch or interactive
  • SAS version—for example, 9.3, 9.4
  • Licensed SAS components—for example, SAS/GRAPH, SAS/STAT, SAS/ACCESS
  • SAS environment portability—the local installation of SAS, including folder and file organization, SAS options, and customization of configuration files
  • SAS data—data set file formats
  • Software development life cycle (SDLC)—for example, independent environments that comprise the development, testing, and production phases

The following sections don't aim to give exhaustive lists of all portability challenges that may exist. Rather, they demonstrate aspects of the environment that may pose portability challenges and of which SAS practitioners should be aware.

OS Portability

The vast majority of the Base SAS language operates equivalently across OS environments, such as Windows, UNIX, and z/OS. In fact, most Base SAS technical documentation is contained in OS-agnostic literature, such as the SAS® 9.4 Macro Language Reference,7 Base SAS® 9.4 Procedures Guide,8 or the SAS® 9.4 Functions and CALL Routines: Reference.9 Within those compendia, the beauty of Base SAS portability is revealed in that SAS practitioners rarely have to make programmatic decisions based on the OS on which they are developing. To a large extent and in much of data analytic development, the OS can be ignored.

In some cases, idiosyncrasies do exist in SAS architecture, implementation, syntax, or how code executes in different environments. Thus, in addition to the broad SAS software documentation, SAS also produces OS-specific documentation such as the SAS® 9.4 Companion for UNIX Environments10 or the SAS® 9.4 Companion for Windows.11 Because Windows is a popular environment for the installation of SAS client software, and because LINUX/UNIX is common for server environments and undergirds the SAS University Edition, SAS practitioners may occasionally perceive lack of portability between these OSs.

Capitalization in Windows and UNIX

Regardless of the OS environment, the SAS application is not case sensitive in its interpretation of SAS code. Even reserved words such as procedures, functions, and SAS global macro variables don't require capitalization as in some languages. For example, while the SAS procedures SORT and MEANS are capitalized within the text of this book, they have not been capitalized in code examples. The Windows environment operates similarly to the SAS application and is not case sensitive. UNIX, however, does enforce capitalization. Thus, when SAS software must interact with its environment through I/O functions or environmental variables, this discrepancy can cause SAS software to fail.

For example, the following code retrieves the SASROOT operating environment variable that demonstrates the location of the SAS.exe executable file. The code executes in both Windows and UNIX environments because SASROOT has been capitalized:

%put %sysget(SASROOT);
C:Program FilesSASHomeSASFoundation9.4

However, when SASROOT is not capitalized (as might be common practice in some SAS Windows environments) the code fails when executed in UNIX because “sasroot” is not recognized:

WARNING: The argument to macro function %SYSGET is not defined as a system variable.
%put %sysget(sasroot);

A straightforward development best practice that overcomes this limitation is to capitalize environmental variables to ensure portability. However, capitalization restrictions extend beyond environmental variables to references of SAS path names. For example, the SAS University Edition uses the default folder /folders/myfolders/ (all lowercase) as the root directory for the SAS environment. The following LIBNAME statement successfully assigns the library LOWER to this location:

libname lower '/folders/myfolders/';
NOTE: Libref LOWER was successfully assigned as follows:
      Engine:        V9
      Physical Name: /folders/myfolders

However, the following library assignment fails in the SAS University Edition because its capitalization is inconsistent:

libname lower '/folders/MyFolders/';
NOTE: Library LOWER does not exist.

Thus, in some cases, Windows users may experience runtime errors when porting software from Windows to UNIX.

SAS Interface Portability

The SAS Display Manager, SAS Enterprise Guide, and SAS Studio represent individual SAS interfaces that compile and execute Base SAS software. While collectively referred to as the “SAS application” throughout this text, these unique interfaces can demonstrate subtle differences in functionality and performance. For example, SAS environmental macro variables may be assigned uniquely within an interface and not be available in all interfaces. Performance characteristics such as software speed can also vary greatly between SAS interfaces, and because SAS practitioners most commonly operate on a single SAS interface, these performance differences may be widely unrecognized.

Environmental Variables

SAS environmental variables can vary by SAS interface. For example, when you open a new SAS Display Manager session, no global macro variables are initially defined. This is demonstrated by submitting the following code, which produces no output:

%put _global_;

However, within Enterprise Guide, the output is very different:

%put _global_;
GLOBAL SASWORKLOCATION "C:UsersstudAppDataLocalTempSEG17308SAS Temporary Files\_TD15952_COMPUTADOR_Prc2/"
GLOBAL _CLIENTAPP 'SAS Enterprise Guide'
GLOBAL _CLIENTAPPABREV EG
GLOBAL _CLIENTMACHINE 'STUD'
GLOBAL _CLIENTPROJECTNAME ''
GLOBAL _CLIENTPROJECTPATH ''
GLOBAL _CLIENTTASKLABEL 'Program'
GLOBAL _CLIENTUSERID 'stud1'
GLOBAL _CLIENTUSERNAME ''
GLOBAL _CLIENTVERSION '7.100.1.2805'
GLOBAL _EG_WORKSPACEINIT 1
GLOBAL _SASHOSTNAME 'stud'
GLOBAL _SASPROGRAMFILE
GLOBAL _SASSERVERNAME 'Local'

When submitted from the SAS University Edition, the code again produces an entirely different set of SAS environmental variables:

%put _global_;
GLOBAL BASEDIR /folders/myfolders/
GLOBAL GRAPHINIT
GLOBAL GRAPHTERM
GLOBAL OLDPREFS /folders/myfolders/.wepreferences
GLOBAL OLDSNIPPETS /folders/myfolders/.mysnippets
GLOBAL OLDTASKS /folders/myfolders/.mytasks
GLOBAL STUDIODIR /folders/myfolders/.sasstudio
GLOBAL STUDIODIRNAME .sasstudio
GLOBAL STUDIOPARENTDIR /folders/myfolders
GLOBAL USERDIR /folders/myfolders
GLOBAL _BASEURL http:localhost:10080SASStudio
GLOBAL _CLIENTAPP SAS Studio
GLOBAL _CLIENTAPPVERSION 3.4
GLOBAL _EXECENV SASStudio
GLOBAL _SASPROGRAMFILE
GLOBAL _SASWSTEMP_ foldersmyfolders.imagescda3fc30dc864cf9a17bbd7afcad3d83
GLOBAL _SASWS_ foldersmyfolders

These differences may seem insignificant until you're writing software that is designed to be ported across different SAS interfaces that must rely on one or more of these variables. Consider the seemingly straightforward task of obtaining the current folder in which a SAS program is saved, which is complicated because the SAS Display Manager, SAS Enterprise Guide, and SAS University Edition each utilize different methods for obtaining this environmental variable. To execute, the %GETCURRENTPATH macro must be saved to a named SAS program file:

* saved as /folders/myfolders/test/mycode.sas;
%macro getcurrentpath();
%let syscc=0;
%global path;
%let path=;
%global getcurrentpathRC;
%let getcurrentpathRC=GENERAL FAILURE;
%if %symexist(_clientprojectpath) %then %do;
   %let path=%sysfunc(dequote(&_clientprojectpath));
   %let path=%substr(&path,1,%length(&path)-%length(%scan(&path,-1,)));
   %end;
%else %if &SYSSCP=WIN %then %do;
   %let path=%sysget(SAS_EXECFILEPATH);
   %let path=%substr(&path,1,%length(&path)-%length(%scan(&path,-1,)));
   %end;
%else %if &_CLIENTAPP=SAS Studio %then %do;
   %let pathfil=&_SASPROGRAMFILE;
   %let pathno=%index(%sysfunc(reverse("&pathfil")),/);
   %let path=%substr(&pathfil,1,%eval(%length(&pathfil)-&pathno+1));
   %end;
%else %do;
   %let getcurrentpathRC= Environment Not Defined!;
   %put &getcurrentpathRC;
   %end;
%if &syscc=0 and %length(&path)>0 %then %let getcurrentpathRC=;%mend;

When executed from within the SAS University Edition, the global macro variable &_CLIENTAPP is detected and the following output is produced:

%getcurrentpath;
%put PATH: &path;
PATH: /folders/myfolders/test

If executed from an unnamed SAS program, the %GETCURRENTPATH macro will not execute and produces the following warning:

%getcurrentpath;
WARNING: Argument 2 to macro function %SUBSTR is out of range.
WARNING: Argument 3 to macro function %SUBSTR is out of range.
%put PATH: &path;
PATH:

The exception handling in the %GETCURRENTPATH macro detects general failures by validating the value of &SYSCC and detects the specific failure that occurs when an unknown interface is encountered. While this fault-tolerance attempts to prevent errors that would occur if the code were run in interfaces other than the three specified, the degree of portability still is unknown until tested in those specific interfaces. For example, because SAS Display Manager was only tested on a Windows platform, it's unclear whether SAS Display Manager on a UNIX platform would behave identically. This underscores the complexities of portability that can occur when dimensions of portability interact, as well as the need to test software in the specific environment in which it will be operated to validate its portability.

Contrasting the SAS University Edition

The SAS University Edition is the gateway drug to SAS, offered free to students, to schools, and for general noncommercial use. When Dr. Goodnight announced the release of the SAS University Edition at the 2014 SAS Global Forum, I flashed a Cheshire-Cat grin as thousands of attendees wildly applauded. The cost-prohibitive nature of SAS (as proprietary software) had limited its accessibility, while the surging popularity of Python, R, and other open-source and free software languages within the analytics and data science communities had provided fierce competition. Thus, the introduction of SAS University Edition allows neophytes to get their feet wet for free, just as they do with other popular data analytic languages.

Despite its license limitations for noncommercial use, the SAS University Edition has other technical limitations that may be less well known, such as the limit of running only two CPUs, which can produce decreased performance compared with other SAS interfaces. For example, Figure 10.1 demonstrates the substantially higher performance of the SORT procedure in the SAS Display Manager as compared with the SAS University Edition when run on the same computer.

Illustration depicting SAS University Edition Performance Limitations.

Figure 10.1 SAS University Edition Performance Limitations

Because of the decreased resources available to the SAS University Edition, the software encounters the inefficiency elbow (discussed in the “Inefficiency Elbow” section in chapter 9, “Scalability”) at significantly smaller file sizes than the SAS Display Manager. This difference results in functional failures (out-of-memory errors) that also occur on much smaller data sets in SAS University Edition than in either the SAS Display Manager or SAS Enterprise Guide.

These performance disparities are important to understand, especially for readers whose sole experience with SAS may be the SAS University Edition. Especially because the SAS University Edition is intended to be the first face of SAS that many SAS practitioners will encounter, it's important to understand that other pay-to-play SAS interfaces (like SAS Display Manager and SAS Enterprise Guide) do provide higher performance than the SAS University Edition.

Users experienced with other SAS interfaces may recognize other specific functional limitations in the SAS University Edition. For example, the following code attempts to shell to the OS and list files in the current directory. However, it fails because the interface for University Edition by default does not enable SYSTASK:

systask command "dir";
ERROR: Insufficient authorization to access SYSTASK COMMAND.

With SYSTASK disabled in SAS Studio, this tremendously limits parallel processing because independent SAS sessions cannot be asynchronously spawned, as demonstrated in the “Decomposing SYSTASK” section in chapter 12, “Automation.” Moreover, while batch jobs were introduced within the SAS University Edition in 2016, they must be run from within a web browser from inside the virtual machine, further preventing scheduled production jobs from being run in this interface. Notwithstanding these subtle limitations, the SAS University Edition offers substantial SAS functionality and performance for free.

SAS Processing Mode Portability

The processing mode represents whether SAS is running interactively, in which code is submitted and executed manually, or in batch mode, in which SAS programs are executed from the SYSTASK command or the OS directly via a command line statement or batch program. SAS software is typically developed in the interactive mode because it enables SAS practitioners to execute code, immediately view results in the log, output, and data windows, and continue to modify the code as necessary. However, because a common goal is to automate and schedule SAS production software to execute with little to no human interaction, the transition to SAS batch jobs is often the final step in development.

Few differences exist between how software executes in interactive mode versus batch; however, because batch opens the SAS application—not just a specific program—system options and other options (such as an Autoexec.sas or configuration file) can be specified at initialization, allowing tremendous diversity in software execution. At times, it may be necessary to determine programmatically whether software is executing interactively or in batch so that program flow can be dynamically altered. To demonstrate the distinction between the two modes, the following code can be saved to C:permatchtest.sas.

%put SYSENV: &SYSENV;
%put SYSPROCESSMODE: &SYSPROCESSMODE;

When the code is executed from the SAS Display Manager (in interactive mode), the following output is produced. Note that the SAS automatic macro variable &SYSENV is set to FORE to indicate that SAS is running interactively in the foreground.

%put SYSENV: &SYSENV;
SYSENV: FORE
%put SYSPROCESSMODE: &SYSPROCESSMODE;
SYSPROCESSMODE: SAS DMS Session

To execute Batchtest.sas in batch mode from Windows, the following command should be entered (in a single line) at the command prompt.

“C:program filesSASHomeSASFoundation9.4sas.exe” -noterminal -sysin c:permatchtest.sas -log c:permatchtest.log

When executed, the log file Batchtest.log indicates that the &SYSENV macro variable now reflects that the software was executed in the BACK (background) rather than foreground while the &SYSPROCESSMODE macro variable now also indicates batch operation.

%put SYSENV: &SYSENV;
SYSENV: BACK
%put SYSPROCESSMODE: &SYSPROCESSMODE;
SYSPROCESSMODE: SAS Batch Mode

Note that the NOTERMINAL command line option must be specified to enable batch mode; without NOTERMINAL specified, the &SYSENV variable will always reflect FORE, even when run from batch mode.

Because SAS system options can be passed via batch invocation, and customized parameters can be passed with the SYSPARM option, batch jobs can function very differently than their interactive equivalents. In fact, if software depends on SAS options or parameters passed through batch invocation, batch software may lack backward compatibility, in that it no longer can be executed directly from the interactive mode. The “Batch Backward Compatibility” section in chapter 12, “Automation,” demonstrates the importance of facilitating software that can still be run interactively to support necessary maintenance and testing.

SAS Version Portability

As each SAS version is released, additional functionality is included, and for the most part, backward compatibility is assured so that code that functioned in the past will continue to do so in the future. Performance can also be improved, such as SORT and SQL, which were upgraded to embrace multithreading with the SAS 9 release. As a result, sorts performed in SAS 9 should perform faster than their equivalent SAS 8 sorts despite identical code. The FULLSTIMER performance metrics comparing multithreaded and single-threaded sorting are demonstrated in the “CPU Processing” section in chapter 8, “Efficiency.”

Version portability is typically not a problem for organizations; after all, why would a team using a newer version of SAS suddenly switch to an older version? On occasion, however, multiple versions may be simultaneously maintained in an environment, especially when the SAS applications are being upgraded. For example, it's not uncommon for a team to upgrade its development server to get the kinks out before upgrading test and production servers. Thus, for periods of time, SAS practitioners may need to become cognizant of functionality or performance differences between the environments.

Version compatibility also becomes an issue when code is ported from SAS white papers or other literature using newer versions of SAS. For example, with the release of SAS 9.4, I've started using the DELETE procedure that debuted. However, because many organizations still run SAS 9.3, if I were publishing code, it might be beneficial to utilize the older DATASETS procedure to delete a file to ensure portability to prior SAS versions. In rare cases—and I won't shame a certain federal agency that in 2016 still uses SAS 8—a significant amount of work may be required to translate new functionality into antediluvian SAS versions and environments.

When software is developed without regard for backward compatibility and utilizes newer functionality, one method to prevent catastrophic failure (in the event that the software later finds itself being run by an older version of SAS) is to implement exception handling that validates SAS software version. By testing the automatic macro variable &SYSVER, software can fail safe or dynamically perform equivalent functionality when encountering an older version of SAS. For example, the following code ensures SAS 9.4 is running before executing the remainder of the %TESTVER macro, which requires the DELETE procedure:

%macro testver();
%if &sysverˆ=9.4 %then %do;
   %put Requires SAS v9.4;
   %return;
   %end;
proc delete data=perm.mydata;
run;
%mend;
%testver;

However, when SAS 9.5 (or 10?) is released, this code will unintentionally fail because it uses the not equal operator. A more robust and reusable solution tests a parameterized version number and provides a return code to demonstrate success or failure if an equivalent or newer version of SAS is detected:

* tests whether SAS version is suffiviently high enough;
%macro testver(ver= /* version number in 9 or 9.4 format--no V */);
%global testverRC;
%let testverRC=FAILED;
%let realver=9.3;
%if %sysevalf(&sysverfrom Windows, the following command should be entered <&ver) %then %do;
   %put Sad Panda Alert: Requires SAS &ver but you only have &sysver;
   %return;
   %end;
%else %let testverRC=;
%mend;

Running the code on SAS 9.4 now produces the expected exception and reroutes program flow when SAS 9.5 is required through the invocation:

%testver(ver=9.5);
Sad Panda Alert: Requires SAS 9.5 but you only have 9.4
%put RC: &testverRC;
RC: FAILED

In production software, rather than printing a message about the exception in the SAS log, the return code &TESTVERRC would be utilized to drive program flow dynamically within the parent process that had called the %TESTVER macro.

SAS Component Portability

SAS software is licensed as separate components or modules, the most common of which is Base SAS, which provides foundational software development and SAS macro development capabilities. Other components, such as SAS/GRAPH, SAS/STAT, or SAS/ACCESS, can be purchased individually or are sometimes bundled together in licensing packages. Even if different organizations have roughly equivalent hardware, infrastructure, and system resources, SAS functionality and performance can still vary wildly based on add-on components that have been purchased.

For example, I recently reviewed the code for the first SAS white paper I ever presented: Winning the War on Terror with Waffles: Maximizing GINSIDE Efficiency for Blue Force Tracking Big Data.12 I had received an email referencing the paper and wanted to answer a question posed by a reader. But when I ran the code, I received a spate of runtime errors; the GINSIDE procedure—at the heart of the paper—did not function because I no longer have a license for SAS/GRAPH. In the same light, organizations should carefully deliberate the decision to eliminate SAS licenses and first ensure that production software does not rely on SAS procedures or other functionality that specific licenses support.

The SETINIT procedure lists all SAS licensed components and licensing periods to the log. The following example demonstrates the format of output in the SAS Display Manager, although licenses and dates will differ by installation, and format may differ by SAS interface and OS:

106  proc setinit;
107  run;
NOTE: PROCEDURE SETINIT used (Total process time):
      real time           0.05 seconds
      cpu time            0.00 seconds
Original site validation data
Current version: 9.04.01M3P062415
Site name:    'SASSY McSASSY'.
Site number:  12345678.
Expiration:   14OCT2019.
Grace Period:  45 days (ending 28NOV2019).
Warning Period: 45 days (ending 12JAN2020).
System birthday:   12SEP2013.
Operating System:   WX64_WKS.
Product expiration dates:
---Base SAS Software
      14OCT2016
---SAS/Secure 168-bit
      14OCT2016
---SAS/Secure Windows
      14OCT2016
---SAS Enterprise Guide
      14OCT2016
---SAS Workspace Server for Local Access
      14OCT2016

The SAS log is useful to someone interested in confirming license availability but unfortunately offers no automatic macro variables that can be used to retrieve license information programmatically. For example, in more robust code, it would be useful to validate SAS/GRAPH component availability before attempting the GINSIDE procedure to ensure the procedure is licensed.

The following code saves the SETINIT output to a text file, parses the file, and generates an asterisk-delimited macro variable &COMPONENTS that contains a list of all licensed SAS components:

%macro get_components();
%global components;
%let components=;
%local out;
%let out=c:permcomponents.txt;
proc printto log="&out" new;
run;
proc setinit;
run;
proc printto;
run;
data _null_;
   length text $100 components $1000;
   infile "&out" truncover end=eof;
   input text $100.;
   if _n_=1 then components='';
   if substr(text,1,3)='---' then components=strip(components)    || strip(substr(text,4,90)) || '*';
   if eof then call symput('components',substr(strip(components),1,    length(strip(components))-1));
   retain components;
run;
%mend;

When executed, the following output demonstrates the macro variable &COMPONENTS that could be parsed further (not shown) to determine dynamically whether all required components for specific software are available:

%get_components;
NOTE: PROCEDURE PRINTTO used (Total process time):
           real time           0.00 seconds
           cpu time            0.00 seconds
NOTE: The infile "c:permcomponents.txt" is:
           Filename=c:permcomponents.txt,
           RECFM=V,LRECL=32767,File Size (bytes)=1217,
           Last Modified=07May2016:23:13:25,
           Create Time=07May2016:23:10:46
NOTE: 32 records were read from the infile "c:permcomponents.txt".
           The minimum record length was 0.
           The maximum record length was 98.
NOTE: DATA statement used (Total process time):
           real time           0.01 seconds
           cpu time            0.00 seconds
%put SAS Components: &components;
SAS Components: Base SAS Software*SAS/Secure 168-bit*SAS/Secure Windows*SAS Enterprise Guide*SAS Workspace Server for Local Access

This methodology is extremely beneficial where code is intended to be ported to other and especially unknown environments, as is common when SAS code is published. For example, including the %GET_COMPONENTS macro in software that relies on the SAS/GRAPH license could enable the software to determine programmatically that SAS/GRAPH was (or was not) installed and dynamically alter program flow to prevent runtime errors.

SAS Environment Portability

Install the same SAS application and version on identical hardware in two different organizations, and SAS software may still run differently. SAS system options, configuration files, and Autoexec.sas files provide endless opportunities to customize not only the look and feel of SAS but moreover the way the SAS application functions. Another source of variability often stems from the hierarchy of logical drives, folders, and files that comprise the SAS infrastructure.

SAS System Options

Because SAS options can so profoundly affect software function and performance, while they are customizable, many cannot be modified within a SAS session and must be invoked through Autoexec.sas, configuration files, or at program invocation through command line statements. Some system options can also be restricted by SAS administrators to protect against accidental or malicious modification. Extensive information about system options and their respective limitations is included in SAS® 9.4 System Options: Reference, Fourth Edition.13

To further complicate the options landscape, the mode in which SAS executes—interactive or batch—can affect default options settings. To demonstrate these differences, the following code should be saved to the file C:permoptions_batch.sas and executed:

* saved as c:permoptions_batch.sas;
libname perm 'c:perm';
%put SYSENV: &sysenv;
proc optsave out=perm.options_batch;
run;

The program uses the OPTSAVE procedure to write all option values to the PERM.Options_batch data set. When the following code is subsequently executed, it runs the Options_batch.sas program a second time in batch mode (with the SYSTASK statement) to demonstrate differences between system options:

libname perm 'c:perm';
proc optsave out=perm.options_interactive;
run;
systask command """%sysget(SASROOT)sas.exe"" -noterminal -sysin ""c:permoptions_batch.sas"" -log ""c:permoptions_batch.log""" wait status=rc_batch;
proc sort data=perm.options_batch;
   by optname;
run;
proc sort data=perm.options_interactive;
   by optname;
run;
data perm.options;
   merge perm.options_batch (in=a) perm.options_interactive    (in=b rename=(optvalue=optvalue2));
   by optname;
   if a and not b then put optname 'not in batch';
   else if b and not a then put optname 'not in interactive';
   else if optvalueˆ=optvalue2 then do;
      put optname 'values differ';
      put '-- BATCH: ' optvalue;
      put '-- INTER: ' optvalue2;
      end;
run;

The good news is that identical system options are identified in both batch and interactive modes. However, the following abridged log does demonstrate subtle differences in option settings. Other installations and environments will of course demonstrate unique results:

DLDMGACTION values differ
-- BATCH: FAIL
-- INTER: REPAIR
FONT values differ
-- BATCH:
-- INTER: (Sasfont 8)
_LAST_ values differ
-- BATCH: _NULL_
-- INTER: PERM.OPTIONS

When SAS software requires substantial customization of SAS system options, including those defined both at software invocation and in the code itself, one best practice to ensure reliable execution is to implement a quality control early in the program that compares current system option settings to expected or required options. Thus, by establishing a baseline data set that contains the required system options settings for specific software, that software can validate current options against the baseline and prevent execution if options have been modified. Especially when running software in a new environment, this type of quality control can prevent disaster while elucidating environmental discrepancies.

SAS Folder Structure

Every SAS environment has an inherent structure that defines where programs, macros, saved formats, configuration files, control tables, data sets, and other files are located. When software is ported to and installed in a new environment, the new infrastructure may be vastly different and require software components that are missing. Even when necessary environmental structure and files do exist, portable software should flexibly reference those components so multiple copies of software don't need to be maintained. For example, server names or logical folder names will often differ from a development server to a production server, but through flexible software design, these attributes can be dynamically encoded to maximize maintainability and ensure a single software instance can suffice for all three environments.

Throughout this text, the C:perm folder is repeatedly referenced and assigned to the PERM library. To facilitate readability, the library assignment is hardcoded:

libname perm 'c:perm';

However, this statement assumes that the folder C:perm exists and, with the default option NODLCREATEDIR enabled, will produce a note to the SAS log if it does not exist:

libname perm 'c:perm';
NOTE: Library PERM does not exist.

To overcome this exception, the DLCREATEDIR option can be specified, which will create the folder through the LIBNAME statement if C:perm does not exist:

options dlcreatedir;
libname perm 'c:perm';
NOTE: Library PERM was created.
NOTE: Libref PERM was successfully assigned as follows:
      Engine:        V9
      Physical Name: c:perm

This code will still fail if the C: drive itself doesn't exist. Moreover, the code is biased toward a Windows environment. Another method to make library assignments more flexible is to assess and use the current library—that is, the folder in which a SAS program is saved. This method is demonstrated earlier in the “Environmental Variables” section, and can be used to create or access a subordinate folder structure. For example, if the %GETCURRENTPATH macro assesses that the current path is C:perm and assigns this value to the global macro variable &PATH, then &PATH could be used subsequently to create or access subordinate folders like C:pathdata or C:pathoutput. This functionality allows SAS practitioners to drop code into an environment and automatically populate its infrastructure, similar to dropping Sea-Monkey or Sea-People capsules into an aquarium.

Another way to eliminate hardcoded references is to enforce that SAS library assignments be made outside of the software—through the SAS Management Console, Autoexec.sas, SAS configuration files, the SAS Autocall Macro Facility, or other dynamic methods. These assignments establish a single, hardcoded reference to a logical folder and replace the necessity to hardcode common library assignments throughout various programs. This may not seem like a tremendous advantage until your information technology (IT) department unexpectedly decides to rename your SAS server, causing hundreds of programs to crash simultaneously as every library reference fails. I was once faced with this crisis and, while a team of developers scrambled to alter our entire software base to run in the “new environment,” we also improved the portability of all software by removing all LIBNAME statements from production software in favor of dynamic library assignments.

SAS Servers

Some SAS environments maintain separate development, testing, and production servers to handle discrete phases of the SDLC. Portability becomes a common concern in these environments because of software transfer between systems that may maintain different versions of SAS, file structures, or even functional intent. For example, in development and test environments, SAS software is often run interactively so that SAS practitioners can modify it, which presumably will require repeated interrogation of the SAS log. Production software, on the other hand, hopefully embraces some method to parse SAS logs automatically for warnings, runtime errors, and exceptions, but might not require maximum verbosity needed during earlier phases of the SDLC.

Because a unique SAS license is required for each installation, evaluation of the &SYSSITE automatic macro variable can be extremely useful in dynamically altering program flow based on execution environment. For example, the following %INIT macro assesses in which environment software is executing, appropriately modifies system options, dynamically assigns libraries, and even prints a “TEST” caveat on all output produced in the development environment:

%macro init();
%if &syssite=12341234 %then %do; * DEVELOPMENT SERVER;
   options fullstimer mlogic msglevel=i;
   libname perm 'c:perm';
   title3 'TEST';
   %put "Maximum verbosity.";
   %end;
%else %if &syssite=56785678 %then %do; * PRODUCTION SERVER;
   options nofullstimer nomlogic msglevel=n;
   libname perm 'd:perm';
   title3;
   %put "Brief.";
   %end;
%mend;

Software maintainability is substantially increased when a single version of software can be maintained that functions across all internal environments and servers. Without dynamic server recognition, SAS practitioners might be forced to implement error-prone hardcoding as software is transferred between environments. This unnecessary maintenance violates maintainability and stability principles discussed in the “Toward Software Stability” section in chapter 11, “Security.”

Data Portability

SAS practitioners who have operated within a single environment, or have switched environments but haven't stolen data from their previous employer, might be surprised to learn that data set encoding can also vary from one system to another. Encoding gives greater flexibility to how data are interpreted—including language and other considerations—but can be defaulted to different values across environments.

To demonstrate this phenomenon, the following code creates the TEST.Mydata data set in the SAS University Edition:

data test.university (drop=i);
   length num1 8;
   do i=1 to 1000;
      num1=round(rand('uniform')*10);
      output;
      end;
run;

However, when the data set is opened in the SAS Display Manager for Windows on the same computer, a note states that the data set encoding does not match the environment encoding:

data windows;
   set test.university;
NOTE: Data file TEST.UNIVERSITY.DATA is in a format that is native to another host, or the file encoding does not match the session encoding. Cross Environment Data Access will be used, which might require additional CPU resources and might reduce performance.
313  run;
NOTE: There were 1000 observations read from the data set TEST.UNIVERSITY.
NOTE: The data set WORK.WINDOWS has 1000 observations and 1 variables.
NOTE: DATA statement used (Total process time):
      real time           0.15 seconds
      cpu time            0.01 seconds

Even the COMPARE procedure finds no discrepancies between the SAS University Edition and SAS Display Manager data sets. However, through an investigation of the encoding for the two SAS interfaces with the OPTIONS procedure, the default encodings are shown to differ. The SAS University Edition defaults to UTF-8 while the SAS Display Manager for Windows defaults to WLATIN1.

proc options option=encoding; * run in SAS Display Manager;
run;
ENCODING=WLATIN1  Specifies the default character-set encoding for the SAS session.
NOTE: PROCEDURE OPTIONS used (Total process time):
      real time           0.01 seconds
      cpu time            0.00 seconds
proc options option=encoding; * run in the SAS University Edition;
run;
ENCODING=UTF-8    Specifies the default character-set encoding for the SAS session.
NOTE: PROCEDURE OPTIONS used (Total process time):
      real time           0.02 seconds
      cpu time            0.03 seconds

In this example, the data are portable between the two SAS environments thanks to a conversion that the SAS application automatically applies; no data were deleted or misrepresented. When encoding values differ more substantially, however, such as when porting data between SAS software that have been run on SAS applications with different default languages, discrepancies can emerge and corrupt data. To eliminate the risk of misrepresenting data or ingesting corrupt data, a quality control can be built (not demonstrated) that programmatically validates actual data set encoding against expected encoding values. This control can be especially beneficial to environments required to ingest third-party data over which they have no influence in data quality, construct, or encoding.

PORTABILITY IN THE SDLC

Within teams and organizations that maintain a single SAS server or infrastructure, portability will have little relevancy in software design. More universal aspects of portability include actions that SAS practitioners might undertake to implement external SAS code or ingest SAS data from third-party sources. However, for SAS practitioners who operate in diverse environments or intend to publish SAS code for general distribution, environmental aspects that influence portability should be considered and implemented commensurate to the diversity of the intended execution environments.

Requiring Portability

Portability requirements are typically straightforward and specify the environments in which software must operate. For a team operating in a single SAS infrastructure, a requirement could state that software should operate on the SAS Display Manager for Windows 9.4. Other beneficial statements could explicitly state that a single code base for software should be maintained to support batch and interactive modes, as well as throughout all phases of the SDLC. This approach demonstrates a commitment to software maintainability and stability and ensures that software will not diverge into separate development and production versions.

To the extent that teams operate on multiple SAS servers or infrastructures, their portability requirements should increase. Even if the environments are virtual mirror images of each other, SAS license numbers, server names, and other subtle variations may require that exception handling dynamically identify the execution environment to perform unique functions. Even in complex environments, however, successful portability requirements can often be expressed in a single statement.

In some cases, portability will need to be interwoven into other performance requirements. For example, given the substantial differences in execution speed between the SAS University Edition and the SAS Display Manager demonstrated in the “Contrasting the SAS University Edition” section, if an organization or team utilized both SAS interfaces, they might need to differentiate required execution speed based on the interface used. Moreover, given that the SAS University Edition is incapable of processing larger files that SAS Display Manager and SAS Enterprise Guide can easily handle, file size thresholds and other requirements might also need to be customized to each specific SAS interface.

Measuring Portability

The ultimate goal of portability is effectively to expand the range of environments in which SAS software successfully operates. In most cases, the outcome of software portability is dichotomous—software either executes in UNIX or it fails. To measure portability, however, software must be tested within the specific environments to which it will be deployed and against all variability that it is designed to handle. For example, a SAS practitioner developing software in the SAS University Edition cannot certify that the software will execute correctly in SAS Enterprise Guide until the software is tested within SAS Enterprise Guide. Thus, measuring portability essentially provides validation that software has been tested in a specific environment against established environmental diversity or variability.

While software functionality is fairly dichotomous, software performance such as execution speed can vary substantially based on variability in the execution environment. A SAS development environment might have a relatively tame infrastructure while its production environment includes grid architecture and other performance enhancements. These and other disparities between environments would necessitate that performance be measured in each environment to ensure portability specifications are met in both environments. In another example, development and testing environments might utilize test data sets with data volumes significantly lower than expected during operational conditions. Given this disparity, as described in chapter 9, “Scalability,” software would need to be tested to demonstrate it scales successfully to the volume and velocity of production data.

WHAT'S NEXT?

While the primary objective of software portability is to expand the functionality of software across diverse environments, a secondary goal is to ensure that software does not encounter an unknown (or untested) environment and subsequently act in an unpredictable manner. In other words, ‘tis better to fail safe than to fail stupid. This second objective of portability focuses on software security, which is introduced in the next chapter. Secure software aims to ensure that it never hurts itself or its environment.

NOTES

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset