Chapter 6: Packages

User-Defined Packages

Instantiation

Using a Package Variable

Package as Object

Packages and Scope

Package as Method Parameter and Method Return Variable

System-Defined Packages

FCMP Package

TZ Package

Methods are code modules that are intended to encapsulate data and logic and modularize your code. Packages are collections of modules that facilitate code reuse and extend the strength of modules. A module in a program has its source code included in the program. As a result, changes to the source code can be made intentionally or inadvertently. A module in a package gets compiled with the package, and the package is saved to a file. The package file can be made accessible to other users in the organization without providing the underlying source code. A considerable amount of code can be built into a package, but a program that accesses this package is not burdened with the source code of the program, making maintenance easier for everyone.

There are user-defined packages and system-defined packages.

User-Defined Packages

DS2 follows a reasonably consistent block pattern for each of the components, and packages are no different. Here is the basic syntax:

package <packageName> / overwrite=YES;                 

       declare  <datatype> <packageGlobalVar>;         

       * more package global vars;

       method <methodName> ;                           

          declare <datatype> <methodLocalVar>;

       * method code;

       end;

       * more methods;                                 

   The package block starts with the keyword package and ends with endpackage. If a one-level package name is used, the package is saved to the Work folder. If a two-level name is used, the package is saved to the specified library.

   Variables that are global to the package are declared. In OOP terms, variables are attributes (or states) of the package (or object). You commonly have get and set methods to allow programs to access these attributes. All package global variables are declared at the beginning of the package.

   Methods are defined within the package.

   Multiple methods can be defined.

Let’s look back at the package defined in Chapter 1:

proc DS2;

package packlib.conv /overwrite=yes;

 method C_to_F(integer C) returns double;

  /* convert degrees fahrenheit to degrees celsius */

  return 32. + (C * (9. / 5.));

 end;

method F_to_C(double F) returns double;

  /* convert degrees fahrenheit to degrees celsius */

  return (F - 32.) * (5. / 9.);

 end;

endpackage;

run;

quit;

A package called conv is compiled and saved in a library named packlib. By saving in a non-Work library, the package can be safely made available to others. In addition, you are only compiling the package. Before you use the package, let’s look at an important concept related to packages—instantiation.

Instantiation

By now, you know you need to declare a variable in DS2. The same goes for packages. You must declare a package variable, which is a variable (identifier) whose data type is package. Here is a common way to declare a package variable:

declare package packlib.conv cnv();

First, there is the package data type, and second, there is the specific package (in this example, packlib.conv). Next, the package variable name is declared (in this example, cnv). Then, there are the parentheses. This declaration tells DS2 to read the package from disk, allocate the memory, and load package contents into memory. The variable references the memory location containing all of the package contents, including attributes and methods. This process of loading the package contents into memory is called instantiation. You can also declare a package variable as follows:

declare package packlib.conv cnv;

This declaration tells DS2 to use a package variable named cnv, but not to automatically allocate the memory. You have a variable, but the variable cannot do anything because memory has yet to be allocated. To instantiate the variable, use the _NEW_ operator:

cnv = _new_ packlib.conv();

The cnv variable now points to a valid memory structure.

Using a Package Variable

Remember, you need to declare and instantiate a package variable before you can use it. Furthermore, because the variable has references to its methods, you need a way to access the methods. You can access a method using dot (.) notation. For example, to access the C_to_F() method of the variable cnv, you submit:

cnv.C_to_F(degC);

If you are familiar with the HASH object in the DATA step, you should be comfortable with this. Here is the program to use the compiled package:

proc DS2;

data ds2DegF_5 (keep=(degF) overwrite=YES)

     ds2AvgF_5 (keep=(avgF) overwrite=YES)

     ;

declare double degF having label 'Temp in Fahrenheit' format F6.1;

declare double avgF having label 'Avg Temp in Fahrenheit' format F6.1;

declare double  sum;

declare integer cnt;

declare package packlib.conv cnv();             

retain sum cnt;

method init();

    sum = 0;

    cnt = 0;

end;

 

method run();

    set ds2DegC_1;

    degF = cnv.C_to_F(degC);             

    sum = sum + degF;

    cnt = cnt + 1;

    output ds2DegF_5;

end;

method term();

    avgF = sum / cnt;

    output ds2AvgF_5;

end;

enddata; 

run;

quit;

   A package variable called cnv is declared and instantiated.

   C_to_F() is invoked in the conv package.

Package as Object

In the conv package above, you were simply creating a collection of methods. The package has no attributes. In OOP, objects (or packages) usually have both attributes and methods. To make the conv package act more like an object, it needs to have more than a collection of methods.

Constructor

Instantiation allocates memory for a package. In addition, instantiation can set initial values for attributes. Whenever a package is instantiated, a special method called a constructor is called. The constructor has the same name as the package. If a constructor method was not explicitly defined, DS2 creates an empty constructor the same way it creates the init(), run(), and term() methods if they were not explicitly defined. In addition, like these system-defined methods, an explicitly defined constructor method cannot return a value. Unlike the system-defined methods, a constructor method can have arguments. In fact, it is common to have an overloaded constructor method, allowing the package variable to be created with specific initial attributes.

Here a package called accumulate is created with three constructor methods:

proc ds2;

package packlib.accumulate / overwrite = yes;

declare integer cnt;                            

declare numeric(10,2) amt;

method accumulate();                            

    cnt = 0;

    amt = 0.00;

end;

method accumulate(numeric(10,2) inAmt);                

    accumulate();

    if inAmt > 0.00

    then

      do;

        cnt = 1;

        amt = inAmt;

      end;

end;

method accumulate(integer inCnt, numeric(10,2) inAmt);

       accumulate();

       cnt = inCnt;

       amt = inAmt;

end;

method add(numeric(10,2) inAmt);                       

    cnt = cnt + 1;

    amt = amt + inAmt;

end;

method setCnt(integer inCnt);                          

    if (cnt = 0) then cnt = inCnt;

end;

method setAmt(numeric(10,2) inAmt);

    if (amt = 0.00) then amt = inAmt;

end;

method getCnt() returns integer;                       

    return (cnt);

end;

method getAmt() returns numeric(10,2);

    return (amt);

end;

method getAvg() returns numeric(10,2);

    return (amt/cnt);                                         

end;

endpackage;

run;

quit;

   The package has two attributes: cnt to represent the number of times the accumulator has been incremented and amt, which is the running total.

   This is the default constructor method. This constructor is called when the accumulator package is instantiated with no arguments. This constructor sets both attributes to zero. In this example, there are no other actions that it needs to take. For more complex packages, there can be several actions involved.

   The method is getting overloaded. When the package is instantiated with one argument, the argument represents the initial amount and the counter is set to 1. There is another constructor to initialize both the counter and the amount. The overloaded constructor method first invokes the default constructor method. It is a good practice to perform all initialization steps.

   The add method increments the counter by 1 and updates the running total with inAmt. The underlying assumption is that amounts are updated one at a time. Hence, the counter is incremented by 1. If amounts can be batched, you would need a second add method that provided both the count and the amount.

   A package has methods that encapsulate data and logic, so you need a method to control access to the data. The setCnt method sets a new value for the counter only if the counter is zero. A similar setAmt method sets a new value for the running total. These set methods control access to the package attributes.

   A package has methods that encapsulate data and logic, so you need a method to control access to the data. The getCnt method retrieves the current value of the counter. A similar getAmt method retrieves the value of the running total. These get methods control access to the package attributes.

   The getAvg method calculates and returns the current average amount.

Get and Set

In the accumulator package, in addition to the overloaded constructor methods, you have defined get and set methods to control access to the package attributes. You need to ensure that the attributes can be updated only in a controlled manner. In the previous example, you allow the running total to be updated only if it is zero.

Using the Object

In the first package example, you accessed a temperature conversion package to convert degrees Celsius to degrees Fahrenheit. After the conversion, you accumulated the temperatures and incremented a counter. After all of the records were processed, an average was calculated and written out. To do all of this, you had to create two global variables (cnt and sum), explicitly retain them, do the accumulation, and in the term method, calculate the average. You can create a package that does all that for you in a more controlled manner.

proc DS2;

data ds2DegF_6 (keep=(degF) overwrite=YES)

     ds2AvgF_6 (keep=(avgF) overwrite=YES)

     ;

declare double degF having label 'Temp in Fahrenheit' format F6.1;   

declare double avgF having label 'Avg Temp in Fahrenheit' format F6.1;

declare package packlib.accumulate temps();                          

declare package packlib.conv cnv();

 

method run();

    set ds2DegC_1;

    degF = cnv.C_to_F(degC);

    temps.add(degF);                                                 

    output ds2DegF_6 ;

end;

method term();

    avgF = temps.getAvg();                                           

    output ds2AvgF_6 ;

end;

enddata; 

run;

quit;

   You need the global variables degF and avgF because you want to write them to the result data set. Only the global variables can be written to the result data set.

   The temps variable is declared and instantiated with the default constructor method. This constructor sets both cnt and amt to zero. The temps variable is global, so it is accessible in all methods.

   The method is invoked with one argument. This method increments the counter and adds to the running total.

   In the term method, the getAvg method is invoked.

The package has simplified the code.

•   There are fewer global variables.

•   The retain statement is not needed.

•   The cnt and sum variables do not have to be explicitly initialized or incremented.

In the future, you will not need to fundamentally this DS2 program. You can add logic and attributes to accomplish new requirements.

Packages and Scope

Within a package, variables follow the same global and local rules discussed in this book. Variables declared within a method are local to that method. In addition, with the exception of the IN_OUT parameter, parameters of a method are implicitly local to the method. Variables declared outside of all methods are global to the package. They are not visible to any program that uses the package. So, these variables are thought of as attributes of the package. If access to these package attributes is required, you should create get and set methods rather than allowing direct access.

Package as Method Parameter and Method Return Variable

Package instances (that is, an already instantiated object) can be passed into a method as a parameter.

declare package packlib.accumulate temps(10, 20000);          

method validate(package packlib.accumulate temps) returns integer;   

       return if temps.getCnt() = 0 then 0 else 1;

end;

   The package is declared and instantiated with one of the overloaded constructors.

   The validate method takes a package instance as a parameter.

In addition, a method can return an instantiated package using the _new_ operator.

declare package packlib.accumulate temps;                     

method create(integer cnt, double amt) returns package packlib.accumulate;      

       return _new_ [this] packlib.accumulate(cnt, amt);             

 end;

   A global package temps variable is declared, but not instantiated.

   The method signature specifies that a package instance will be returned.

   The package is instantiated using the _new_ operator. The [this] operator specifies that the new package instance has global scope.

When instantiating a package in a method to be returned by the method, you must ensure that you are referencing a global variable if you are assigning to that variable:

temps = _new_  packlib.accumulate(cnt, amt);

Or, you can use the [this] operator to ensure that the object is in global scope.

System-Defined Packages

DS2 provides eight system-defined packages. This section briefly discusses the FCMP and TZ packages. See the SAS® 9.4 DS2 Language Reference, Fifth Edition for more information.

Here are the system-defined packages:

FCMP
Allows access to libraries of user-created FCMP functions.

Hash and hash iterator
Enables you to create and use hash tables. There are minor differences in method syntax compared to the DATA step HASH object, but usage is fundamentally the same.

HTTP
Constructs an HTTP client to access HTTP web services.

JSON
JavaScript Object Notation (JSON) is a text-based, open standard data format that is designed for human-readable data interchange. This package provides an interface to create and parse JSON text.

Logger
Provides a basic interface to the SAS logging facility.

Matrix
Provides access to the matrix programming capability similar to SAS/IML functionality.

SQLSTMT
Provides a way to pass FedSQL statements to a DBMS for execution and to access the result set returned by the DBMS.

TZ
Provides a way to process local and international time and date values.

FCMP Package

In the 9.2 release, SAS added the ability to create user-written functions and call routines using the SAS DATA step language. You could make these functions and call routines available to any DATA step programmer. These functions are called the same way a SAS built-in function is called—in the DATA step, there is no apparent difference between a user-written function and a built-in function. With DS2, you can still access these user-written functions. However, you must access them through the FCMP package. In this section, creating user-defined functions is not created. See Eberhardt [2009], Eberhardt [2010], and Secosky [2007] for more information about using PROC FCMP.

PROC FCMP with Functions

Let’s look at an FCMP implementation of the F_to_C() and C_to_F() methods from Chapter 1:

proc fcmp outlib=work.fcmpconv.base ;

function C_to_F(C) ;      

  /* convert degrees fahrenheit to degrees celsius */

  return (32 + (C * (9./5.)));

endsub;

function F_to_C(F) ;

  /* convert degrees fahrenheit to degrees celsius */

  return ((F -32) * (5./9.));

endsub;

run;

quit;                             

   The FCMP procedure is invoked and the function results are saved to work.fcmpconv.base.

   The function is created. Because no return type is specified, a number is assumed.

   PROC FCMP ends with run and quit.

PROC FCMP with Functions in a DATA Step

The functions are created and saved. You can now call them in a DATA step program. Before you can use the functions, you have to let the DATA step know where to find the saved functions using the cmplib= option.

options cmplib=work. fcmpconv;           

data testDS_2;

    drop i;

    format f 6.2;

    do i = 1 to 1000000;

       do c = 0 to 100;

          f = C_to_F(C);   

       end;

    end;

       do c = 0 to 100;

          f = C_to_F(C);   

      output;

   end;

run;

   This is the search path for the FCMP functions.

   The C_to_F() function is invoked.

   The C_to_F() function is invoked.

The functions created by PROC FCMP are invoked the same way a built-in function is called.

FCMP Package with Functions in DS2

To use PROC FCMP functions in DS2, you must wrap them in a package that DS2 can access. You create a DS2 package that references the FCMP library. Once the package is created, you declare it, and then access it in the DS2 program similar to the way you access methods in a user-defined package.

First, create the FCMP package:

proc ds2;

package fcmpCnv /                

       encrypt=SAS               

        language='fcmp'          

        table='work.fcmpconv'     

        overwrite=YES             

       ;

run;

quit;

   Create a package called fmcpCnv.

   The package uses SAS encryption.

   The package is created and includes FCMP functions.

   The location of the FCMP library is created.

   If the package exists, overwrite it.

Declare a package instance in a DS2 program, and then access the functions through the package instance:

proc ds2;

data testDS2_3 (overwrite=YES);

DECLARE double f having format 6.2;

DECLARE double c;

declare package fcmpCnv cnv();    

  method init();

    declare bigInt i;

    do i = 1 to 1000000;

       do c = 0 to 100 ;

          f = cnv.C_to_F(C);      

       end;

    end;

       do c = 0 to 100 ;

          f = cnv.C_to_F(C);      

          output;

       end;

  end;

enddata;

run;

quit;

   The package is declared and instantiated.

   The C_to_F() function is invoked through the package.

   The C_to_F() function is invoked through the package.

Accessing the FCMP functions looks the same as accessing the DS2 method through a package.

PROC FCMP versus DS2 Methods

If you already have libraries full of FCMP functions that you need to access, you should use the FCMP package. If you need to create functions that will be used by both the DATA step and DS2, you should create FCMP functions and use the FCMP package. If you do not have FCMP functions and you do not need to share code, you should create DS2 methods.

TZ Package

Suppose you have three transactions:

1.   2016-02-01 at 11:30 p.m.

1.   2016-02-01 at 10:31 p.m.

2.   2016-02-01 at 10:32 p.m.

You have to sequence them and get the interval between them. At first, this looks easy. However, each transaction is in a different time zone.

1.   2016-02-01 at 11:30 p.m. Honolulu, USA local time

2.   2016-02-01 at 10:31 p.m. Shanghai, China local time

3.   2016-02-01 at 10:32 p.m. Johannesburg, South Africa local time

To correctly sequence these transactions, you need to convert the times to a common time zone, which is usually Coordinated Universal Time (UTC). To do this in DS2, use the TZ package. The TZ package extends SAS date and datetime processing by adding time zone capabilities. To facilitate time zone processing, the TZ package has multiple methods. Because many of the methods are overloaded, see SAS® 9.4 DS2 Language Reference, Fifth Edition for information about each one. Here are the methods:

GETLOCALTIME
Returns the local time. The method can be overloaded with a time zone parameter.

GETOFFSET
Returns the offset (in hours) from UTC.

GETTIMEZONEID
Returns the current time zone ID.

GETTIMEZONNAME
Returns the current time zone ID.

GETUTCTIME
Returns the current UTC time.

TOISO8601
Converts a local time to the ISO 8601 format.

TOLOCALTIME
Converts UTC time to local time.

TOTIMESTAMPZ
Converts a local time to a time stamp string. The string includes the time zone.

TOUTCTIME
Converts a local time to UTC time.

For more information about the time zone ID names, see “Time Zone IDs and Time Zone Names” in SAS National Language Support (NLS): Reference Guide. For more information about ISO 8601, see Eberhardt and Qin [2013].

TZ Example

Remember the three time stamps?

1.   2016-02-01 at 11:30 p.m. Honolulu local time.

2.   2016-02-01 at 10:31 p.m. Shanghai local time

3.   2016-02-01 at 10:32 p.m. Johannesburg local time

This program takes these three times and converts them to UTC and they are in ISO 8601 format.

proc ds2;

data _null_ ;

   method init();

      declare package tz tzone();                             

      dcl double local_time ;

      dcl varchar(40) local_time_iso local_time_utc local_time_tz;

      dcl timestamp h_time s_time j_time utc_time;

      h_time = timestamp'2016-02-01 23:30:00';                

      s_time = timestamp'2016-02-01 22:32:00';

      j_time = timestamp'2016-02-01 22:31:00';

      put 'ISO8601 time';                                            

      local_time_iso = tzone.toiso8601(h_time, 'Pacific/Honolulu');

      put '1h. ' local_time_iso;

      local_time_iso = tzone.toiso8601(s_time, 'Asia/Shanghai');

      put '1s. ' local_time_iso;

      local_time_iso = tzone.toiso8601(j_time, 'Africa/Johannesburg');

      put '1j. ' local_time_iso;

      put;

      put 'UTC time';                                                

      local_time_utc = tzone.toutctime(h_time, 'Pacific/Honolulu');

      put '2h. ' local_time_utc datetime32.;

      local_time_utc = tzone.toutctime(s_time, 'Asia/Shanghai');

      put '2s. ' local_time_utc datetime32.;

      local_time_utc = tzone.toutctime(j_time, 'Africa/Johannesburg');

      put '2j. ' local_time_utc datetime32.;

      put;

   end;

enddata;

run;

quit;

   A TZ package is instantiated. Because no time zone was specified, the time zone in the SAS timezone= option is used. When the TZ package references a local_time, it references the local time of the package.

   Three time stamps are created to represent the times in the three time zones.

   The time stamp is converted to ISO 8601 format. In these method calls, both a timestamp variable and its time zone are provided. toiso8601() converts the time stamp to the ISO 8601 format using the time zone offset associated with the time zone provided. If no time zone is provided, the time zone in the SAS timezone option is used.

   The local time is converted to UTC time. Because a time zone is provided, the result is the UTC time of the local time in the time zone provided. If no time zone is provided, the time zone in the SAS timezone option is used.

Here is the log:

ISO8601 time

1h.  2016-02-01T23:30:00.00-10:00

1s.  2016-02-01T22:32:00.00+08:00

1j.  2016-02-01T22:31:00.00+02:00

UTC time

2h.                02FEB2016:09:30:00

2s.                01FEB2016:14:32:00

2j.                01FEB2016:20:31:00

With increasing global data, time zone processing has become more crucial to ensure the correct sequencing of events, whether it is for global stock trading or sequencing patient visits in an international drug trial.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset