Chapter 2: Object-Oriented Programming for SAS Programmers

Background and Definition

Dog Class

An Example of OOP

Moving Forward

Background and Definition

Object-oriented programming (OOP) is a programming style based on data structures that contain data in fields, which are commonly called attributes. There is also code in the form of procedures, which are commonly called methods. This combination of a data structure and code is called a class. An object is a particular instantiation of a class. All interaction with the data in an object and the program that created the object is controlled through the object’s methods.

Many explanations and definitions of OOP try to clarify the concept with a concrete example, such as a dog:

Dog Class

Attributes

•   Breed

•   Name

•   Color

Methods

•   Bark

•   Run

•   Eat

•   Wag tail

You can create an object from the dog class and assign it values. It might look like this:

DECLARE dogClass myDog();

myDog.breed = ‘Wolf Hound’;

myDog.name = ‘Sherlock’;

myDog.bark();

myDog.run();

This is simple code and can be easily understood. However, as a SAS programmer, you probably have never needed a dog object in your work assignments!

Let’s look deeper. The typical DATA step program starts with DATA and ends with RUN. It executes the statements in between from top to bottom in a certain order, where the order follows any flow control statements (e.g., if, then, else). Some lines are not executed in every iteration. However, all lines of code are executed at least once for every row in the input data.

An Example of OOP

Suppose you have a table containing dollar amounts, and you want to read each row, sum the amounts, and keep a running total. When all rows are read, you want to calculate an average amount. Here is some simple code to get this:

DATA avgAmt;

       retain total 0;

       drop amount;

       set amounts end = done;

       total = total + amount;

       if done then

       do;

         avgAmt = total / _n_;

         output;

       end;

run;

Now, suppose that hundreds of people in your company want to do the same thing. They copy your code, change the input data source, and use different variable names. You and your code become a legend.

One day, five minutes before you are ready to leave for a six-week around-the-world holiday, the CFO calls you (because you are a legend). The financial health of the company depends on an immediate change to your program, and this change must be applied immediately to all versions of the program in use throughout the company. He needs the minimum amount and maximum amount values right now or the company will flounder and you will be without a job.

If it were just your program that needed to change, you could do the following:

DATA summaryAmts;

       retain total max min 0;   

       drop amount;

       set amounts end = done;

       if _n_ = 1 then

       do;

          min = amount; max = amount;   

       end;

       total = total + amount;

       if amount < min then min = amount;   

       if amount > max then max = amount;   

       if done then

       do;

         avgAmt = total / _n_;

          output;

       end;

run;

   Two numeric variables (max and min) are added to the program.

   The values of max and min are initialized in the first iteration of the DATA step.

   The min value is checked and updated.

   The max value is checked and updated.

Instead of just coding this simple change in your program, you update your resume because there is no way or time for you to change the hundreds of versions of the program throughout the company.

If you knew OOP best practices and could turn back time and rewrite the original program, what could you do differently?

You could write something called an accumulator. The task of your accumulator is to keep a running total. In addition, it keeps track of the number of times an amount is added to the accumulator. It provides the average amount whenever it is asked. You could add the accumulator to your program.

proc DS2;

package packlib.accumulate /overwrite=YES;   

declare integer cnt;   

declare numeric(10,2) amt;

method accumulate();   

    cnt = 0;

    amt = 0.00;

end;

method accumulate(numeric(10,2) inAmt);   

    accumulate();   

    if inAmt > 0.00    

    then

      do;

        cnt = 1;

        amt = inAmt;

      end;

end;

method accumulate(integer inCnt, numeric(10,2) inAmt);   

       accumulate();

       cnt = inCnt;

       amt = inAmt;

end;

method add(numeric(10,2) inAmt);   

    cnt = cnt + 1;

    amt = amt + inAmt;

end;

method setcnt(integer inCnt);      

    if (cnt = 0) then cnt = inCnt;

end;

method setamt(numeric(10,2) inAmt);      

    if (amt = 0.00) then amt = inAmt;

end;

method getCnt() returns integer;         image

    return (cnt);

end;

method getAmt() returns numeric(10,2);   image

    return (amt);

end;

method getAvg() returns numeric(10,2);   image

    return (amt/cnt);

end;

endpackage;

run;

quit;

What is all this OOP doing?

   A package named accumulate is created and saved in the packlib library.

   The accumulate package has two attributes—cnt and amt.

   accumulate() is the default constructor method. It is invoked when the package is instantiated (that is, when the object is created in the calling program). The default constructor has all of the actions to create the object. In this method, the two attributes are both initialized to zero.

   In addition to the default constructor method, there is a constructor method that creates an object with a starting amount.

   The initialization is started by calling the default constructor. This ensures that all actions to create the object are performed.

   The amt attribute is updated only if the value to be set is greater than zero.

   A third constructor has both a starting cnt variable and amt variable. Having multiple methods with the same name but different arguments is called method overloading.

   This method increments the running total and the counter.

   This method allows the cnt attribute to be updated.

   This method allows the amt attribute to be updated.

image   This method queries the current value of the cnt attribute.

image   This method queries the current value of the amt attribute.

image   This method queries the current average amount.

“WAIT!!!” you cry, “my original program was just 11 lines long, and now it is about 60 lines long! And it doesn’t even do anything!” What this 60-line program does do is create a reusable package that has more functionality than the original program. Now, the original program plus the package meets the CFO’s current needs.

proc ds2;

data acc (overwrite=yes);

declare package packlib.accumulate fees();  

declare numeric(10,2) total avgAmt having format comma15.2;

drop amount;

method run();

    set amounts;

    fees.add(amount);  

end;

method term();

total = fees.getAmt(); avgAmt = fees.getAvg();  

       output;

end;

enddata;

run;

quit;

   An object named fees from the accumulate package is created. Because there are no arguments in the instantiation call, the default constructor method is used.

   The add() method of the fees object is invoked.

   In the term() method, separate get methods capture the running total and the average.

At this point, the program looks more manageable. The object named fees adds the amount of each row to the total fees. When all rows have been processed, you get totals. But really, how has this helped you?

First, the accumulate package logic (those 60 lines) has been separated from the program. Every time someone needs to keep a running total or to calculate an average, they need only to request a copy of the current accumulate package. Focus on the keyword “current.” If the accumulate package is updated, every program that uses it is updated.

Second, the accumulate package can be used to accumulate any numeric variable, regardless of its name. You just invoke the add() method. You can get an average by invoking the getAvg() method. And, access to the accumulate package is controlled.

Because the accumulate package is outside of the program, you can update just the package to keep track of the minimum and maximum amounts. You do not need to change each program to get these values. You can provide a getMin method and a getMax method to make these available to all versions of the program.

method accumulate();   

    cnt = 0;

    amt = 0.00;

    min = 99999999.99;

    max = 0.00;

end;

method add(numeric(10,2) inAmt);

    cnt = cnt + 1;

    amt = amt + inAmt;

    if inAmt > max    

       then max = inAmt;

       else if inAmt < min then min = inAmt;

end;

method getMin() returns numeric(10,2);   

    return (min);

end;

method getMax() returns numeric(10,2);

    return (max);

end;

   The default constructor method was changed to initialize min and max.

   The max and min values are tested.

   Two new get methods for the minimum and maximum are added.

Now, all programs that use the accumulate package can capture minimum and maximum values.

Moving Forward

The moral of the story is this—object-oriented programming enables you to make objects that do many things. Objects have attributes, such as running total, minimum amount, or maximum amount. Objects have methods that control access, such as add(), getAvg(), or getMin(). Objects can be used by many applications for many different reasons. One application might need to add dollar amounts. Another application might need to add fuel fill-ups for the corporate truck fleet. In one program, you can create multiple objects to do all of these things.

Because access to an object is controlled through methods, changes to the object itself do not affect access. Furthermore, new access methods can be added without breaking applications that use the older access methods.

Do you have to understand everything about OOP to take advantage of it in your DS2 programs? No. Think of it like driving a car. You do not have to understand the physics of understeering and oversteering to drive, but if you do, driving on challenging roads can be a lot more fun.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset