Your benchmarking is most likely to yield useful results
if you follow these guidelines:
-
Before you test the programming
techniques, turn on the SAS system options that report resource usage.
As explained earlier,
to track and report on resource usage, you can use some or all of
the system options STIMER, MEMRPT, FULLSTIMER, and STATS. The availability,
usage, and functionality of these options vary by operating environment.
You can also specify MSGLEVEL=I to display additional notes in the
SAS log. Use the FULLSTIMER option to log a complete list of the resources
that are used.
Note: To turn on the FULLSTIMER
option, use the following statement:
options fullstimer;
-
Execute the code for each programming
technique in a separate SAS session.
The
first time that program code (including the DATA step, functions,
formats, and SAS procedures) is referenced, the operating system might
have to load the code into memory or assign virtual address space
to it. The first time data is read, it is often loaded into a cache
from which it can be retrieved more quickly the next time it is read.
The resource usage that is required for performing these actions is
overhead. Using separate SAS sessions for each technique change can
minimize the effect of the overhead on your resource statistics.
-
In each programming technique that
you are testing, include only the SAS code that is essential for performing
the task.
If you include too
many elements in the code for each technique, you do not know what
caused the results. If the program that you are benchmarking is not
large, you can optimize it by changing individual programming techniques,
one at a time, and running the entire program after each change to
measure the effect on resource usage. However, a more complex program
might be easier to optimize by identifying the steps that use the
most resources and extracting those steps into separate programs.
You can measure the effects of different programming techniques by
repeatedly changing, running, and measuring the separate programs.
When isolating parts of your program, be careful to measure their
resource usage under the conditions in which they are used in the
complete program.
-
If your system is doing other work
at the same time that you are running your benchmarking tests, be
sure to run the code for each programming technique several times.
Running the code several
times reduces any variability in resource consumption that is associated
with other work that the system is doing. How you handle multiple
measurements depends on the resource, as indicated below:
-
Use the minimum real time and CPU
time measurements, because these represent most closely the amount
of time your programming technique actually requires. The larger time
values (especially in the case of real time) are the result of interference
from other work that the computer was doing while your program ran.
-
The amount of memory should not
vary from trial to trial. If memory does vary, it is possible that
your program sometimes shares a resource with another program. In
this situation, you must determine whether the higher or lower memory
consumption is more likely to be the case when your program is used
in production.
-
I/O can be an especially elusive
resource to measure. With modern file systems and storage systems,
the effect of your program on the I/O activity of the computer sometimes
must be observed by operating system tools, file system tools, or
storage system tools because it cannot be captured by your SAS session.
Data is often aggressively cached by modern file systems and storage
systems, and file caches are greatly affected by other activity in
the file system. Be realistic when you measure I/O—it is possible
to achieve good performance on a system that is not doing other work,
but performance is likely to worsen when the application is deployed
in a more realistic environment.
-
Run your benchmarking tests under
the conditions in which your final program will run.
Results might vary
under different conditions, so it is important to control the conditions
under which your benchmarks are tested. For example, if batch execution
and large data sets are used in your environment, you should incorporate
these conditions into your benchmarking environment.
-
After testing is finished, consider
turning off the options that report resource usage.
The options that report
resource usage are themselves consumers of resources. If it is a higher
priority in your environment to minimize resource usage than to periodically
check an application's resource usage, then it is most efficient
to turn off these options.
Note: To turn off the FULLSTIMER
option, use the following statement:
options nofullstimer;