Chapter 6

Strategies for Efficient and Effective Simulation

Contents

 

6.1 Overview of Simulation Strategies

An efficient simulation is one that optimizes computational resources so that it finishes in a reasonable amount of time. An effective simulation is one that produces a desired statistical result. Efficiency is achieved through good programming practices and is a primary theme of this book. Effectiveness is achieved though a solid understanding of statistical concepts and methodology.

The design of a simulation study affects its efficiency and effectiveness. One design choice is the number of samples to use in a simulation. If you use too few samples, then the program runs quickly but uncertainty in the estimates renders the simulation useless. If you use too many samples, then the standard error of the estimates are tiny but the program requires hours or days to finish. Clearly, the best choice lies between these two extremes.

This chapter discusses issues that are important for designing efficient and effective simulation studies. The ideas in this chapter are applied throughout this book, but this chapter collects the various strategies in one place. This chapter also describes details of implementing simulations in SAS software, and points out why some implementations are more efficient than others.

 

6.2 The Design of a Simulation Study

Gentle (2009, Appendix A) discusses the fact that simulation studies are computer experiments. As such, Gentle argues, the principles of statistical design can be used to improve the efficiency, reproducibility, and documentation of a simulation study.

Gentle outlines a typical simulation study. Suppose that you want to study the power of ordinary least squares regression to test the hypothesis H0: β1 = 0 for the simple regression model Yi = β0 + β1Xi + Ei,, where each response value, Yi, is modeled as a linear function of an explanatory variable value, Xi, and a random variable, Ei, which represents an error term.

This problem contains many experimental factors that might or might not be important in computing the power. Among these factors are the following:

  • The true value of β1 You might choose equally spaced values, such as 0, 0.2, 0.4,…
  • The sample size. You might choose three sample sizes, such as 20, 200, and 2,000.
  • The distribution of the error term. You might choose five distributions: a standard normal distribution and four contaminated normal distributions (see Section 7.5.2) with various percentages and magnitudes of contamination.
  • The design points of the independent variable, x. You might choose three designs: points that are equally spaced, points that have a cluster of extreme values, and points drawn randomly from a skewed distribution.

If you use a full factorial design with, for example, 16 values for β1, then there are 16×3×5×3 = 720 experiments that you need to run. Each experiment is a simulation study, which consists of generating thousands of random samples and running a regression analysis on each sample. It is tempting to feel overwhelmed by the magnitude of the study and by the challenge of summarizing the results. How should you proceed?

The best advice is to start small and start simply. For this experiment, you might want to choose the smallest sample size (20) and the simplest choice for the design of x (equally spaced) and for the distribution of the error term (normally distributed). You should also use a smaller number of samples than the number that you intend to use for the final study. One-tenth the number is often appropriate during the development, debugging, profiling, and optimization of the simulation algorithm.

When you run the small-scale simulation, consider whether you can present the results graphically. For example, Figure 6.1 presents the results as a curve that shows the power as a function of the parameter β1. Standard errors for the estimates are displayed. Section 11.4 shows how to create graphs like Figure 6.1.

Figure 6.1 Preliminary Results of a Simulation Study

images

You can learn a lot from small-scale simulations, such as the values of β1 for which the power curve is steep or flat, and the magnitude of the standard error for the estimates along the curve. Use this preliminary analysis to plan the final simulation. For example, use the preliminary analysis to choose an appropriate range for β1. For small samples, you need a large range for β1; for large samples, you need a much smaller range. Because the preliminary analysis estimates the standard error along the curve, you might decide to generate more samples when the variance of the estimator is relatively large (approximately 1 ≤ β1 ≤ 2 in Figure 6.1) and use fewer samples when the variance is smaller.

You can also use the preliminary analysis to estimate the run time for the final simulation. If you used 1,000 samples for each simulation in the preliminary study, then a simulation that uses 10,000 samples will take about 10 times as long.

In a similar way, you can run preliminary analyses for some of the other parameter values and scenarios, such as for a contaminated normal error term and for different design points for x. For each scenario, you should save the results (the points and upper/lower limits of the power curve) so that, for example, you can combine several curves on a single graph. By the time you run your final simulation study, you should have a good idea what the results will be.

In summary, when you begin a large simulation study, resist the urge to immediately write and run the entire simulation. Start small. Debug and optimize the simple cases. Run a small-scale simulation. These techniques provide important information that you can incorporate into the final study.

 

6.3 The Effect of the Number of Samples

There are two “sizes” in a simulation study, and it is important to not confuse them. One is the sample size; the other is the number of samples, which is also known as the number of repetitions. As demonstrated in Section 4.4.5, the sample size, N, affects the width of the sampling distribution (that is, the standard error of the statistic). For most well-behaved statistics that are used in practice, small sample sizes correspond to large standard errors, whereas large sample sizes result in smaller standard errors. The value of N is often controlled in a simulation study. It might be set to the sample size of an observed data set, or it might be systematically varied, such as in the power study that is described in Section 6.2.

In contrast, the number of samples in the simulation is something that you have to choose every time that you run a simulation. The number of samples (which in this book is determined by the NumSamples macro variable) determines how well the approximate sampling distribution (ASD) of a statistic approximates the exact sampling distribution.

How many samples are sufficient? Unfortunately, that is a difficult question to answer. The number of samples that you need depends on characteristics of the sampling distribution. Lower order moments of the sampling distribution (such as the mean) require fewer samples than statistics that are functions of higher order moments, such as the variance and skewness. You might need many, many, samples to capture the extreme tail behavior of a sampling distribution.

A popular choice in research studies is 10,000 or more samples. However, do not be a slave to any particular number. The best approach is to understand what you are trying to estimate and to report not only point estimates but also standard errors and/or confidence intervals.

If you are only interested in the Monte Carlo estimate of the mean, then you can often use a relatively small number of samples. You can quantify this statement by looking at the Monte Carlo standard error, which is a measure of how close the mean of the sampling distribution is to the unknown population parameter. The Monte Carlo standard error is of the form images where m is the number of samples that you generate (Ross 2006). This implies that to halve the Monte Carlo standard error, you need to quadruple the number of samples in the simulation.

Increasing the number of samples is one technique to reduce the Monte Carlo standard error, but other methods also exist. These so-called variance reduction techniques are described in Ross (2006, Ch. 8), Ripley (1987, Ch. 5), and Jones, Maillardet, and Robinson (2009, Ch. 20). There are two difficulties with using these techniques. First, they sometimes require considerable ingenuity because you need to construct certain auxiliary random variables with special properties. Second, they make programming the simulation more complicated because there is more “bookkeeping” in order to keep track of the auxiliary variables. In other words, these techniques enable you to use fewer simulations to obtain better estimates, but at the cost of a more complicated program.

 

6.4 Writing Efficient Simulations

The previous sections described several efficient simulation techniques. This section summarizes those techniques and describes additional techniques for efficiency.

6.4.1 The Basic Structure of Efficient Simulations in SAS

Recall that there are two basic techniques for simulating and analyzing data: the BY-group technique and the in-memory technique.

When you use the BY-group technique, do the following:

  • Identify each sample by a unique value of the SampleID variable. Use a BY statement to compute statistics for each BY group.
  • Suppress all output during the BY group analysis. Many procedures have a NOPRINT option in the PROC statement. Otherwise, use the method described in Section 6.4.2 to suppress ODS output.

When you use the SAS/IML in-memory technique, do the following:

  • Use the J function to allocate a vector (or matrix) to store the simulated data before you call the RANDGEN subroutine. This enables you to generate an entire sample (or even multiple samples) with a single call to the RANDGEN subroutine. Do not generate one random value at a time.
  • When possible, compute statistics for all samples with a single call. For example, the MEAN function can compute the means of all columns in a matrix. The subscript reduction operator x[,:] computes the means of all rows in the x matrix.
  • If you are analyzing multivariate data, then it is often convenient to generate the data in a DO loop. At each iteration, generate a single sample and compute the statistic on that sample. This approach is described further in Chapter 8, “Simulating Data from Basic Multivariate Distributions.”

6.4.2 How to Suppress ODS Output and Graphics

This section assumes that you are familiar with the Output Delivery System (ODS), as described in Section 3.5. ODS enables you to select, exclude, and output tables and graphics.

When you use a SAS procedure to compute statistics for each BY group, you should create an output data set that contains the statistics, as described in Section 4.4.1. Because you are writing the statistics to a data set, you do not need to display any output from the procedure. There are two ways to suppress procedure output: by using a NOPRINT option and by using the ODS EXCLUDE ALL statement and related ODS statements.

About 50 SAS/STAT procedures support the NOPRINT option in the PROC statement. When you specify the NOPRINT option, ODS is temporarily disabled while the procedure runs. This prevents SAS from displaying tables and graphs that would otherwise be produced for each BY group. For a simulation that computes statistics for thousands of BY groups, suppressing the display of tables results in a substantial savings of time. (But, the SAS/STAT User's Guide states, “However, there are a few procedures that for historical reasons still might produce some output even when NOPRINT is specified.”)

The NOPRINT option is ideal for writing statistics to a data set by using procedure syntax such as the OUTPUT statement and other “OUT” options, such as the OUTP= option in PROC CORR, the OUT= option in PROC FREQ, the OUTEST= option in PROC REG, and the OUTSTAT= option in PROC GLM.

However, sometimes the statistic of interest is available only in an ODS table. In these cases, you cannot use the NOPRINT option because it suppresses all ODS tables, including the one that contains the statistic of interest. In these cases, use ODS to prevent tables and graphics from displaying on your computer monitor, but create an output data set by using the ODS OUTPUT statement.

For example, you can use the following technique to turn off ODS output. Prior to calling the procedure, execute the following statements:

/* suppress output to ODS destinations */
ods graphics off;
ods exclude all;
ods noresults;

The first statement turns off ODS graphics. Technically, you only need to use this statement prior to calling a procedure (such as the REG procedure) that produces ODS graphics by default, but there is no harm in unconditionally turning off ODS graphics. The second statement excludes all tables from open destinations such as HTML or LISTING. The third statement prevents ODS from making entries in the ODS Results window, which is shown in Figure 6.2.

Figure 6.2 The ODS Results Window

images

You can now run a SAS procedure without seeing any output displayed, and you can use the ODS OUTPUT statement to save a table of statistics to an output data set. After you compute the statistics for each BY group, re-enable ODS output by using the following statements:

ods graphics on;
ods exclude none;
ods results;

Because these sequences of commands are used so frequently in simulation studies, it is convenient to package them into SAS macros:

%macro ODSOff;               /* Call prior to BY-group processing */
ods graphics off;
ods exclude all;
ods noresults;
%mend;
%macro ODSOn;                /* Call after BY-group processing    */
ods graphics on;
ods exclude none;
ods results;
%mend;

With these definitions, you can easily disable ODS output temporarily while computing the statistics for each BY group. For example, the following statements write descriptive statistics to a data set named Desc, but suppresses the display of tables or graphs. The SimNormal data was created in Section 4.4.4.

%ODSOff
proc means data=SimNormal;
   by SampleID;
   var x;
   ods output Summary=Desc;
run;
%ODSOn

Sometimes it is convenient to also suppress SAS notes during a simulation. This is covered in the next section.

6.4.3 How to Suppress Notes to the SAS Log

Some SAS procedures write a note to the SAS log as part of their normal operation. For example, procedures that use maximum likelihood estimation write a note for each BY group that reports whether the numerical optimization succeeded for that BY group, as shown in the following example:

NOTE: Convergence criterion (GCONV=1E-8) satisfied.
NOTE: The above message was for the following BY group:
      SampleID=3

Not only do these messages take time to print, but they can also fill up the SAS log, which (when you are running SAS software interactively) results in a dialog box that gives you the option to clear the log. It is therefore desirable to deal with these messages in one of two ways:

  • If you are confident that the notes are uninformative, then you can suppress notes by executing the following command:
options nonotes;

After the procedure runs, you can re-enable notes:

options notes;
  • If you are concerned that some of the notes might be important and you want the opportunity to review the SAS log after the simulation is complete, then you can redirect the SAS log to a file by using the PRINTTO procedure, as follows:
proc printto log='name-of-log-file' new;
run;

After the simulation completes, you can check the log file, reset the log, and restore the output destinations to their default values:

proc printto;              /* no options ==> restore defaults */
run;

6.4.4 Caution: Avoid Macro Loops

Many SAS users attempt to run a simulation by using a macro loop instead of using the template presented in Section 4.4.1. Do not do this. Novikov (2003) compares times for the two methods and concludes (for his application) that the macro-loop technique is 80–100 times slower than the BY-group technique.

To be concrete, the following program is an example of macro code that computes the same quantities as in Section 4.4.2. You should avoid writing programs like this:

/*************************************/
/* DO NOT USE THIS CODE: INEFFICIENT */
/*************************************/
%macro Simulate(N, NumSamples);
options nonotes;                       /* turn off notes to log    */
proc datasets nolist;
   delete OutStats;                    /* delete data if it exists */
run;

%do i = 1 %to &NumSamples;
   data Temp;                          /* create one sample        */
   call streaminit(0);
   do i = 1 to &N;
      x = rand(“Uniform”);
      output;
   end;
   run;

   proc means data=Temp noprint;       /* compute one statistic   */
      var x;
      output out=Out mean=SampleMean;
   run;

   proc append base=OutStats data=Out;       /* accumulate stats */
   run;
%end;
options notes;
%mend;

/* call macro to simulate data and compute ASD */
%Simulate(10, 100)                     /* means of 100 samples of size 10 */

This approach suffers from a low ratio of work to overhead. The DATA step and the MEANS procedure are called 100 times, but they generate or analyze only 10 observations in each call. This is inefficient because every time that SAS encounters a procedure call, it must parse the SAS code, open the data set, load data into memory, do the computation, close the data set, and exit the procedure. When a procedure computes complicated statistics on a large data set, these “overhead” costs are small relative to the computation performed by the procedure. However, for this example, the overhead costs are large relative to the computational work. For an example of a hybrid approach that uses macro and BY-group processing together, see the next section.

The macro approach also makes it difficult to reproduce the simulation. Notice that zero is used as the seed value, which means that the random number stream is initialized by using the system clock. In this example, you cannot call the STREAMINIT subroutine with a nonzero seed because then every sample data set would contain exactly the same data. An alternative to using a zero seed is to use the macro variable &i as the seed value.

Caution: If you do not turn off the NOTES option, then the performance of the %Simulate macro will be even worse. If the number of simulation loops is sufficiently large, then you might fill the SAS log with irrelevant text.

Exercise 6.1: Compare the run time for the %Simulate macro with the run time for the BY-group technique in Section 4.4.2. Because the NONOTES option is used, you might want to use PROC PRINT to print something when the macro completes.

6.4.5 When Macros Are Useful

The point of the previous section is that SAS simulations run faster when you use a small number of DATA steps and procedure calls to generate and analyze a large amount of data. The BY-group approach is efficient because a single call to a SAS procedure results in computing thousands of statistics.

This does not imply that macros are “bad.” After all, a macro merely generates SAS code. However, if you use macros naively, then you might generate inefficient SAS code.

For huge simulation studies with a large number of design parameters, macros can be useful. Provided that a macro generates an efficient SAS program that uses BY-groups to read and process a lot of data, it is perfectly acceptable to encapsulate your simulation into a macro. In fact, for huge simulation studies, it makes sense to run the simulation as a series of smaller sub-studies. For example, for the simulation study in Section 6.2, you might write a macro that takes parameters for the sample size, the distribution of the error term, and the design of the independent variable. You could then call the macro 3 × 5 × 3 times. Each call creates results that are sufficient to create a graph similar to Figure 6.1.

For huge simulation studies, encapsulating the program into a macro can provide several benefits:

  • Efficiency: Novikov and Oberman (2007) noted that if you generate a single huge SAS data set that contains millions of samples, then the performance might be worse than running k analyses, each on 1/ k samples. Novikov and Oberman suggest using values of k in the range of 20–50. The optimal value depends on the analysis, as well as performance characteristics of your computer, but the key point to remember is that running one huge simulation might take longer than several smaller (but still “large”) simulations.
  • Robustness: If you break up the study into several sub-studies, then you protect against catastrophic errors such as loss of power or a bug in your program that manifests itself just before the simulation completes. If you break the study into smaller sub-studies, then you will not lose all your results if something catastrophic occurs near the end of the simulation.
  • Distributed processing: If you are conducting a study that involves a large number of independent parameter values and you have access to multiple computers that each have SAS software, then you can distribute the computations by running a subset of parameter values on each computer. Some researchers call simulation studies embarrassingly parallel because it is so easy to distribute the computations. You can distribute these computations manually, or you can use a tool such as SAS Grid Manager.

6.4.6 Profiling a SAS/IML Simulation

Some simulations take a long time to run. Consequently, it is essential to know how to profile a program, which means estimating the performance of various portions of the program. Wicklin (2010, p. 371) defines the performance of a program as “the time required for it to run on typical data, and also how that time changes with the size or characteristics of the data.”

Suppose that part of your simulation involves finding the eigenvalues of a large matrix, and that the size of the matrix varies with the number of variables in the simulation. The following SAS/IML program generates random symmetric n × n matrices for various values of n, and times how long it takes to compute the eigenvalues:

proc iml;
size = do(500, 2000, 250);        /* 500, 1000, …, 2000              */
time = j(1, ncol(size));          /* allocatevectorforresults        */
call randseed(12345);
do i = 1 to ncol(size);
   n = size[i];
   r = j(n* (n+1)/2, 1);          /* generatelowertriangularelements */
   call randgen(r, “uniform”);
   A = sqrvech(r);                /* createsymmetricmatrix           */

   t0 = time();
   evals = eigval(A);
   time[i] = time()-t0;           /* elapsed time for computation    */
end;

The TIME function returns the time of day as the number of seconds after midnight. The TIME function is called once prior to the eigenvalue computation and then again after the eigenvalue computation. The difference between those times is the elapsed time in seconds.

You can write the elapsed times to a SAS data set and use PROC SGPLOT to visualize the performance as a function of the size of the matrix. See Figure 6.3.

create eigen var {“Size” “Time”}; append; close;
quit;
proc sgplot data=eigen;
   title “Performance of Eigenvalue Computation”;
   series x=Size y=Time / markers;
   yaxis grid label=“Time to Compute Eigenvalues (s)”;
   xaxis grid label=“Size of Matrix”;
run;

Figure 6.3 Performance of a Simple Computation

images

Figure 6.3 shows that the computation time increases nonlinearly with the size of the matrix. The time required for a 500 × 500 matrix is a fraction of a second. For a 2000 × 2000 matrix, the computation requires about eight seconds.

You can use this technique to time various components of your simulation. You can then focus on optimizing the program statements that are consuming the most time.

6.4.7 Tips for Shortening Simulation Times

This section collects some general suggestions for making your simulation run faster and more efficiently.

The following tips apply to SAS/IML programs:

  • Vectorize computations. This means that you should write a relatively small number of statements and function calls, each of which performs a lot of work. For example, avoid loops over rows or elements of matrices. Instead, use matrix and vector computations.
  • Allocate space for a matrix of random numbers, and then generate all of the random numbers with a single call to the RANDGEN subroutine.
  • Profile the program to identify trouble spots. Profiling means timing different sections of a program to determine the run time for each section, as described in Section 6.4.6.
  • Allocate result arrays outside of a loop; fill the array inside a loop. Never iteratively append to an array when performance is crucial.
  • Optimize the performance of the statements in the innermost loop. These statements are executed most often.

In addition, there are some general performance tips that apply to simulation studies that involve parameters. For example, Section 5.4.2 uses a grid of uniformly spaced values to estimate the power of the t test. Similarly, Section 16.10 uses an evenly spaced grid of parameters to explore how skewness and kurtosis affect the coverage probability of a confidence interval. In studies such as these, a simulation is run for every parameter value, which means that in a straightforward implementation the total time is directly proportional to number of parameter values. Consequently, keep in mind the following tips:

  • Reduce the size of your parameter grid. A 20 × 20 grid of parameters will take four times longer than a 10 × 10 grid. Use coarse grids to get a broad overview of the simulation results. Only refine the grid when the response function changes faster than the resolution of the grid.
  • If you need to refine the grid, then consider local refinement rather than a global refinement. There is no reason to use a large number of grid points in regions where the response function is not changing much.
  • Distribute independent computations. A 10 × 10 grid of parameter values corresponds to 100 independent simulations. If you have access to four computers, then consider running 25 simulations on each computer and collating the results.
  • Estimate the run time of a simulation before you run it. If you plan to run 100,000 simulations, then start by running 100 simulations and time how long it takes. The final time will be about 1,000 times longer. This will let you know whether you should run the simulation while you go to lunch, or whether you should run it overnight.
  • Remember that the number of points in a grid grows geometrically with the dimension. If you have two parameters, each with 10 possible values, then you have to run 102 simulations. If you attempt to use five parameters, and each has 10 possible values, then you have to run 105 simulations. Perhaps for some parameters you can consider only three values: small, medium, and large. Or perhaps the problem is symmetric in one of the parameters, and you do not need to consider using negative values. Or perhaps instead of using a complete factorial design you should consider using an incomplete factorial design.
  • Respect the size of a billion. Due to advances in computing power, some people like to joke that “a billion is the new million,” by which they mean that today you can run billions of computations in the same time that it once took to run millions of computations. That is true, but remember that it used to take a long time to run millions of computations.

It is a good idea to figure out how long it takes to generate a billion random normal variates on your computer. You can use this length of a time as a benchmark for running future simulations.

Exercise 6.2: Run a DATA step that generates one million samples, each containing 1,000 random normal observations. Check the SAS log to determine how long it takes to create that simulated data.

Exercise 6.3: Run a SAS/IML program that generates a vector of 1,000 random normal observations. Use the TIME function (see Section 6.4.6) to time how long it takes to fill the vector one million times within a DO loop. (You probably cannot generate all of these values in a single matrix. A billion doubles requires 8 GB of RAM, and SAS/IML 9.3 cannot allocate matrices larger than 2 GB.)

 

6.5 Disadvantages of Simulation

Simulation is a powerful technique, but it has limitations, which include difficulty in generalizing the results, difficulty in organizing the results, and difficulty in applying the results to real data.

A common criticism of simulation studies is that they cannot be generalized beyond the specific population models and the parameter values in the study. Furthermore, real data are rarely distributed according to any “named” distribution, so how well do the results generalize to real data?

As an example, suppose that you use simulation to reveal how deviations from normality affect the performance of a statistical test. In your study, you might simulate data from a t distribution and a lognormal distribution as two instances of nonnormal distributions. If the test performs well on the simulated nonnormal data, then you might want to conclude that the statistic is robust to departures from normality.

However, can you justify generalizing this result to other nonnormal distributions? Probably not. Do your conclusions hold for an exponential distribution or, more generally, for gamma distributions? Simulations provide evidence that something is true, but you need to be careful not to extrapolate beyond the cases that were simulated.

Because of the previous shortcoming, you might be tempted to run a huge simulation study that incorporates a wide range of nonnormal distributions, such as the exponential, gamma, and beta families. You might feel a need to run a simulation for every nonnormal distribution under the sun.

If you were to give in to these temptations, then how would you organize the results? Would you create pages of tables that list results for every distribution and every parameter value? Some researchers do. However, the fact that the study is not well designed makes it difficult to present the results in an organized manner.

Before you start simulating data, think about what you are trying to demonstrate. Perhaps “show this statistic works well for nonnormal data” is too nebulous a goal. Perhaps a more attainable goal is to study how the test behaves for populations with a wide range of skewness and kurtosis.

With this more modest goal, you can design a better simulation study. As shown in Chapter 16, you can sample from distributions whose skewness and kurtosis values lie on a regularly spaced grid (see Figure 16.16). With this design, you can display the results in an organized graph.

In spite of its limitations, simulation remains a powerful technique that belongs in the toolbox of every statistical programmer. Data simulation enables you to compare two or more statistical techniques by studying their performance on common data with known properties. Most importantly, data simulation enables you to understand the sampling distribution of statistics without resorting to asymptotic theory or overly restrictive assumptions about the distribution of data.

 

6.6 References

Gentle, J. E. (2009), Computational Statistics, New York: Springer-Verlag.

Jones, O., Maillardet, R., and Robinson, A. (2009), Introduction to Scientific Programming and Simulation Using R, Boca Raton, FL: Chapman & Hall/CRC.

Novikov, I. (2003), “A Remark on Efficient Simulations in SAS,” Journal of the Royal Statistical Society, Series D, 52, 83–86.
URL http://www.jstor.org/stable/4128171

Novikov, I. and Oberman, B. (2007), “Optimization of Large Simulations Using Statistical Software,” Computational Statistics and Data Analysis, 51, 2747–2752.

Ripley, B. D. (1987), Stochastic Simulation, New York: John Wiley & Sons.

Ross, S. M. (2006), Simulation, 4th Edition, Orlando, FL: Academic Press.

Wicklin, R. (2010), Statistical Programming with SAS/IML Software, Cary, NC: SAS Institute Inc.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset