In the interest of simplicity, the SAS code provided above avoided all macro programming, except for using the %CrucialRates macro. However, analysts can profit greatly by making elementary use of the SAS Macro Language. Below is the full program that was used in developing the EER example. Note how the parameters that shape the problem are specified only once at the beginning. Due to rounding, the results obtained with this code differ slightly from those given above.
options ls=80 nocenter; /************************************************************************ Program Name: EER_SSAnalysis060722.sas Date: 22 July 2006 Investigator: "Sol Capote, MD; CHI Malaria Research Group" Statistician: "Phynd Gooden, PhD" (Actually, Ralph O'Brien)
Purpose: Sample-size analysis for comparing usual care only vs. QCA on elysemine:elysemate ratios at 4 hours (EER4). Assumes data will be logNormal in distribution with same relative range in the two groups. ************************************************************************/ *options symbolgen mlogic mprint; * for macro debugging; *options FORMCHAR="|----|+|---+=|-/<>*"; * for ordinary text tables; title1 "Usual Care Only (UCO) vs. Usual Care + QCA (QCA)"; title2 "Difference in 4-hour elysemine-elysemate ratio (EER4),adjusted"; title3 "for three baseline covariates: EER0, plasma lactate, parasitemia"; /******************************************************************* This program is structured so that all the defining values are set through %let macro statements at the start of the code. *******************************************************************/ /********************************** BEGIN TECHNICAL SPECIFICATIONS /**********************************/ * Set label for Y; * ---------------; %let Ylabel = EE Ratio; * Set variable labels for the two groups ; * ---------------------------------------; %let GrpLabel_1 = UCO; %let GrpLabel_2 = DCA; /************************************************************************* Each distribution is logNormal with different medians, but same relative spread (defined below). This is the same as saying that the distributions have different means but the same coefficients of variation. *************************************************************************/ * Supply guesstimates for medians ; * ------------------------------- ; %let Ymedian0 = 2.0; * median for control arm, only one scenario ; %let Ymedian1A = 1.8; * median for experimental arm, scenario A ; %let Ymedian1B = 1.7; * median for experimental arm, scenario B ; * Supply guesstimates for the 95% relative spread, defined below; * ------------------------------------------------------------- ; %let YRelSpread95_1 = 2.5; * YRelSpread95, scenario 1 ; %let YRelSpread95_2 = 3.0; * YRelSpread95, scenario 2 ; *Set NCovariates and supply guesstimates for PrtlCorr(XXX,logY); * -------------------------------------------------------------; %let NCovariates = 0 3 50; * number of covariates ; %let PrtlCorr_XXXlogY = .2 .35 .50 ; * Multiple partial correlation (R) ; * between covariates ("XXX") and ; * logY, within treatment groups. ; * Supply prior probabilities that null is false;
* ---------------------------------------------; %let PriorPNullFalse = .20 .30; /******************************* END TECHNICAL SPECIFICATIONS *******************************/ /********************************************************************** ===== Notes ===== 1. Each distribution is logNormal with different medians, but same relative spread (defined below). This is the same as saying that the distributions have different means but the same coefficients of variation. Under logNormality, medians are also geometric means. 2. Let Y025, Y500 and Y975 be the 2.5%, 50%, and 97.5% quantiles for Y, i.e., Y500 is the median of Y and Y025 and Y975 are the limits of the mid-95% range for Y. 3. Define YRelSpread95 = Y975/Y025 to be the inter-95% relative spread. These relative spreads are taken to be equal in control and experimental groups. 4. Log(Y025), log(Y500), and log(Y925) are the 2.5% quantile, the median, and the 97.5% quantiles for logY. If Y ˜ logNormal, then log(Y) ˜ Normal, so mean_logY = median_logY = log(Y500). Let SD_logY be the standard deviation of logY. Then, log(Y025) and log(Y925) are each 1.96*SD_log2Y units from mean_logY, so SD_logY = [log(Y025) - log(Y025)]/(2*1.96), where 1.96 is the 97.5% quantile (Z975) of the standard Normal, Z ˜ N(0,1). Taking 1.96 to "equal" 2, we have, SD_logY = [log(Y025) - log(Y025)]/4, With YRelSpread95 = Y975/Y025, we get, SD_logY = log(YRelSpread95)/4. 5. It some cases it may be more convenient to use another relative spread, say YRelSpread90 or YRelSpread50. Using Z900 = 1.65 and Z750 = 0.67, we have SD_logY = log(YRelSpread90)/(2*1.65) and
SD_logY = log(YRelSpread50)/(2*0.67). Whereas [log(Y750) - log(Y250)] is the interquartile range for logY YRelSpread50 could be called the interquartile relative range for Y. 6. One can show that the coefficient of variation is CoefVar_Y = sqrt(exp(SD_logY**2 - 1)). See page 213 of Johnson, Kotz, Balakrishnan (1994), Continuous Univariate Distributions, Vol. I. 7. All logs are taken at base 2, but this choice is irrelevant for sample-sizeanalysis. *********************************************************************/ /********* Main code *********/ %let SD_log2Y_1 = %sysevalf(%sysfunc(log2(&YRelSpread95_1))/4); %let SD_log2Y_2 = %sysevalf(%sysfunc(log2(&YRelSpread95_2))/4); data exemplary; group = "&GrpLabel_1"; CellWgt = 1; mean_log2Y_A = log2(&Ymedian0); mean_log2Y_B = log2(&Ymedian0); output; group = "&GrpLabel_2"; CellWgt = 2; mean_log2Y_A = log2(&Ymedian1A); mean_log2Y_B = log2(&Ymedian1B); output; run; proc print data=exemplary; run; proc GLMpower data=exemplary; ODS output output=Ntotals; class group; model mean_log2Y_A mean_log2Y_B = group; weight CellWgt; power StdDev = &SD_log2Y_1 &SD_log2Y_2 NCovariates = &NCovariates CorrXY = &PrtlCorr_XXXlogY alpha = .01 .05 power = 0.95 0.99 Ntotal = .; run;
data Ntotals; set Ntotals; if dependent = "mean_log2Y_A" then comparison = "&Ymedian0 vs &Ymedian1A"; if dependent = "mean_log2Y_B" then comparison = "&Ymedian0 vs &Ymedian1B"; if UnadjStdDev = &SD_log2Y_1 then YRelSpread95 = &YRelSpread95_1; if UnadjStdDev = &SD_log2Y_2 then YRelSpread95 = &YRelSpread95_2; run; proc tabulate data=Ntotals format=5.0 order=data; format Alpha 4.3 YRelSpread95 3.1; class comparison alpha YRelSpread95 CorrXY NominalPower Ncovariates; var Ntotal; table Ncovariates="Number of covariates; ", comparison="&Ylabel: " * alpha="Alpha" * NominalPower="Power", YRelSpread95="95% Relative Spread" * CorrXY="Partial R for Covariates" * Ntotal=""*mean=" " /rtspace=35; run; proc GLMpower data=exemplary; ODS output output=powers; class group; model mean_log2Y_A mean_log2Y_B = group; weight CellWgt; power StdDev = &SD_log2Y_1 &SD_log2Y_2 Ncovariates = &NCovariates CorrXY = &PrtlCorr_XXXlogY alpha = .01 .05 Ntotal = 300 power = .; run; data powers; set powers; if dependent = "mean_log2Y_A" then comparison = "&Ymedian0 vs &Ymedian1A"; if dependent = "mean_log2Y_B" then comparison = "&Ymedian0 vs &Ymedian1B"; if UnadjStdDev = &SD_log2Y_1 then YRelSpread95 = &YRelSpread95_1; if UnadjStdDev = &SD_log2Y_2 then YRelSpread95 = &YRelSpread95_2; if power > .999
then power999 = .999; else power999 = power; run; proc tabulate data=powers format=4.3 order=data; format Alpha 4.3 YRelSpread95 3.1; class comparison alpha YRelSpread95 CorrXY Ntotal Ncovariates; var power999; table Ntotal="Total Sample Size: " * Ncovariates="Number of covariates; ", comparison="&Ylabel.: " * alpha="Alpha", YRelSpread95="95% Relative Spread" * CorrXY="Partial R for Covariates" * power999=""*mean=" " /rtspace=35; run; %macro CrucialRates (PriorPNullFalse = , Powers = powers, CrucialErrRates = CrucialErrRates ); /********************************************************************** Converts Alphas and Powers to Crucial Error Rates ------------------------------------------------- <> PriorPNullFalse= value1 value2 ... value10 This is gamma = PriorP[Ho false]. <> Powers= InputDSName "InputDSName" corresponds to ODS output statement in PROC POWER or PROC GLMPOWER, such as proc power; ODS output output=MoralityPowers; <> CrucialErrRates= OutputDSName "OutputDSName" is SAS dataset name; default: "CrucialErrRates" *********************************************************************/ data &CrucialErrRates; set &Powers; array PrNullFalseV{10} _temporary_ (&PriorPNullFalse); beta = 1 − power; iPNF = 1; do until (PrNullFalseV{iPNF} = .); gamma = PrNullFalseV{iPNF}; /* Compute Crucial Type I error rate */ TypeError = "Type I"; CrucialRate = alpha*(1 - gamma)/(alpha*(1 - gamma) + (1 - beta)*gamma); output; /* Compute Crucial Type II error rate */ TypeError = "Type II"; CrucialRate = beta*gamma/(beta*gamma + (1 - alpha)*(1 - gamma)); output;
iPNF + 1; end; run; %mend; *CrucialRates; %CrucialRates ( PriorPNullFalse = &PriorPNullFalse, Powers=powers, CrucialErrRates = CrucRates ); proc tabulate data=CrucRates format=4.3 order=data; *&UniversalText; title3 "Crucial Error Rates for QCA Malaria Trial"; format Alpha 4.3 YRelSpread95 3.1; class TypeError gamma comparison alpha YRelSpread95 CorrXY Ntotal Ncovariates; var CrucialRate; table Ntotal="Total Sample Size: " * Ncovariates="Number of covariates; ", comparison="&Ylabel.: " * YRelSpread95="95% Relative Spread" * CorrXY="Partial R for Covariates", alpha="Alpha" * gamma="PriorP[Null False]" * TypeError="Crucial Error Rate" * CrucialRate=""*mean=" " / rtspace = 32; run;