Propensity Scoring with Missing Values
5.3 Using SAS for IPW Estimation with Missing Values
Propensity scores have been used widely as a bias reduction method to estimate the treatment effect in nonrandomized studies. Because many covariates are generally included in the model for estimating propensity scores, the proportion of subjects with at least one missing covariate can be relatively large. In this chapter, we review existing methods for estimating propensity scores when missing values are present. The methods include a complete covariate (CC) method, an indicator variable (IND) method, various multiple imputation (MI) methods, a missingness pattern (MP) method, and a multiple imputation missingness pattern (MIMP) method. We provide SAS programs to implement all five methods for a data set from a clinical study in osteoporosis. We also provide a SAS macro for pooling small patterns of missing data to increase the stability and efficiency of MP and MIMP estimators. Because estimation may be sensitive to model misspecification for imputation and/or propensity score estimation as well as to the tuning parameters of associated algorithms, we also suggest various sensitivity analyses.
Observational studies are becoming increasingly important because they allow us to observe treatment outcomes for large numbers of subjects in real-world treatment practice. Well-designed observational studies could provide valuable information to enhance information from randomized controlled trials (Concato, Shah, and Horwitz, 2000). The propensity score concepts introduced by Rosenbaum and Rubin (1983) are tools for estimating causal effects of alternative treatments in the presence of imbalance in baseline covariate (X) distributions between treatment groups due to lack of randomization.
Let T denote a binary treatment group indicator. Throughout this chapter, we restrict attention to only two treatment groups, with T=0 indicating the reference treatment group and T=1 indicating the investigational or active treatment group. The propensity score , which is the probability of a subject being assigned to the active treatment given X, is essentially a mapping of multiple covariates onto a single, scalar valued variable. Propensity scores are typically estimated using a multiple logistic regression model, as follows:
.
It has been shown that using propensity scores results in substantial reduction of bias in estimating the treatment effect when treatment assignments are subject to selection bias (Rosenbaum and Rubin, 1983). Furthermore, the propensity score method provides advantages compared to simply incorporating all the covariates in the model for the treatment effect. For example, propensity score methods tend to be more robust than direct covariate adjustment with respect to model overparameterizations (including too many covariates) and situations with different covariance matrices within treated and untreated groups (D’Agostino, 1998, page 2286.)
Propensity score methods encourage use of many covariates because only predicted probabilities of alternative treatment choices end up being used, while the relative magnitudes, numerical signs, and p-values of fitted coefficients tend to be ignored. The generally larger number of missing values recorded in observational studies, compared to well-controlled trials, implies that a large proportion of subjects have at least one missing covariate value. The first simple approach is using only the observations without missing covariates. We call this method the complete covariate (CC) method. Clearly, simply ignoring patients with at least one missing covariate value is not a viable strategy. A simple and intuitive way for handling categorical missing values is to treat a missing value for each categorical variable as an additional category. For a continuous covariate with missing values, we can impute the missing values with the marginal mean and create a new dummy variable to indicate the missingness. We call this the indicator variable (IND) method. However, creating this new value ignores any observed correlations among original covariate values and, thus, is not an efficient approach.
A more sophisticated method is to fit separate regressions in estimation of the propensity score for each distinct missingness pattern (MP) (D’Agostino, 2001). Although this approach includes all nonmissing values for those subjects with the same MP, it increases the variability of estimated propensity scores because only a subset of subjects is included in the propensity score model. A much more complicated and computationally intensive approach is to jointly model the propensity score and the missingness, and then use the EM/ECM algorithm (Ibrahim et al., 1999) or Gibbs sampling (D’Agostino et al., 2000) to estimate parameters and propensity scores. Because there is currently no SAS procedure to perform such analyses, the computational complexity makes this alternative approach less attractive and practical.
As a different approach, propensity scores in the presence of missing covariates could also be estimated using multiple imputation (MI) concepts proposed by Rubin (1978, 1987). Recently, Crowe, Lipkovich, and Wang (2009) studied the performance of different multiple imputation strategies in propensity score-based estimation through a simulation study. The central idea of MIs is to randomly fill in any missing values multiple times with sampling from the posterior predictive distribution of the missing values given the observed values, thereby creating a sequence of complete data sets. One advantage to this method is that each data set in the imputed sequence can be analyzed using standard complete data methods. Another advantage is that MI procedures allow us to include ancillary variables that, although they do not directly affect propensity scores, may none the less contain useful information about missing values in important variables.
Recently, Qu and Lipkovich (2009) developed a new method called multiple imputation missingness pattern (MIMP), which utilizes not only a multiple imputation method but also the pattern of missing data in the estimation of propensity scores. In this approach, missing data are imputed using a multiple imputation procedure. Then, the propensity scores are estimated from a logistic regression model including the covariates (with missing values imputed) and a factor (a set of indicator variables) indicating the missingness pattern for each observation. A simulation study showed that MIMP performs as well as MI and better than MP when the missingness mechanism is either “missing completely at random” or “missing at random,” and it performs better than MI when data are missing not at random (Qu and Lipkovich, 2009).
We will use a data example from a clinical trial in osteoporosis to show how to estimate propensity scores in the presence of missing data for some covariates using the CC, IND, MI, MP, and MIMP methods. There are many ways to use these estimated propensity scores to estimate treatment effects, including regression, stratification, inverse probability weighting, matching, and some combinations of these (see Chapters 2–4 for detailed discussions of these methods). Note that each of the approaches for estimating propensity scores discussed in this chapter can be used in combination with any method of using the propensity scores to estimate treatment effects. In this chapter, for illustration, we use the inverse probability weighting (IPW) approach to estimate the treatment difference. It has been shown that standardizing the weights before performing the IPW estimation provides a more stable estimator for the treatment difference. Therefore, we use the standardized IPW estimation throughout this chapter. Specifically, the estimated treatment effects is estimated as the difference in mean outcome between the two treatment groups, , where
and individual propensity scores, , are estimated from the logistic regression model described previously (Lunceford and Davidian, 2004).
In this section, we apply the five methods of handling missing data when estimating propensity scores to a set of data from an osteoporosis study: Multiple Outcomes of Raloxifene Evaluation (MORE) (Delmas et al., 2002). In this study, 7,705 women with osteoporosis were randomly assigned to one of the three treatment groups with an intended ratio of 1:1:1 for placebo, raloxifene 60 mg/day, or raloxifene 120 mg/day and were followed up for 4 years. After 3 years of follow-up, women were allowed to take other bone-active agents such as bisphosphonates. In this analysis, we compared the change in the femoral neck bone mineral density (BMD) during the fourth year (Y) between women not taking bisphosphonates (referred to in this analysis as the untreated group, T=0) and women taking bisphosphonates (the treated group, T=1) among the 1,643 women who were originally randomized to placebo. Our primary analytic method for evaluating treatment difference is an analysis of variance (ANOVA) model for the outcome (Y) against the assigned treatment group (T). However, because taking bisphosphonates was not a randomized factor, the response variable Y in treated and untreated groups may be confounded due to selection bias. Therefore, we use a weighted ANOVA model with weights taken as the standardization of the inverse of the estimated probability of treatment received. Because PROC MIXED in SAS automatically standardizes the weights before performing the analysis, there is no need to explicitly compute standardized weights in the SAS program. These propensities were estimated using a logistic regression model with 16 covariates (Table 5.1): age at baseline (i.e., prior to randomization), body mass index (BMI) at baseline, family history of breast cancer, 5-year breast cancer risk score (Costantino et al.,1998) at baseline, whether a woman had had a hysterectomy at baseline, lumbar spine BMD at baseline, femoral neck BMD at baseline, change in lumbar spine BMD during the first 3 years, change in femoral neck BMD during the first 3 years, previous hormone replacement therapy status, prevalent vertebral fracture at baseline, new nonvertebral fracture in the first 3 years, new vertebral fracture in the first 3 years, weighted adverse event score during the first 3 years calculated as (1 × #mild AE + 2 × #moderate AE + 3 × #severe AE + 4 × #serious AE), smoking status at baseline, and baseline semi-quantitative vertebral fracture status (0=no fracture, 1=mild fracture, 2=moderate fracture, and 3=severe fracture).
A total of 1,643 women (1,512 with T=0 and 131 with T=1) were included in this sub-analysis and 603 women (36.7%) had at least one missing covariate. Table 5.1 displays the variable names and the numbers of missing values for women originally treated with the placebo. There were 14 distinct patterns of the missing data. The largest pattern included 1,040 subjects with no missing covariates, and each of the three smallest patterns consisted of only one subject.
Variable Name | Variable Description | # Missing Observations |
AGE | Age at MORE baseline | 0 |
BMIR | Body mass index (BMI) at MORE baseline | 1 |
GAILMORE | 5-year breast cancer risk score | 562 |
LSC | Change in lumbar spine BMD during the first 3 years | 6 |
LTOTBMR | Lumbar spine BMD at baseline | 6 |
FNC | Change in femoral neck BMD during the first 3 years | 0 |
NECKBMDR | Femoral neck BMD at baseline | 0 |
AESCORE | Weighted adverse event score during the first 3 years | 0 |
SQ | Baseline semi-quantitative vertebral fracture status (0=no fracture, 1=mild fracture, 2=moderate fracture, and 3=severe fracture) | 9 |
FAMHXBCN | Family history of breast cancer (Y/N) | 41 |
KHYSYN | Whether hysterectomized at MORE baseline (Y/N) | 0 |
NVFX | New nonvertebral fracture in the first 3 years (Y/N) | 0 |
PREVHRT | Previous hormone replacement therapy status (Y/N) | 3 |
PREVVERT | Prevalent vertebral fracture at MORE baseline (Y/N) | 8 |
SMOKE | Smoking status at MORE baseline (Y/N) | 23 |
VFX | New vertebral fracture in the first 3 years (Y/N) | 8 |
FNBMD_C | Change in the femoral neck BMD during the fourth year (Outcome variable) | 0 |
BISMORE | Taking bisphosphonates versus not taking (treatment group), (Y/N) | 0 |
5.3 Using SAS for IPW Estimation with Missing Values
SAS procedures (MEANS, GLM, and MIXED) automatically standardize the weights before performing the analysis. Therefore, although there is no explicit code for standardizing the weights (defined as the inverse of the propensity scores in the SAS programs), the standardized IPW estimation is actually performed for all methods of handling missing data.
The CC method simply omits observations with at least one missing covariate. Program 5.1 shows the SAS code to perform the CC analysis and Output from Program 5.1 shows the results.
PROC FORMAT;
VALUE FORMATYN 0 = 'NO'
1 = 'YES';
RUN;
*****************************************************;
* COMPLETE COVARIATE (CC) ANALYSIS;
*****************************************************;
PROC LOGISTIC DATA = ANALDATA NOPRINT;
MODEL BISMORE = &VARLIST;
OUTPUT OUT=PRED PREDICTED=P;
RUN;
DATA PRED;
SET PRED;
IF BISMORE = 0 THEN PROB = P;
IF BISMORE = 1 THEN PROB = 1-P;
W = 1/PROB;
RUN;
PROC SORT DATA=PRED;
BY BISMORE;
RUN;
TITLE 'ANALYSIS RESULTS USING THE COMPLETE COVARIATE (CC) METHOD';
PROC MIXED DATA = PRED;
CLASS BISMORE;
MODEL FNBMD_C = BISMORE;
WEIGHT W;
FORMAT BISMORE FORMATYN.;
LSMEANS BISMORE/DIFF=ALL;
RUN;
ANALYSIS RESULTS USING THE COMPLETE COVARIATE (CC) METHOD
The Mixed Procedure
Least Squares Means
Effect | Bisphosphonates use in the 4th yr of MORE |
Estimate | Standard Error |
DF | t Value | Pr > |t| |
BISMORE | No | -0.00225 | 0.001224 | 1038 | -1.84 | 0.0668 |
BISMORE | Yes | 0.006102 | 0.001180 | 1038 | 5.17 | <.0001 |
Differences of Least Squares Means
Effect | Bisphosphonates use in the 4th yr of MORE |
Bisphosphonates use in the 4th yr of MORE |
Estimate | Standard Error |
DF | t Value | Pr > |t| |
BISMORE | No | Yes | -0.00835 | 0.001700 | 1038 | -4.91 | <.0001 |
The IND approach is a straightforward method for categorical data where the missing value is treated as a special category. For continuous variables, we generally impute the missing values with the marginal means and create a dummy variable to indicate missingness. Program 5.2 shows the SAS code to perform the IND analysis and Output from Program 5.2 shows the results.
**********************************************************************;
* INDICATOR VARIABLE (IND) ANALYSIS;
**********************************************************************;
PROC MEANS DATA = ANALDATA NOPRINT;
VAR AGE BMIR GAILMORE LSC LTOTBMDR FNC NECKBMDR AESCORE SQ;
OUTPUT OUT = ANALDATA_MEAN MEAN=AGE_M BMIR_M GAILMORE_M LSC_M
LTOTBMDR_M FNC_M NECKBMDR_M AESCORE_M SQ_M;
BY STUDY;
RUN;
DATA ANALDATA_2;
MERGE ANALDATA ANALDATA_MEAN;
BY STUDY;
RUN;
DATA ANALDATA_IV;
SET ANALDATA_2;
ARRAY X{9} AGE BMIR GAILMORE LSC LTOTBMDR FNC NECKBMDR AESCORE SQ;
ARRAY M{9} M1-M9;
ARRAY XM{9} AGE_M BMIR_M GAILMORE_M LSC_M LTOTBMDR_M FNC_M NECKBMDR_M
AESCORE_M SQ_M;
DO I = 1 TO 9;
IF X{I} = . THEN DO;
M{I} = 1;
X{I} = XM{I};
END;
ELSE M{I} = 0;
END;
ARRAY XC{7} FAMHXBCN KHYSYN NVFX PREVHRT PREVVERT SMOKE VFX;
DO I = 1 TO 7;
IF XC{I} = . THEN XC{I} = -1;
END;
RUN;
PROC LOGISTIC DATA = ANALDATA_IV NOPRINT;
CLASS FAMHXBCN KHYSYN NVFX PREVHRT PREVVERT SMOKE VFX;
MODEL BISMORE = &VARLIST M1-M9;
OUTPUT OUT=PRED PREDICTED=P;
RUN;
DATA PRED;
SET PRED;
IF BISMORE = 0 THEN PROB = P;
IF BISMORE = 1 THEN PROB = 1-P;
W = 1/PROB;
RUN;
PROC SORT DATA=PRED;
BY BISMORE;
RUN;
TITLE 'ANALYSIS RESULTS USING THE INDICATOR VARIABLE (IND) METHOD';
PROC MIXED DATA = PRED;
CLASS BISMORE;
MODEL FNBMD_C = BISMORE;
WEIGHT W;
LSMEANS BISMORE/DIFF=ALL;
FORMAT BISMORE FORMATYN.;
RUN;
ANALYSIS RESULTS USING THE INDICATOR VARIABLE (IND) METHOD
The Mixed Procedure
Least Squares Means
Effect | Bisphosphonates use in the 4th yr of MORE |
Estimate | Standard Error |
DF | t Value | Pr > |t| |
BISMORE | No | -0.00217 | 0.001031 | 1641 | -2.11 | 0.0354 |
BISMORE | Yes | 0.08131 | 0.000995 | 1641 | 8.17 | <.0001 |
Differences of Least Squares Means
Effect | Bisphosphonates use in the 4th yr of MORE |
Bisphosphonates use in the 4th yr of MORE |
Estimate | Standard Error |
DF | t Value | Pr > |t| |
BISMORE | No | Yes | -0.01030 | 0.001433 | 1641 | -7.19 | <.0001 |
Applying MI for categorical predictors may be challenging because most commercially available statistical software for MI works with continuous data under an assumption of normality. Unfortunately, there is no current procedure in SAS to perform multiple imputation easily for categorical covariates. As an alternative, we could create dummy variables for categorical variables and perform MI treating the dummy variables as continuous variables, which is readily justified for binary variables. For example, let X=1 indicate a subject who smoked regularly at baseline, and X=0 indicate a subject who did not. The imputed value for X could then be, say, 0.4, indicating a subject who has a 40% chance of smoking. This might provide better information about a patient than rounding the probability down to 0 (no smoking). In this chapter, we impute missing binary predictors with probabilities without rounding down or up.
Another difficulty in applying MI inference in the context of propensity-based estimation of treatment effects is that using combining rules (Rubin, 1987) may result in variance estimators that are not valid because the uncertainty in estimated weights has not been accounted for. Another reason why Rubin’s variance estimator may not be appropriate is that one’s imputation model (for baseline covariates) and analysis model (for treatment effects) are unlikely to be compatible. See Meng (1994), Wang and Robins (1998), and Robins and Wang (2000) for more information. One general recipe for improving the variance estimator is bootstrapping the entire estimation procedure. This can be easily done using the available SAS macro suite for bootstrapping and implementing various bootstrap-based confidence intervals (http://cuke.hort.ncsu.edu/cucurbit/wehner/software/pathsas/jackboot.txt). In Section 5.3.6, we outline how to use this macro.
Program 5.3 shows the SAS code to perform the MI analysis and Output from Program 5.3 shows the results. Program 5.3 shows how to calculate the point estimator when PROC MI is used to impute missing values. Essentially, it is the average of the estimates from samples generated by multiple imputations.
**********************************************************************;
* ANALYSIS RESULTS USING THE MULTIPLE IMPUTATION (MI) METHOD;
**********************************************************************;
PROC MI DATA = ANALDATA ROUND=.001 NIMPUTE=5 SEED=6731205
OUT=IMPUTED_DATA NOPRINT;
VAR &VARLIST FNBMD_C BISMORE;
RUN;
PROC LOGISTIC DATA = IMPUTED_DATA NOPRINT;
MODEL BISMORE = &VARLIST;
OUTPUT OUT=PRED PREDICTED=P;
BY _IMPUTATION_;
RUN;
DATA PRED;
SET PRED;
IF BISMORE = 0 THEN PROB = P;
IF BISMORE = 1 THEN PROB = 1-P;
W = 1/PROB;
RUN;
PROC SORT DATA=PRED;
BY _IMPUTATION_ BISMORE;
RUN;
ODS LISTING CLOSE;
ODS OUTPUT LSMEANS = LSM DIFFS=DIFFS;
PROC MIXED DATA = PRED;
CLASS BISMORE;
MODEL FNBMD_C = BISMORE;
WEIGHT W;
BY _IMPUTATION_;
LSMEANS BISMORE/ DIFF=ALL;
RUN;
ODS LISTING;
TITLE 'ANALYSIS RESULTS USING THE MULTIPLE IMPUTATION (MI) METHOD';
TITLE2 'POINT ESTIMATES BY TREATMENT GROUP';
PROC MEANS DATA=LSM;
CLASS BISMORE;
VAR ESTIMATE;
FORMAT BISMORE FORMATYN.;
RUN;
TITLE2 'POINT ESTIMATE FOR THE TREATMENT DIFFERENCE';
PROC MEANS DATA = DIFFS;
VAR ESTIMATE;
RUN;
ANALYSIS RESULTS USING THE MULTIPLE IMPUTATION (MI) METHOD
POINT ESTIMATES BY TREATMENT GROUP
The MEANS Procedure
Analysis Variable : Estimate
Bisphosphonates use in the 4th yr of MORE Maximum |
N Obs |
N | Mean | Std Dev | Minimum | |
-------------------------------------------------------------------------------- | ||||||
No 0.0021606 |
5 | 5 | -0.0021784 | 0.000011247 | -0.0021881 | - |
Yes 0.0086499 |
5 | 5 | 0.0083319 | 0.000191791 | 0.0081308 | |
-------------------------------------------------------------------------------- |
ANALYSIS RESULTS USING THE MULTIPLE IMPUTATION (MI) METHOD
POINT ESTIMATE FOR THE TREATMENT DIFFERENCE
The MEANS Procedure
Analysis Variable : Estimate
N | Mean | Std Dev | Minimum | Maximum |
---------------------------------------------------------------------------------- | ||||
5 | -0.0105103 | 0.000189049 | -0.0108240 | -0.0103159 |
---------------------------------------------------------------------------------- |
The following SAS code produces the MI estimators of treatment effect with naïve estimates of standard error using PROC MIANALYZE. Specifically, it proceeds in three steps: first one creates multiple data sets without missing baseline values by imputation using a multivariate normal model (PROC MI), then one computes IPW estimates of treatment effects for each completed data set (PROC MIXED), and finally one computes a single MI estimator of treatment effect and associated approximate 95% confidence interval (CI) using Rubin’s combining rules (PROC MIANALYZE).
Note that PROC MIANALYZE has different formats for input data sets PARMS and COVB for SAS versions 8 and 9. Our example assumes use of SAS 9. Remember also that the variance estimator from PROC MIANALYZE may not be valid for the reasons mentioned here. Program 5.4 shows the SAS code to summarize the estimates from the MI method using PROC MIANALYZE and Output from Program 5.4 shows the results.
TITLE2 'ESTIMATE THE TREATMENT DIFFERENCE USING PROC MIANALYZE';
DATA FOR_MI_EST (KEEP = _IMPUTATION_ EFFECT ESTIMATE RENAME=(EFFECT=
PARAMETER));
SET DIFFS;
EFFECT = 'DIFF';
RUN;
DATA FOR_MI_COV (KEEP = _IMPUTATION_ ROWNAME BISMORE DIFF );
SET DIFFS;
DIFF = STDERR**2;
ROWNAME = "DIFF";
RUN;
PROC MIANALYZE PARMS=FOR_MI_EST COVB=FOR_MI_COV;
MODELEFFECTS DIFF;
ODS OUTPUT PARAMETERESTIMATES=MI_EST
VARIANCEINFO=MI_VAR;
RUN;
ANALYSIS RESULTS USING THE MULTIPLE IMPUTATION (MI) METHOD
ESTIMATE THE TREATMENT DIFFERENCE USING PROC MIANALYZE
The MIANALYZE Procedure
Model Information
PARMS Data Set | WORK.FOR_MI_EST |
COVB Data Set | WORK.FOR_MI_COV |
Number of Imputations | 5 |
Multiple Imputation Variance Information
-----------------Variance----------------- | ||||
Parameter | Between | Within | Total | DF |
DIFF | 3.5739427E-8 | 0.000002072 | 0.000002115 | 9729.1 |
Multiple Imputation Variance Information
Parameter | Relative Increase in Variance |
Fraction Missing Information |
Relative Efficiency |
DIFF | 0.020696 | 0.020478 | 0.995921 |
Multiple Imputation Parameter Estimates
Parameter | Estimate | Std Error | 95% Confidence | Limits | DF |
DIFF | -0.010510 | 0.001454 | -0.01336 | -0.00766 | 9729.1 |
Multiple Imputation Parameter Estimates
Parameter | Minimum | Maximum | Theta0 | t for H0: Parameter=Theta0 |
Pr > |t| |
DIFF | -0.010824 | -0.010316 | 0 | -7.23 | <.0001 |
The two data sets passed to PROC MIANALYZE contain estimated treatment effects (data set FOR_MI_EST) and their associated variance-covariance matrices (squared estimated standard errors in data set FOR_MI_COV) for each imputation. The output of PROC MIANALYZE contains information on partitioning of the total variance associated with estimated treatment effect into between- and within-imputation pieces, multiple imputation parameter estimates, approximate 95% confidence intervals based on t-distributions, and total variance. Note that for a finite number of imputations (M), Rubin’s variance estimator is inconsistent because it has a non-degenerate chi-squared limiting distribution. Therefore, a standard Wald-type inference based on the normal distribution is invalid and a t-distribution is used (Rubin, 1987). The impact of missing covariates on the final estimates of treatment effect can also be assessed by the fraction of missing information (about treatment effect), which is fairly low in this case (only 2%).
The MP method essentially estimates propensity scores separately for each missingness pattern by including only variables without missing values within a missingness pattern. As a result, the independent variables included in the propensity score estimation models differ across missingness patterns. One challenge in implementing MP analysis is that usually there are some missingness patterns with a small number of observations, which renders estimation using the described model unstable. To address this problem, we developed an algorithm for pooling small missingness patterns according to their similarities to reach a prespecified minimum number of observations in each pattern (Qu and Lipkovich, 2009). After combining similar patterns, we impute all missing values within each pooled pattern with the marginal means (to avoid “holes”) and estimate the propensity scores using a logistic regression model. In this example, we pooled missingness patterns with a minimum of 100 observations for each pooled cell. Program 5.5 provides a macro to create these pooled patterns.
***********************************************************************;
* Input parameters:
* indata = input data set;
* outdata = output data set;
* varlist = a list of variables to be included in the propensity score estimation;
* M_MP_MIN = minimum number of observations for each missing pattern.
* Missing patterns with less than MIN_MP observations will be pooled;
***************************************************************************;
%MACRO MP_ASSIGN(MSDATA = , OUTDATA =, VARLIST =, N_MP_MIN = 100);
/* Determine how many variables to include in the propensity score estimation */
%LET N = 1;
%LET VARINT = ;
%DO %UNTIL(%QSCAN(&VARLIST., &N. , %STR( )) EQ %STR( ));
%LET VAR = %QSCAN(&VARLIST. , &N. , %STR( ));
%LET VARINT = &VARINT &VAR.*MP;
%LET N = %EVAL(&N. + 1);
%END;
%LET KO = %EVAL(&N-1);
%LET M_MISSING = %EVAL(&N-1);
%PUT &VARINT;
%PUT &KO;
%PUT &M_MISSING;
/* Create indicators for missing values and missingness patterns */
DATA MS;
SET &MSDATA;
ARRAY MS{&M_MISSING} M1-M&M_MISSING.;
ARRAY X{&M_MISSING} &VARLIST;
MV = 0;
DO I = 1 TO &M_MISSING;
IF X{I} = . THEN MS{I} = 1;
ELSE MS{I} = 0;
MV = 2*MV + MS{I};
END;
MV = MV + 1;
DROP I;
RUN;
/* Only keep one record for each missingness pattern */
PROC SORT DATA = MS OUT = PATTERN NODUPKEY;
BY MV;
RUN;
/* Calculate the number of observations in each missingness pattern */
PROC FREQ DATA = MS NOPRINT;
TABLES MV / OUT = M_MP(KEEP = MV COUNT);
RUN;
DATA PATTERN;
MERGE PATTERN M_MP;
BY MV;
RUN;
PROC SORT DATA = PATTERN;
BY DESCENDING COUNT;
RUN;
/* Assign missingness pattern to new index from the largest to the smallest */
DATA PATTERN;
RETAIN M1-M&M_MISSING MV COUNT MV_S;
SET PATTERN;
KEEP M1-M&M_MISSING MV COUNT MV_S;
MV_S = _N_;
RUN;
PROC IML;
USE PATTERN;
READ ALL INTO A;
CLOSE PATTERN;
MS = A[, 1:&M_MISSING];
MV = A[, 1+&M_MISSING];
N_MP = A[, 2+&M_MISSING];
MV_S = A[, 3+&M_MISSING];
M_MP = NROW(MS);
M = NCOL(MS);
/* Calculate the distance between missingness patterns */
DISTANCE = J(M_MP, M_MP, 0);
DO I = 1 TO M_MP;
DO J = 1 TO I-1;
D = 0;
DO L = 1 TO M;
D = D + ( (MS[I,L]-MS[J,L])*(MS[I,L]-MS[J,L]) );
END;
DISTANCE[I,J] = D;
DISTANCE[J,I] = D;
END;
END;
I = 0;
K_MV_POOL = 0;
MV_POOL = J(M_MP, 1, 0);
/*Pooling small missingness patterns according to their similarities to
reach a prespecified minimum number of observations (&N_MP_MIN) in each
pattern */
DO WHILE( I < M_MP);
I = I + 1;
IF MV_POOL[I] = 0 THEN
DO;
K_MV_POOL = K_MV_POOL + 1;
N_MP_POOL = N_MP[I];
IF N_MP_POOL >= &N_MP_MIN THEN
DO;
MV_POOL[I] = K_MV_POOL;
END;
ELSE
DO;
IF I < M_MP THEN
DO;
A = DISTANCE[(I+1):M_MP, I];
B = MV[(I+1):M_MP];
C = N_MP[(I+1):M_MP];
D = MV_S[(I+1):M_MP];
E = MV_POOL[(I+1):M_MP];
TT = A || B || C || D || E;
CALL SORT( TT, {1 3});
J = 0;
DO WHILE( (N_MP_POOL < &N_MP_MIN) & (I+J < M_MP) );
J = J+1;
IF (TT[J,5] = 0) THEN
DO;
N_MP_POOL = N_MP_POOL + TT[J,3];
TT[J,5] = K_MV_POOL;
END;
END;
END;
IF ( N_MP_POOL >= &N_MP_MIN ) THEN
DO;
MV_POOL[I] = K_MV_POOL;
DO K = 1 TO J;
MV_POOL[TT[K,4]] = K_MV_POOL;
END;
END;
ELSE
DO J = I TO M_MP;
SGN_TMP = 0;
K = 1;
DO WHILE(SGN_TMP = 0 & K <= M_MP);
DO L = 1 TO M_MP;
IF (DISTANCE[J,L] = K) & (MV_POOL[J]=0) &
(MV_POOL[L]>0) THEN
DO;
MV_POOL[J] = MV_POOL[L];
SGN_TMP = 1;
END;
END;
K = K + 1;
END;
END;
END;
END;
END;
MV_FINAL = MV || MV_POOL;
VARNAMES={'MV' 'MV_POOL'};
CREATE MVPOOL FROM MV_FINAL[COLNAME=VARNAMES];
APPEND FROM MV_FINAL;
QUIT;
PROC SORT DATA = MVPOOL;
BY MV;
RUN;
PROC SORT DATA = MS;
BY MV;
RUN;
/* The variable MVPOOL in the &OUTDATA set indicates the pooled missingness pattern */
DATA &OUTDATA(RENAME=(MV=MP_ORIG MV_POOL=MP));
MERGE MS MVPOOL;
BY MV;
RUN;
%MEND MP_ASSIGN;
Program 5.6 shows the SAS code to perform the MP analysis after the macro %MP_ASSIGN is applied. Output from Program 5.6 shows the results.
**********************************************************************;
* MISSINGNESS PATTERN (MP) METHOD;
**********************************************************************;
%MP_ASSIGN(MSDATA = ANALDATA, OUTDATA = ANALDATA2, VARLIST = &VARLIST, N_MP_MIN = 100);
PROC MEANS DATA = ANALDATA2 NOPRINT;
VAR &VARLIST;
OUTPUT OUT = MN MEAN = XM1-XM16;
BY STUDY;
RUN;
DATA TEMP;
MERGE ANALDATA2 MN;
BY STUDY;
RUN;
DATA TEMP;
SET TEMP;
ARRAY X{16} &VARLIST;
ARRAY XM{16} XM1-XM16;
DO I = 1 TO 16;
IF X{I} = . THEN X{I} = XM{I};
END;
DROP I;
RUN;
PROC SORT DATA = TEMP;
BY MP;
RUN;
PROC LOGISTIC DATA = TEMP NOPRINT;
CLASS MP;
MODEL BISMORE = &VARLIST;
OUTPUT OUT=PRED PREDICTED=P;
BY MP;
RUN;
DATA PRED;
SET PRED;
IF BISMORE = 0 THEN PROB = P;
IF BISMORE = 1 THEN PROB = 1-P;
W = 1/PROB;
RUN;
PROC SORT DATA=PRED;
BY BISMORE;
RUN;
TITLE 'ANALYSIS RESULTS USING THE MISSINGNESS PATTERN (MP) METHOD';
PROC MIXED DATA = PRED;
CLASS BISMORE;
MODEL FNBMD_C = BISMORE;
WEIGHT W;
LSMEANS BISMORE/DIFF=ALL;
FORMAT BISMORE FORMATYN.;
RUN;
ANALYSIS RESULTS USING THE MISSINGNESS PATTERN (MP) METHOD
The Mixed Procedure
Least Squares Means
Effect | Bisphosphonates use in the 4th yr of MORE |
Estimate | Standard Error |
DF | t Value | Pr > |t| |
BISMORE | No | -0.00222 | 0.001021 | 1641 | -2.18 | 0.0295 |
BISMORE | Yes | 0.007187 | 0.000987 | 1641 | 7.28 | <.0001 |
Differences of Least Squares Means
Effect | Bisphosphonates use in the 4th yr of MORE |
Bisphosphonates use in the 4th yr of MORE |
Estimate | Standard Error |
DF | t Value | Pr > |t| |
BISMORE | No | Yes | -0.00941 | 0.001420 | 1641 | -7.6.63 | <.0001 |
The MIMP method essentially is a combination of MI and MP methods. First of all, missing values are multiply imputed. Then, for each imputed data set, propensity scores are estimated using the baseline covariates and the categorical variable for the missingness pattern. Similar to the MP method, we combine small missingness patterns to reach a minimum of 100 observations for each cell. Once we create data sets with indicators for pooled patterns (as shown in Section 5.3.4), we simply apply PROC MI to this data set (as shown in Section 5.3.3). Program 5.7 shows the SAS code to perform the MIMP analysis and Output from Program 5.7 shows the results.
**********************************************************************;
* Multiple Imputation Missingness Pattern (MIMP) Method;
**********************************************************************;
PROC MI DATA = ANALDATA2 ROUND=.001 NIMPUTE=5 SEED=6731205 OUT=IMPUTED_DATA NOPRINT;
VAR &VARLIST FNBMD_C BISMORE;
RUN;
PROC LOGISTIC DATA = IMPUTED_DATA NOPRINT;
CLASS MP;
MODEL BISMORE = &VARLIST MP;
OUTPUT OUT=PRED PREDICTED=P;
BY _IMPUTATION_;
RUN;
DATA PRED;
SET PRED;
IF BISMORE = 0 THEN PROB = P;
IF BISMORE = 1 THEN PROB = 1-P;
W = 1/PROB;
RUN;
PROC SORT DATA=PRED;
BY _IMPUTATION_ BISMORE;
RUN;
ODS OUTPUT LSMEANS = LSM DIFFS=DIFFS;
PROC MIXED DATA = PRED;
CLASS BISMORE;
MODEL FNBMD_C = BISMORE;
WEIGHT W;
BY _IMPUTATION_;
LSMEANS BISMORE/ DIFF=ALL;
RUN;
TITLE 'ANALYSIS RESULTS USING THE MULTIPLE IMPUTATION MISSINGNESS PATTERN (MIMP) METHOD';
TITLE2 'POINT ESTIMATES BY TREATMENT GROUP';
PROC MEANS DATA=LSM;
CLASS BISMORE;
VAR ESTIMATE;
FORMAT BISMORE FORMATYN.;
RUN;
TITLE2 'POINT ESTIMATE FOR THE TREATMENT DIFFERENCE';
PROC MEANS DATA = DIFFS;
VAR ESTIMATE;
RUN;
ANALYSIS RESULTS USING THE MULTIPLE IMPUTATION MISSINGNESS PATTERN (MIMP) METHOD
POINT ESTIMATES BY TREATMENT GROUP
The MEANS Procedure
Analysis Variable : Estimate
Bisphosphonates use in the 4th yr of MORE Maximum |
N Obs |
N | Mean | Std Dev | Minimum | Maximum |
No | 5 | 5 | -0.0021775 | 0.000011208 | -0.002187 | -0.0021596 |
Yes | 5 | 5 | 0.0083296 | 0.000190528 | 0.0081254 | 0.0086443 |
ANALYSIS RESULTS USING THE MULTIPLE IMPUTATION MISSINGNESS PATTERN (MIMP) METHOD POINT ESTIMATE FOR THE TREATMENT DIFFERENCE
The MEANS Procedure
Analysis Variable : Estimate
N | Mean | Std Dev | Minimum | Maximum |
5 | -0.0105070 | 0.000187855 | -0.0108178 | -0.0103096 |
One can also use PROC MIANALYZE to get the point estimate for the treatment difference. However, the variance estimation from PROC MIANALYZE is not valid. This approach is shown in Program 5.8, and Output from Program 5.8 shows the results.
TITLE2 'ESTIMATE THE TREATMENT DIFFERENCE USING PROC MIANALYZE';
DATA FOR_MIMP_EST (KEEP = _IMPUTATION_ EFFECT ESTIMATE RENAME=(EFFECT= PARAMETER));
SET DIFFS;
EFFECT = 'DIFF';
RUN;
DATA FOR_MIMP_COV (KEEP = _IMPUTATION_ ROWNAME BISMORE DIFF );
SET DIFFS;
DIFF = STDERR**2;
ROWNAME = "DIFF";
RUN;
PROC MIANALYZE PARMS=FOR_MIMP_EST COVB=FOR_MIMP_COV;
MODELEFFECTS DIFF;
ODS OUTPUT PARAMETERESTIMATES=MI_EST
VARIANCEINFO=MI_VAR;
RUN;
ANALYSIS RESULTS USING THE MULTIPLE IMPUTATION MISSINGNESS PATTERN (MIMP) METHOD ESTIMATE THE TREATMENT DIFFERENCE USING PROC MIANALYZE
The MIANALYZE Procedure
Model Information
PARMS Data Set | WORK.FOR_MI_EST |
COVB Data Set | WORK.FOR_MI_COV |
Number of Imputations | 5 |
Multiple Imputation Variance Information
-----------------Variance----------------- | ||||
Parameter | Between | Within | Total | DF |
DIFF | 3.5289507E-8 | 0.000002072 | 0.000002115 | 9972.9 |
Multiple Imputation Variance Information
Parameter | Relative Increase in Variance |
Fraction Missing Information |
Relative Efficiency |
DIFF | 0.020436 | 0.020224 | 0.995972 |
Multiple Imputation Parameter Estimates
Parameter | Estimate | Std Error | 95% Confidence | Limits | DF |
DIFF | -0.010507 | 0.001454 | -0.01336 | -0.00766 | 9972.9 |
Multiple Imputation Parameter Estimates
Parameter | Minimum | Maximum | Theta0 | t for H0: Parameter=Theta0 |
Pr > |t| |
DIFF | -0.010818 | -0.010310 | 0 | -7.23 | <.0001 |
Direct estimation of standard errors for the point estimators in all of these propensity score-based IPW methods is challenging because the additional variability in the estimated weights is difficult to account for. Therefore, we use bootstrap methods to estimate the standard error and the confidence interval for the point estimates. The SAS macros %BOOT and %BOOTCI (http://cuke.hort.ncsu.edu/cucurbit/wehner/software/pathsas/jackboot.txt) provide nonparametric estimates of standard errors and various bootstrap confidence intervals (including the popular bias-corrected accelerated [BCa] method; Efron, 1987) for the IPW treatment difference. First, we need to create a user-defined macro that has to be named %ANALYZE. This macro computes the point estimate of the IPW treatment difference that will be repeatedly called from the %BOOTand %BOOTCI macros. Next, the %BOOTand %BOOTCI macros are called to compute bootstrap estimates. Note that the %ANALYZE macro must have two parameters: data to identify the input data set and out to name the output data set. Program 5.9 is an illustration of the %ANALYZE macro and calls to the %BOOT and %BOOTCI macros.
/*****************************************************************************
MIMP_ANALYSIS is a macro which calculates CC, MI, MP and MIMP estimates;
Q_METH indicates the method
*****************************************************************************/
%INCLUDE ‘BOOTS.SAS’; /* the file can be found in http://cuke.hort.ncsu.edu/cucurbit/wehner/software/pathsas/jackboot.txt */
%MACRO ANALYZE(DATA=BMDPS, OUT= );
%MIMP_ANALYSIS(INDATA = &DATA, VARLIST = &VARLIST, Y = FNBMD_C, G = BISMORE, M_MP_MIN = 100);
PROC SORT DATA = EST OUT = &OUT;
BY Q_METH;
RUN;
%MEND ANALYZE;
PROC PRINTTO LOG=NOLOG;
RUN;
TITLE 'BOOSTRAP: NORMAL ("STANDARD") CONFIDENCE INTERVAL WITH BIAS CORRECTION';
TITLE2;
%BOOT(DATA=BMDPS, ALPHA=.05, SAMPLES=1000, RANDOM=123, ID=Q_METH);
TITLE 'BOOTSTRAP BCA';
%BOOTCI(BCA, ID=Q_METH);
PROC PRINTTO;
RUN;
Carpenter and Bithell (2000) used simulation to show that the BCa bootstrap method generally produces reliable results, even if the distribution of the test statistic is far from symmetric. In this example, we use BCa bootstrapping to estimate confidence intervals for the data set in Section 5.2 (see Table 5.2). The naive estimator (a direct unweighted estimator of treatment effect using a t-test) appears to overestimate the treatment difference. The CC method had the widest confidence interval and the MP method had wider confidence intervals than the MI and MIMP methods, whereas the last two methods produced similar results. It should be noted that for this example, the 95% confidence intervals produced by the CC and MP methods contained 0 while those for the MI and MIMP methods did not.
Method | Untreated (n=1512) | Treated (n=131) | Treated vs. Untreated |
NAIVE | -0.002(-0.004,-0.001) | 0.011( 0.006, 0.017) | 0.013( 0.007, 0.019) |
CC | -0.002(-0.004,-0.001) | 0.006(-0.002, 0.020) | 0.008( 0.000, 0.022) |
MP | -0.002(-0.004,-0.001) | 0.007(-0.003, 0.016) | 0.009(-0.001, 0.018) |
MI | -0.002(-0.003,-0.001) | 0.009( 0.002, 0.017) | 0.011( 0.004, 0.019) |
MIMP | -0.002(-0.003,-0.001) | 0.009( 0.002, 0.017) | 0.011( 0.004, 0.019) |
* Results are from Qu and Lipkovich (2009). The estimates for MI and MIMP were slightly different from the SAS output presented previously due to the choice of different seeds in PROC MI.
Estimation of treatment effects using propensity scores may be sensitive to the misspecification of imputation and propensity score models, as well as to the tuning parameters of the associated algorithms. To address this issue, we discuss several strategies for performing sensitivity analysis in the context of propensity-based analyses with missing data.
Examples of alternative analytic methods were illustrated in Section 5.3. Within each method, one can also vary parameters. For example, one can vary the minimum pattern size to test how sensitive the results are to the selection of the minimum size of the pooled pattern in MP and MIMP. We also recommend, as illustrated here, computing propensity-stratified estimates of treatment effects in addition to IPW estimates.
The SAS code here constructs five strata (subgroups, bins) based upon fitted values of propensity for treatment and computes the corresponding stratified estimators of treatment effects. This can be done in various ways. For example, one can compute treatment effects within each stratum and combine them by weighting inversely to the square of estimated standard errors. In general, this approach is not recommended because it always down-weights results from the strata with highly different observed treatment fractions and up-weighs results from the strata with nearly equal observed treatment fractions. Here we follow a stratified regression (SR) approach described in D’Agostino (1998) and Lunceford and Davidian (2004) that combines stratification on propensity scores and regression and computes the overall estimator of treatment effect as an average of regression-adjusted estimators within each strata. Specifically, for stratum s, a regression-based estimator of treatment effect is computed as follows:
,
where Qs is the set of indices of subjects with estimated propensity scores falling within stratum s; ns is the size of stratum s; denotes the predicted outcome for subject j using regression of the form fitted to the data within stratum s, with predictors T and X, and representing the vector of estimated parameters. When a linear regression model is used, as in the following example, the estimated treatment effect within stratum is simply the estimated regression coefficients for treatment, .
In this example, shown in Program 5.10, the estimated propensity scores are divided into K=5 strata and the overall SR estimator of treatment effect is computed as an average of strata-specific estimators based on linear regression within each stratum, . The estimate of standard error can be obtained using bootstrapping, as explained in Section 5.3.6.
This alternative stratified estimator may be more robust compared to the IPW estimator when the estimated propensity for the treatment actually received is close to 0 for a few patients, resulting in these few subjects having undue impact on the overall estimated treatment difference. Program 5.10 shows the SAS code to perform the SR method and Output from Program 5.10 shows the results. The SR estimate for the treatment effect is 0.0099.
/* THIS STEP CAN BE REPLACED WITH ANY METHOD FOR OBTAINING PROPENSITY SCORES */
PROC LOGISTIC DATA = ANALDATA DESC NOPRINT;
MODEL BISMORE = &VARLIST;
OUTPUT OUT=PRED PREDICTED=P;
RUN;
/* DEFINE THE QUINTILES WITH PROC RANK (GROUP=5) */
PROC RANK DATA = PRED OUT=PRED GROUPS=5;
VAR P;
RANKS PS_STRATA;
RUN;
PROC SORT DATA=PRED; BY PS_STRATA; RUN;
ODS LISTING CLOSE;
ODS OUTPUT SOLUTIONF=SF;
PROC MIXED DATA = PRED (WHERE=(PS_STRATA NE .));
BY PS_STRATA;
MODEL FNBMD_C = BISMORE &VARLIST/SOLUTION;
RUN;
ODS LISTING;
TITLE 'ESTIMATING TREATMENT EFFECT BY ADJUSTING FOR BASELINE COVARIATES WITHIN EACH STRATA';
PROC MEANS DATA = SF N MEAN;
CLASS PS_STRATA;
TYPES PS_STRATA ();
VAR ESTIMATE;
where effect = 'BISMORE';
RUN;
ESTIMATING TREATMENT EFFECT BY ADJUSTING FOR BASELINE COVARIATES WITHIN EACH STRATA
The MEANS Procedure
Analysis Variable : Estimate
N Obs |
N | Mean |
5 | 5 | 0.0099243 |
Analysis Variable : Estimate
Rank for Variable P |
N Obs |
N | Mean |
0 | 1 | 1 | 0.0144995 |
1 | 1 | 1 | 0.0102909 |
2 | 1 | 1 | 0.0023628 |
3 | 1 | 1 | 0.0143711 |
4 | 1 | 1 | 0.0080970 |
For methods involving multiple imputation, try several imputation models using different sets of predictors for imputation. It has been suggested (Meng, 1994; Schafer, 1997) that it is better to include as many variables as possible in one’s imputation model, because omitting an important covariate results in an incorrect analysis while keeping an irrelevant predictor may only lower efficiency. Also, a recent simulation study comparing performance of different imputation strategies for propensity-based estimation (Crowe, Lipkovich, and Wang, 2009) showed that including treatment (T) and treatment outcome (Y) in imputation models improves estimation of the treatment effect. While it may appear counterintuitive to incorporate future outcomes in the imputation model for baseline covariates, it makes the imputation model more compatible with the analysis model (ANOVA for Y and T, in our case) and, therefore, helps reduce bias in estimating the treatment effect (Meng, 1994). Note that, in our examples of multiple imputation in Sections 5.3.3 and 5.3.5, we include both treatment and outcome variables in addition to baseline covariates in the VAR statement of PROC MI.
It is useful to evaluate the impact of missingness by examining the fraction of missing information (FMI), which is part of the output from PROC MIANALYZE. While the FMI may be substantial when estimating some of the coefficients of the propensity score model, the overall impact of missing covariates may be fairly low when estimating treatment effects (which, after all, is the ultimate goal of one’s analysis). This is the case because variables that have the largest FMI may contribute little to the probability of treatment assignment. In our example, the FMI for estimating the treatment difference was only about 2% (see the output in Section 5.3.3); for some variables included in the logistic model, it was fairly large, such as for FAMHXBCN (30%) and GAILMORE (31%). However, these variables were not significant predictors of treatment assignment, given other covariates, and, therefore, their missing values did not contribute much to the uncertainty associated with estimating treatment effects. On the other hand, the FMI for the most significant predictor of treatment assignment, LTOTBMDR, FNC, and NECKBMDR was estimated as <0.1%, 0.1%, and 0.1%, respectively. Program 5.11 shows the SAS code of PROC MIANALYZE for logistic regression coefficients. Output from Program 5.11 shows the results.
/**** ASSESING FRACTION MISSING INFORMATION FOR COEFFICIENTS IN PS MODEL *****/
PROC MI DATA = ANALDATA ROUND=.001 NIMPUTE=5 SEED=6731205 OUT=IMPUTED_DATA NOPRINT;
VAR &VARLIST FNBMD_C BISMORE;
RUN;
ODS LISTING;
ODS OUTPUT PARAMETERESTIMATES = _EST
COVB=_COV;
PROC LOGISTIC DATA = IMPUTED_DATA;
MODEL BISMORE = &VARLIST/COVB;
OUTPUT OUT=PRED PREDICTED=P;
BY _IMPUTATION_;
RUN;
TITLE 'ANALYSIS RESULTS USING THE MULTIPLE IMPUTATION (MI) METHOD';
DATA _EST;
SET _EST;
RENAME VARIABLE=PARAMETER;
KEEP _IMPUTATION_ VARIABLE ESTIMATE;
RUN;
PROC MIANALYZE PARMS= _EST COVB=_COV;
MODELEFFECTS INTERCEPT &VARLIST;
ODS OUTPUT PARAMETERESTIMATES=MI_EST
VARIANCEINFO=MI_VAR;
RUN;
Multiple Imputation Variance Information
Relative | Fraction | ||||
Increase | Missing | Relative | |||
Parameter | in Variance | Information | Efficiency | ||
intercept | 0.003791 | 0.003784 | 0.999244 | ||
AGE | 0.041368 | 0.040481 | 0.991969 | ||
BMIR | 0.003756 | 0.003749 | 0.999251 | ||
FAMHXBCN | 0.380552 | 0.301688 | 0.943096 | ||
GAILMORE | 0.395063 | 0.310298 | 0.941567 | ||
KHYSYN | 0.000864 | 0.000863 | 0.999827 | ||
LSC | 0.008437 | 0.008401 | 0.998323 | ||
LTOTBMDR | 0.000264 | 0.000264 | 0.999947 | ||
FNC | 0.001002 | 0.001001 | 0.999800 | ||
NECKBMDR | 0.001434 | 0.001433 | 0.999713 | ||
NVFX | 0.001993 | 0.001991 | 0.999602 | ||
PREVHRT | 0.003607 | 0.003601 | 0.999280 | ||
PREVVERT | 0.008648 | 0.008610 | 0.998281 | ||
AESCORE | 0.007936 | 0.007904 | 0.998422 | ||
SMOKE | 0.073996 | 0.071100 | 0.985979 | ||
SQ | 0.010930 | 0.010870 | 0.997831 | ||
VFX | 0.002259 | 0.002256 | 0.999549 |
Multiple Imputation Parameter Estimates
Parameter | Estimate | Std Error | 95% Confidence | Limits | DF |
intercept | -3.910976 | 1.415616 | -6.68554 | -1.13641 | 280408 |
AGE | 0.032916 | 0.016281 | 0.00099 | 0.06484 | 2534.8 |
BMIR | 0.011069 | 0.027360 | -0.04256 | 0.06469 | 285733 |
FAMHXBCN | -0.131857 | 0.410049 | -0.95444 | 0.69073 | 52.643 |
GAILMORE | -0.130776 | >0.168963 | -0.47017 | 0.20862 | 49.879 |
KHYSYN | -0.042621 | >0.230992 | -0.49536 | 0.41011 | 5.37E6 |
LSC | 5.866674 | >1.922807 | 2.09796 | 9.63539 | 57142 |
LTOTBMDR | 2.713227 | >0.893752 | 0.96150 | 4.46495 | 5.73E7 |
FNC | 9.155874 | >2.772187 | 3.72249 | 14.58926 | 3.99E6 |
NECKBMDR | 4.221311 | >1.430553 | 1.41748 | 7.02515 | 1.95E6 |
NVFX | -0.255861 | >0.294621 | -0.83331 | 0.32159 | 1.01E6 |
PREVHRT | -0.245808 | 0.210377 | -0.65814 | 0.16652 | 309625 |
PREVVERT | -0.338755 | 0.372863 | -1.06957 | 0.39206 | 54417 |
AESCORE | -0.015502 | 0.007301 | -0.02981 | -0.00119 | 64523 |
SMOKE | 0.562806 | 0.311964 | -0.04951 | 1.17512 | 842.65 |
SQ | -0.134405 | 0.18 4790 | -0.49660 | 0.22779 | 34218 |
VFX | -0.600087 | 0.282802 | -1.15437 | -0.04581 | 787654 |
One major drawback for the IPW estimator is that it can produce unstable estimators if there are extreme large or small weights when the propensity scores are close to 0 or 1. One immediate remedy is to set boundaries for the weights. For example, for propensity scores < δ, we set them to be δ; for propensity scores > 1 − δ, we set propensity scores to be 1 − δ, where δ is an empirical small number. In our experience, we used δ = 0.001 in some simulations, and it yields great results. For the data example presented in Section 5.3, there were no very small or large weights, so we did not use this empirical technique. Program 5.12 shows how to set boundaries for weights to produce stable estimators.
%LET DELTA = 0.001;
DATA PRED;
SET PRED;
IF . < P < &DELTA. THEN P = 0.001;
IF P > 1-&DELTA. THEN P = 1-&DELTA.;
RUN;
We have presented several methods of estimating propensity scores in the presence of missing values for some covariates. The CC method simply includes only subjects without any missing covariates. The IND method creates dummy variables to indicate the missingness. The MP method estimates the propensity scores within each missingness pattern. For some missingness patterns with a small number of observations, we proposed a method to pool small missingness patterns according to their similarities. The MI method estimates propensity scores with missing values imputed by multiple imputations. The MIMP method essentially combines the MI and MP methods by first imputing missing values with multiple imputations and then including information on the missingness pattern as a covariate in estimating propensity scores.
These approaches to estimating propensity scores can be combined with any propensity score-based method to estimate the treatment difference in the presence of treatment selection bias. In this chapter, all the analytic methods presented were based on the assumption of two treatment groups and used standardized inverse-probability weighted ANOVA with estimated propensities from a logistic regression. When more than two treatment arms are present, propensities are vectors of probabilities that sum to 1. The IPW approach is thus easily extended, with probabilities estimated using a multinomial logistic model (e.g., in SAS using PROC LOGISTIC with GLOGIT in its MODEL statement). Sometimes multiple treatment groups can be naturally ordered (e.g., when representing increasing levels of exposure/dose), and propensity scores can be estimated using ordinal logistic regression as shown in Leon and Hedeker (2005).
If the proportion of subjects with at least one missing value is low (e.g., < 10%), intuitively all approaches handling missing values will produce similar results. Therefore, one can choose the most convenient method such as the complete covariate analysis.
Carpenter, J. and J. Bithell. 2000. “Bootstrap confidence intervals: when, which, what? A practical guide for medical statisticians.” Statistics in Medicine 19: 1141–1164.
Concato, J., N. Shah, and R. I. Horwitz. 2000. “Randomized, controlled trials, observational studies, and the hierarchy of research designs.” The New England Journal of Medicine 342: 1887–1892.
Costantino. J., M. Gail, S. Pee, et al. 1999. “Validation studies for models projecting the risk of invasive and total breast cancer incidence.” Journal of the National Cancer Institute 91: 1541–1548.
Crowe, B., I. Lipkovich, and O. Wang. 2009. “Comparison of several imputation methods for missing baseline data in propensity scores analysis of binary outcome.” Pharmaceutical Statistics. In Press. DOI: 10.1002/pst.389.
D’Agostino, R. B. 1998. “Tutorial in biostatistics: propensity score methods for bias reduction in the comparison of a treatment to a non-randomized control group.” Statistics in Medicine 17(19): 2265–2281.
D’Agostino, Jr., R. B., and D. B. Rubin. 2000. “Estimating and using propensity scores with partially missing data.” Journal of the American Statistical Association 95: 749–759.
D’Agostino, R., W. Lang, M. Walkup, and T. Morgon. 2001. “Examining the impact of missing dData on propensity score estimation in determining the Effectiveness of self-monitoring of blood glucose (SMBG).” Health Services & Outcomes Research Methodology 2: 291–315.
Delmas, P. D., K. E. Ensrud, J. D. Adachi, et al. 2002. “Efficacy of raloxifene on vertebral fracture risk reduction in postmenopausal women with osteoporosis: four-year results from a randomized clinical trial.” The Journal of Clinical Endocrinology & Metabolism 87(8): 3609–3617.
Efron, B. 1987. “Better bootstrap confidence intervals.” Journal of the American Statistical Association 82: 171–185.
Horvitz, D. G., and D. J. Thompson. 1952. “A generalization of sampling without replacement from a finite universe.” Journal of the American Statistical Association 47(260): 663–685.
Ibrahim, J., S. Lipitz, and M. Chen. 1999. “Missing covariates in generalized linear models when the missing data mechanism is non-ignorable.” Journal of the Royal Statistical Society, Series B (Statistical Methodology) 61: 173–190.
Leon, A. C., and D. Hedeker. 2005. “A mixed-effects quintile-stratified propensity adjustment for effectiveness analyses of ordered categorical doses.” Statistics in Medicine 24: 647–658.
Little, R. J., and D. B. Rubin. 2002. Statistical Analysis with Missing Data. 2d ed. New York: Wiley-Interscience.
Lunceford, J. K., and M. Davidian. 2004. “Stratification and weighting via the propensity score in estimation of causal treatment effects: a comparative study.” Statistics in Medicine 23: 2937–2960.
Meng, X. L. 1994. “Multiple-imputation inferences with uncongenial sources of input.” Statistical Science 9: 538–573; discussion 558–573.
Qu, Y., and I. Lipkovich. 2009. “Propensity score estimation with missing values using a multiple imputation missingness pattern (MIMP) approach.” Statistics in Medicine. 28:1402–1414.
Robins, J. M., A. Rotnitzky, and L. P. Zhao. 1994. “Estimation of regression coefficients when some regressors are not always observed.” Journal of the American Statistical Association 89: 846–866.
Robins, J. M., and N. Wang. 2000. “Inference for imputation estimators.” Biometrika 87: 113–124.
Rosenbaum, P. R. 1987. Model-based direct adjustment.” Journal of the American Statistical Association 82: 387–394.
Rosenbaum, P. R., and D. B. Rubin. 1983. “The central role of the propensity score in observational studies for causal effects.” Biometrika 70: 41–55.
Rosenbaum, P. R., and D. B. Rubin. 1984. “Reducing bias in observation studies using subclassification on the propensity score.” Journal of the American Statistical Association 79: 516–524.
Rubin, D. B. 1978. “Multiple imputations in sample surveys: a phenomenological Bayesian approach to nonresponse.” In the ASA Proceedings in theSection on Survey Research Methods 20–28. Alexandria, VA: American Statistical Association.
Rubin, D. B. 1987. Multiple Imputation for Nonresponse in Surveys. New York: John Wiley & Sons, Inc.
Schafer, J. L. 1997. Analysis of Incomplete Multivariate Data. London: Chapman and Hall/CRC.
Wang, N., and J. M. Robins. 1998. “Large-sample theory for parametric multiple imputation procedures.” Biometrika 85: 935–948.
Xie, J., and C. Liu. 2005. “Adjusted Kaplan-Meier estimator and log-rank test with inverse probability of treatment weighting for survival data.” Statistics in Medicine 24: 3089–3110.