A
agglomerative clustering methods 179
Akaike Information Criterion (AIC) 265, 280
ALPHA=statement, DATA step 224
ANOVA model
evaluating propensity scores 27–28
evaluating treatment differences 107
imputation strategies and 126
LC analysis and 155
PSBB and 317
association
causation vs. 5–7
defined 5
assumption of exchangeability 88
B
B/B (blocking/balancing) score
cluster membership and 187–189
defined 187
balancing score
See also propensity score (PS)
assessing balance in baseline characteristics 55–58
defined 12, 55, 187
evaluating across treatment groups 205–207
Bayesian simulation method 364
bias
See also selection bias
hidden 7
in observational research 385–390
overt 7
sample size and 9, 343
binary outcomes 398–400, 411–414, 419–422
blind assessments 388
blocking, key roles played 152
BOOT macro 122–123
BOOTCI macro 122–123
BOUNDS statement, NLP procedure 245–246
BOXPLOT procedure 36
BPRS (Brief Psychiatric Rating Scale) 266–267, 278
BY statement 102
C
CA (covariate adjustment) method
defined 183
LC analysis and 165
mortality rates analyses 157–158, 160
of propensity scores 62
RCTs and 347, 356
Cardiac Care Network (CCN) 62–65
case-control design 9
CATMOD procedure 225
causation, association vs. 5–7
CC (complete covariate) method
bootstrap confidence intervals 123
defined 106
IPW estimation with missing values 109–110
CCN (Cardiac Care Network) 62–65
CD Trial 351–358
CDF (cumulative distribution function) 153–154
CEA
See cost-effectiveness analysis
CEAC (cost-effectiveness acceptability curve) 344–346, 355–356
censoring
cost-effectiveness analysis and 363–382
induced informative 364
parameter estimation and 347–351
CLASS statement
DR considerations 102
INB example 359
UNIVARIATE procedure 136
Clinical Global Impression scale 389
clustering
alternatives to try 178
defined 154
JMP considerations 168–170
review of concepts 178–179
sensitivity analysis and 178–179
treatment effects 155
cohort studies 9
COMMON_SUPPORT option, WTMODEL statement 90, 92, 97–98
complete covariate method
See CC (complete covariate) method
confidence intervals
bootstrap method 122–123
dose-response analyses 307
ICER and 368
nonparametric bootstrapping and 318
PSBB and 323
confounding
defined 7
examples of 7–8
in observational studies 7–8
research checklist 289, 292–293
unmeasured 29, 214
continuous outcomes
calculating sample size 413, 416–419
DR macro and 92
longitudinal data 392–398
propensity score and 59
correlation 5–7, 415
cost-effectiveness acceptability curve (CEAC) 344–346, 355–356
cost-effectiveness analysis
about 363–365
incremental net benefit 339–362
propensity score bin bootstrapping 315–337
with censored data 363–382
counterfactual causal effect 6
COVARIANCE=option, NLP procedure 245, 248
covariate adjustment
See CA (covariate adjustment) method
Cox proportional hazards model
dose-response analyses 297, 380
propensity score matching and 61, 78–80
cross-sectional studies
defined 9
general design issues 386
cumulative distribution function (CDF) 153–154
D
DATA=option, NLP procedure 245
DATA step
ALPHA= statement 224
creating residuals 255
databases
dose-response analyses 295–311
good research practices 289–294
retrospective 287–288
defibrillator study 365, 369–380
dependent variables
confounding and 7
in logistic regression 25
missing patterns 414–422
DES/BMS safety and efficacy study 62–80
dichotomous outcomes 59–60, 93, 99–101
DISCRETE option, MODEL statement 144
DIST=option, MODEL statement 90–91
divisive clustering methods 179
dose-response analyses 295–311
doubly robust (DR) estimation
assumptions 87–88
conceptual overview 86
defined 85
implementing DR macro 88–94
limitations 101–102
practical considerations 102
sample analysis 95–101
statistical expression 87–88
DR macro
output from 89–94
specifying outcome regression models 89
specifying weight model 88–89
E
eCDF (empirical cumulate distribution function) 174, 177
EM/ECM algorithm 106
empirical cumulate distribution function (eCDF) 174, 177
end-stage renal disease (ESRD) study 296–310
endogeneity
See selection bias
erythropoiesis-stimulating agent (ESA) 296–297
ESRD (end-stage renal disease) study 296–310
ESTIMATE statement, GENMOD procedure 221
estimating
ICER 368
in structured nested models 236–239
mean cost 366–368
parameters 346–351
propensity scores 24–25, 52–53, 64–65
treatment effect 26–27, 58–61
with RMLPS 265
Euro-QOL %D scale 389
evaluating propensity scores 27–28
exchangeability, assumption of 88
experimental studies
defined 3
observational studies vs. 3–5
exposure
See also independent variables
DR analysis of probability 97
effect on outcomes 74–80
treatment and 3n
F
FDA (Food and Drug Administration) 288
FHS (Framingham Heart Study) 5
Fieller's Theorem 368, 380
FMI (fraction of missing information) 126
Food and Drug Administration (FDA) 288
fraction of missing information (FMI) 126
Framingham Heart Study (FHS) 5
FREQ procedure
IV method for addressing selection bias 134, 139
propensity score example 33
PSBB and 323
G
G-computation formula (Robins) 264
gamma distribution 333, 351, 354–356
GEE (generalized estimating equation) 60, 304–305, 410
generalized models 25, 316
GENISOS (Genetics vs. Environment In Scleroderma Outcome Study) 419–422
GENMOD procedure
dose-response analyses 300, 304, 307
doubly robust estimation 100
ESTIMATE statement 221
INB example 359
IV method for addressing selection bias 148
LINK=LOGIT option 216
propensity score example 36–37
PSBB and 318, 320, 323
schizophrenia study 275
WEIGHT statement 221
GLM procedure
calculating least squares means 138
creating residuals 254–255
IV method for addressing selection bias 147–148
PSBB and 318, 320
SNM example 253
standardized weights and 109
GLOGIT option, MODEL statement 128
GMATCH macro 65
goodness-of-fit testing 25
greedy matching method 53–54
GROUPS=option, RANK procedure 32
H
hazard ratio (HR) 296, 304, 307
Health Collaborative Depression Study (NIH) 198–207
health maintenance organization (HMO) 134, 136
health-related quality of life (HRQOL) 232, 239
Health Services Research journal 132
hemodialysis study 296–310
heteroskedasticity 168
hidden biases 7
HIPAA 288
HMO (health maintenance organization) 134, 136
homoskedasticity 168
HR (hazard ratio) 296, 304, 307
HRQOL (health-related quality of life) 232, 239
I
ICD-9 code system 298
ICER (incremental cost-effectiveness ratio) 316–323, 340–345, 368–379
IML procedure 259–260
INB (incremental net benefit)
about 339–340
CD Trial 351–358
cost-effectiveness analysis 341–346, 365
defined 316
observational studies 359
parameter estimation 346–351
incremental cost-effectiveness ratio (ICER) 316–323, 340–345, 368–379
incremental net benefit
See INB (incremental net benefit)
IND (indicator variable) method
defined 106
IPW estimation with missing values 110–111
independent variables
confounding and 7
in logistic regression 25
missing patterns 414–422
indicator variable (IND) method
defined 106
IPW estimation with missing values 110–111
induced informative censoring 364
instrumental variable method
See IV (instrumental variable) method
International Society for Pharmacoeconomics and Outcomes Research (ISPOR) 288, 316–317
inverse probability weight approach
See IPW (inverse probability weight) approach
IPW (inverse probability weight) approach
CA method and 158, 160
defined 183
dose-response analyses 300–304
doubly robust estimation and 86
estimating treatment differences 107
handling extreme weights 127–128
mortality rates analyses 162–163
sensitivity analysis 124
with missing values 109–123
ISPOR (International Society for Pharmacoeconomics and Outcomes Research) 288, 316–317
ITT LOCF analyses 211–212, 227
IV (instrumental variable) method
addressing selection bias 131–150
applying to case study 139–143
case study description 134–137
challenges 147–148
clustering support 155
defined 12, 183
least squares regression method 138
overview 131–133
QLIM procedure and 143–146
regression adjustment method comparison 146–147
sensitivity analysis and 13
J
JMP
See also LC (local control) analysis
Analyze menu 166, 180
clustering considerations 168–170
launching files 166
LTD distribution considerations 167–182
Next Number of Clusters dialog box 169
Open Data File dialog box 167
Select Columns dialog box 167, 175
K
Kaplan-Meier estimator 366, 370, 380
Kaplan-Meier survival curves 60–61, 76–77
L
LATEs (Local Average Treatment Effects) 155
LC (local control) analysis
defined 183
determining distribution saliency 164, 174–177
fundamental concepts 153–154
identifying baseline characteristics 165, 179–182
patient registry data analysis 156–163
performing sensitivity analysis 165, 178–179
problems with randomization 152–153
revealing bias 163–174
statistical methods for 154–155
tactical phases 163–182
LCLF Study 239–252
least squares regression method 138
LIFETEST procedure 370
likelihood ratio test 198, 203
LINK=GLOGIT option, LOGISTIC procedure 216
LINK=LOGIT option, GENMOD procedure 216
Local Average Treatment Effects (LATEs) 155
local control analysis
See LC (local control) analysis
local treatment differences
See LTD distributions
log-likelihood 239
log-rank test 405–409
LOGISTIC procedure
creating residuals 254–255
IV method for addressing selection bias 148
LINK=GLOGIT option 216
MODEL statement 128
propensity score example 32–33
schizophrenia study 267, 271
logistic regression
DR estimation and 89–90, 95–96
estimating propensity scores 25, 27–28
mixed-effects model 196, 199–202
longitudinal observational study
between-group comparison 410–422
continuous outcomes 392–398
defined 386
model of propensity for treatment 195–209
NIH Health Collaborative Depression Study 198–207
regression models 263–283
sensitivity analysis in 224–227
treatment effectiveness analyses 197–198, 203–205, 211
two-stage propensity adjustment 195–209
LTD distributions
defined 152–154
determining saliency 174–177
identifying baseline characteristics 179–182
in LC tactical phases 164–165
JMP considerations 167–182
mortality rates analyses 161
systemic sensitivity analyses 178–179
M
MADIT (Multicenter Automatic Defibrillator Implantation) 365, 369–380
Mantel-Haenszel procedure 198
Mantel-Haenszel test 398, 400–405
many-to-one matching 55, 57
marginal structural model
See MSM (marginal structural model)
MATCH macro 65
matched sets
defined 53
forming for propensity scores 53–55
maximum likelihood data analysis 238–252
Mayo Clinic case study 65–68
McNemar's test 59–61, 75
MEANS procedure
propensity score example 37
standardized weights and 109
summary statistics and 225
measurement bias 385–390
medical claims databases 287–294
medication effectiveness study 30–46
MEDLINE database 317
Meta-analysis Of Observational Studies in Epidemiology (MOOSE) 14
methods
See statistical methods
MI (multiple imputation) method
bootstrap confidence intervals 123
CEA with censored data and 364
defined 106
IPW estimation with missing values 111–115, 120
MI procedure 112–113
MIANALYZE procedure
calculating FMI 126
IPW estimation with missing values 113–115, 121
Microsoft EXCEL 356
MIMP (multiple imputation missingness pattern) method
bootstrap confidence intervals 123
defined 107
IPW estimation with missing values 120–122
sensitivity analysis 124
missing patterns 414–422
missing values
CC analysis 109–110
CEA with censored data 364
censoring and 364
classifying 364
data quality and 288
examples handling 107–108
IND analysis 110–111
IPW estimation with 109–123
MI analysis 111–115
MIMP analysis 120–122
MP analysis 115–119
MSM methodology and 213, 227
propensity scoring with 105–130
research checklist 289–291
missingness pattern (MP) method
See MP (missingness pattern) method
MIXED procedure
IPW estimation with missing values 113
PSBB and 318, 320
standardized weights and 108–109
MODEL statement
DISCRETE option 144
DIST= option 90–91
DR considerations 102
GLOGIT option 128
schizophrenia study 280
SELECT option 146
specifying outcome regression models 89
specifying weigh model 88–89
Monte Carlo simulations 62
MOOSE (Meta-analysis Of Observational Studies in Epidemiology) 14
MORE (Multiple Outcomes of Raloxifene Evaluation) 107
mortality rates analyses
CA method for 157–158
estimated propensity score and 158–161
IPW model 162–163
MP (missingness pattern) method
bootstrap confidence intervals 123
defined 106
IPW estimation with missing values 115–119
sensitivity analysis 124
MSM (marginal structural model)
defined 12
dose-response analyses 295–311
LCLF Study and 239
methodology overview 213–214
notation for 212
schizophrenia study 214–227
sensitivity analysis and 13
treatment effectiveness analysis model 221–223
Multicenter Automatic Defibrillator Implantation (MADIT) 365, 369–380
multiple imputation method
See MI (multiple imputation) method
multiple imputation missingness pattern method
See MIMP (multiple imputation missingness pattern) method
Multiple Outcomes of Raloxifene Evaluation (MORE) 107
N
National Institutes of Health (NIH)
Framingham Heart Study 5
Health Collaborative Depression Study 198–207
Nelson-Aalen estimator 406
neural networks 25
Newton-Raphson algorithm 411
NIH (National Institutes of Health)
Framingham Heart Study 5
Health Collaborative Depression Study 198–207
NLP procedure
BOUNDS statement 245–246
COVARIANCE= option 245, 248
DATA= option 245
LCLF Study and 243–252
loading starting values 254
maximizing log-likelihood 256–259
output considerations 248–251
PARMS statement 245
PCOV option 245
VARDEF= option 245
NNT (number needed to treat) 59
non-experimental studies
See observational studies
nonparametric bootstrapping 318
nonparametric density estimates 56
NPAR1WAY procedure 317
number needed to treat (NNT) 59
O
observational studies
See also longitudinal observational study
addressing research bias 385–390
association vs. causation 5–7
confounding in 7–8
cost estimation in 363–364
defined 3, 232
experimental studies vs. 3–5
general approaches to data analysis 183–184
general design issues 386–387
good research practices 287–294
incremental net benefit 359
issues in 5–9
methods 10–13
replicability of 8–9
reporting guidelines 13–14
sample size calculation 391–425
selection bias in 7–8, 391
sensitivity analysis and 13
study design 9–10
Type I error in 8–9
observer bias 386–387
odds ratio
discouraging use of 60
propensity score example 33–34
ODS OUTPUT statement 244–245
one-to-one matching 55
optimal matching method 54
outcome regression
doubly robust estimation and 86
specifying models 89
outcomes
See also continuous outcomes
See also dependent variables
binary 398–400, 411–414, 419–422
dichotomous 59–60, 93, 99–101
effect of exposures on 74–80
time-to-event 60–61
OUTPUT statement 144
overt biases 7
P
pair matching on propensity scores 55
paired t-test
continuous outcomes and 59
IV method for addressing selection bias 146
parameter estimation 346–351
parametric models
CA methods and 165, 183
LC analysis and 183–184
SNMM and 237–238, 242–243
PARMS statement, NLP procedure 245
partitioned estimator 365
Patient Health Questionnaire (PHQ) 30–46
patient registry data analyses 156–163
PCOV option, NLP procedure 245
pharmacoeconomics
CEA with censored data 363–382
good research practices 288
incremental net benefit 339–362
propensity score bin bootstrapping 315–337
reporting guidelines 14
pharmacoepidemiology
good research practices 287
reporting guidelines 14
PHQ (Patient Health Questionnaire) 30–46
PHREG procedure 304
PLOT procedure 320
Poisson regression model 100–101
POWER procedure 394, 407, 409
PREDICTED option, OUTPUT statement 144
probit regression
defined 52
estimating propensity scores 24
IV method for addressing selection bias 144
propensity score (PS)
See also longitudinal observational study
See also RMLPS (Regression Models on Longitudinal Propensity Scores)
advantages of 29–30
bias reduction methods 105–106
computing 32–33
defined 12, 24, 52, 183
doubly robust estimation and 84–104
estimating 24–25, 52–53, 64–65
evaluating 27–28
evaluating balance produced 36–46
fundamental theorem 185–186
limitations of 29–30
medication effectiveness study 30–46
mortality rates analyses 158–161
objective of 24, 27
problems estimating 186–187
quintile classification 197
regression adjustment of 26–30, 60
sensitivity analysis and 13
stratification of 26–27, 62, 198
with missing values 105–130
propensity score bin bootstrapping (PSBB)
about 315–319
schizophrenia study 320–334
propensity score matching
assessing balance in baseline characteristics 55–58
compared to other methods 62
DES/BMS safety and efficacy study 62–80
estimating propensity scores 52
estimating treatment effect 58–61
forming matched sets 53–55
overview 52
sensitivity analysis for 61–62
PS
See propensity score (PS)
PSBB (propensity score bin bootstrapping)
about 315–319
schizophrenia study 320–334
Q
QLIM procedure 143–146
quantile-quantile plots 56
R
randomization
fundamental problems with 152–153
key roles played 152
randomized controlled trials
See RCTs (randomized controlled trials)
RANK procedure
GROUPS= option 32
propensity score example 32
PSBB and 323
schizophrenia study 267, 323
RCTs (randomized controlled trials)
as gold standard 23
association vs. causation 5–7
bias in 4
CD Trial 351–358
RCTs (randomized controlled trials) (continued)
CONSORT statement for 14
covariate adjustment 347, 356
defined 3
generalizability of 4
measurement bias 386
methodological considerations 10–13
observer bias 388
randomization in 58
replicability of 8
reporting guidelines 14
strengths of 3–4
study designs 9–10
RDC (Research Diagnostic Criteria) 198
Receiver Operating Characteristic
See ROC (Receiver Operating Characteristic)
REG procedure 148
regression adjustment method
IV method comparison 146–147
of propensity scores 26–30, 60
Regression Models on Longitudinal Propensity Scores
See RMLPS (Regression Models on Longitudinal Propensity Scores)
replicability of observational studies 8–9
reporting guidelines for observational studies 13–14
Research Diagnostic Criteria (RDC) 198
research practices
addressing study bias 385–390
checklist for 289–294
retrospective databases and 287–288
response analyses using large databases 295–311
RMLPS (Regression Models on Longitudinal Propensity Scores)
about 263–265
estimation 265
schizophrenia study 266–282
Robins' G-computation formula 264
ROC (Receiver Operating Characteristic)
CA method 157
defined 58
IPW method 162
propensity score estimates 158
S
salient treatment differences 153
sample size
bias and 9, 343
calculating in observational studies 391–425
checklist considerations 289, 291
confounding and 9
in retrospective database studies 293–294
nonparametric bootstrapping and 318
skewness and 317
schizophrenia study
MSM and 214–227
PSBB and 320–334
RMLPS and 266–282
scleroderma 419–422
SELECT option, MODEL statement 146
selection bias
defined 387
good research practices and 288
in observational studies 7–8, 391
in RCTs 4
IV method addressing 131–150
revealing in LC analysis 163–174
treatment 52
types of 7
sensitivity analysis
clustering and 178–179
different imputation strategies 126–127
for local control approach 165, 178–179
for propensity score matching 61–62, 75
handling extreme weights 127–128
importance of 13, 27
longitudinal observational study 224–227
schizophrenia study 280–282
varying analytic methods 124–125
SHOWCURVES option, WTMODEL statement 89–90, 97, 100
side-by-side box plots 56
skewness
in cost data 347, 351, 363–364
sample size and 317
smearing method 364
SNM (structured nested model) 12
SNMM (Structural Nested Mean Model)
about 231–234
estimation 236–239
getting starting values 254–256
LCLF Study 239–252
maximum likelihood data analysis 238–252
parametric models and 237–238
time-varying moderation and 235–236
sponsor bias 385–390
SR (stratified regression) approach 124–125
standard error 92–93
standardized differences method 56–57, 68–74
statistical methods
commonly used tools 12
for assessing balance in matched samples 56
for CEA with censored data 364–368
for local control analysis 154–155
issues and considerations 10–11
research checklist 289, 293
rule suggestions 11
stratification of propensity scores 26–27, 62, 198
stratified regression (SR) approach 124–125
STROBE (STrengthening the Reporting of OBservational studies in Epidemiology) 14
Structural Nested Mean Model
See SNMM (Structural Nested Mean Model)
structured nested model (SNM) 12
study designs
general issues 386–387
hierarchy of 10
in observational studies 9–10
survival data study 405–409
systemic sclerosis 419–422
T
t-test
paired 59, 146
PSBB and 316–317
two-sample 393–395
TABULATE procedure 134–135, 139
temporality 7
time-to-event outcomes 60–61
time-varying moderation
about 232–233
in LCLF study 241–242
notation for 234–235
SNMM and 235–236
Transparent Reporting of Evaluations with Nonrandomized Designs (TREND) 14
treatment
See also independent variables
estimating differences 107
evaluating balance across groups 205–207
exposure and 3n
longitudinal model of propensity for 195–209
salient differences in 153
within clusters 155
treatment effects
adjusted for selection bias 133
bias reduction in 105
clustering and 155
computing 33–36
doubly robust estimation of 84–104
estimating 26–27, 58–61, 124–128
evaluation challenges 195
longitudinal observational study 197–198, 203–205, 211
MSM methodology 221–223
sensitivity analysis 124–128
TREND (Transparent Reporting of Evaluations with Nonrandomized Designs) 14
TTEST procedure
IV method for addressing selection bias 134, 136–137, 139, 146
PSBB and 333
Tukey-Kramer adjustment for multiple comparisons 138
two-sample log-rank test 405–409
two-sample t-test 393–395
two-sample test on binary outcome 399–400
two-stage regression 244, 253–256
type I error
in observational studies 8–9
random effects and 348
U
UNIVARIATE procedure
CLASS statement 136
IV method for addressing selection bias 136
schizophrenia study 279
unmeasured confounding 29, 214
USRDS (United States Renal Data System) 298
V
VARDEF=option, NLP procedure 245
variables
binary 398–400, 411–414, 419–422
checklist for definitions 289, 291
continuous 392–398, 413, 416–419
dependent 7, 25, 414–422
independent 7, 25, 414–422
VMATCH macro 65
W
WEIGHT statement, GENMOD procedure 221
weighted Mantel-Haenszel test 398, 400–405
WHERE statement 102
WHI (Women's Health Initiative) 5
Wilcoxon rank sum test
about 395–398
asymptotic distribution of 423–424
LC approach and 176
PBSS and 317, 333
willingness-to-pay (WTP) 341, 365
Women's Health Initiative (WHI) 5
WTMODEL statement
COMMON_SUPPORT option 90, 92, 97–98
SHOWCURVES option 88–89, 97, 100
WTP (willingness-to-pay) 341, 365
Z
Z score 93