**Try to compute alpha for parenting items; proc corr data=sdqdata alpha; Par:; run; **Recode and re-run; *Recode option 1 - the long way; data sdqdata; set sdqdata; *recode Par2; if Par2=1 then Par2_recode=6; if Par2=2 then Par2_recode=5; if Par2=3 then Par2_recode=4; if Par2=4 then Par2_recode=3; if Par2=5 then Par2_recode=2; if Par2=6 then Par2_recode=1; *recode Par4; if Par4=1 then Par4_recode=6; if Par4=2 then Par4_recode=5; if Par4=3 then Par4_recode=4; if Par4=4 then Par4_recode=3; if Par4=5 then Par4_recode=2; if Par4=6 then Par4_recode=1; run; *Recode option 2 - arrays!; data sdqdata (drop=i); set sdqdata; array old(*) Par2 Par4; *old variables to recode; array recode(*) Par2_recode Par4_recode; *new versions; do i=1 to dim(old); if old(i)=1 then recode(i)=6; if old(i)=2 then recode(i)=5; if old(i)=3 then recode(i)=4; if old(i)=4 then recode(i)=3; if old(i)=5 then recode(i)=2; if old(i)=6 then recode(i)=1; end; run; **Re-run; proc corr data=sdqdata alpha; var Par1 Par2_recode Par3 Par4_recode Par5; run;
PROC
SURVEYSELECT
, we drew 1000 subsamples of N=50,
1000 subsamples of N=100, and
1000 subsamples of N=250 from
the data set using simple random sampling with replacement. The syntax
to do this is provided below.
**Take 1000 subsamples of N=50; proc surveyselect data=sdqdata method=SRS n=50 out=sdqdata_sub50 seed=5092 reps=1000; run; **Take 1000 subsamples of N=100; proc surveyselect data=sdqdata method=SRS n=100 out=sdqdata_sub100 seed=821 reps=1000; run; **Take 1000 subsamples of N=250; proc surveyselect data=sdqdata method=SRS n=250 out=sdqdata_sub250 seed=291 reps=1000; run;
**Estimate alpha in each subsample; ods output CronbachAlpha=alpha50; *noprint option used to suppress printing and outp used to output data set with alphas; proc corr data=sdqdata_sub50 alpha nosimple nocorr noprob; by Replicate; var Par1 Par2_recode Par3 Par4_recode Par5; run; ods output close; **Print histogram of alphas; ods graphics on /height=3in width=3in; proc sgplot data = alpha50 noautolegend; title "Sample Size = 50"; where Variables='Raw'; histogram Alpha / binwidth=.025 FILLATTRS=(COLOR=cxE5EAF2); xaxis label = 'Alpha' grid values = (.55 to 1 by .1); run; ods graphics off;
proc corr data=marshdata alpha; var GDS:; run;
PROC
SURVEYSELECT
, we drew a random sample of N=50, N=100,
and N=250 from the sample using simple random sampling. We then conducted
2000 bootstrap resamples from each and estimated 95% confidence intervals
around the alphas and item-total correlations. (See Chapter 7 for
more information about bootstrapping methods and syntax.)
subNBoot
to
do these steps for each sample. The syntax is presented below. There
are two arguments that are fed into subNBoot
:
1) ss, which stands for the numeric sample size value that we would
like our subsample to have, and 2) seed, which stands for the numeric
seed value that surveyselect will use to identify the sample. Setting
the seed allows our analyses to be replicated. subNBoot
then
uses a series of procedures that you were introduced to in the current
chapter (PROC CORR
) and Chapter 7 (PROC
SURVEYSELECT
and PROC
UNIVARIATE
) to estimate the CI.
This macro results in a set of data sets that contain the actual estimates
(orig_alpha&ss
[6] and orig_item&ss
)
and the bootstrapped CI (ci_alpha&ss
and ci_item&ss
).
%MACRO subNBoot(ss,seed); *Subsample; proc surveyselect data=marshdata method=SRS n=&ss. out=marsh_sub&ss. seed=&seed; run; *Estimate Item Stats; ods output CronbachAlpha=orig_alpha&ss CronbachAlphaDel=orig_item&ss; proc corr data=marsh_sub&ss. alpha; var GDS:; run; ods output close; *Take Bootstrap Sample; proc surveyselect data = marsh_sub&ss. method = URS samprate = 1 outhits out = outboot_marsh&ss. (compress=binary) seed = 5 rep = 2000; RUN; *Estimate Item Stats for each bootstrap sample; ods output CronbachAlpha=boot_alpha&ss. (compress=binary) CronbachAlphaDel=boot_item&ss. (compress=binary); proc corr data=outboot_marsh&ss. alpha nosimple nocorr noprob; by replicate; var GDS:; run; ods output close; *Estimate CI from bootstrapped results; proc univariate data=boot_alpha&ss.; where variables='Raw'; var Alpha; output out=ci_alpha&ss. pctlpts=2.5, 97.5 mean=Alpha_mean std=Alpha_std pctlpre=Alpha_ci ; run; proc sort data=boot_item&ss. nodupkey; by Variable replicate; run; proc univariate data=boot_item&ss.; by Variable; var RawCorr; output out=ci_item&ss. pctlpts=2.5, 97.5 mean=RawCorr_mean std=RawCorr_std pctlpre=RawCorr_ci ; run; %MEND; %subNBoot(50,3); *Run analysis with subsample of N=50; %subNBoot(100,8321); *Run analysis with subsample of N=100; %subNBoot(250,26); *Run analysis with subsample of N=250;
Subsample
|
Alpha
|
95% CI
|
---|---|---|
N=50
|
0.84
|
(0.74, 0.89)
|
N=100
|
0.88
|
(0.82, 0.91)
|
N=250
|
0.90
|
(0.87, 0.92)
|
Var:
|
N = 50
|
N = 100
|
N = 250
|
|||
---|---|---|---|---|---|---|
Item-total R
|
95% CI
|
Item-total R
|
95% CI
|
Item-total R
|
95% CI
|
|
GDS01
|
.51
|
(.16, .77)
|
.48
|
(.24, .67)
|
.52
|
(.38, .65)
|
GDS02
|
.25
|
(.00, .55)
|
.31
|
(.11, .50)
|
.41
|
(.29, .53)
|
GDS03
|
.36
|
(.11, .66)
|
.53
|
(.30, .70)
|
.58
|
(.44, .69)
|
GDS04
|
.36
|
(-.12, .69)
|
.42
|
(.16, .63)
|
.49
|
(.35, .61)
|
GDS05
|
.32
|
(.03, .59)
|
.38
|
(.09, .61)
|
.55
|
(.41, .66)
|
GDS06
|
.10
|
(-.18, .49)
|
.51
|
(.29, .69)
|
.55
|
(.41, .66)
|
GDS07
|
.52
|
(.33, .77)
|
.50
|
(.23, .69)
|
.47
|
(.29, .61)
|
GDS08
|
.31
|
(.24, .62)
|
.36
|
(.12, .58)
|
.37
|
(.19, .52)
|
GDS09
|
.56
|
(.05, .81)
|
.46
|
(.23, .66)
|
.59
|
(.45, .71)
|
GDS10
|
.23
|
(-.02, .53)
|
.54
|
(.32, .70)
|
.62
|
(.49, .72)
|
GDS11
|
.50
|
(.15, .74)
|
.44
|
(.21, .62)
|
.44
|
(.31, .56)
|
GDS12
|
.39
|
(.09, .65)
|
.43
|
(.24, .61)
|
.31
|
(.18, .44)
|
GDS13
|
.38
|
(-.04, .69)
|
.51
|
(.32, .67)
|
.35
|
(.20, .49)
|
GDS14
|
.26
|
(.02, .54)
|
.00
|
(-.14, .18)
|
.30
|
(.16, .43)
|
GDS15
|
.23
|
(-.03, .53)
|
.63
|
(.40, .78)
|
.37
|
(.20, .52)
|
GDS16
|
.78
|
(.53, .90)
|
.62
|
(.43, .77)
|
.69
|
(.58, .77)
|
GDS17
|
.64
|
(.33, .86)
|
.56
|
(.34, .72)
|
.69
|
(.57, .77)
|
GDS18
|
.31
|
(.23, .62)
|
.34
|
(.11, .55)
|
.35
|
(.18, .51)
|
GDS19
|
.46
|
(.15, .71)
|
.59
|
(.46, .71)
|
.56
|
(.46, .65)
|
GDS20
|
.21
|
(-.07, .54)
|
.25
|
(.03, .47)
|
.37
|
(.25, .50)
|
GDS21
|
.42
|
(.12, .68)
|
.46
|
(.27, .62)
|
.43
|
(.31, .53)
|
GDS22
|
--*
|
(--, --)*
|
.48
|
(.21, .66)
|
.62
|
(.48, .73)
|
GDS23
|
.30
|
(.07, .60)
|
.34
|
(.08, .58)
|
.56
|
(.41, .68)
|
GDS24
|
.46
|
(.09, .73)
|
.59
|
(.42, .74)
|
.43
|
(.29, .57)
|
GDS25
|
.38
|
(.31, .69)
|
.48
|
(.22, .67)
|
.57
|
(.44, .69)
|
GDS26
|
.63
|
(.33, .84)
|
.42
|
(.21, .60)
|
.49
|
(.36, .60)
|
GDS27
|
.29
|
(-.03, .61)
|
.25
|
(.04, .47)
|
.37
|
(.23, .50)
|
GDS28
|
.10
|
(-.17, .40)
|
.34
|
(.12, .54)
|
.40
|
(.26, .53)
|
GDS29
|
.28
|
(-.01, .60)
|
.23
|
(.02, .42)
|
.37
|
(.24, .50)
|
GDS30
|
.50
|
(.25, .70)
|
.30
|
(.09, .49)
|
.39
|
(.27, .50)
|
*This parameter could not be estimated because of lack of variance among the responses for this item in the reduced sample. Note: Confidence intervals that did not contain the “population” parameter are highlighted. |