Chapter 6: Working with Enrollment Data

Introduction and Goals

Review and Approach

Basics of Medicare Enrollment Data

Our Programming Plan

Algorithms: Identifying Continuously Enrolled FFS Beneficiaries

Why Define Continuously Enrolled FFS Beneficiaries?

How to Specify the Programming for Continuous Enrollment in Medicare FFS

Medicare Part A and Part B Enrollment Variables

HMO Coverage Variable

Date of Death Variable

How to Program in SAS to Define Continuous Enrollment in Medicare FFS

Algorithms: Create or Retain Data Elements for Displaying Results by Certain Characteristics

Coverage Characteristics, Month of Death, Sex, and Race

Age Groups

Geographic Characteristics

Algorithms: Create Final Enrollment Data for Remainder of Programming

Chapter Summary

Exercises

 

Introduction and Goals

In Chapter 5, we acquired the Medicare enrollment file for the 2010 calendar year and loaded the file into a SAS data set. In this chapter, our goal is to utilize this Master Beneficiary Summary File (MBSF) data set in our research programming project. As discussed in earlier chapters, the online companion to this book is at http://support.sas.com/publishing/authors/gillingham.html. Here, you will find information on creating dummy source data, the code in this and subsequent chapters, as well as answers to the exercises in this book. I expect you to visit the book’s website, create your own dummy source data, and run the code yourself.

Recall that we have been asked to evaluate the 2010 outcomes of a pilot program designed to incentivize providers to reduce costs and improve quality. In order to do so, we will compute simple measures of payment, utilization, and quality outcomes for those providers that interacted with the beneficiaries in our sample population. The pilot program operated for the full calendar year of 2010. Although a population of providers (and associated beneficiaries) participating in the pilot program was chosen at the outset of the demonstration, we have been asked to perform our analyses only on those beneficiaries who were continuously enrolled in Medicare Fee-for-Service (FFS) throughout all twelve months of 2010. Therefore, we must write and execute SAS code that delimits our beneficiary population to those beneficiaries continuously enrolled in Medicare FFS throughout all twelve months of calendar year 2010.1 In subsequent chapters we will also use this file to delimit our calendar year 2010 claims data. We will then develop algorithms that will query the claims data to produce summaries of payment, utilization, and quality. Given this plan, we will use the MBSF data to do the following:

• Identify beneficiaries continuously enrolled in Medicare FFS throughout all twelve months of calendar year 2010, and use this information to delimit our 2010 MBSF data set to continuously enrolled beneficiaries.

• Create or retain data elements for displaying results by the following beneficiary characteristics:

∘ Medicare coverage characteristics in 2010

∘ Sex

∘ Age groups (less than 65, 65-74, 75-84, 85-94, and 95 and older)

∘ Beneficiary geographic information, defined by Social Security Administration (SSA) state and county codes, along with the corresponding state and county names

• Create a file of beneficiary enrollment and demographic information distilled from the MBSF data for 2010 that we will use for the remainder of our programming for our sample research project. This file will be at the beneficiary level, with one record per beneficiary.

Review and Approach

Basics of Medicare Enrollment Data

In Chapter 2 through Chapter 5, we discussed the following characteristics of Medicare enrollment, the Master Beneficiary Summary File, and programming with Medicare administrative data:

• The majority of Medicare beneficiaries are eligible for Medicare insurance because they are aged 65 and over. However, under certain circumstances, Medicare also insures beneficiaries who are permanently disabled or have ESRD or ALS.

• Beneficiaries can enroll in Medicare Parts A and B or Medicare Part C. Medicare Parts A and B are FFS programs, while Medicare Part C is a managed care program (Medicare Advantage).

• Medicare Part C permits Medicare beneficiaries to enroll in a managed care organization instead of participating in traditional Medicare FFS. Medicare Part A helps pay for care that is provided in an institutional setting, like inpatient hospitals. Medicare Part B helps pay for care that is provided in a non-institutional setting, like a physician’s office, as well as institutional outpatient services.

• Generally, the claims of beneficiaries enrolled in Part C do not appear in the administrative claims data, but some HMOs do report claims and these claims can appear in the administrative data (although in such a very small number that we cannot study these beneficiaries because we will not have full utilization information).

• The Medicare MBSF data contains information on Medicare beneficiaries, including enrollment and demographic characteristics like reason for entitlement, date of birth, age, sex, state, county, and zip code.

• MBSF data are available by request. Our example enrollment data for 2010 arrived in flat file format and we loaded it into a SAS dataset named SRC.MBSF_AB_2010.

• The MBSF data dictionary defines the variables in the file. The data dictionary is available from CMS’s data distribution contractor and is integral to the effective use of the MBSF.

• It is prudent to plan effectively prior to beginning your programming project. Part of this planning effort involves creating written specifications that will guide the creation of your SAS algorithms. In addition, it is equally important to execute quality assurance and quality control steps throughout your programming process. These QA/QC procedures include reviewing the written specifications and debugging your SAS code through viewing output and test cases, as well as benchmarking output. Finally, because Medicare administrative data files can be quite large, it is important to keep efficient programming techniques in mind when coding.

Our Programming Plan

We will use the 2010 MBSF enrollment data to establish our population of full-year FFS beneficiaries (i.e., those beneficiaries enrolled in Medicare FFS throughout all twelve months of 2010). We will also utilize the 2010 MBSF for all enrollment information needed for our research programming effort. Therefore, our plan is to first query the 2010 MBSF and establish our population of full-year FFS beneficiaries. Next, we will turn our attention to creating descriptive variables. With this in mind, we can begin our programming with MBSF data by identifying beneficiaries in our sample population who were continuously enrolled in Medicare FFS throughout all twelve months of calendar year 2010.

Algorithms: Identifying Continuously Enrolled FFS Beneficiaries

Why Define Continuously Enrolled FFS Beneficiaries?

Identifying beneficiaries who have been continuously enrolled in Medicare FFS (be it for a calendar year or a period prior to or following a specific medical event, like a hospitalization) is a common task in research programming. Because Medicare administrative claims data include all FFS claims submitted on behalf of a beneficiary (and most likely will not include claims for managed care enrollees), focusing on beneficiaries who are enrolled in Medicare FFS helps to ensure that we have a complete picture of each beneficiary’s medical history for that year. In our case, we are seeking to determine the impact of a program in the year 2010 on payment, utilization, and quality. In order to do so comprehensively, we will limit the population we are studying to those beneficiaries who have been continuously enrolled in Medicare FFS for all twelve months of the 2010 calendar year to help ensure that we have an accurate picture of their claims data during that time.

How to Specify the Programming for Continuous Enrollment in Medicare FFS

Depending on your project needs, “continuous enrollment” can be defined in many different ways; it is just a business rule. We will define continuous enrollment as follows: A beneficiary is considered continuously enrolled in Medicare FFS throughout 2010 if they were alive for the entire year, did not have managed care coverage at any point during the year, and were enrolled in Medicare Parts A and B all year.2 Accomplishing this task requires understanding and using the following variables in the Master Beneficiary Summary File, which will be explained in detail in the remainder of this chapter:

• Medicare Hospital Insurance (Part A) and Supplementary Medical Insurance (Part B) coverage variables

• HMO coverage variable

• Date of death variable

Medicare Part A and Part B Enrollment Variables

The Medicare Part A and Part B Enrollment variables (BENE_HI_CVRAGE_TOT_MONS and BENE_SMI_CVRAGE_TOT_MONS) describe the number of months that a beneficiary is enrolled in Part A and Part B. These variables are sometimes referred to by the shortened variable names A_MO_CNT and B_MO_CNT. It is important to determine enrollment because services received when a beneficiary is not enrolled in Medicare Part A or Part B for a period of time are not contained in the claims files. Therefore, a distorted picture of, say, utilization may result if gaps in coverage are allowed.

Checking the data dictionary for the MBSF3, we want to keep only those beneficiaries with values of 12 for each variable, denoting that the beneficiary was enrolled in Part A and Part B for all 12 months of the year. We do not want to keep beneficiaries with a value of less than 12 for either the _HI_CVRAGE_TOT_MONS variable or the _SMI_CVRAGE_TOT_MONS variable because we stated above that we want to retain only those beneficiaries who have been enrolled in both Medicare Part A and Part B throughout 2010.

HMO Coverage Variable

The HMO coverage variable (BENE_HMO_CVRAGE_TOT_MONS) describes the number of months that a beneficiary was enrolled in a managed care plan (also known as a Health Maintenance Organization, or HMO) during the year. As stated above, Medicare Part C permits Medicare beneficiaries to enroll in a private managed care insurance plan instead of participating in traditional Medicare FFS. Generally, the claims of beneficiaries enrolled in these managed care plans do not appear in the administrative claims data, so we seek to exclude these beneficiaries from our study population. Another quick check of the MBSF data dictionary shows we want to keep only those beneficiaries with a value of 0 for the BENE_HMO_CVRAGE_TOT_MONS variable.

All other values indicate that the beneficiary was enrolled in managed care and what entity will process the beneficiary’s claims.

Date of Death Variable

The date of death variable (DEATH_DT) describes the date that a beneficiary died. We want to keep only those beneficiaries who were alive in 2010. Although we will not use it, the date of death variable can also be confirmed by using the Valid Date of Death Switch variable (V_DOD_SW); this variable indicates that the beneficiary’s date of death has been confirmed by the Social Security Administration as accurate.4 In this case, we will choose to delimit our population to records where the DEATH_DT variable is equal to missing, indicating that no date of death is available.

How to Program in SAS to Define Continuous Enrollment in Medicare FFS

Now that we have specified how to define continuous enrollment, we can build the code. The great thing about SAS is that there are many, many different ways to accomplish a single task, and there is really no right or wrong method (as long as the different approaches provide the same, correct answer!). That said, we are often concerned with efficiency when using Medicare data because the data sets can be quite large. It is important to get into the habit of reading and writing as few times as possible. Therefore, we will attempt to perform the tasks listed above in as few DATA steps as possible, while creating code that is (hopefully) useful from an instructional perspective. In addition, we will take a first step in quality assuring our code by examining our output and attempting to replicate our results.

I present the following method for identifying and retaining beneficiaries who have been continuously enrolled in Medicare FFS. First, in Step 6.1, we perform a simple data step to flag those beneficiaries continuously enrolled in Medicare Parts A and B. Specifically, we create a variable called CONTENRL_AB_2010 that is set equal to ‘AB’ if the values of BENE_HI_CVRAGE_TOT_MONS and BENE_SMI_CVRAGE_TOT_MONS are both equal to 12, denoting that the beneficiary was enrolled in Medicare Parts A and B for all twelve months of the year. In the same fashion, we create a variable called CONTENRL_HMO_2010 that is set equal to ‘NOHMO’ if the value of BENE_HMO_CVRAGE_TOT_MONS is equal to 0, denoting that the beneficiary was not enrolled in an HMO at any time during the year. Finally, we create a variable called DEATH_2010 that is set equal to 0 if the value of DEATH_DT is null, denoting that the beneficiary was alive during all twelve months of year.

/* STEP 6.1: BUILD CONTINUOUS ENROLLMENT INFORMATION IN 2010 MBSF FILE */

data enr.contenr_2010;

    set src.mbsf_ab_2010;

      length contenrl_ab_2010 contenrl_hmo_2010 $5.;

      /* FLAG BENEFICIARIES WITH PARTS A AND B OR HMO COVERAGE */

    if bene_hi_cvrage_tot_mons=12 and bene_smi_cvrage_tot_mons=12 then contenrl_ab_2010='ab'; else contenrl_ab_2010='noab';

    if bene_hmo_cvrage_tot_mons=12 then contenrl_hmo_2010='hmo'; else contenrl_hmo_2010='nohmo';

     /* FLAG BENEFICIARIES THAT DIED IN 2010 */

     if death_dt ne . then death_2010=1; else death_2010=0;

run;

 

In Step 6.2, we output the following frequency distribution of the enrollment flags we created in Step 6.1, the results of which are illustrated by Output 6.1:

/* STEP 6.2: FREQUENCY OF CONTINUOUS ENROLLMENT VARIABLES */

ods html file=“C:UsersmgillinghamDesktopSAS BookFINAL_DATAODS_OUTPUTGillingham_fig6_2_ENRL.html”

image_dpi=300 style=GrayscalePrinter;

ods graphics on / imagefmt=png;

title “VARIABLES USED TO DETERMINE CONTINUOUS ENROLLMENT IN 2010 DATA”;

proc freq data=enr.contenr_2010;

   tables contenrl_ab_2010 contenrl_hmo_2010 death_2010 / missing;

run;

ods html close;

Output 6.1: Continuous Enrollment Variables

image

Finally, in Step 6.3 we create our file of continuously enrolled beneficiaries by delimiting the ENR.CONTENR_2010 file created in Step 6.1 by the enrollment flags we defined in that same step. A beneficiary is defined as continuously enrolled in Medicare Parts A and B in calendar year 2010 if the value of CONTENRL_AB_2010 is equal to ‘AB,’ the value of CONTENRL_HMO_2010 is equal to ‘NOHMO,’ and the value of DEATH_2010 is not equal to 1. We keep only these records.

/* STEP 6.3: CREATE A 2010 ENROLLMENT FILE OF ONLY CONTINUOUSLY ENROLLED BENEFICIARIES */

data enr.contenr_2010_fnl;

    set enr.contenr_2010;

       if contenrl_ab_2010='ab' and contenrl_hmo_2010='nohmo' and death_2010 ne 1;

run;

 

Algorithms: Create or Retain Data Elements for Displaying Results by Certain Characteristics

In addition to delimiting our population to beneficiaries who were continuously enrolled in Medicare FFS throughout 2010, we also must create flags for displaying our results in later chapters by the following beneficiary characteristics:

• Medicare coverage characteristics in 2010

• Sex

• Race

• Age groups (less than 65, 65-74, 75-84, 85-94, and 95 and older)

• Beneficiary geographic information, defined by assigning an SSA state and county code, and corresponding state and county names, to each beneficiary

While we are working with the MBSF data in this chapter, we will create or retain these descriptive data elements. Although we cannot utilize these data elements to segment our payment and utilization statistics until later chapters, it is helpful to create them in this chapter while we are working with the enrollment data. We can put the variables to immediate use by using them to study the composition of our population. Accomplishing this now does not really influence the efficiency of our code (we can just set the output aside until needed), and it comes with the benefit of knowing more about our study population and finalizing the MBSF information we will need to complete the remainder of the programming for our sample project. This will be well worth the effort!

Coverage Characteristics, Month of Death, Sex, and Race

Our input dataset is the output of the continuous enrollment exercise, ENR.CONTENR_2010_FNL. We do not need to create separate flags to define enrollment characteristics in the 2010 data because we can simply use the variables created in Step 6.2. In addition, there is no need to perform any programming to create separate variables that contain information on sex or race; the SEX and RACE variables provided in the MBSF files serve our purposes. In Step 6.4, we simply explore the information contained in the SEX and RACE variables using frequency distributions. Note that we use PROC FORMAT so the definitions of the values of SEX and RACE are displayed in our output. We use SAS’ Output Delivery System (ODS) to create our output.

/* STEP 6.4: INITIAL INVESTIGATION OF SEX AND RACE IN THE 2010 DATA */

proc format;

    value $sex_cats_fmt

             '0'='UNKNOWN'

        '1'='MALE'

        '2'='FEMALE';

run;

 

ods html file=“C:UsersmgillinghamDesktopSAS BookFINAL_DATAODS_OUTPUTGillingham_fig6_4_SEX.html”

image_dpi=300 style=GrayscalePrinter;

ods graphics on / imagefmt=png;

title “FREQUENCY OF SEX IN 2010 DATA”;

proc freq data=enr.contenr_2010_fnl;

    tables sex / missing;

       format sex $sex_cats_fmt.;

run;

ods html close;

 

proc format;

   value $race_cats_fmt

       '0'='UNKNOWN'

       '1'='WHITE'

       '2'='BLACK'

       '3'='OTHER'

       '4'='ASIAN'

             '5'='HISPANIC'

             '6'='NORTH AMERICAN NATIVE';

run;

 

ods html file=“C:UsersmgillinghamDesktopSAS BookFINAL_DATAODS_OUTPUTGillingham_fig6_4_RACE.html”

image_dpi=300 style=GrayscalePrinter;

ods graphics on / imagefmt=png;

title “FREQUENCY OF RACE IN 2010 DATA”;

proc freq data=enr.contenr_2010_fnl;

   tables race / missing;

      format race $race_cats_fmt.;

run;

ods html close;

 

Output 6.2 and Output 6.3 show the results of Step 6.4. You can see that there are more females than males in our population, and that the majority of our beneficiaries are white.

Output 6.2: Percentage of Sex in 2010 Data

image

Output 6.3: Percentage of Race in 2010 Data

image

Age Groups

Next, let’s define and group the age category to which a beneficiary belongs. Suppose for illustrative purposes we wish to calculate each beneficiary’s age as of the beginning of the reference year (in our case, January 1, 2010). In Step 6.5, we use our ENR.CONTENR_2010_FNL data set to calculate this variable and call it STUDY_AGE. Suppose further that we wish to create a variable that describes a beneficiary’s inclusion in one of four age groups: 65 and younger, 65-74, 75-84, 85-94, and 95 and older.5 Also in Step 6.5, we create a variable called AGE_CATS that groups beneficiaries by their value of STUDY_AGE. Note that we use PROC FORMAT to create a format called AGE_CATS_FMT that we apply to the display of AGE_CATS below.

/* STEP 6.5: CREATE VARIABLE NAMED STUDY_AGE THAT CONTAINS AGE AS OF 01.01.2010 */

/* STEP 6.5 (CONT): CREATE VARIABLE AGE_CATS THAT GROUPS STUDY_AGE INTO AGE CATEGORIES */

proc format;

   value age_cats_fmt

       0='AGE LESS THAN 65'

       1='AGE BETWEEN 65 AND 74, INCLUSIVE'

       2='AGE BETWEEN 75 AND 84, INCLUSIVE'

       3='AGE BETWEEN 85 AND 94, INCLUSIVE'

       4='AGE GREATER THAN OR EQUAL TO 95';

run;

 

data enr.contenr_2010_fnl;

   set enr.contenr_2010_fnl;

   format age_cats age_cats_fmt.;

   study_age=floor((intck('month', bene_dob, '01jan2010'd) - (day('01jan2010'd) < day(bene_dob))) / 12);

   select;

       when (study_age<65)      age_cats=0;

       when (65<=study_age<=74) age_cats=1;

       when (75<=study_age<=84) age_cats=2;

       when (85<=study_age<=94) age_cats=3;

       when (study_age>=95)     age_cats=4;

       end;

   label age_cats='Beneficiary age category at beginning of reference year (January 1, 2010)';

run;

Finally, in Step 6.6, Output 6.4 shows the results of Step 6.5 (still using the file ENR.CONTENR_2010_FNL) using a cross-tabulation of STUDY_AGE (the variable we calculated as of January 1, 2010), and AGE_CATS (the groupings of STUDY_AGE). The output of this frequency can be used to check that our calculations of STUDY_AGE and AGE_CATS were performed correctly. Note that we delimited our input data to retain only those beneficiaries aged 65 through 70, inclusive for display purposes below. You can remove the ‘where’ clause and perform a frequency distribution for all values of STUDY_AGE at your discretion.

/* STEP 6.6: DISPLAY AGE GROUP CHARACTERISTICS IN 2010 ENROLLMENT DATA */

ods html file=“C:UsersmgillinghamDesktopSAS BookFINAL_DATAODS_OUTPUTGillingham_fig6_6_AGE.html”

image_dpi=300 style=GrayscalePrinter;

ods graphics on / imagefmt=png;

title “CROSS TAB OF STUDY_AGE AND AGE_CATS IN 2010 DATA”;

proc freq data=enr.contenr_2010_fnl(where=(65<=study_age<=70));

   tables study_age * age_cats / list missing;

   format age_cats age_cats_fmt.;

run;

ods html close;

Here is the output of frequency distribution executed in Step 6.6. You can see the computation of AGE_CATS is accurate.

Output 6.4: Age Groups

image

Geographic Characteristics

What are some effective ways to display healthcare data by geographic characteristics? Certainly, grouping by state may be effective, but may also yield results that are defined too broadly. On the other hand, viewing by zip code may be too fine a level of granularity to pull out any meaningful results. In some cases, it may be informative to designate Hospital Referral Regions (HRRs) or Hospital Service Areas (HSAs)6, or to assign a Metropolitan Statistical Area (MSA).

In our example research programming project, we will add the SSA state and county name to our enrollment data set. SSA state and county codes already exist on the MBSF data we received and loaded in Chapter 5.7 A publicly available SSA code file contains the state and county names that correspond to the SSA state and county codes.8 In order to add the state and county names to the MBSF data, we must load the SSA code file into SAS and work to merge the county names onto our MBSF data. In other words, our task is to merge state and county names onto our data set that contains enrollment information for those beneficiaries who have been continuously enrolled in Medicare FFS throughout all twelve months of 2010 (called ENR.CONTENR_2010_FNL). We do this by merging our enrollment data with the SSA code file by SSA state and county codes.

In Step 6.7, we begin by loading the SSA code file (called MSABEA.TXT) into a SAS data set. To this end, we create a SAS data set called SRC.MSABEA_SSA that contains the SSA state and county codes (concatenated into a single variable called SSA), the corresponding county name (COUNTY), and the corresponding state name (STATE).

/* STEP 6.7: LOAD SSA STATE AND COUNTY CODE INFORMATION */

data src.msabea_ssa;

       infile “C:UsersmgillinghamDesktopSAS BookFINAL_DATAsource_dataMSABEA03.TXT” missover;

       input

              county $  1-25

              state  $ 26-27

              ssa    $ 30-34;

run;

In Step 6.8, we prepare to merge this file with our MBSF data by sorting the data set by the SSA variable.9

/* STEP 6.8: SORT SSA STATE AND COUNTY CODES FILE TO REMOVE DUPLICATE RECORD FOR DADE OR MIAMI DADE */

proc sort data=src.msabea_ssa nodupkey;

       by ssa;

run;

In Step 6.9, we prepare our enrollment file (SRC.CONTENR_2010_FNL) to receive the SSA state and county name information loaded into the SRC.MSABEA_SSA data set. Prior to merging with the SRC.MSABEA_SSA data set, we must create a variable on the SRC.CONTENR_2010_FNL data set that is equivalent to the SSA variable on the SRC.MSABEA_SSA data set. More specifically, our ENR.CONTENR_2010_FNL data set contains the information stored in the SSA variable in two variables (STATE_CD and CNTY_CD). Therefore, in order to merge our enrollment data with the SSA code file, we concatenate these two separate variables, creating an equivalent SSA variable (also called SSA) on the ENR.CONTENR_2010_FNL data set.

/* STEP 6.9: CREATE SSA VARIABLE ON ENROLLMENT DATA */

data enr.contenr_2010_fnl;

       set enr.contenr_2010_fnl;

       ssa=state_cd||cnty_cd;

run;

In Step 6.10, we proceed to merge the county and state names onto our enrollment data. First, we sort the ENR.CONTENR_2010_FNL data set by SSA, and then we perform a simple merge of the ENR.CONTENR_2010_FNL and the SRC.MSABEA_SSA data sets by SSA, keeping all of the data in our enrollment data set. In this way, we have assigned a state and county name to each beneficiary in our file of beneficiaries who were continuously enrolled in Medicare FFS throughout all twelve months of 2010.

/* STEP 6.10: SORT CONTINUOUS ENROLLMENT DATA AND MERGE WITH MSABEA FILE */

proc sort data=enr.contenr_2010_fnl; by ssa; run;

 

data enr.contenr_2010_fnl;

       merge enr.contenr_2010_fnl(in=a) src.msabea_ssa(in=b);

       by ssa;

       if a;

run;

Finally, in Step 6.11, we perform a simple print of the newly added state and county name variables in the ENR.CONTENR_2010_FNL data set (displaying just the first 10 records).

/* STEP 6.11: DISPLAY SSA STATE AND COUNTY NAMES IN 2010 ENROLLMENT DATA */

ods html file=“C:UsersmgillinghamDesktopSAS BookFINAL_DATAODS_OUTPUTGillingham_fig6_11.html”

image_dpi=300 style=GrayscalePrinter;

ods graphics on / imagefmt=png;

title “SSA STATE AND COUNTY NAMES IN 2010 DATA”;

proc print data=enr.contenr_2010_fnl(obs=10);

       var bene_id ssa state county;

run;

ods html close;

 

We use ODS to display the first 10 records of the output of Step 6.11, shown below in Output 6.5.

Output 6.5: SSA State and County Names

image

Algorithms: Create Final Enrollment Data for Remainder of Programming

The last step in our programming with MBSF data is to create a final file that we will carry forward and use throughout the remainder of our programming. In Step 6.12, we merely need to carry the file ENR.CONTENR_2010_FNL through the remainder of our processing. Since we will be doing much of our work at the beneficiary level, the last programming step in this chapter is to sort the ENR.CONTENR_2010_FNL data set by the beneficiary identifier BENE_ID.

/* STEP 6.12: CREATE FINAL ENROLLMENT FILE */

proc sort data=enr.contenr_2010_fnl;

       by bene_id;

run;

 

Chapter Summary

In this chapter we used the 2010 MBSF data sets to begin our research programming project. We:

• Specified our continuous enrollment criteria and wrote code to identify beneficiaries who were continuously enrolled in Medicare FFS throughout calendar year 2010. We used this information to delimit our 2010 MBSF data.

• Created and retained data elements for displaying results by Medicare coverage characteristics, sex, race, age groupings, and state and county name.

• Learned about commonly used enrollment data elements like the date of death, Parts A and B coverage, HMO enrollment, and beneficiary date of birth.

• Wrote an algorithm to calculate beneficiary age and create beneficiary age categories.

• Learned about SSA state and county codes, and created an algorithm to merge state and county name information onto our enrollment data.

• Created a final analytic file of enrollment information that we will use for the remainder of our programming for our sample research project.

Exercises

1. Alter the continuous enrollment specifications and code to flag beneficiaries who were continuously enrolled for any six consecutive months in 2010.

2. Using the 2010 MBSF data, write code to retain beneficiaries who reside in the following states: Pennsylvania, District of Columbia, Maryland, Michigan, Ohio, and Virginia. What variable describes the beneficiary’s state of residence? Be sure to include a frequency to present your results.

3. In this chapter, we wrote our code in piecemeal fashion for educational purposes. We will use the same steps and illustrative coding process in later chapters. For your own projects, you may want to combine some of the code into fewer steps for efficiency reasons. Can you rewrite the code in this chapter so it is more efficient? How would you measure efficiency?


1 In reality, we would have requested our data from CMS with the restriction of asking CMS’s data distribution contractor to provide only claims for those beneficiaries who were enrolled in Medicare FFS throughout all twelve months of 2010. However, for instructional purposes, we will assume that we must perform this restriction ourselves.

2 Often, researchers also exclude Medicare Secondary Payer and ESRD beneficiaries.

3 Data dictionaries are available on the Chronic Conditions Data Warehouse website at https://www.ccwdata.org/web/guest/data-dictionaries.

4 If a beneficiary’s day of death cannot be confirmed, then it is assigned as the last day of the month of death. For more information, see ResDAC’s article on this issue available on ResDAC’s website at http://www.resdac.org/resconnect/articles/117.

5 We assume that there are no missing values of the STUDY_AGE variable. If missing values did exist, they would end up in the “65 and younger” category.

6 For more information on HRRs and HSAs, see The Dartmouth Atlas of Healthcare website at http://www.dartmouthatlas.org/data/region/.

7 It is important to note the existence of multiple state and county coding systems. While the MBSF data uses the SSA state and county coding system, other sources may use the Federal Information Processing Standard (FIPs) coding system.

8 The SSA code file used in this book (called MSABEA.TXT) is available on CMS’s MSABEA file webpage available at http://www.cms.gov/Medicare/Medicare-Fee-for-Service-Payment/AcuteInpatientPPS/Acute-Inpatient-Files-for-Download-Items/CMS022639.html.

9 Note that we perform a nodupkey sort to remove a record that is duplicative for our purposes.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset