Skip to Main Content

Finding Datasets for Secondary Analysis

Medicare Data

Medicare and Medicaid are administered by the Centers for Medicare and Medicaid Services (CMS).  Medicare is as federal health insurance program for people age 65 and older, people with certain disabilities and people with end stage renal disease and has two principal parts (Part A and B). 

CMS classified data files into three categories, depending on the level of personal information available in each.

  • Research Identifiable Files

Contains specific variables such as beneficiary social security number or Unique Physician Identification Number. 

Researchers who want to use RIFs must submit a formal request to the Research Data Assistance Center (ResDAC), including a Data User Agreement and study plan or protocol.

  • Beneficiary Encrypted Files (also known as Limited Data Set Files)

Personal identifiers have been encrypted or blanked out.

 It is also necessary to apply to the ResDAC for access to BEFs, and costs are similar to that for RIFs; however it is easier to get permission to use BEFs. 

  • Public Use files

The lowers level of restrictions are applied; are generally aggregated to levels higher than the individual (i.e., to the state level) and contain no beneficiary-level or physician-level information; are much cheaper than BEFs and RIFs, is available on the CMS website.​ Link to Public Use File

  • Dashboard

The office of Enterprise Data and Analytics at CMS developed interactive dashboards that present information on state and county leval variation in standardized per-capita costs for the Medicare fee-for-service population.

Medicaid Data

Medicaid is a state-administered health insurance system, primarily for people who are low income and for those with disabilities, that is partially financed by the federal government. Eligibility and benefits for Medicaid differ by state. 

In Medicaid, an individual’s status many change from eligible to ineligible several time in a single year, depending on income, employment situation, and state of residence. Therefore a researcher needs to be careful when using Medicaid data to represent either the health care received by an individual in a given time period, or to make comparisons of utilization across states.