Skip to content

Latest commit

 

History

History
 
 

R

Folders and files

NameName
Last commit message
Last commit date
 
 
 
 
 
 

Analyzing MEPS data using R

Loading R packages
Loading MEPS data
    Using the MEPS package (all data years)
    Using the foreign package (1996-2017)
    Using the readr package (2018 and later)
    Automating file download
    Saving R data (.Rdata)
Survey Package in R
R examples
    Workshop Exercises
    Summary tables examples

Loading R packages

To load and analyze MEPS data in R, additional packages are needed. Packages are sets of R functions that are downloaded and installed into the R system. A package only needs to be installed once per R installation. Typically, this is done with the install.packages function to download the package from the internet and store it on your computer. The library function needs to be run every time the R session is re-started. Packages are tailor-made to help perform certain statistical, graphical, or data tasks. Since R is used by many analysts, it is typical for only some packages to be loaded for each analysis.

# Only need to run these once:
  install.packages("foreign")  
  install.packages("survey")
  install.packages("devtools")
  install.packages("tidyverse")
  install.packages("readr")

# Run these every time you re-start R:
  library(foreign)
  library(survey)
  library(devtools)
  library(tidyverse)
  library(readr)

Loading MEPS data

IMPORTANT! Starting in 2018, the SAS Transport formats for MEPS Public Use Files were converted from the SAS XPORT to the SAS CPORT engine (excluding the 2018 Point-in-Time file, HC-036, and HC-036BRR). These CPORT data files cannot be read directly into R at this time. The ASCII data file format (.dat) must be used instead. This requirement also applies to the 2017 Full-Year Consolidated file (HC-201).

Several methods are available for importing MEPS public use files (PUFs) into R. The easiest method is to use the read_MEPS function from the MEPS package, which was created to facilitate loading and manipulation of MEPS PUFs. Alternatively, R users can use the read.xport function from the foreign package to import SAS transport (.ssp) files from data years 1996-2017, or the read_fwf function from the readr package to import ASCII (.dat) files from data years 2018 and later.

Using the MEPS Package (all data years)

The MEPS R Package was created to facilitate loading and manipulation of MEPS PUFs. It can be installed using the following commands:

library(devtools)

install_github("e-mitchell/meps_r_pkg/MEPS")
library(MEPS)

The read_MEPS function can then be used to import MEPS data into R, either directly from the MEPS website, or from a local directory. This function automatically detects the best file format (.ssp or .dat) to import based on the specified data year.

In the following example, the 2017 (h197b) and 2018 (h206b) Dental visits files are automatically downloaded from the MEPS website and imported into R. Either the file name or the year and MEPS data type can be specified:

# Specifying year and MEPS data type
dn2017 <- read_MEPS(year = 2017, type = "DV")
dn2018 <- read_MEPS(year = 2018, type = "DV")

# Specifying MEPS file name
dn2017 <- read_MEPS(file = "h197b")
dn2018 <- read_MEPS(file = "h206b")

Files can also be read from a local folder using the 'dir' argument. This method is faster, since the file has already been downloaded. In the following example, the 2017 and 2018 Dental visits files have already been manually downloaded, unzipped, and stored in the local directory C:/MEPS:

dn2017 <- read_MEPS(year = 2017, type = "DV", dir = "C:/MEPS")
dn2018 <- read_MEPS(year = 2018, type = "DV", dir = "C:/MEPS")

For users that prefer not to use the MEPS R package to load MEPS public use files, care must be taken to ensure that the correct version of the file is being imported in accordance with the data year, as detailed in the next sections.

Using the foreign package (1996-2017)

The preferred file format for downloading MEPS public use files from data years 1996-2017 is the SAS transport file format (.ssp). These files can be read into R using the read.xport function from the foreign package. In the following example, the transport file h197b.ssp has been downloaded from the MEPS website, unzipped, and saved in the local directory C:/MEPS (click here for details).

dn2017 <- read.xport("C:/MEPS/h197b.ssp")

Using the readr package (2018 and later)

Starting in 2018, design changes in the MEPS survey instrument resulted in SAS transport files being converted from the XPORT to the CPORT format (excluding the 2018 Point-in-Time file, HC-036, and HC-036BRR). These CPORT file types are not readable by R at this time. Thus, the ASCII (.dat) files must be used instead. This requirement also applies to the 2017 Full-Year Consolidated file (HC-201).

In the following example, the ASCII file h207.dat has been downloaded from the MEPS website, unzipped, and saved in the local directory C:/MEPS. The 2018 Medical Conditions ASCII file (h207.dat) is then imported by running the R programming statements provided on the MEPS website.

# Set the location of the .dat file
meps_path <- "C:/MEPS/h207.dat"  

# Run the R programming statements
source("https://meps.ahrq.gov/mepsweb/data_stats/download_data/pufs/h207/h207ru.txt")

# View data
head(h207) 

Automating file download

Instead of having to manually download, unzip, and store MEPS data files in a local directory, it may be beneficial to automatically download MEPS data directly from the MEPS website. This can be accomplished using the download.file and unzip functions. The following code downloads and unzips the 2017 dental visits file, and stores it in a temporary folder (alternatively, the file can be stored permanently by editing the exdir argument). The file can then be loaded into R using the read.xport function. The same code can be used to download and unzip transport (.ssp) and ASCII (.dat) files. The following example demonstrates this process for the SAS transport (.ssp) file:

# Download .ssp (or .dat) file
url <- "https://meps.ahrq.gov/mepsweb/data_files/pufs/h197bssp.zip"
download.file(url, temp <- tempfile())

# Unzip and save .ssp (or .dat) file to temporary folder
meps_file <- unzip(temp, exdir = tempdir())

# Alternatively, this will save a permanent copy of the file to the local folder "C:/MEPS/R-downloads"
# meps_file <- unzip(temp, exdir = "C:/MEPS/R-downloads")

# Read the .ssp file into R (for .dat files, use the 'source' code above)
dn2017 <- read.xport(meps_file)

To download additional files programmatically, replace 'h197b' with the desired filename (see meps_files_names.csv for a list of MEPS file names by data type and year).

Saving R data (.Rdata)

Once the MEPS data has been loaded into R using either of the two previous methods, it can be saved as a permanent R dataset (.Rdata) for faster loading. In the following code, the h197b dataset is saved in the 'R/data' folder, (first create the 'R/data' folder if needed):

save(dn2017, file = "C:/MEPS/R/data/h197b.Rdata")

The h197b dataset can then be loaded into subsequent R sessions using the code:

load(file = "C:/MEPS/R/data/h197b.Rdata")

Survey package in R

To analyze MEPS data using R, the survey package should be used to ensure unbiased estimates. The survey package contains functions for analyzing survey data by defining a survey design object with information about the sampling procedure, then running analyses on that object. Some of the functions in the survey package that are most useful for analyzing MEPS data include:

  • svydesign: define the survey object
  • svytotal: population totals
  • svymean: proportions and means
  • svyquantile: quantiles (e.g. median)
  • svyratio: ratio statistics (e.g. percentage of total expenditures)
  • svyglm: generalized linear regression
  • svyby: run other survey functions by group

To use functions in the survey package, the svydesign function specifies the primary sampling unit, the strata, and the sampling weights for the data frame. The survey.lonely.psu='adjust' option ensures accurate standard error estimates when analyzing subsets. Once the survey design object is defined, population estimates can be calculated using functions from the survey package. As an example, the following code will estimate total dental expenditures in 2017:

options(survey.lonely.psu='adjust')

mepsdsgn <- svydesign(
  id = ~VARPSU,
  strata = ~VARSTR,
  weights = ~PERWT17F,
  data = dn2017,
  nest = TRUE)  

svytotal(~DVXP17X, design = mepsdsgn)

R examples

In order to run the example codes, you must download the relevant MEPS files in SAS transport format (.ssp) from the MEPS website and save them to your local computer, as described above. The codes are written under the assumption that the .ssp files are saved in the local directory "C:/MEPS/". However, you can customize the programs to point to an alternate directory.

Workshop exercises

The following codes from previous MEPS workshops are provided in the workshop_exercises folder:

1. National health care expenses

exercise_1a.R: National health care expenses by age group, 2016
exercise_1b.R: National health care expenses by age group, 2018

2. Prescribed medicine purchases

exercise_2a.R: Purchases and expenses for narcotic analgesics or narcotic analgesic combos, 2016
exercise_2b.R: Purchases and expenses for narcotic analgesics or narcotic analgesic combos, 2018

3. Pooling data files

exercise_3a.R: Pooling MEPS FYC files, 2015 and 2016: Out-of-pocket expenditures for unisured persons ages 26-30 with high income
exercise_3b.R: Pooling longitudinal files, panels 17-19
exercise_3c.R: Pooling MEPS FYC files, 2017 and 2018: People with joint pain, using JTPAIN31 for 2017 and JTPAIN31_M18 for 2018

4. Regression

exercise_4.R: Logistic regression to identify demographic factors associated with receiving a flu shot in 2018 (using SAQ population)

5. Plots

ggplot_example.R: Code to re-create the data and plot for Figure 1 in Statistical brief #491.

Summary tables examples

The following codes provided in the summary_tables_examples folder re-create selected statistics from the MEPS online summary tables:

Accessibility and quality of care

care1_child_dental.R: Children with dental care, by poverty status, 2016
care2_diabetes_a1c.R: Adults with diabetes receiving hemoglobin A1c blood test, by race/ethnicity, 2016
care3_access.R: Ability to schedule a routine appointment, by insurance coverage, 2016

Medical conditions

cond1_expenditures.R: Utilization and expenditures by medical condition, 2015

Health Insurance

ins1_age.R: Health insurance coverage by age group, 2016

Prescribed drugs

pmed1_therapeutic_class.R: Purchases and expenditures by Multum therapeutic class, 2016
pmed2_prescribed_drug.R: Purchases and expenditures by generic drug name, 2016

Use, expenditures, and population

use1_race_sex.R: Utilization and expendiutres by race and sex, 2016
use2_expenditures.R: Expenditures for office-based and outpatient visits, by source of payment, 2016
use3_events.R: Number of events and mean expenditure per event, for office-based and outpatient events, by source of payment, 2016