-
Notifications
You must be signed in to change notification settings - Fork 641
BFD RIF Export
The BFD RIF exporter produces files that conform to the following specifications:
- RIF Layout and FHIR Mapping defines each file type and the fields contained within it
- CODEBOOK: Medicare Beneficiary Summary File (MBSF) Base with Medicare Part A, B, C, and D defines each of the data dictionaries and the included code values used in the beneficiary file
- CODEBOOK: Medicare Fee For Service (FFS) Claims defines each of the data dictionaries and the included code values used in the claim files
- CODEBOOK: Medicare Part D Event (PDE)/Drug Characteristics defines each of the data dictionaries and the included code values used in the Part D claim file
The exporter is configured via a set of properties as shown below with their default values:
-
exporter.bfd.bene_id_start = -1000000
defines the start value ofBENE_ID
, the first exported patient will get the specified value, subsequent ids are monotonically decremented from that value -
exporter.bfd.clm_id_start = -100000000
defines the start value ofCLM_ID
, the first exported claim will get the specified value, subsequent ids are monotonically decremented from that value -
exporter.bfd.clm_grp_id_start = -100000000
defines the start value ofCLM_GRP_ID
, the first exported group will get the specified value, subsequent ids are monotonically decremented from that value -
exporter.bfd.pde_id_start = -100000000
defines the start value ofPDE_ID
, the first exported PDE claim will get the specified value, subsequent ids are monotonically decremented from that value -
exporter.bfd.mbi_start = 1S00-E00-AA00
defines the start value ofMBI_NUM
, the first exported patient will use that value, subsequent ids will monotonically increase from that value -
exporter.bfd.hicn_start = T01000000A
defines the start value ofBENE_CRNT_HIC_NUM
, the first exported record will use that value, subsequent ids will monotonically increase from that value. -
exporter.bfd.partc_contract_start = Y0001
defines the start value of Part C contract IDs that will be used inPTC_CNTRCT_JAN_ID
toPTC_CNTRCT_DEC_ID
, the first contract will use that id, subsequent ids will monotonically increase from that value. -
exporter.bfd.partc_contract_count = 10
defines the number of Part C contracts that Synthea will use in exports; each year, each patient will be randomly assigned to one of the contracts (or no contract). -
exporter.bfd.partd_contract_start = Z0001
defines the start value of Part D contract IDs that will be used inPLAN_CNTRCT_REC_ID
, the first contract will use that id, subsequent ids will monotonically increase from that value. -
exporter.bfd.partd_contract_count = 10
defines the number of Part D contracts that Synthea will use in exports; each year, each patient will be randomly assigned to one of the contracts (or no contract). -
exporter.bfd.plan_benefit_package_start = 800
defines the starting value of plan benefit package identifiers -
exporter.bfd.plan_benefit_package_count = 5
defines the number of plan benefit package identifiers, each Part C and Part D plan will share the same set of plan benefit package identifiers. -
exporter.bfd.clia_labs_start = 00A0000000
defines the start number of CLIA lab numbers that will be used to populateCARR_LINE_CLIA_LAB_NUM
. -
exporter.bfd.clia_labs_count = 10
defines the number of CLIA lab numbers that will be used. -
exporter.bfd.cutoff_date=20140529
defines the earliest date for any exported claims
At the end of a Synthea run, the exporter will create an end_state.properties
file that captures the final value of any of the above configuration options that require a monotonically increasing or decreasing value per beneficiary or claim. The value in this file will override the configured values to permit subsequent runs of Synthea to start where the prior run ended. An example file is shown below.
exporter.bfd.hicn_start=T01000020A
exporter.bfd.mbi_start=1S00E00AA20
exporter.bfd.clm_grp_id_start=-100003266
exporter.bfd.pde_id_start=-100000996
exporter.bfd.fi_doc_cntl_num_start=-100000575
exporter.bfd.bene_id_start=-1000020
exporter.bfd.carr_clm_cntl_num_start=-100001695
exporter.bfd.clm_id_start=-100002270
Synthea does not model values for all the RIF file fields. In these cases, each field is assigned a fixed value, or a value randomly taken from a set of allowed values. These values are configured using the bfd_field_values.tsv
tab-separated file. Each cell within this file specifies the allowed values for a particular field (row) for a particular file (column): where a value can be one from a set of allowed values, this is shown as a comma-separated list; where the field is always empty, this is shown as [Blank].
The following shell script will generate records for a set of beneficiaries for all 50 states and Washington, DC. The desired total size of the population is supplied as a command line argument, numbers of beneficiaries in each location will be proportional to the population of each state (based on census data).
#!/bin/bash
if [[ $# -eq 0 ]]; then
echo "Usage: $0 size"
echo "where 'size' is an integer specifying the target population size"
exit 1
fi
# Weights are based on 2019 census data:
#
# https://data.census.gov/cedsci/table?q=Total%20Population&g=0400000US01,02,04,05,06,08,09,10,11,12,13,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,44,45,46,47,48,49,50,51,53,54,55,56&tid=ACSDP1Y2019.DP05&hidePreview=true&moe=false
#
# Each value represents the number of state residents aged 62 or more divided by the
# total number of USA state residents aged 62 or more expressed as a percentage.
#
states=( ); weights=( )
states+=( "Alabama" ); weights+=( "1.578" )
states+=( "Alaska" ); weights+=( "0.178" )
states+=( "Arizona" ); weights+=( "2.357" )
states+=( "Arkansas" ); weights+=( "0.958" )
# states+=( "California" ); weights+=( "10.801" ) # California is handled separately at the end and is used to absorb any rounding errors
states+=( "Colorado" ); weights+=( "1.586" )
states+=( "Connecticut" ); weights+=( "1.170" )
states+=( "Delaware" ); weights+=( "0.351" )
states+=( "District of Columbia" ); weights+=( "0.161" )
states+=( "Florida" ); weights+=( "8.044" )
states+=( "Georgia" ); weights+=( "2.836" )
states+=( "Hawaii" ); weights+=( "0.492" )
states+=( "Idaho" ); weights+=( "0.536" )
states+=( "Illinois" ); weights+=( "3.796" )
states+=( "Indiana" ); weights+=( "2.016" )
states+=( "Iowa" ); weights+=( "1.016" )
states+=( "Kansas" ); weights+=( "0.891" )
states+=( "Kentucky" ); weights+=( "1.401" )
states+=( "Louisiana" ); weights+=( "1.399" )
states+=( "Maine" ); weights+=( "0.530" )
states+=( "Maryland" ); weights+=( "1.801" )
states+=( "Massachusetts" ); weights+=( "2.179" )
states+=( "Michigan" ); weights+=( "3.288" )
states+=( "Minnesota" ); weights+=( "1.712" )
states+=( "Mississippi" ); weights+=( "0.905" )
states+=( "Missouri" ); weights+=( "1.963" )
states+=( "Montana" ); weights+=( "0.382" )
states+=( "Nebraska" ); weights+=( "0.580" )
states+=( "Nevada" ); weights+=( "0.916" )
states+=( "New Hampshire" ); weights+=( "0.472" )
states+=( "New Jersey" ); weights+=( "2.753" )
states+=( "New Mexico" ); weights+=( "0.698" )
states+=( "New York" ); weights+=( "6.092" )
states+=( "North Carolina" ); weights+=( "3.210" )
states+=( "North Dakota" ); weights+=( "0.220" )
states+=( "Ohio" ); weights+=( "3.804" )
states+=( "Oklahoma" ); weights+=( "1.175" )
states+=( "Oregon" ); weights+=( "1.406" )
states+=( "Pennsylvania" ); weights+=( "4.413" )
states+=( "Rhode Island" ); weights+=( "0.351" )
states+=( "South Carolina" ); weights+=( "1.713" )
states+=( "South Dakota" ); weights+=( "0.285" )
states+=( "Tennessee" ); weights+=( "2.098" )
states+=( "Texas" ); weights+=( "7.031" )
states+=( "Utah" ); weights+=( "0.686" )
states+=( "Vermont" ); weights+=( "0.234" )
states+=( "Virginia" ); weights+=( "2.523" )
states+=( "Washington" ); weights+=( "2.247" )
states+=( "West Virginia" ); weights+=( "0.679" )
states+=( "Wisconsin" ); weights+=( "1.903" )
states+=( "Wyoming" ); weights+=( "0.185" )
END_STATE_PROPS_FILE="./output/bfd/end_state.properties"
total_generated=0
for i in "${!states[@]}"
do
state=${states[$i]}
weight=${weights[$i]}
count=`echo "${1}*${weight}/100" | bc`
total_generated=`echo "${total_generated}+${count}" | bc`
if [[ $count -eq "0" ]]
then
echo "Skipping generating ${state}, requested patients is ${count} "
continue
fi
if [[ -f "${END_STATE_PROPS_FILE}" ]]
then
load_props="-c ${END_STATE_PROPS_FILE}"
else
load_props=
fi
echo "Generating ${count} patients for ${state}"
./run_synthea -s ${i} -cs ${i} -r 20211020 ${load_props} -p ${count} --exporter.fhir.export=false --exporter.fhir.transaction_bundle=false --exporter.hospital.fhir.export=false --exporter.practitioner.fhir.export=false --exporter.bfd.export=true --exporter.years_of_history=10 --generate.only_alive_patients=true -a 70-80 "${state}"
done
# Generate remaining requested population for California to handle any rounding errors
if [[ -f "${END_STATE_PROPS_FILE}" ]]
then
load_props="-c ${END_STATE_PROPS_FILE}"
else
load_props=
fi
remaining=`echo "${1}-${total_generated}" | bc`
echo "Generating ${remaining} patients for California"
total_generated=`echo "${total_generated}+${remaining}" | bc`
./run_synthea -s 51 -cs 51 -r 20211020 ${load_props} -p ${remaining} --exporter.fhir.export=false --exporter.fhir.transaction_bundle=false --exporter.hospital.fhir.export=false --exporter.practitioner.fhir.export=false --exporter.bfd.export=true --exporter.years_of_history=10 --generate.only_alive_patients=true -a 70-80 California
echo "Finished generating ${total_generated} of ${1} requested patients"