Skip to content

Files

Latest commit

 

History

History

disparities

Disparities

The repository contains auxiliary data and code relevant to the modeling effort for the COVID-19 Scenario Modeling Hub Research Disprities rounds

For any questions or issues please feel free to open an issue on the GitHub.

A high proportion of COVID-19 cases are not reported with demographic information such as race/ethnicity. Populations with reduced access to quality healthcare and testing resources are more likely to experience Covid-19 morbidity and mortality, and thus is it not appropriate to omit or distribute missing case data randomly. We adapt methods described in Trangucci et al. 2022 to infer the distribution of missing cases by race/ethnicity. Here, we solely consider the cases that were reported without race/ethnicity and do not consider infections that were not reported.

Source: Trangucci, Rob, Yang Chen, and Jon Zelner. "Modeling rates of disease with missing categorical data." arXiv preprint arXiv:2206.08161 (2022).

We produced synthetic daily contact matrices by race/ethnicity in the household, school, community, workplace setting using methodology described in Mistry et al. 2021 and Aleta et al. 2022.

For more information, please consult the associated README.md

The folder contains hospitalization by race/ethnicity, in a rate per 100,000 people for California and number of hospitalization for North Carolina.

The data have been extracted from different sources for each location:

The weekly mobility data at the census tract level are extracted from: Kang, Y., Gao, S., Liang, Y. Li, M., Rao, J. and Kruse, J. Multiscale dynamic human mobility flow dataset in the U.S. during the COVID-19 epidemic. Scientific Data 7, 390 (2020). https://www.nature.com/articles/s41597-020-00734-5

The folder contains:

The folder contains state population structure by age and race/ethnicity.

The serology data was extracted from the CDC COVID Data Tracker, 2020-2021 Nationwide COVID-19 Infection- and Vaccination-Induced Antibody Seroprevalence (Blood donations)

The nationwide blood donor seroprevalence survey estimates the percentage of the U.S. population ages 16 and older that have developed antibodies against SARS-CoV-2. The dataset includes seroprevalence from both infection and both vaccination (combined) and infection for three regions in California and one region in North Carolina by major racial/ethnic groups.

Blood donor data represents a biased sample, so differences between racial/ethnic groups should be interpreted conservatively. Several other serological studies were conducted throughout the study period:

California: Cross-sectional serological studies conducted within a hospital network from February 4-17, 2021, indicate the risk of infection for Hispanic/Latino is ~5x that of White individuals aged 18-64. Another serological study found that incidence was 7.5x higher for Hispanic/Latino populations and 2.4x higher for Black population compared to White populations from August-December 2020. Notably, the latter study adjusted sampling techniques to attempt to reach comparable coverage by race/ethnicity.

North Carolina: Serological samples collected from a network of hospitals from 10/25/2020 - 12/26/2020 indicated that seroprevalence was 1.8x higher among Black individuals and 3.9x higher among Hispanic/Latino individuals compared to White populations.

An additional serology_data_complete.csv file is available. It originates from the same source data but has been adapted to align with the race/ethnicity classifications used in the Disparities Round. The file also contains more complete estimates for small populations with upper and lower bounds

This folder contains weekly vaccination data by key demographics for California and North Carolina. We provide the number of individuals receiving at least 1 dose ("partial_vax") and fully vaccinated ("full_vax") by age ('demographic_category' = 'age') and by race/ethnicity ('demographic_category' = 'race_ethnicity'). Age is broken down into '0-17', '18-49', '50-64', '65+', and 'unknown' and race/ethnicity is broken down into 'asian','white','black', 'latino', 'other', and 'unknown', as denoted in 'demographic_value'.

Source data:

For more detailed information on vaccine efficacy and vaccine rollout schedule assumptions, please consult the Scenario Description associated with the disparities SMH round, available on the COVID-19 Scenario Modeling Hub - Research GitHub repository