This repository parses & archives COVID-19 test positivity percent for Indian districts and states. The test positivity is the fraction of Covid tests that are positive.
From 10th May onwards, the Indian Ministry of Health has been uploading a daily Excel file that reports the 6-day average COVID-19 test positivity for Indian districts with a test positivity ≥ 10%. From 26th May onwards, this data set has been expanded to include data from all districts.
This repository automatically fetches the daily government data update, parses the data and appends it to a CSV file, and archives the government data file.
We also calculate the test positivity for Indian states & union territories using data collected by covid19india.org as follows: test positivity (7 day average) = new confirmed cases in the past week / tests conducted in the past week
. Note that there may be some discrepancies between states depending on whether they report the number of samples tested or the number of people tested.
- The archive folder contains an archive of the daily government Excel files with test positivity data for districts
- districtdata.csv contains time-series test positivity data for districts
- statedata.csv contains time-series test positivity data, weekly confirmed cases, and weekly tests for states & union territories
The date is in ISO format i.e. YYYY-MM-DD
. District-level data is for the 6 day period preceding the date on which the data is reported, while state-level data is for the 7 day period including the date on which the data is reported.
- District Data: Indian Ministry of Health
- State Data: covid19india.org
The following steps were taken to clean the daily data in districtdata.csv
- 2021-05-13: Renamed misnamed state 'A' to 'Tamil Nadu'
Prior to May 26th, the Ministry of Health only included data from districts with test positivity >= 10%, and did not include district-level data for Telangana.
- May 26th: source data format changed (includes data from almost all districts)
- May 27th: source data format changed (added columns for % of testing by RT-PCR and Rapid Antigen)
- May 29th: souce data format changed (added rows of summary text on top)