Skip to content

Tasks: Data maintenance

James McKinney edited this page Feb 6, 2017 · 31 revisions

Tasks

Refer to the Troubleshooting page if you encounter errors, and the expected output below if the maintenance tasks return output.

Regularly

Quarterly

Run the maintenance tasks of:

In scrapers_ca_app:

  • Run ./manage.py mappings and ./manage.py checkmappings

Check this master spreadsheet for manual requests. If a cell in the Next boundary column is highlighted in yellow, then its shapefile is out-of-date and must be requested. If the cell is highlighted in beige, then it's unknown whether its shapefile is out-of-date (and its shapefile is not online).

Annually

  • Purchase postal code centroids and run loadpostcodes
  • Check for updates to postal code concordances and run loadpostcodeconcordance

Expected output

Some of the maintenance tasks return output that is expected and requires no action.

represent-canada-data

invoke definitions

Richmond, NS districts                                       Expected slug to be Richmond districts not Richmond, NS districts
Paroisse de Plessisville districts                           Expected slug to be Plessisville districts not Paroisse de Plessisville districts

We need to disambiguate Richmond, NS from Richmond, QC and Paroisse de Plessisville, QC from Ville de Plessisville, QC.

ocd-division/country:ca/csd:2425213/borough:1                unknown name: check slug and domain manually
ocd-division/country:ca/csd:2425213/borough:2                unknown name: check slug and domain manually
ocd-division/country:ca/csd:2425213/borough:3                unknown name: check slug and domain manually

The identifiers don't yet exist in ocd-division-ids.

Hamilton wards                                               Non-unique values

Hamilton's licence_url is the same as its source_url.

Newfoundland and Labrador electoral districts (2006)         Expected LICENSE.txt to match "all rights reserved" template
Newfoundland and Labrador electoral districts                Expected LICENSE.txt to match "all rights reserved" template
Quebec electoral districts                                   Expected authority to be Her Majesty the Queen in Right of Quebec not Directeur général des élections du Québec

Newfoundland and Labrador and Quebec have non-standard licensing arrangements.

Bonnyville No. 87 wards                                      Expected slug to be Bonnyville No. 87 divisions not Bonnyville No. 87 wards

Bonnyville No. 87 has non-standard boundary names.

invoke urls

302 http://chathamkentopendata.chatham-kent.opendata.arcgis.com/datasets/1f4428dd5d764320b4246d190cfb70cb_0
404 https://www.geosask.ca/
404 https://www.geosask.ca/Portal/jsp/terms_popup.jsp

GeoSask and Chatham Kent's open data catalog no longer exist. Saskatchewan electoral districts are now available from Elections Saskatchewan but without a license, so we continue to attribute GeoSask.

scrapers-ca

invoke tidy

County of Grande Prairie No. 1 Council                       Expected Grande Prairie County No. 1 Municipal district Council
Strathcona County Council                                    Expected Strathcona County Municipal Council
Langley Township Council                                     Expected Langley District Council
Haldimand County Council                                     Expected Haldimand County City Council
Markham City Council                                         Expected Markham Town Council

The task uses a jurisdiction's Census data to guess its legislature's proper name. However, Census data may not correspond with how the jurisdiction refers to its legislature.

Conseil municipal de Dollard-Des-Ormeaux                     Expected Conseil municipal de Dollard-Des Ormeaux

The task uses the English name of a jurisdiction. For Quebec jurisdictions, we use the French name.

ca_on_waterloo_region                                        Expected ca_on_waterloo

We use a different module name for the Region of Waterloo to avoid a collision with the City of Waterloo.

ca_mb                                                        Check: http://www.gov.mb.ca/legislature/
ca_nb                                                        Check: http://www.gnb.ca/legis/index.asp

If the jurisdiction's url contains a path, the task reminds you to double-check that the path is correct.

invoke sources_and_assertions

Expected 2 sources after 4 requests ca/people.py
Expected 2 sources after 3 requests ca_ab/people.py
Expected 2 sources after 3 requests ca_sk/people.py
  • ca: We don't add a source URL for the Twitter usernames. A request validates the photo.
  • ca_ab: A request retrieves the temporary CSV file.
  • ca_sk: A request caches the photo.