Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add name based OCD IDs to CA #320

Closed
wants to merge 2 commits into from

Conversation

evannjw
Copy link
Contributor

@evannjw evannjw commented Oct 13, 2022

Identify districts using names as unique id for OCD-ID.

Names should be used as the stable identifier as the map between district id and distract name will often change with redistricting leading to new ids being created (i.e. ca_federal_electoral_districts-2013.csv) making the ocdids unstable.

If the name changes, the OCD-ID should change, even if the number stays the same
If the number changes but the name stays the same, the OCD-ID should stay the same (assuming the OCD-ID uses the name)
If the lines on a map change but the name does not, then the OCD-ID should stay the same

Aliases have been created for numeric id based OCD-IDs

@evannjw
Copy link
Contributor Author

evannjw commented Oct 20, 2022

@jpmckinney can you take a look?

@jpmckinney
Copy link
Member

This is a major change and will break existing applications.

If the name changes, the OCD-ID should change, even if the number stays the same

No. MPs regularly author bills to change the name of their districts, even though there has been no redistricting. Redistricting occurs every 10 years at the federal level. It's a process. Names change several times within 10 years. They are the same district. The name doesn't change the identity of the district.

If the number changes but the name stays the same, the OCD-ID should stay the same

Trying to maintain the identity of a district across redistricting is a bit of a fool's errand. Is a given district split into two (and as such the original does not survive), or is a new district split out of two existing ones (in which case the two existing districts survive)?

Different sources provide different answers to these questions. They are all "correct" insofar as they have some reasonable explanation. Instead of picking one methodology (on which not everyone will agree), this project instead establishes new IDs at each redistricting, since neither the number nor the name consistently survive a redistricting.

The IDs are stable for the period during which they are relevant.

Sorry, this PR doesn't account for the real evolution of names and numbers in Canada.

@jpmckinney jpmckinney closed this Oct 24, 2022
@jdmgoogle
Copy link
Contributor

This is a major change and will break existing applications.

We understand that this is a major change and would like to try and make this in as backwards-compatible way as possible.

No. MPs regularly author bills to change the name of their districts, even though there has been no redistricting.

Do you have a link to such bills, happening outside of the usual redistricting cycle? All I can find is this one example from the 2013/2014 redistricting cycle.

@jdmgoogle jdmgoogle reopened this Oct 24, 2022
@jdmgoogle
Copy link
Contributor

Also it appears that names are the common source of consistent identity on places like Wikipedia. E.g.,

The division numbers most certainly are not consistent and appear to be closer to the equivalent of "if I sorted the ridings by name and took the row number of the spreadsheet" rather than any actual consistent source of identification.

More fundamentally, they provide a false sense of consistent identity and make it more likely that errors will get introduced over time during redistricting.

@jpmckinney
Copy link
Member

jpmckinney commented Oct 24, 2022

Sure, there's https://openparliament.ca/bills/42-1/C-402/ and in general you can search for "An Act to change the name of the electoral district of" or "An Act to change the name of certain electoral districts". MPs love to do it.

Wikipedia renames pages when names change. That's not consistency. And we don't have that luxury in this project. With respect to the methodological issues I mentioned, Wikipedia has no formal way to decide whether a district is new, survives a redistricting, or doesn't survive a redistricting. The identity issue is basically up to the public contributors of Wikipedia as long as they can provide one source (which is not necessarily the unanimous opinion on the division's "identity").

The division numbers at the federal level are indeed only useful for one redistricting, as far as I remember.

If you want districts to survive redistricting, you'll need to do more research into the concordances of federal ridings, which is not a solved problem, but it would be the direction to take to prepare a PR that captures reality more closely.

@jpmckinney
Copy link
Member

jpmckinney commented Oct 24, 2022

FWIW, the numbers are used by Statistics Canada, etc. They may very well have been created by sorting a spreadsheet, but they are used consistently in a variety of geographical products...

Statistics Canada assign numbers, because political division identity at the federal level in Canada is a real can of worms. It's not as easy as the numbered congressional districts in the US, for comparison.

@jpmckinney
Copy link
Member

I still think this PR should be closed so that this discussion can occur in an issue without a bias toward a specific solution.

@evannjw
Copy link
Contributor Author

evannjw commented Nov 1, 2022

Closing PR in favor of moving discussion to #323 . With regards to the frequency of name changes, those changes would be present here correct? I'm not sure about the 2018 bill you linked above, but it looked like those name changes were already reflected in the elections.ca site and in the repo. It doesn't seem to me like it will require that much additional effort to maintain changes to district names (elections.ca site shows there was just 1 update to district names since the last redistricting cycle). I agree that trying determine whether a district survives redistricting is a tricky problem, but I think that if we are using names as identifiers, if a new district created after redistricting doesn't share a name with any previous district, it makes sense create a new id for it rather than trying to map it to a previous district.

@evannjw evannjw closed this Nov 1, 2022
@jpmckinney jpmckinney mentioned this pull request Sep 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants