Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

migrate country:uk ids to country:gb v2 #188

Open
wants to merge 18 commits into
base: master
Choose a base branch
from

Conversation

chris48s
Copy link
Contributor

Alternate solution for #184 incorporating a clause for constituent nation (England/Wales/Scotland/Northern Ireland) as proposed in #184 (comment)

This branch has the same commits as #186 with 6f59a09, 4c369fd and 09a5a9a added on top so its easier to compare the two. If you want to merge this version instead of #186 , let me know and I'll rebase to tidy the history up.

$ ./scripts/country-gb/build.py
$ ./scripts/compile.py gb
processing identifiers/country-gb/uk_parliament_consitituencies.csv
processing identifiers/country-gb/combined_authorities.csv
processing identifiers/country-gb/police_force_areas.csv
processing identifiers/country-gb/naw_consitituencies.csv
processing identifiers/country-gb/scottish_parliament_regions.csv
processing identifiers/country-gb/naw_regions.csv
processing identifiers/country-gb/country-uk.csv
processing identifiers/country-gb/scottish_parliament_consitituencies.csv
types
   pcon                             650
   spc                               73
   pfa                               40
   nawc                              40
   cauth                              9
   spr                                8
   nawr                               5
   country                            1
fields
   id                          826       100%
   name                        826       100%
   gss_code                    826       100%
Copy link
Contributor

@jloutsenhizer jloutsenhizer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this looks good to me, I spot checked some random UK parliament constituencies to make sure they show up in the new file with the expected part.

My only comment is to restore the parts.csv file so there can be OCD IDs defined for those.

identifiers/country-gb/parts.csv Show resolved Hide resolved
Copy link
Contributor

@jloutsenhizer jloutsenhizer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jpmckinney
Copy link
Member

@jdmgoogle Since you commented on #184

@jdmgoogle
Copy link
Contributor

It generally LGTM. The only part I'd like to get more clarity on is why Parliamentary constituencies hang off Wales, Scotland, etc, when it sounds like the level at which they're created and administered is at the UK level. I'll defer to an in-country expert, but would like to at least see the argument as to why they're this way when

Fundamentally though, the thing that defines the boundaries is a piece of legislation passed through the UK-wide Parliament.

and

All UK elections are regulated by The Electoral Commission [...] which is a UK-wide body overseeing electoral administration and political finance.

For example, if a US state had a law saying that state legislative districts couldn't cross county boundaries, would we say that the sldu and sldl identifiers had to hang off the counties and not the state itself, even though the boundaries are set and administered by the state?

@sguenther85
Copy link
Contributor

@sguenther85 @jpmckinney
Sorry, i have no rights to approve. Another person has to do this. ;)

@chris48s
Copy link
Contributor Author

I'd like to get more clarity on is why Parliamentary constituencies hang off Wales, Scotland, etc
...
but would like to at least see the argument as to why they're this way

From my perspective (as the PR author), I'd like to get the existing IDs migrated from the country:uk namespace to country:gb. I submitted this PR as an alternative to #186 (like-for-like migration) in response to #184 (comment) and this seems to be the approach that has traction. I'd probably leave out the constituent nation clause, but it also probably doesn't introduce problems to populate it, so if that's the tradeoff that gets us over the line, cool.

Another (probably more useful) way you could think about introducing hierarchy would be:

Instead of thinking about the constituent countries (or "parts" as they've been labelled) as geographic containers for divisions (which is what we're doing with the approach in this PR) we could get rid of that concept and think about divisions as children of legislatures they elect to.

i.e: you say:

  • Parliamentary Constituencies are all just children of the UK Parliament (regardless of which nation they are in geographically)
  • Scottish Parliament Regions and Scottish Parliament Constituencies are children of the Scottish Parliament (as opposed to the country of Scotland, as such)
  • Welsh Assembly Regions and Welsh Assembly Constituencies are children of the Welsh Assembly (as opposed to the country of Wales)

..and so on

Maybe that way of introducing hierarchy is closer to the way @jdmgoogle is thinking about this but less in line with the angle @jloutsenhizer and @sguenther85 see it from? Correct me if I'm wrong..

@jloutsenhizer
Copy link
Contributor

My original argument for having the UK parliament constituencies hang off of each of the constituent countries was because they each have a boundary commission which define the boundaries of these constituencies, which felt to me in-line with USA.

I think doing the hierarchy on the basis of which governmental body administers the constituencies make sense, and can get this unblocked. This model aligns with the current US gp-units since each US State determines their own districting.

So I'm in support of moving forward with this proposed solution by @chris48s. Does this sound good to you, @jdmgoogle?

@sguenther85
Copy link
Contributor

We now have only 1 week left until the election. i repeat again my request to merge this now please after the election.
I think there would be more problems for the people (and also for us) who work with the ocd paths now than it would help.

@jdmgoogle
Copy link
Contributor

Sorry for being the hold-up. SGTM. Approving and merging.

@jdmgoogle
Copy link
Contributor

Actually, let me sync with JustinL this morning and we'll resolve this ASAP.

@jloutsenhizer
Copy link
Contributor

Discussed offline with @jdmgoogle. We'll hold off on merging until after the UK election on December 12th.

@sguenther85
Copy link
Contributor

Thanks, this is very helpful.

@jloutsenhizer
Copy link
Contributor

@jdmgoogle Now that the elections over, let's revisit merging this

@jloutsenhizer
Copy link
Contributor

This totally slipped the radar.

@chris48s @sguenther85 any concerns with merging this PR now? If no, I'll proceed with merging it.

@sguenther85
Copy link
Contributor

@jloutsenhizer no concerns from my view. Like i thumbed up you last post: lets merging this :D

Copy link

@jdimatteo jdimatteo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please change uk to gb in identifiers/country-eu.csv (and we can deal with brexit separately)

Otherwise LGTM -- thanks!

@chris48s
Copy link
Contributor Author

Happy if you want to go ahead with it in this form, or make changes based on the comments in #188 (comment) I think the issue tailed off somewhat unresolved.

@jloutsenhizer
Copy link
Contributor

Thanks for calling that out, I had misremembered and thought that the changes from that comment were already in place.

I just want to double check my understanding of the diff from the current PR would be:
For UK parliament constituencies, we'd be removing the "part:xxx" from the OCD IDs. For example:
ocd-division/country:gb/part:wls/pcon:cardiff_south_and_penarth -> ocd-division/country:gb/pcon:cardiff_south_and_penarth
ocd-division/country:gb/part:eng/pcon:barking -> ocd-division/country:gb/pcon:barking

Is that correct? If so then I think that change sounds good to me and would be good to include in the current PR before mering.

@chris48s
Copy link
Contributor Author

I was thinking we'd remove the part clause completely and introduce a legislature clause (where relevant):

  • ocd-division/country:gb/legislature:uk_parliament
    all pcon divisions become a child of this e.g:
    ocd-division/country:gb/legislature:uk_parliament/pcon:cardiff_south_and_penarth

  • ocd-division/country:gb/legislature:scottish_parliament
    all spc and spr divisions become a child of this e.g:
    ocd-division/country:gb/legislature:scottish_parliament/spc:aberdeen_central

  • ocd-division/country:gb/legislature:national_assembly_for_wales
    all nawc and nawr divisions become a child of this e.g:
    ocd-division/country:gb/legislature:national_assembly_for_wales/nawc:aberavon

With Combined Authorities and Police Force Areas just hanging off ocd-division/country:gb directly as they are each authorities in their own right.

@jloutsenhizer
Copy link
Contributor

Thanks for the explanation and examples. I don't think we should be going this route of adding in legislature clauses, since OCD IDs are meant to represent political geography and ocd-division/country:gb/legislature:uk_parliament isn't a piece of political geography (can't be drawn on a map).

@chris48s
Copy link
Contributor Author

I see. If the constituent nations are just bounding boxes for divisions but don't encode any notion of a division's relationship to a legislature that representatives are elected to, it is probably equally meaningful or meaningless to apply that rule consistently to all boundaries ( i.e: #188 as it stands ) or none of them ( i.e: #186 as it stands ).

@jdmgoogle
Copy link
Contributor

The general heuristic I use as to what an OCD-ID is is that it's a structured name for a shape on a map. Some of those shapes exist solely because of legislative bodies, but those legislative bodies themselves are not shapes on maps. E.g., ocd-division/country:us/state:va/cd:1 is the first congressional district in the state of Virginia in the United States, but there's no ocd-division/country:us/legislature:house/va:1 because ocd-division/country:us/legislature:house isn't itself a shape on a map.

@chris48s
Copy link
Contributor Author

Makes sense.. so legislature:scottish_parliament (for example) should not be introduced in that case.

What is the heuristic for deciding when an identifier should/shouldn't also encode the concept of "this shape on a map is inside this other shape on a map"?

@jpmckinney
Copy link
Member

The problem with legislature:scottish_parliament is that those words describe a political body (a legislature), not a political geography (a country), which is what OCDIDs are meant to identify.

What is the heuristic for deciding when an identifier should/shouldn't also encode the concept of "this shape on a map is inside this other shape on a map"?

There was some discussion here: #181 (comment)

OCDIDs are meant to identify geographies, not to encode their hierarchy. In general, we only include as much hierarchy as is needed to produce a stable identifier.

For identifiers that are defined by authorities, we sometimes follow the hierarchy provided by that authority, if it'll better guarantee stability. This isn't always the case.

For example, Canada Census subdivisions are within Census divisions, but we have OCDIDs like ocd-division/country:ca/cd:1234 and ocd-division/country:ca/csd:1234567, instead of like ocd-division/country:ca/cd:1234/csd:1234567. The two division types have different numbers of digits in their ID (so no possibility of conflict), and the hierarchy is already encoded in the Census codes (first 4 identify division, last 3 identify subdivision), so there's really nothing to gain.

In #188, I suggested less hierarchy but later agreed to an extra level of hierarchy for UK parliamentary constituencies, on the understanding that these are set by different boundary commissions, who aren't guaranteed to coordinate unique names. If that understanding is incorrect, I'd rather pcon hang off country:gb directly, but I'm not fussed either way.

@chris48s
Copy link
Contributor Author

OCDIDs are meant to identify geographies, not to encode their hierarchy

👍

For identifiers that are defined by authorities, we sometimes follow the hierarchy provided by that authority, if it'll better guarantee stability

type:name_slug will always be enough to produce a unique id for all of the division types in this PR, so it sounds like #186 (with no part: clause) is the way forward. It would only be necessary to introduce hierarchy to disambiguate or ensure uniqueness if IDs were created for County Electoral Divisions or Council Wards.

@jloutsenhizer
Copy link
Contributor

I'm not sure I agree that the only use of hierarchy is to prevent ID collision. If that's the case you could move all of the hierarchy necessary into the types, which is what is done in #186. where we use nawc and nawr for the Welsh government's constituencies and regions and Scottland we use spc and spr for the government constituencies and regions.

In that vein, you could rewrite the identifiers for US counties as ocd-division/country:us/<2-letter-state-code>county:<county-name>.

I think what was previously suggested and would keep the amount of hierarchy away from either extreme is using the government in charge of the divisioning to determine the amount of hierarchy. So in this case you'd hang all of the UK parliament constituencies off of ocd-division/country:gb, all National Assembly for Wales constituencies off of ocd-division/country:gb/part:wls, all Scottish parliament constituencies off of ocd-division/country:gb/part:sct, etc.

@jpmckinney
Copy link
Member

jpmckinney commented Mar 19, 2020

I agree with @jloutsenhizer's proposal.

To expand on the use of hierarchy: There are many factors at play. Another factor is to avoid a proliferation of types within a jurisdiction, which increases complexity and reduces comparability.

Most of the types in this PR are for electoral divisions. There isn't a conceptual difference between a Scottish parliamentary constituency and a Welsh parliamentary constituency; as such, they don't deserve their own types, as their semantics are essentially the same.

For example, in Canada, there is a legal difference between provinces and territories, so we use the province and territory types instead of a more generic region type (which wouldn't be desirable anyway, because "region" in Canada often refers to groups to provinces).

We sometimes proliferate types across jurisdictions to use local terms instead of generic terms, especially in cases where no generic type or correspondence is agreed (#170).

In short: the type shouldn't encode hierarchy at all (unless that hierarchy is essential to its semantics – for example, a Census subdivision is essentially a child of a Census division), and the OCDID as a whole should generally only encode as much hierarchy as necessary.

@chris48s
Copy link
Contributor Author

I've pushed some commits to this branch that apply the constituent nation (or part) clause to only the SP/NAW consitituencies/regions, but not the Parliament Consitituencies, Combined Authorities or Police Force Areas.


move all of the hierarchy necessary into the types, which is what is done in #186

Just to clarify: #186 doesn't move hierarchy into the types - its just a straight migration from the -uk to the -gb namespace. The types in that form already exist in https://github.com/opencivicdata/ocd-division-ids/tree/master/identifiers/country-uk on master. This PR introduces hierarchy.

There isn't a conceptual difference between a Scottish parliamentary constituency and a Welsh parliamentary constituency

There is some difference in the sense that the Scottish Parliament and Welsh Assembly aren't exactly analagous - they have different powers. As one example, the Scottish Parliament has powers to vary the basic rate of income tax while the Welsh Assembly is reliant solely on funding by UK central government. Hence being the elected representative for a SP Region means you have different powers than an elected representative for a NAW Region does. So while they do both use the same words to describe their subdivisions, they're not just the exact same type of body in two different places. Maybe that is a bit like how a Province is different from a Territory, or perhaps it is not 🤷‍♂️

@jpmckinney
Copy link
Member

jpmckinney commented Mar 20, 2020

The difference you’re describing could be used to argue for a different type for Scotland and Wales instead of both using part. However, I can’t find any common terminology for that difference, unlike province and territory, which is written into the constitution.

The constituencies on other hand are semantically the same. (Adopting a model where every difference at a higher level is inherited by sub-levels would create an extremely large taxonomy of types, which wouldn’t be useful and doesn’t make much sense since the function of types is categorization, not identification.)

It’s important not to conflate political divisions, bodies (legislatures) and roles (representatives).

@jloutsenhizer
Copy link
Contributor

Thanks for the changes @chris48s. I'm happy with the state of this PR now, @jpmckinney mind taking a look to see if there are any other concerns from your side before I merge?

identifiers/country-gb.csv Show resolved Hide resolved
identifiers/country-gb/naw_consitituencies.csv Outdated Show resolved Hide resolved
identifiers/country-gb/naw_consitituencies.csv Outdated Show resolved Hide resolved
@chris48s
Copy link
Contributor Author

The difference you’re describing could be used to argue for a different type for Scotland and Wales instead of both using part.

OK. I think I see where you're going with this now. I'm going to attempt to summarise your perspective, what we've changed, and why..

In 09a5a9a part:eng, part:wls, part:sct are just "shapes on a map" - if a thing is geographically in England, its a child of part:eng, regardless of what type of thing it is.

In 1c87a1c rather than just seeing the part:* layer as a bounding box and using it some places but not others, part:wls and part:sct have moved to representing the area or juristiction of the Welsh Assembly and Scottish Parliament, respectively (but crucially not the actual legislaitures themselves).

So whereas with the previous scheme, if we were going to add an idenfifier for Cardiff Council (for the love of god lets not try and get into adding IDs for more areas in this PR, but just as an example), you would have made the ID something like ocd-division/country:gb/part:wls/uta:cardiff (because its geographically in Wales), but now you'd make it ocd-division/country:gb/uta:cardiff because its not directly a child/division of the Welsh assembly. Right?

Is that summary correct, or am I still wide of the mark on understanding your thinking?

@jpmckinney
Copy link
Member

jpmckinney commented Mar 22, 2020

Hmm, not quite.

Political divisions are never just shapes on a map. Having a shape is a necessary condition, not a sufficient condition. It's also important to note that the shape needn't be agreed or stable, e.g. Western Sahara. Another necessary condition is that the shape must have some political identity (instead of splitting hairs on that one, I'll just defer to academic work in political geography, which is the theoretical basis for this project).

ocd-division/country:gb/part:sct is the political division that matches the jurisdiction of the Scottish Parliament, but it is not essentially or only or identical to that jurisdiction. Prior to the 1997 referendum, this political division existed even though the Scottish Parliament did not; for example, it was the jurisdiction of Scottish courts. The same division can match the territory of different jurisdictions, whether over time or at once.

On your third summary: First, we'd never add an identifier for Cardiff Council, because OCDID's aren't for political bodies. But we can add one for Cardiff.

I don't know the political processes by which UK councils are created, managed, merged, split, etc. or how their jurisdictions and territories are determined. If they are managed directly by the UK Parliament (by-passing the Welsh Assembly), then they could be under country:gb. If by the Welsh Assembly, then under part:wls. On this last point, there are two considerations:

  • If the Ordnance Survey or UK Census or some other relevant authority nests councils directly under UK in their own geographic system, the OCDID maintainers might defer to that, on the assumption that the authority had good reason to do so, even if the reasons are unknown.
  • If the OCDID maintainers for the UK are interested in establishing an identifier system that can work across long periods of time, they might prefer an approach that gives Cardiff a stable identifier that disregards which part of the UK is presently responsible for it.

Coming back to this, as it seemed to be a point of confusion:

Instead of thinking about the constituent countries (or "parts" as they've been labelled)

We considered the reuse of country in ocd-division/country:gb/country:sct to be confusing. We therefore took the only other label that had any consistent authoritative use from the Acts of Union 1707 and 1800.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

Successfully merging this pull request may close these issues.

6 participants