Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filling out Elections proposal #64

Merged
merged 36 commits into from
May 10, 2017
Merged

Conversation

gordonje
Copy link
Contributor

Third time's the charm!

My first PR was on datamade's fork of this repo. This second one was on this repo, but from datamade's fork which included some commits to files I never intended to change (probably as a result of my shoddy attempt to squash the commits).

We've essentially borrowed everything from the VIP XML spec that feels useful for the campaign finance use cases. Differences between proposed OCD data types and VIP elements are noted below.

This proposal doesn't include modeling election results (which also aren't in VIP). Will cover that in a future proposal.

@aepton: Saw your comments on the original PR. I've made a few changes you suggest and will run through each of your comments. After that, if you feel like I haven't adequetely address your questions/concerns, do you mind moving those comments over to this PR? Sorry for the redundant work!

@fgregg, @palewire and @dwillis and whoever else: Please give it look!

@palewire
Copy link

palewire commented Jan 15, 2017

I think this is a great start. As we await feedback from the OCD crew, we're going to push ahead with a rough implementation in our django-calaccess-processed-data repository. That's where we are currently working to transform and simplify data from CAL-ACCESS, the state of California's jumbled, dirty and difficult campaign-finance database.

**optional**
Specifies the date and time when a candidate must have filed for the contest for the office (datetime).

is_runoff
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I liked your suggestion about runoff_for_contest_id as a property here that would a) replace is_runoff (since they're redundant) and b) provide a helpful link in a clean way to the original contest.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. Made that change.


post_id
Reference to the OCD ``Post`` representing the public office the candidate will either retain or lose as a result of the contest.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be nice to inherit BallotMeasureThreshold for RetentionContest (if it's not already being inherited). Strikes me that some retention contests are likely to have non-50%+1 thresholds.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RetentionContest should inherit all of the properties of BallotMeasureContest, including passage_threshold. This property is proposed to be a string, as it is in VIP. Possibly it would make more sense as an enum if we can agree on what the options should be.

is_top_ticket
**optional**
Indicates that the candidate is the top of a ticket that includes multiple candidates (boolean). For example, the candidate running for President is consider the top of the President/Vice President ticket. In many states, this is also true of the Governor/Lieutenant Governor.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might want some pointer to the ticket itself here as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the nearest object that represents the ticket would be the CandidateSelection. Since CandidateSelection already references the Candidates, adding the reverse reference feels like overkill. But maybe there's an important need here that I am just missing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As long as there's a reasonable way to do it, I'm good. Don't need to be redundant.

@aepton
Copy link
Contributor

aepton commented Jan 16, 2017

This looks really good to me.

Copy link
Contributor Author

@gordonje gordonje left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the great comment @aepton. I have incorporated a couple of small changes and responded to the other comments.

**optional**
Specifies the date and time when a candidate must have filed for the contest for the office (datetime).

is_runoff
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. Made that change.


post_id
Reference to the OCD ``Post`` representing the public office the candidate will either retain or lose as a result of the contest.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RetentionContest should inherit all of the properties of BallotMeasureContest, including passage_threshold. This property is proposed to be a string, as it is in VIP. Possibly it would make more sense as an enum if we can agree on what the options should be.

is_top_ticket
**optional**
Indicates that the candidate is the top of a ticket that includes multiple candidates (boolean). For example, the candidate running for President is consider the top of the President/Vice President ticket. In many states, this is also true of the Governor/Lieutenant Governor.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the nearest object that represents the ticket would be the CandidateSelection. Since CandidateSelection already references the Candidates, adding the reverse reference feels like overkill. But maybe there's an important need here that I am just missing.

@gordonje
Copy link
Contributor Author

Realized I had a couple of redundant post_id fields, which I have just cleared up. I think we all agree that an OCD Post is an analog to a VIP <Office>, but I'm also thinking now that a RetentionContest should reference a particular OCD Membership, which I take to represent a particular person's tenure in a public office (@fgregg, @jpmckinney, and whoever, lemme know if I have that right).

Seems better if Candidacy contains one record for each time a person ran for a public office, not including the times they were in a recall election. That's I believe I have things modeled currently.

@jpmckinney
Copy link
Member

jpmckinney commented Jan 18, 2017

@gordonje Yes, you've got the correct interpretations of Popolo's (OCD's) Post and Membership :)

@jpmckinney
Copy link
Member

I'm assuming the PR isn't ready for review, but let me know if it is.

@gordonje
Copy link
Contributor Author

gordonje commented Feb 8, 2017

@jpmckinney It's ready to review. I've added some sample json pulled out of data CCDC has scraped from CAL-ACCESS. We've come pretty far along with a draft of django models implementing what's proposed here and, as a result, I've made a few more minor tweaks to this spec. Looking forward to some feedback.

@jpmckinney
Copy link
Member

jpmckinney commented Feb 10, 2017

General notes

OCD is written for international use, but VIP is written for the US only. It will require a fair bit more work to transform this proposal into something for international use.

If you just want to have US models, then I propose implementing this as a new Python package, that looks like and interoperates with python-opencivicdata-django, in which classes are prefixed with VIP, to avoid clashing with any eventual internationally-usable elections models.

Prior to realizing this, I noted below several changes I would make to achieve a closer alignment between this proposal and existing relevant international standards. Classes I still have to review are CandidateContest and Candidacy.

Anyway, let me know if all you want is a straight Python/JSON version of VIP, in which case most of my comments may be irrelevant.

Separately, I'll also note here that OCD's Event class probably has too many properties that not all subclasses should share (e.g. contests don't have agenda items).

Small edits

  • Add definitions for all classes, e.g. to explain what is a contest.
  • Rename ContestBase to Contest – same for BallotSelectionBase. In Python it might become an abstract ContestBase class, but a public schema wouldn't normally have a Base suffix.
  • Cut the content like "No important differences between corresponding fields." "No other OCD fields not implemented in VIP." "No other VIP fields not implemented in this OCDEP." The double negatives are hard to parse. The default assumption is that, if you're reusing an upstream standard, then you haven't changed it. These sections should only note the changes. Readers will assume that anything not mentioned is unchanged. See OCDEP 5 for example.

Election

  • administrative_org_id: Avoid abbreviations. Schema.org has an organizer property with the same semantics.
  • Countries without states have elections. Why not use division_id instead of state? We may want to add division_id to Event itself – or relate Events to full objects like Areas in Popolo, which can hold the division ID.
  • Similarly, is_statewide would need to be renamed.

BallotMeasureContest

  • Why pro and con instead of the more readable yes and no?
  • ballot_measure_type: We typically use the term classification, and we never prefix property names with class names, because it's redundant. We don't have Event.event_name for example.
  • other_type: I suggest removing it and just allowing ballot_measure_type to be an open codelist. I don't see an advantage to pushing the non-standard codes into a new field.

PartySelection

  • Can you provide the justification for the change to PartySelection?

Big edits

  • We shouldn't be advising people to throw specific things into extras. If we later decide to promote one of those fields to the main class, it will be a nightmare to transition. Either don't mention the field at all, or put it on the main class.
  • Can you merge the VIP differences discussion into each class' documentation? It's hard to keep jumping up and down the page or scrolling two tabs, and it just makes sense for readers to get all the docs about a class in one place.

Party

  • Parties are already modeled as organizations in OCD (and Popolo). We can create a subclass if appropriate.

Contest

  • Shouldn't Contest also subclass Event (and use the same id prefix)? They are events just like elections. In which case, Event should gain super_event from Popolo (superEvent in Schema.org), which would replace election_id.
  • These contests are starting to look similar to VoteEvent, which OCD adopted from Popolo, and Motion from Popolo. We should at a minimum re-use the same properties where possible. passage_threshold is requirement and full_text is text in Motion, for example; summary_text is summary in Popolo's Event (and in iCal).
  • effect_of_abstain may need to be modeled out differently. Some countries have multiple types of abstentions with different effects. We can model this as a list of objects, one for each vote option, with another property to describe the effect. We can also store 'statements' here, though we will usually only have statements for yes and no.

@palewire
Copy link

@jpmckinney Thank you for this thoughtful and thorough comment. We will review your requests and be back soon with an update.

@gordonje
Copy link
Contributor Author

I've taken a closer look at python-opencivicdata-django, which is where we want to implement this specification. The Event Django model has a required "jurisdiction" field. However, this field is not mentioned in the OCD Event format or in the Event OCDEP.

Currently this proposal requires an Election.division_id field, but because it inherits from Event, an Election will also be linked to a Jurisdiction, and each Jurisdiction is also linked to a Division.

Does each Event require a Jurisdiction? If so, should we remove Election.division_id and rely on the Division linked to the Jurisdiction that's linked to the Event?

@fgregg
Copy link
Contributor

fgregg commented Mar 13, 2017

I can't think of a really deep reason why events, generically, should have require either a jurisdiction or division field.

Practically, pupa people have needed a way of grouping different types of meetings associated with the same legislature together and this is how we've done it. I think that other ways are possible, though I don't have a concrete proposal right now.

I think it makes more sense for an Election to be associated with a division than a jurisdiction.

@gordonje
Copy link
Contributor Author

@fgregg In that case I'll leave this proposal as it is, and raise the question in python-opencivicdata-django repo.

Any other feedback on this? I'm working now and adapting our current implementation to what is described here.

@fgregg
Copy link
Contributor

fgregg commented Mar 16, 2017

No, I think this is ready to be accepted as draft proposal. @jpmckinney, thoughts?

@jpmckinney
Copy link
Member

I haven't read the full thread - any salient points to keep in mind, or can I just read the patch?

@fgregg
Copy link
Contributor

fgregg commented Mar 22, 2017 via email

@jpmckinney
Copy link
Member

For posterity, noting some election modeling elsewhere that I was just made aware of (I had asked folks at mySociety to review this thread):

@palewire
Copy link

@jpmckinney I think you can largely just read the patch. You'll find a little less here than last time due to the change @fgregg described. Our Django implementation is also nearing completion in another repo.

@gordonje
Copy link
Contributor Author

gordonje commented Apr 10, 2017

Just wanted to share that we now have a working implementation of this spec in our fork of python-opencivicdata-django. The attached zip contains csv files exported from our campaign finance app that loads these models.

We'll wait to make submit a PR on the django package until this spec is at least accepted, in case there are any other changes are necessary based on feedback.

@fgregg
Copy link
Contributor

fgregg commented Apr 11, 2017

This is so cool @gordonje !

@gordonje
Copy link
Contributor Author

Question came up in our implementation that I think deserves broader discussion.

Should a candidate contest include all the candidates who declared to run for the office in the election, or should we limit the list to just candidates who ultimately appeared on the ballot?

The campaign finance use cases surely want to allow for any candidate who raised/spent money in that contest irrespective of being included on the ballot. But I expect the election results use cases will want to limit the list to only candidates who could have received any votes.

So then Candidacy probably needs some property indicating if the person is currently an active candidate in the contest. This would likely be a boolean field like is_active or dropped_out. It would be most precise to store the date when a given candidate dropped out of each contest, but I doubt that is reliably available.

@palewire
Copy link

Could it be managed by the BallotSelection classes that were ultimately dropped by this proposal? If you have one of those, you're on the ballot. If you don't, you didn't.

@aepton
Copy link
Contributor

aepton commented Apr 14, 2017 via email

@jdmgoogle
Copy link

I haven't looked at your derivation of the schema, but did you remove the CandidatePreElectionStatus and CandidatePostElectionStatus enumerations from the Candidate element?

@gordonje
Copy link
Contributor Author

@jdmgoogle Thanks for pointing out CandidatePreElectionStatus. Maybe an enum field with exactly these options will do the trick:

  • filed: The candidate has filed for office but not yet been qualified.
  • qualified: The candidate has qualified for the contest.
  • withdrawn: The candidate has withdrawn from the contest (but may still be on the ballot).

@gordonje
Copy link
Contributor Author

VIP also has "write-in" among the CandidatePreElectionStatus options. That's another piece we're currently missing.

@gordonje
Copy link
Contributor Author

gordonje commented May 3, 2017

Anyone have any problem calling this new field on Candidacy registration_status?

I'm also adding/populating it in our fork of opencivicdata-django.

What other steps are necessary for this spec to be accepted (even provisionally) so that we can pivot to discussing the draft implementation?

@fgregg
Copy link
Contributor

fgregg commented May 5, 2017

@jpmckinney thoughts on this?

@jpmckinney
Copy link
Member

I'll actually have time to review this coming week, but my basic feeling is that it's not really going to be something that's robust for international use, but that we're okay with that. So, it'll be a specification, but not one that targets standardization outside the US, except by accident.

@palewire
Copy link

palewire commented May 5, 2017

That's great news, @jpmckinney! We're eager to complete our implementation and begin releasing a first generation of standardized files on our data portal. Please let us know what we can do to help you.

@jpmckinney jpmckinney merged commit d546cf2 into opencivicdata:master May 10, 2017
@jpmckinney
Copy link
Member

Looks good! Thanks for your patience!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

6 participants