Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quick tips on upgrading to 0.4? #124

Closed
jpmckinney opened this issue Oct 14, 2014 · 7 comments
Closed

Quick tips on upgrading to 0.4? #124

jpmckinney opened this issue Oct 14, 2014 · 7 comments

Comments

@jpmckinney
Copy link
Member

I just got scrapers-ca working in Py3, so I'm ready to upgrade Pupa. I know there's the new style __init__.py files, but otherwise I don't really know what differences there are. Point me in the right direction? cc/ @paultag

@paultag
Copy link
Contributor

paultag commented Oct 14, 2014

Yeah totally @jpmckinney - there have been a few changes to how the scrapers work -- we've not been keeping a migration guide (sorry!), but let me see if I can get the big points from memory -- I'd be happy to go through a scraper with you and migrate it forward (and hopefully document this too -- I've been keeping a few scrapers against master, so it's hard to remember what was in 0.3.0 and what was added and changed after 0.3.0)

Big ones are likely:

  • get_organizations method on the jurisdiction for the big organizations in the jurisdiction -- in the US Federal case, this would be the branches of government. Committees (etc) are still scraped in.
  • Legislator helper is done away with in favor of a straight Person from the helper, with a primary_org keyword argument (which relates to one of the branches above -- you can still manually relate them with Memberships if this isn't the case and leave it at None)
  • Committee helper is also gone in favor of Organization

Sorry for all the API breakage. If you have a set of a bunch of scrapers, I'd be happy to take a look to see what the big stuff is

@jpmckinney
Copy link
Member Author

Is name in __init__.py for the division name or jurisdiction name? The example uses "Seattle", but if it's the government of Seattle I would expect the example to use "City of Seattle".

@paultag
Copy link
Contributor

paultag commented Oct 14, 2014

Jurisdiction - you're likely right, "City of Seattle" sounds like the
correct string for name :)

On Tue, Oct 14, 2014 at 3:10 PM, James McKinney [email protected]
wrote:

Is name in init.py for the division name or jurisdiction name? The
example uses "Seattle", but if it's the government of Seattle I would
expect the example to use "City of Seattle".


Reply to this email directly or view it on GitHub
#124 (comment).

Paul Tagliamonte
Software Developer | Sunlight Foundation

@jpmckinney
Copy link
Member Author

Besides changes reported in opencivicdata/docs.opencivicdata.org#32

Modules

  • pupa.models mostly moved to pupa.scrape
  • pupa.models.utils moved to pupa.utils

init.py

  • get_scraper() becomes scrapers
  • sessions becomes legislative_sessions
  • chambers becomes get_organizations()
  • remove jurisdiction_id, provides
  • add division_id, classification

Scrapers

  • .add_extra(key, value) becomes .extras[key] = value
  • Many arguments are now keyword-only, e.g.:
    • note in add_link() and add_source()
    • all in add_contact_detail()
  • Legislator becomes Person.
    • chamber becomes primary_org
    • post_id becomes district
    • Legislator would automatically create an Membership. Any contact details added with Legislator#add_contact would be added to that Membership and not to the Legislator. That membership would take the Legislator's role, chamber, and post_id. Now, setting primary_org (which is in fact a classification) automatically creates a Membership with a role of "member", with a post_id corresponding to the district and classification, and with a start_date and end_date.
    • To disambiguate people, Legislator would save chamber, post_id and party properties. Now, it disambiguates people on birth_date.
    • With Legislator, a membership would automatically be created to the Legislator's party. This is still the case. However, a Person does not store a party property.

@jpmckinney
Copy link
Member Author

Got my tips :)

@paultag
Copy link
Contributor

paultag commented Oct 15, 2014

Awesome. Once I start at the docs, I'll be sure to add this somewhere (an upgrade guide)

@paultag paultag mentioned this issue Oct 15, 2014
@jpmckinney
Copy link
Member Author

FYI, got my first two scrapers to complete. Adding explicit posts is really helping with data integrity - I can cut a bunch of code that used to validate that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants