Skip to content

v2.0.0

Compare
Choose a tag to compare
@jdemaeyer jdemaeyer released this 26 Jan 11:22
· 537 commits to master since this release

This release brings major updates to shub:

  • Configuration is now done from a dedicated YAML file named scrapinghub.yml. shub will automatically migrate your configuration.
  • We now supply shub binaries. This will be particularly helpful for our Windows users. Find them at the bottom of these release notes.
  • The API received an overhaul. Most notably, the -p option was completely dropped in favour of defining targets in scrapinghub.yml or supplying the project as positional argument: shub deploy -p 12345 becomes shub deploy 12345. Or add targetname: 12345 to the projects section of your scrapinghub.yml and run shub deploy targetname.

Check out the revamped README for more information.

New features:

  • Add -f flag to items, log, and requests for live view of logs/items/requests as they are being scraped
  • Add -s flag to schedule to allow passing job settings
  • Add onboarding wizard and auto-generation of configuration file on first run of deploy
  • Add automatic check for updates

API changes:

  • Read configuration from scrapinghub.yml and ~/.scrapinghub.yml instead of scrapy.cfg and ~/.scrapy.cfg (old settings will be auto-migrated)
  • Drop -p option
  • Print job items/requests in JSON lines format
  • Show only a summary and not the full log when deploying (use -v to overwrite)
  • Drop -v for version when deploying, use --version
  • Use more meaningful nonzero exit codes depending on error
  • Don't include egg name in version tag of deployed eggs

Enhancements:

  • Improve usage messages and command help
  • Drop dependency to unzip and tar
  • Use pip as package rather than spawn sub-processes

Bugfixes:

  • Fix parsing of equal signs in spider arguments and job settings (e.g. shub schedule myspider -a ARG=stringwith=equalsign)
  • Fix reading project version from mercurial branch/commit when git is not installed