KingfisherFilesStore: Write scrapyd-job.txt #534

jpmckinney · 2020-10-27T17:29:05Z

Presently, it's a little complicated to reconcile data directories with log files, especially if multiple crawls are started for the same source in rapid succession. This is relevant to Kingfisher Archive.

If we write the spider's job ID to a scrapyd-job.txt in the crawl directory, this can be much easier. A _job kwarg is passed to the spider's initializer by Scrapyd.

The text was updated successfully, but these errors were encountered:

jpmckinney added the framework-spiders Relating to common spider functionality label Oct 27, 2020

jpmckinney mentioned this issue Oct 27, 2020

Simplify code for matching log files open-contracting-archive/kingfisher-archive#51

Open

yolile added this to the Priority milestone Mar 3, 2021

jpmckinney closed this as completed in 0c01707 Aug 31, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

KingfisherFilesStore: Write scrapyd-job.txt #534

KingfisherFilesStore: Write scrapyd-job.txt #534

jpmckinney commented Oct 27, 2020

KingfisherFilesStore: Write scrapyd-job.txt #534

KingfisherFilesStore: Write scrapyd-job.txt #534

Comments

jpmckinney commented Oct 27, 2020