More details in the Jobs table #39

Blender3D · 2014-03-07T04:49:41Z

The Jobs table in the web interface is really bare. The Scrapy stats collector contains a lot of valuable data, which should be included in this table.

I see a few ways of accessing this data:

Parsing logs. This seems like unnecessary work and will only give access to crawl statistics after a crawl has finished.
Subclassing CrawlerProcess, overriding methods that start/stop the reactor, thus removing the need to launch scrapyd.runner as a separate process. This gives us direct access to crawler.stats.get_stats() and gives the added benefit of using only one reactor to run multiple crawls.
Using scrapy.contrib.webservice.stats.StatsResource. This doesn't rely on an unstable API (unlike 2), but will force us to parse log files to determine the webservice port.

Scrapyd needs some useful upgrades aside from a prettier UI. Scheduling periodic crawls, queues, retrying, etc. They don't seem difficult to implement, but I don't have the time to do this myself and don't know if the community even has interest in Scrapyd.

Thoughts?

The text was updated successfully, but these errors were encountered:

Digenis · 2014-03-08T16:48:59Z

I wouldn't favour parsing logs, at least not by default. They may spawn big files and delay the rendering of the job table. The builtin provides enough functionality already but I understand the need for more features, I myself once ended up patching it locally because I could find any documentation on custom resource classes. Maybe more extensive documentation on resource classes and a contrib/ package would unleash the creativity of even more users and their useful ideas without cluttering the builtins. I think the community has interest in scrapyd, it just takes much more to get involved without detailed documentation to start from (compared to scrapy's doc)

jayzeng · 2014-07-04T23:21:53Z

I do agree scrapyd needs more powerful features for different needs, but adding more features adds unnecessary overheads for those who needs the absolute minimal. I think we need to think about adding plugins and expose/manage them through a settings file, and/or web ui.

jpmckinney · 2022-05-13T15:34:39Z

Closing as this feature request has not attracted additional interest since 2014.

Digenis added the type: enhancement label May 21, 2016

jpmckinney mentioned this issue Sep 23, 2021

Add schedule.json parameters to listjobs.json response #205

Closed

jpmckinney closed this as completed May 13, 2022

jpmckinney mentioned this issue May 13, 2022

Is there a way to show the items count in the jobs list webpage? #239

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

More details in the Jobs table #39

More details in the Jobs table #39

Blender3D commented Mar 7, 2014

Digenis commented Mar 8, 2014

jayzeng commented Jul 4, 2014

jpmckinney commented May 13, 2022

More details in the Jobs table #39

More details in the Jobs table #39

Comments

Blender3D commented Mar 7, 2014

Digenis commented Mar 8, 2014

jayzeng commented Jul 4, 2014

jpmckinney commented May 13, 2022