-
Notifications
You must be signed in to change notification settings - Fork 572
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
More details in the Jobs table #39
Comments
I wouldn't favour parsing logs, at least not by default. They may spawn big files and delay the rendering of the job table. The builtin provides enough functionality already but I understand the need for more features, I myself once ended up patching it locally because I could find any documentation on custom resource classes. Maybe more extensive documentation on resource classes and a contrib/ package would unleash the creativity of even more users and their useful ideas without cluttering the builtins. I think the community has interest in scrapyd, it just takes much more to get involved without detailed documentation to start from (compared to scrapy's doc) |
I do agree scrapyd needs more powerful features for different needs, but adding more features adds unnecessary overheads for those who needs the absolute minimal. I think we need to think about adding plugins and expose/manage them through a settings file, and/or web ui. |
Closing as this feature request has not attracted additional interest since 2014. |
The Jobs table in the web interface is really bare. The Scrapy stats collector contains a lot of valuable data, which should be included in this table.
I see a few ways of accessing this data:
CrawlerProcess
, overriding methods that start/stop the reactor, thus removing the need to launchscrapyd.runner
as a separate process. This gives us direct access tocrawler.stats.get_stats()
and gives the added benefit of using only one reactor to run multiple crawls.scrapy.contrib.webservice.stats.StatsResource
. This doesn't rely on an unstable API (unlike 2), but will force us to parse log files to determine the webservice port.Scrapyd needs some useful upgrades aside from a prettier UI. Scheduling periodic crawls, queues, retrying, etc. They don't seem difficult to implement, but I don't have the time to do this myself and don't know if the community even has interest in Scrapyd.
Thoughts?
The text was updated successfully, but these errors were encountered: