- Moscow
Stars
🎭 Intelligent browser header & fingerprint generator
Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Wo…
Make websites accessible for AI agents
A course on aligning smol models.
Puppeteer(Chrome headless node API) based web page renderer
Scrapy download handler that can impersonate browser' TLS signatures or JA3 fingerprints.
Python binding for curl-impersonate fork via cffi. A http client that can impersonate browser tls/ja3/http2 fingerprints.
A curated list of awesome puppeteer resources.
🚀 The CLI for your next Chrome Extension
Gracy helps you handle failures, logging, retries, throttling, and tracking for all your HTTP interactions.
Scrapy Extension for monitoring spiders execution.
JavaScript object that creates unique CSS selector for given element.
Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XML
📜 Extract meaningful content from the chaos of a web page
A Smart, Automatic, Fast and Lightweight Web Scraper for Python
List of libraries, tools and APIs for web scraping and data processing.
Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code.