wayback-downloader

A simple downloader client for the Wayback Machine written in Python.

Usage:
	python <script.py> {--help|-h}
	python <script.py> [--threads <threads>] [--matchType {exact|prefix|host|domain}] [--from <timestamp>] [--to <timestamp>] [--limit <snapshots>] [--dry] <url>

Options:
	--help, -h		Display this help message and exit

	--threads, -T	Number of downloading threads (default: 10)

	--matchType, -m	What results will be downloaded based on <url>
		exact		Download results matching exactly <url>
		prefix		Download results under the path <url>
		host		Download results from host of <url>
		domain		Download results from host of <url> and all subhosts of <url>

	--from, -f		Download results that were captured after this timestamp
	--to, -t		Download results that were captured before this timestamp
		Both <from> and <to> must be a prefix of "yyyyMMddhhmmss"

	--limit, -l		Download at most <snapshots> snapshots

	--dry, -d		List items to be downloaded without downloading them

Example:
	Use the following command:
		python <script.py> --matchType prefix --from 2010 --to 201606 --limit 1000 example.org
	To download at most 1000 abarity pages under example.org between the year of 2010 and the month of June 2016 (inclusive).

For more information, see: https://github.com/internetarchive/wayback/blob/master/wayback-cdx-server/README.md

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.gitignore		.gitignore
README.md		README.md
wayback_downloader.py		wayback_downloader.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

wayback-downloader

About

Releases

Packages

Languages

relrelb/wayback-downloader

Folders and files

Latest commit

History

Repository files navigation

wayback-downloader

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages