The Trapper Keeper is a collection of scripts that support archiving information from around the web to make it easier to study and use. If you are a researcher working with online material, an educator creating openly licensed content, or a curious person who likes to learn more about different subjects, then Trapper Keeper might be helpful to you. Trapper Keeper can currently archive and clean web pages and pdfs.
Trapper Keeper supports these features:
- Archive data from multiple sources;
- Clean data and save it as text;
- List out embedded media and links;
- Retain a copy of embedded images in the source text;
- Track the source material for changes;
- Organize your cleaned, archived data into arbitrary collections - a "collection" can be anything that unifies a set of information; ie, a set of urls that all relate to a specific topic; or a set of information that will be remixed into chapters;
- Export a list of all tracked URLs.
Trapper Keeper has been tested on OSX and Ubuntu Linux. No testing has taken place on Windows machines.
The Trapper Keeper Overview page contains instructions on using Trapper Keeper.
The scripts in Trapper Keeper use csv files to help organize information. Sample csv files are included in the /samples
directory.