Skip to content
This repository has been archived by the owner on Dec 28, 2020. It is now read-only.

Latest commit

 

History

History
49 lines (34 loc) · 793 Bytes

README.md

File metadata and controls

49 lines (34 loc) · 793 Bytes

Drudge domain analysis

A simple example of using storytracker and the PastPages API to conduct a link analysis

Getting started

Create a virtualenv and activate it.

$ virtualenv drudge-domain-analysis
$ cd drudge-domain-analysis
$ . bin/activate

Clone the repository and jump into it.

$ git clone https://github.com/pastpages/drudge-domain-analysis.git repo
$ cd repo

Install the requirements.

$ pip install -r requirements.txt

Running the analysis

Download the archived screenshots from PastPages.

$ python download.py

Extract the hyperlinks from each one.

$ python extract.py

Analyze the hyperlinks and spit out the results.

$ python analyze.py