This repository contains a suite of scripts used for Grokking Presidentinal Inaugural Addresses from George Washington to Barack Obama.
Fun with NLTK, pygal, and word_cloud
Analyze the full-text, tag parts of speech, provide word frequency distirbutions
Create a barchart showing the Parts of Speech in the form of an .svg
Generate a wordcloud (independent from NLTK) in the form of a .png
- wget the full set of texts
- strip of the gutenberg specific language (but keep the original file for attribution's sake)
- run the presinaug-nltk.py
- run the presinaug-charts.py
- run the presinaug-cloud.py
Though these are disparate tools, you can install each to run the component scripts manually. Start with:
pip install Cython
This above line goes into an install_requires
section in a proper setup.py (coming soon).
After that, you then run:
pip install -r requirements.txt
which should contain all the component libraries you'll need
- Ralph Bean [email protected]
- Nate Case [email protected]
- Remy DeCausemaker [email protected]