uses code from this repo download datasets into datasets folder may have to download nltk stopwords...