- Python 3
- Jupyter Notebook
- Sklearn
- Numpy
- Pandas
- NLTK
- statistics
-
datafile.txt The dataset, should be put under the same folder as other python files.
-
Exercise_I_10_models.py The 10 model comparasion for exercise I. After installing all the dependency libraries, the code can be executed with the command
python Exercise_I_10_models.py
-
Exercise_I.ipynb The stratified CV comparison for exercise I. After installing the Jupyter notebook server, typing
jupyter notebook
in command line to start the server, and in the file can be opened and executed using the Jupyter web interface. See http://jupyter.org/install for precise installation instruction. -
Exercise_II.ipynb The document retrieval for exercise II. The file can be opened and executed using a Jupyter notebook server.