Use a Bigram HMM to tag the parts of speech in a sentence
There are two methods to handle unknown words:
- Replace them with an Unknown symbol
- Replace them with a pseudoword symbol
The following commands use method 2
Use the following commands when inside src
python main.py -a vocab -i path/to/input/data/ -w path/to/word/vocab/file -t path/to/tag/vocab/file
python main.py -a train -i path/to/train/file/ -d path/to/dev/file -w path/to/word/vocab/file -t path/to/tag/vocab/file