Skip to content

Use a Bigram HMM to tag the parts of speech in a sentence

License

Notifications You must be signed in to change notification settings

Shoumik-Gandre/parts-of-speech

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

parts-of-speech

Use a Bigram HMM to tag the parts of speech in a sentence

About Preprocessing

There are two methods to handle unknown words:

  1. Replace them with an Unknown symbol
  2. Replace them with a pseudoword symbol

The following commands use method 2

Commands

Use the following commands when inside src

Generate Vocabulary:

python main.py -a vocab -i path/to/input/data/ -w path/to/word/vocab/file -t path/to/tag/vocab/file

Train HMM:

python main.py -a train -i path/to/train/file/ -d path/to/dev/file -w path/to/word/vocab/file -t path/to/tag/vocab/file

About

Use a Bigram HMM to tag the parts of speech in a sentence

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages