DirichletModel1WithTopics

Bayesian implementation of IBM Model 1 with conditioning on latent topics.

To run: python model1withtopics.py --num_iterations 100 --num_topics 2 --alpha 0.1 --beta0 0.00002 --beta1 1.0 --nonull fake_data.txt fakeOutput > output-fake.txt

To analyze results: python analyze.py fake_data.txt fakeOutput/ 100

The fake_data.txt file contains some example English-psuedo French sentence pairs. Note that the 'French' is simplified to exaggerate the effect of grammatical gender. When applying our model to this data, we expect to recover two "topics" representing masculine words and feminine words.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md
alignment_prior.h		alignment_prior.h
bilingual_corpus.h		bilingual_corpus.h
mode.py		mode.py
model3.cc		model3.cc

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DirichletModel1WithTopics

About

Releases

Packages

Languages

cmu-mtlab/dirichlet-model1-with-topics

Folders and files

Latest commit

History

Repository files navigation

DirichletModel1WithTopics

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages