mkdir data data/raw data/master pred ckpt log
download the following parallel corpora from europarl
and untar to the data/raw
directory
cd src
./data.py
- run scripts
train[0-5].py
in succession, where[0-5]
is the trial number - the checkpoints will be saved in
ckpt
- the tensorboard summaries will be saved in
log
- by the naming pattern
m[0-5]_
- a checkpoint number also is appended for the checkpoints
- run
eval_all.py
for translating between all language pairs - run
eval_nl_da.py
for translating only between dutch and danish - the trial number (
C.trial
) and the checkpoint number (C.ckpt
orckpt
) needs to be set first - the translations for the evaluation set will be saved in
pred
- run
sacrebleu --force -tok intl -b -i
with the path to the predicted translation - and the path to the reference translation (saved as
data/master/eval_*.txt
bydata.py
)