Character-level machine translation Transformer sequence to sequence in Pytorch

Translate text with Transformer sequence to sequence architecture using only characters, implement in Pytorch

Light weight text translation

Most implementations of text translation use word or byte-pair with 1000+ vocabularies as input which takes ages to train. This implementation use only characters in text data (<300 chars) as vocabulary, so I can train everything from scratch in under 30 minutes on a GTX 1060. In this implementation you will find:

Transformer and self attention re-implementation in Pytorch in encode_decode_transformer.py
A compact fast and greedy beam search implementation at the end of model
Boilerplate for training trainer.py
Jupyter notebook to run model step by step

Sample translation results

thời tiết hôm nay thật đẹp! | <sos>the weather is concerned! it's beautiful!<eos>
xin chào | <sos>please come along<eos>
bạn đã ăn sáng chưa? | <sos>have you eaten the morning, didn't you?<eos>

This implementation is inspired by Andrej Karpathy MinGPT

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Character-level machine translation Transformer sequence to sequence in Pytorch

Files

README.md

Latest commit

History

README.md

File metadata and controls

Character-level machine translation Transformer sequence to sequence in Pytorch