Translate text with Transformer sequence to sequence architecture using only characters, implement in Pytorch
Light weight text translation
Most implementations of text translation use word or byte-pair with 1000+ vocabularies as input which takes ages to train. This implementation use only characters in text data (<300 chars) as vocabulary, so I can train everything from scratch in under 30 minutes on a GTX 1060. In this implementation you will find:
- Transformer and self attention re-implementation in Pytorch in
encode_decode_transformer.py
- A compact fast and greedy beam search implementation at the end of model
- Boilerplate for training
trainer.py
- Jupyter notebook to run model step by step
Sample translation results
thời tiết hôm nay thật đẹp! | <sos>the weather is concerned! it's beautiful!<eos>
xin chào | <sos>please come along<eos>
bạn đã ăn sáng chưa? | <sos>have you eaten the morning, didn't you?<eos>
This implementation is inspired by Andrej Karpathy MinGPT