Skip to content

5aurabhpathak/hi_en_hybrid_machine_translation

Repository files navigation

Masters thesis

Title: Improving Accuracy in Hindi to English Machine Translation with a Hybrid Approach

Abstract

Machine Translation still remains a largely unsolved problem. Over the past few decades, there have been many approaches to solve it. However, all of these approaches fall short on the accuracy required to ‘precisely’ translate an arbitrary sentence. The work presented in this thesis report approaches the translation problem for Hindi to English. There are only a few known attempts for the source to target direction on this language pair. The most recent attempt has suggested a hybrid approach which is also the focus of this thesis. The proposed system aims to improve the accuracy of Hindi to English Machine Translation on long as well as short sentences. The hybrid method presented in this work is a corpus-guided approach. It uses Statistical Machine Translation as a core component which is augmented by the example-based and rule-based sub-systems. The end results demonstrate that the proposed system achieves high standards of accuracy in translation of simple, compound as well as complex sentences from Hindi to English.

THESIS SUBMITTED IN 2017 AND ACCEPTED.

Current status - Complete

  • Moses baseline complete.
  • EBMT complete.
  • RBMT complete.
  • Transliteration complete.
  • Created a WebUI to translate Hi-En

External dependencies

Make sure you have moses installed along with any of its dependencies

Instructions

Run server.py and open localhost:5000 in browser to start the GUI and you're good to go.

Contact me

[email protected]

[email protected]