You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Often what is done when translating something to be dubbed is adjusting the translation - not the speed - to cater to audio constraints. It's often possible to say things in multiple ways - meme for reference - still keeping the original meaning or something close to it.
There are articles about controlling the output length of a machine translation such this one (I remember reading other but I could not find it, but I found one better in the Edit below) which can be researched for that.
One idea I had and tested is using the fact many libraries - such as Hugging Face's Transformers - support providing multiple choices for translations. An example: I used the "Helsinki-NLP/opus-mt-tc-big-itc-itc" model, set num_return_sequences=5 to make it return 5 translations and translated "Ok" (like the meme) to Spanish, then it returned "De acuerdo.", "Está bien.", "Bien.", "Muy bien." and "De acuerdo", which are mostly correct translations (well, at least from what I know of when I studied Spanish a long time ago; by the way the last translation is just the first without a period).
One downside of this idea is that it restricts models to only models supported by the library and someone might prefer a proprietary translation model instead, then one possibility is using a Summarization model to at least avoid the case of having to speed up the dub voice in order to read a way too long translation. Note that I don't tried this yet and there is a chance those models might not work well summarizing small sentences.
Edit:this 2021 paper from Amazon AI addresses a lot of things related to this project. Its references are quite good too.
The text was updated successfully, but these errors were encountered:
Often what is done when translating something to be dubbed is adjusting the translation - not the speed - to cater to audio constraints. It's often possible to say things in multiple ways - meme for reference - still keeping the original meaning or something close to it.
There are articles about controlling the output length of a machine translation such this one (I remember reading other but I could not find it, but I found one better in the Edit below) which can be researched for that.
One idea I had and tested is using the fact many libraries - such as Hugging Face's Transformers - support providing multiple choices for translations. An example: I used the "Helsinki-NLP/opus-mt-tc-big-itc-itc" model, set
num_return_sequences=5
to make it return 5 translations and translated "Ok" (like the meme) to Spanish, then it returned "De acuerdo.", "Está bien.", "Bien.", "Muy bien." and "De acuerdo", which are mostly correct translations (well, at least from what I know of when I studied Spanish a long time ago; by the way the last translation is just the first without a period).One downside of this idea is that it restricts models to only models supported by the library and someone might prefer a proprietary translation model instead, then one possibility is using a Summarization model to at least avoid the case of having to speed up the dub voice in order to read a way too long translation. Note that I don't tried this yet and there is a chance those models might not work well summarizing small sentences.
Edit: this 2021 paper from Amazon AI addresses a lot of things related to this project. Its references are quite good too.
The text was updated successfully, but these errors were encountered: