You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thank you for your work and sharing!
It seems MATCHA-TTS and VoiceFLow-TTS (https://github.com/X-LANCE/VoiceFlow-TTS) are very similar?
What is the main diffences between these two methods?
And How about the performace on voice quality, for example prosody, and the inference speed?
The text was updated successfully, but these errors were encountered:
They are! I met the author @cantabile-kwok just last week (super nice guy), it is interesting we both made certain decisions to improve the speed relative to just conditional flow matching. One way to speed up that they employed was to improve the paths by "rectifying" the learned paths by flow matching which is a two-step approach and quite effective. For us, we felt that the same speedup could be achieved by improving the architecture instead so we improved the U-net architecture and got a similar speedup.
They both are trying to solve a similar problem in different ways, you surely can "rectify" the paths with Matcha-TTS's architecture for even improved speed up :)
Thank you for your work and sharing!
It seems MATCHA-TTS and VoiceFLow-TTS (https://github.com/X-LANCE/VoiceFlow-TTS) are very similar?
What is the main diffences between these two methods?
And How about the performace on voice quality, for example prosody, and the inference speed?
The text was updated successfully, but these errors were encountered: