You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Other datasets which could be useful include those used for disfluency detection, in particular Disfl-QA which includes both an original question (derived from SQuADv2) and a human-altered question containing added disfluencies.
I've tried to summarize the existing datasets from the literature here.
None of these fulfil all of our requirements. The WMT dataset provides a "gold standard" human score but no reference translation.
DISCO could provide an easy way to show the effect of different disfluency types. The dataset as distributed does not provide translations of the original imperfect speech, only fluent English translations, so we would need to pick a translation model to produce these.
Are there any existing datasets derived from human speech we could use?
The text was updated successfully, but these errors were encountered: