Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skip silence around hallucinations #646

Conversation

trungkienbkhn
Copy link
Collaborator

@trungkienbkhn trungkienbkhn force-pushed the skip-silence-around-hallucinations branch from 9efdffb to 5e94811 Compare January 16, 2024 12:24
@makaveli10
Copy link
Contributor

@trungkienbkhn looks like this slows down the inference. Have you noticed the increase in latency?

@Purfview
Copy link
Contributor

looks like this slows down the inference

Original author mentioned:

...since this also requires extra processing time, we only do this when a probable hallucination is detected.

@trungkienbkhn trungkienbkhn force-pushed the skip-silence-around-hallucinations branch from 5e94811 to beeb467 Compare January 25, 2024 07:03
@trungkienbkhn
Copy link
Collaborator Author

trungkienbkhn commented Jan 25, 2024

@makaveli10 , hello, sorry for the late reply.
I tested an mp3 audio file (192 seconds) that had a lot of noise with the tiny model and device cuda.
If use hallucination_silence_threshold=2, the avearage execution time total is 5.38s.
And if use the original code and don't use this feature, it's 4.87s.
My code:

model = WhisperModel('tiny', device='cuda')
segments, info = model.transcribe(audio_path, word_timestamps=True, hallucination_silence_threshold=2)

=> Latency has increased a bit. But I found that the transcription quality also improved. So I think it's a trade off, it's not too impactful and is acceptable.

@nguyendc-systran nguyendc-systran merged commit 0920672 into SYSTRAN:master Feb 20, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants