Token/words confidence #1092

Zeiny96 · 2023-05-24T05:52:30Z

Any available way to get each token/word confidence while using the modified beam search decoding method?

Zeiny96 · 2023-05-28T16:01:17Z

I successfully added it, but the confidences for out of domain sets are very low, even though the result itself is fair, is this normal?

csukuangfj · 2023-05-29T02:35:00Z

Could you describe how you get the scores?

Zeiny96 · 2023-05-29T16:00:12Z

Using the log_probs in the modified beam search, then using it with the exponential function just to get rid of the log scale as follows:

for k in range(len(topk_hyp_indexes)):
                ...
                ...
                ...
                new_confidences = hyp.confidences[:]
                if new_token not in (blank_id, unk_id):
                    ...
                    ...
                    ...
                    new_confidences.append(torch.exp(topk_log_probs[k]).item())

                new_log_prob = topk_log_probs[k]
                new_hyp = Hypothesis(
                    ys=new_ys, log_prob=new_log_prob, timestamp=new_timestamp, confidences=new_confidences
                )
                B[i].add(new_hyp)

    ...
    ...
    ...
    sorted_confidences = [h.confidences for h in best_hyps]
    ...
    ...
    ...
    ans_confidences = []
    unsorted_indices = packed_encoder_out.unsorted_indices.tolist()
    for i in range(N):
        ans.append(sorted_ans[unsorted_indices[i]])
        ans_timestamps.append(sorted_timestamps[unsorted_indices[i]])
        ans_confidences.append(sorted_confidences[unsorted_indices[i]])

    return DecodingResults(
        tokens=ans,
        timestamps=ans_timestamps,
        confidences=ans_confidences,
    )

Then to get each word confidence I average all of its tokens prob values as follows:

confidence = sum(tokens_per_word_probs)/len(tokens_per_word_probs)

csukuangfj · 2023-05-30T01:54:00Z

icefall/egs/librispeech/ASR/pruned_transducer_stateless2/beam_search.py

Line 1027 in 1aeffa7

log_probs.add_(ys_log_probs)

icefall/egs/librispeech/ASR/pruned_transducer_stateless2/beam_search.py

Line 1058 in 1aeffa7

new_log_prob = topk_log_probs[k]

                new_confidences.append(torch.exp(topk_log_probs[k]).item())

topk_log_probs is cumulative. You have to subtract it from ys_log_probs to get the log_prob of the current frame.

Zeiny96 · 2023-05-30T09:52:56Z

Yeah, that was my bad, it is fixed now:

new_confidences.append(torch.exp(topk_log_probs[k]-hyp.log_prob).item())

Thanks

csukuangfj closed this as completed May 31, 2023

AhmedSalah98 mentioned this issue Sep 3, 2024

No duration or confidence info for alignments #1735

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Token/words confidence #1092

Token/words confidence #1092

Zeiny96 commented May 24, 2023

Zeiny96 commented May 28, 2023 •

edited

Loading

csukuangfj commented May 29, 2023

Zeiny96 commented May 29, 2023 •

edited

Loading

csukuangfj commented May 30, 2023

Zeiny96 commented May 30, 2023

Token/words confidence #1092

Token/words confidence #1092

Comments

Zeiny96 commented May 24, 2023

Zeiny96 commented May 28, 2023 • edited Loading

csukuangfj commented May 29, 2023

Zeiny96 commented May 29, 2023 • edited Loading

csukuangfj commented May 30, 2023

Zeiny96 commented May 30, 2023

Zeiny96 commented May 28, 2023 •

edited

Loading

Zeiny96 commented May 29, 2023 •

edited

Loading