You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
According the paper for self-bleu calculations each generation is compared against all the other references.
The current Self-BLEU implementation includes the selected hypothesis in the list of references. This risks inflation in the self-bleu scores as there will be always a direct match between the hypothesis and one of the references.
def get_bleu(self):
ngram = self.gram
bleu = list()
reference = self.get_reference()
weight = tuple((1. / ngram for _ in range(ngram)))
with open(self.test_data) as test_data:
for hypothesis in test_data:
hypothesis = nltk.word_tokenize(hypothesis)
bleu.append(nltk.translate.bleu_score.sentence_bleu(reference, hypothesis, weight,
smoothing_function=SmoothingFunction().method1))
return sum(bleu) / len(bleu)
should we remove the target hypothesis from the set of references or am I missing something here?
Thanks for the help in advance
The text was updated successfully, but these errors were encountered:
According the paper for self-bleu calculations each generation is compared against all the other references.
The current Self-BLEU implementation includes the selected hypothesis in the list of references. This risks inflation in the self-bleu scores as there will be always a direct match between the hypothesis and one of the references.
should we remove the target hypothesis from the set of references or am I missing something here?
Thanks for the help in advance
The text was updated successfully, but these errors were encountered: