Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update src\llamafactory\train\sft\metric.py #4877

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

01WarpDrive
Copy link

@01WarpDrive 01WarpDrive commented Jul 18, 2024

What does this PR do?

The input parameters of rouge/bleu are optimized. Added the ability to evaluate English data.
Rouge and bleu scores can now be evaluated more accurately, while word segmentation is automatically selected based on Chinese and English data sets.

For Chinese data

The ComputeSimilarity class in metric.py seems to be designed specifically for Chinese data sets.
In my opinion, there are two problems with the current evaluation code for Chinese data.

  1. Special characters such as punctuation are not removed.
  2. The current argument passed to sentence_bleu is a list of individual Chinese characters. It is better to use Chinese words in practice.

For example:

label = "你好!世界。"
reference = ['你好', '!', '世界', '。']
list(label) = ['你', '好', '!', '世', '界', '。']
  • Punctuation affects the calculation of the rouge/bleu indicator
  • The argument passed to sentence_bleu should preferably be ['你好', '世界']

For English data

In addition, the current code has some problems in estimating English. This is due to jieba word segmentation and other reasons.

For example, the current argument passed to sentence_bleu is a list of English letters instead of words, which does not conform to the official standard usage: nltk/translate/bleu_score

So I added code to support English data evaluation.

Fixes # (issue)

Before submitting

The input parameters of bleu are optimized. Added the ability to evaluate English data.
@hiyouga hiyouga added the pending This problem is yet to be addressed label Jul 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pending This problem is yet to be addressed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants