Skip to content

Commit a689ec4

Browse files
Bump up the latex2sympy2_extended version + more tests (#510)
* extract matching * better docstring * lazy imports * bump up math * Update src/lighteval/metrics/dynamic_metrics.py Co-authored-by: Clémentine Fourrier <[email protected]> * fix pr commnets * Apply suggestions from code review Co-authored-by: Clémentine Fourrier <[email protected]> * rename comparisson -> comparison * fix expr numbers extraction with currency or units * add test for correct extraction of failed answer * bump of latex2sympy2 version, add new tests for extract metric --------- Co-authored-by: Clémentine Fourrier <[email protected]>
1 parent 64a9925 commit a689ec4

File tree

2 files changed

+28
-1
lines changed

2 files changed

+28
-1
lines changed

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -109,7 +109,7 @@ multilingual = [
109109
"jieba", # for chinese tokenizer
110110
"pyvi", # for vietnamese tokenizer
111111
]
112-
math = ["latex2sympy2_extended>=0.9.0"]
112+
math = ["latex2sympy2_extended>=0.9.1"]
113113

114114
[project.urls]
115115
Homepage = "https://github.com/huggingface/lighteval"

tests/metrics/test_extractive_match.py

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -949,7 +949,34 @@ def test_math_extraction_edge_cases(gold, pred, expected):
949949
r"To find the product \( ab \) where \( a = 2012_3 \) and \( b = 201_3 \), we first convert these base-three numbers to base ten. For \( a = 2012_3 \): \[ a = 2 \cdot 3^3 + 0 \cdot 3^2 + 1 \cdot 3^1 + 2 \cdot 3^0 = 2 \cdot 27 + 0 \cdot 9 + 1 \cdot 3 + 2 \cdot 1 = 54 + 0 + 3 + 2 = 59_{10} \] For \( b = 201_3 \): \[ b = 2 \cdot 3^2 + 0 \cdot 3^1 + 1 \cdot 3^0 = 2 \cdot 9 + 0 \cdot 3 + 1 \cdot 1 = 18 + 0 + 1 = 19_{10} \] Now, calculate the product in base ten: \[ ab = 59 \times 19 \] Perform the multiplication: \[ 59 \times 19 = 59 \times (20 - 1) = 59 \times 20 - 59 \times 1 = 1180 - 59 = 1121 \] Next, convert \( 1121_{10} \) to base three. We do this by dividing by 3 and recording the remainders: \[ 1121 \div 3 = 373 \quad \text{remainder } 2 \] \[ 373 \div 3 = 124 \quad \text{remainder } 1 \] \[ 124 \div 3 = 41 \quad \text{remainder } 1 \] \[ 41 \div 3 = 13 \quad \text{remainder } 2 \] \[ 13 \div 3 = 4 \quad \text{remainder } 1 \] \[ 4 \div 3 = 1 \quad \text{remainder } 1 \] \[ 1 \div 3 = 0 \quad \text{remainder } 1 \] Reading the remainders from last to first, we find: \[ 1121_{10} = 1112122_3 \] Thus, the product \( ab \) expressed in the base-three number system is \(\boxed{1112122_3}\).",
950950
0,
951951
),
952+
(
953+
r"\(\boxed{\text{C}}\).",
954+
r"$\boxed{\text{(C)}}.$",
955+
1,
956+
),
957+
(
958+
r" So the answer is: \[ \boxed{11111111100} \]",
959+
r"is $\boxed{11,\! 111,\! 111,\! 100}$",
960+
1,
961+
),
962+
(
963+
r" So the answer is: \[ \boxed{32349} \]",
964+
r"is $\boxed{32,\! 349}$",
965+
1,
966+
),
967+
(
968+
r"Thus, the domain of the function \( f(x) \) is: \[ \boxed{(2, 12) \cup (12, 102)} \]",
969+
r"Thus, the answer is $x \in \boxed{(2,12) \cup (12,102)}$",
970+
1,
971+
),
952972
],
953973
)
954974
def test_math_extraction_additional_cases(gold, pred, expected):
955975
assert compare_strings(gold, pred, match_types=["latex", "expr"]) == expected
976+
977+
978+
# text{C} Qwen correct
979+
# 11111111100 Qwen correct
980+
# Interval(2, oo) qwen incorrect
981+
# text{west} qwen incorrect
982+
# 32349, 32,\!348 qwen incorrect

0 commit comments

Comments
 (0)