-
Notifications
You must be signed in to change notification settings - Fork 31.7k
Closed
Labels
Description
System Info
transformersversion: 4.22.0.dev0- Platform: macOS-12.4-x86_64-i386-64bit
- Python version: 3.9.13
- Huggingface_hub version: 0.8.1
- PyTorch version (GPU?): 1.11.0 (False)
- Tensorflow version (GPU?): 2.9.1 (False)
- Flax version (CPU?/GPU?/TPU?): 0.5.2 (cpu)
- Jax version: 0.3.6
- JaxLib version: 0.3.5
- Using GPU in script?: n
- Using distributed or parallel set-up in script?: n
Who can help?
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examplesfolder (such as GLUE/SQuAD, ...) - My own task or dataset (give details below)
Reproduction
RUN_SLOW=1 RUN_PIPELINE_TESTS=yes pytest tests/pipelines/test_pipelines_token_classification.py::TokenClassificationPipelineTests::test_dbmdz_english
Fails with two notable diffs: the "UN" entity offsets in the assertion don't match the offsets in the input string itself (off by two characters), and the index doesn't match. Output:
======================================= FAILURES ========================================
__________________ TokenClassificationPipelineTests.test_dbmdz_english __________________
self = <tests.pipelines.test_pipelines_token_classification.TokenClassificationPipelineTests testMethod=test_dbmdz_english>
@require_torch
@slow
def test_dbmdz_english(self):
# Other sentence
NER_MODEL = "dbmdz/bert-large-cased-finetuned-conll03-english"
model = AutoModelForTokenClassification.from_pretrained(NER_MODEL)
tokenizer = AutoTokenizer.from_pretrained(NER_MODEL, use_fast=True)
sentence = """Enzo works at the UN"""
token_classifier = pipeline("ner", model=model, tokenizer=tokenizer)
output = token_classifier(sentence)
> self.assertEqual(
nested_simplify(output),
[
{"entity": "I-PER", "score": 0.997, "word": "En", "start": 0, "end": 2, "index": 1},
{"entity": "I-PER", "score": 0.996, "word": "##zo", "start": 2, "end": 4, "index": 2},
{"entity": "I-ORG", "score": 0.999, "word": "UN", "start": 22, "end": 24, "index": 7},
],
)
E AssertionError: Lists differ: [{'en[24 chars] 0.998, 'index': 1, 'word': 'En', 'start': 0, [179 chars] 20}] != [{'en[24 chars] 0.997, 'word': 'En', 'start': 0, 'end': 2, 'i[179 chars]: 7}]
E
E First differing element 0:
E {'ent[15 chars]'score': 0.998, 'index': 1, 'word': 'En', 'start': 0, 'end': 2}
E {'ent[15 chars]'score': 0.997, 'word': 'En', 'start': 0, 'end': 2, 'index': 1}
E
E [{'end': 2,
E 'entity': 'I-PER',
E 'index': 1,
E - 'score': 0.998,
E ? ^
E
E + 'score': 0.997,
E ? ^
E
E 'start': 0,
E 'word': 'En'},
E {'end': 4,
E 'entity': 'I-PER',
E 'index': 2,
E - 'score': 0.997,
E ? ^
E
E + 'score': 0.996,
E ? ^
E
E 'start': 2,
E 'word': '##zo'},
E - {'end': 20,
E ? ^
E
E + {'end': 24,
E ? ^
E
E 'entity': 'I-ORG',
E - 'index': 6,
E ? ^
E
E + 'index': 7,
E ? ^
E
E 'score': 0.999,
E - 'start': 18,
E ? ^^
E
E + 'start': 22,
E ? ^^
E
E 'word': 'UN'}]
tests/pipelines/test_pipelines_token_classification.py:284: AssertionError
Expected behavior
[a green dot]
Narsil