Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training core only model and testing on test stories with entity annotations causes evaluation to crash. #8386

Closed
kedz opened this issue Apr 7, 2021 · 10 comments
Assignees
Labels
area:rasa-oss 🎡 Anything related to the open source Rasa framework area:rasa-oss/model-testing Issues focused around testing models (e.g. via `rasa test`) effort:atom-squad/2 Label which is used by the Rasa Atom squad to do internal estimation of task sizes. type:bug 🐛 Inconsistencies or issues which will cause an issue or problem for users or implementors.

Comments

@kedz
Copy link
Contributor

kedz commented Apr 7, 2021

Rasa version: 2.4.2

Rasa SDK version (if used & relevant):

Rasa X version (if used & relevant):

Python version: 3.7.6

Operating system (windows, osx, ...): macos, Darwin-20.2.0-x86_64-i386-64bit

Issue:
Training non-e2e core only model and evaluating on a test stories with entity annotations causes evaluation to crash. We would like to be able to add starter pack datasets like the insurance-demo to the model regression tests, but this bug prevents us from doing regression tests on core policies in isolation.

Error (including full traceback):

git clone https://github.com/RasaHQ/insurance-demo.git
cd insurance-demo
rasa train core
rasa test core -s tests/
2021-04-07 15:28:36 INFO     rasa.model  - Loading model models/core-20210407-152812.tar.gz...
/Users/kedz/projects2021/carbonbot/rasa/rasa/utils/train_utils.py:565: UserWarning: constrain_similarities is set to `False`. It is recommended to set it to `True` when using cross-entropy loss. It will be set to `True` by default, Rasa Open Source 3.0.0 onwards.
  category=UserWarning,
/Users/kedz/projects2021/carbonbot/rasa/rasa/utils/train_utils.py:537: UserWarning: model_confidence is set to `softmax`. It is recommended to try using `model_confidence=linear_norm` to make it easier to tune fallback thresholds.
  category=UserWarning,
No NLU model found. Using default 'RegexInterpreter' for end-to-end evaluation. If you added actual user messages to your test stories this will likely lead to the tests failing. In that case, you need to train a NLU model first, e.g. using `rasa train`.
/Users/kedz/projects2021/carbonbot/rasa/rasa/shared/utils/io.py:96: UserWarning: Issue found in 'tests/test_stories.yml':
Found intent 'get_a_quote' in stories which is not part of the domain.
  More info at https://rasa.com/docs/rasa/stories
/Users/kedz/projects2021/carbonbot/rasa/rasa/shared/utils/io.py:96: UserWarning: Issue found in 'tests/test_stories.yml':
Found intent 'make_payment' in stories which is not part of the domain.
  More info at https://rasa.com/docs/rasa/stories
/Users/kedz/projects2021/carbonbot/rasa/rasa/shared/utils/io.py:96: UserWarning: Issue found in 'tests/test_stories.yml':
Found intent 'greet' in stories which is not part of the domain.
  More info at https://rasa.com/docs/rasa/stories
/Users/kedz/projects2021/carbonbot/rasa/rasa/shared/utils/io.py:96: UserWarning: Issue found in 'tests/test_stories.yml':
Found intent 'inform' in stories which is not part of the domain.
  More info at https://rasa.com/docs/rasa/stories
/Users/kedz/projects2021/carbonbot/rasa/rasa/shared/utils/io.py:96: UserWarning: Issue found in 'tests/test_stories.yml':
Found intent 'deny' in stories which is not part of the domain.
  More info at https://rasa.com/docs/rasa/stories
/Users/kedz/projects2021/carbonbot/rasa/rasa/shared/utils/io.py:96: UserWarning: Issue found in 'tests/test_stories.yml':
Found intent 'new_id_card' in stories which is not part of the domain.
  More info at https://rasa.com/docs/rasa/stories
/Users/kedz/projects2021/carbonbot/rasa/rasa/shared/utils/io.py:96: UserWarning: Issue found in 'tests/test_stories.yml':
Found intent 'affirm' in stories which is not part of the domain.
  More info at https://rasa.com/docs/rasa/stories
Processed story blocks: 100%|██████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:00<00:00, 975.31it/s, # trackers=1]
2021-04-07 15:28:48 INFO     rasa.core.test  - Evaluating 6 stories
Progress:
 17%|████████████████████▊                                                                                                        | 1/6 [00:00<00:00, 32.55it/s]
Traceback (most recent call last):
  File "/usr/local/anaconda3/envs/carbon/bin/rasa", line 5, in <module>
    main()
  File "/Users/kedz/projects2021/carbonbot/rasa/rasa/__main__.py", line 117, in main
    cmdline_arguments.func(cmdline_arguments)
  File "/Users/kedz/projects2021/carbonbot/rasa/rasa/cli/test.py", line 112, in run_core_test
    use_conversation_test_files=args.e2e,
  File "/Users/kedz/projects2021/carbonbot/rasa/rasa/model_testing.py", line 191, in test_core
    **kwargs,
  File "/Users/kedz/projects2021/carbonbot/rasa/rasa/utils/common.py", line 309, in run_in_loop
    result = loop.run_until_complete(f)
  File "uvloop/loop.pyx", line 1456, in uvloop.loop.Loop.run_until_complete
  File "/Users/kedz/projects2021/carbonbot/rasa/rasa/core/test.py", line 845, in test
    completed_trackers, agent, fail_on_prediction_errors, e2e
  File "/Users/kedz/projects2021/carbonbot/rasa/rasa/core/test.py", line 753, in _collect_story_predictions
    tracker, agent, fail_on_prediction_errors, use_e2e
  File "/Users/kedz/projects2021/carbonbot/rasa/rasa/core/test.py", line 666, in _predict_tracker_actions
    circuit_breaker_tripped,
  File "/Users/kedz/projects2021/carbonbot/rasa/rasa/core/test.py", line 551, in _collect_action_executed_predictions
    processor, partial_tracker, prediction
  File "/Users/kedz/projects2021/carbonbot/rasa/rasa/core/test.py", line 520, in _get_e2e_entity_evaluation_result
    tokens = parsed_message.get(TOKENS_NAMES[TEXT])
AttributeError: 'NoneType' object has no attribute 'get'

Command or request that led to error:


Content of configuration file (config.yml) (if relevant):

Content of domain file (domain.yml) (if relevant):

@kedz kedz added type:bug 🐛 Inconsistencies or issues which will cause an issue or problem for users or implementors. area:rasa-oss 🎡 Anything related to the open source Rasa framework labels Apr 7, 2021
@TyDunn TyDunn added the area:rasa-oss/model-testing Issues focused around testing models (e.g. via `rasa test`) label May 5, 2021
@kedz kedz changed the title Training core only model and testing on a test stories with entity annotations causes evaluation to crash. Training core only model and testing on test stories with entity annotations causes evaluation to crash. May 12, 2021
@kedz
Copy link
Contributor Author

kedz commented May 12, 2021

This also happens if you run on data with both of intent label and user text annotated in the test stories. For example in the retail-demo starter pack there are stories like this:

- story: faq
  steps:
  - intent: greet
    user: |-
      hi
  - action: utter_greet
  - intent: faq
    user: |
      what kind of payment you take?
  - action: utter_faq

If you remove the user fields, the tests run fine.

@samsucik
Copy link
Contributor

samsucik commented May 18, 2021

@TyDunn I'd like to lobby for this one because the cost/benefit ratio is, I think, very favourable. Hacking around the bug took me ~3 lines of code (though a clean solution can still be complex, who knows). Once fixed, this would unblock our Core regression testing on the big Sara test set (which is the best Core dataset that we've got right now).

@JEM-Mosig
Copy link
Contributor

I also just ran into this issue while working on a bot for IntentTED evaluation.

@koernerfelicia
Copy link
Contributor

Would also be helpful for sanity checking carbon bot (not crucial, but nice to have)

@TyDunn TyDunn added the effort:atom-squad/2 Label which is used by the Rasa Atom squad to do internal estimation of task sizes. label May 21, 2021
@TyDunn
Copy link
Contributor

TyDunn commented May 21, 2021

We have estimated and are picking it up next sprint

@ancalita
Copy link
Member

Hi @wochinge I started a having a look into this issue - I think this has to do with the fact that the processor.interpreter.featurize_message is not implemented neither here nor in RegexInterpreter.

The question I have is whether the wrong method was used here or if an implementation is required for featurize_message? If the latter, I'd need some guidance in what this method does.

@wochinge
Copy link
Contributor

@joejuzl implemented that to allow testing the extraction of entities by end-to-end policies (TEDPolicy is currently the only one). It does't make much sense to do the entity evaluation with Core only models at the moment. As the type annotation suggests featurize_message is allowed to return None. I'd just handling for None values here and don't add an EntityEvaluationResult in that case.

@ancalita
Copy link
Member

@joejuzl what is still unclear to me - if the body of featurize_message is pass, it always returns None - and therefore L555-L558 would never execute. When would featurize_message return a Message? 🤔

@joejuzl
Copy link
Contributor

joejuzl commented May 25, 2021

@joejuzl what is still unclear to me - if the body of featurize_message is pass, it always returns None - and therefore L555-L558 would never execute. When would featurize_message return a Message? 🤔

The body is only pass is in NaturalLanguageInterpreter which is a kind of abstract base class for the interpreter. RegexInterpreter is a subclass, but does not override this method because it's a simple default interpreter that is mainly used for testing.
However in Interpreter in rasa/nlu/model.py you can see that featurize_message is implemented. It loops through all the components in the nlu pipeline, adding details to the message along the way. As the tokenizer will be present in the pipeline, the tokens will be added.

@ancalita
Copy link
Member

Bugfix PR merged 😃

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:rasa-oss 🎡 Anything related to the open source Rasa framework area:rasa-oss/model-testing Issues focused around testing models (e.g. via `rasa test`) effort:atom-squad/2 Label which is used by the Rasa Atom squad to do internal estimation of task sizes. type:bug 🐛 Inconsistencies or issues which will cause an issue or problem for users or implementors.
Projects
None yet
Development

No branches or pull requests

8 participants