Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ResponseSelector training fails with use_text_as_label and simple pipeline #8046

Closed
2 tasks done
samsucik opened this issue Feb 25, 2021 · 13 comments · Fixed by #8652
Closed
2 tasks done

ResponseSelector training fails with use_text_as_label and simple pipeline #8046

samsucik opened this issue Feb 25, 2021 · 13 comments · Fixed by #8652
Assignees
Labels
area:rasa-oss 🎡 Anything related to the open source Rasa framework area:rasa-oss/ml/nlu-components Issues focused around rasa's NLU components type:bug 🐛 Inconsistencies or issues which will cause an issue or problem for users or implementors.

Comments

@samsucik
Copy link
Contributor

samsucik commented Feb 25, 2021

Rasa version: main (2.3.1)

Rasa SDK version (if used & relevant):

Rasa X version (if used & relevant):

Python version: 3.8.5

Operating system (windows, osx, ...): MacOS 10.15.6

Issue: NLU training fails when I have a ResponseSelector that expects sequence-level features but don't provide any featurisers that would create such features (in this case specifically for the label attribute). I would've expected at least some warning when using use_text_as_label but not providing any sequence-level featurisers.

Error (including full traceback):

Traceback (most recent call last):
  File "/Users/sams/.miniconda/envs/rasa1/bin/rasa", line 5, in <module>
    main()
  File "/Users/sams/rasa1/rasa/__main__.py", line 116, in main
    cmdline_arguments.func(cmdline_arguments)
  File "/Users/sams/rasa1/rasa/cli/train.py", line 197, in train_nlu
    return train_nlu(
  File "/Users/sams/rasa1/rasa/train.py", line 700, in train_nlu
    return rasa.utils.common.run_in_loop(
  File "/Users/sams/rasa1/rasa/utils/common.py", line 308, in run_in_loop
    result = loop.run_until_complete(f)
  File "uvloop/loop.pyx", line 1456, in uvloop.loop.Loop.run_until_complete
  File "/Users/sams/rasa1/rasa/train.py", line 750, in train_nlu_async
    return await _train_nlu_with_validated_data(
  File "/Users/sams/rasa1/rasa/train.py", line 812, in _train_nlu_with_validated_data
    await rasa.nlu.train(
  File "/Users/sams/rasa1/rasa/nlu/train.py", line 116, in train
    interpreter = trainer.train(training_data, **kwargs)
  File "/Users/sams/rasa1/rasa/nlu/model.py", line 209, in train
    updates = component.train(working_data, self.config, **context)
  File "/Users/sams/rasa1/rasa/nlu/classifiers/diet_classifier.py", line 832, in train
    self.model.fit(
  File "/Users/sams/rasa1/rasa/utils/tensorflow/models.py", line 222, in fit
    ) = self._get_tf_train_functions(eager, model_data, batch_strategy)
  File "/Users/sams/rasa1/rasa/utils/tensorflow/models.py", line 483, in _get_tf_train_functions
    self._get_tf_call_model_function(
  File "/Users/sams/rasa1/rasa/utils/tensorflow/models.py", line 466, in _get_tf_call_model_function
    tf_call_model_function(next(iter(init_dataset)))
  File "/Users/sams/.miniconda/envs/rasa1/lib/python3.8/site-packages/tensorflow/python/eager/def_function.py", line 780, in __call__
    result = self._call(*args, **kwds)
  File "/Users/sams/.miniconda/envs/rasa1/lib/python3.8/site-packages/tensorflow/python/eager/def_function.py", line 823, in _call
    self._initialize(args, kwds, add_initializers_to=initializers)
  File "/Users/sams/.miniconda/envs/rasa1/lib/python3.8/site-packages/tensorflow/python/eager/def_function.py", line 696, in _initialize
    self._stateful_fn._get_concrete_function_internal_garbage_collected(  # pylint: disable=protected-access
  File "/Users/sams/.miniconda/envs/rasa1/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 2855, in _get_concrete_function_internal_garbage_collected
    graph_function, _, _ = self._maybe_define_function(args, kwargs)
  File "/Users/sams/.miniconda/envs/rasa1/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 3213, in _maybe_define_function
    graph_function = self._create_graph_function(args, kwargs)
  File "/Users/sams/.miniconda/envs/rasa1/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 3065, in _create_graph_function
    func_graph_module.func_graph_from_py_func(
  File "/Users/sams/.miniconda/envs/rasa1/lib/python3.8/site-packages/tensorflow/python/framework/func_graph.py", line 986, in func_graph_from_py_func
    func_outputs = python_func(*func_args, **func_kwargs)
  File "/Users/sams/.miniconda/envs/rasa1/lib/python3.8/site-packages/tensorflow/python/eager/def_function.py", line 600, in wrapped_fn
    return weak_wrapped_fn().__wrapped__(*args, **kwds)
  File "/Users/sams/.miniconda/envs/rasa1/lib/python3.8/site-packages/tensorflow/python/framework/func_graph.py", line 973, in wrapper
    raise e.ag_error_metadata.to_exception(e)
tensorflow.python.autograph.impl.api.StagingError: in user code:

    /Users/sams/rasa1/rasa/utils/tensorflow/models.py:295 train_on_batch  *
        prediction_loss = self.batch_loss(batch_in)
    /Users/sams/rasa1/rasa/nlu/selectors/response_selector.py:737 batch_loss  *
        loss, acc = self._calculate_label_loss(
    /Users/sams/rasa1/rasa/nlu/classifiers/diet_classifier.py:1419 _calculate_label_loss  *
        all_label_ids, all_labels_embed = self._create_all_labels()
    /Users/sams/rasa1/rasa/nlu/selectors/response_selector.py:651 _create_all_labels  *
        sequence_lengths_label = self._get_sequence_lengths(
    /Users/sams/rasa1/rasa/utils/tensorflow/models.py:1126 _get_sequence_lengths  *
        return tf.cast(tf_batch_data[key][sub_key][0], dtype=tf.int32) + 1

    IndexError: list index out of range

Command or request that led to error:

rasa train nlu

Content of configuration file (config.yml) (if relevant):

language: en
pipeline:
  - name: WhitespaceTokenizer
  - name: CountVectorsFeaturizer
  - name: DIETClassifier
    epochs: 1
  - name: ResponseSelector
    epochs: 1
    use_text_as_label: true

Content of domain file (domain.yml) (if relevant):

version: "2.0"
intents:
  - greet
  - goodbye

Content of NLU data file:

version: "2.0"
nlu:
- intent: greet/hi
  examples: |
    - hey
    - hello

- intent: greet/bye
  examples: |
    - bye
    - ciao

- intent: goodbye
  examples: |
    - see you
    - good night

responses:
  utter_goodbye:
  - text: "Bye"

  utter_greet/hi:
  - text: hey there

  utter_greet/bye:
  - text: bye

Definition of done

  • Add new warning for cases where user hasn't specified transformer_size but has enabled the transformer
  • Set a default transformer_size in the above case so that ResponseSelector doesn't break
@samsucik samsucik added type:bug 🐛 Inconsistencies or issues which will cause an issue or problem for users or implementors. area:rasa-oss 🎡 Anything related to the open source Rasa framework area:rasa-oss/ml/nlu-components Issues focused around rasa's NLU components labels Feb 25, 2021
@Dustyposa
Copy link

Dustyposa commented Feb 26, 2021

missing intent in domain.yml

greet/hi
greet/bye

@samsucik
Copy link
Contributor Author

@Dustyposa in this case greet is a retrieval intent, so it's enough to list greet in the domain file, without enumerating all the variations with different response keys (in these case the keys being hi and bye).

@Dustyposa
Copy link

Dustyposa commented Feb 26, 2021

@samsucik
in your config.yml, not declare the greet intent to retrieval intent
need this configuration:

  - name: ResponseSelector
    retrieval_intent: greet

@samsucik
Copy link
Contributor Author

@Dustyposa strictly speaking, that's not needed. By not specifying a retrieval_intent there, I effectively tell the ResponseSelector to be used for all retrieval intents.

@samsucik
Copy link
Contributor Author

samsucik commented Mar 4, 2021

@dakshvar22 I think I found the "bug". Even though I specified use_text_as_label: true, I didn't specify the number of transformer layers. And the default is, unfortunately, 0. Subsequently, here in label data preparation, sequence-level features are discarded if the number of transformer layers is 0. And so it happens that the DIET2DIET code expects sequence-level features to exist always when use_text_as_label: true, but in reality this isn't true and things break.

The easiest (but only temporary) fix would be to validate the config and exit if use_text_as_label: true and num_transformer_layers is 0 or not specified. However, we really need to change the DIET2DIET code to work OK just with sentence-level label features. And since that takes us exactly to the new rasa_layers that I'm creating over in #7589, I think that fixing it as part of that PR might be the easiest way actually...

@dakshvar22
Copy link
Contributor

@samsucik Just checking, are you still planning to fix this in #7589 ?

@samsucik
Copy link
Contributor Author

Short answer: No 😜

@TyDunn
Copy link
Contributor

TyDunn commented Apr 29, 2021

@samsucik Since you are no longer taking care of this in #7589, what are our options for tackling this? Is this a common problem we expect users to run into or more of an edge case?

@alopez alopez added the type:discussion 👨‍👧‍👦 Early stage of an idea or validation of thoughts. Should NOT be closed by PR. label Apr 29, 2021
@dakshvar22
Copy link
Contributor

@samsucik Looking at your configuration, CountVectorsFeaturizer should add both sentence and sequence level features for response attribute of the message. I suspect either of the two:

  1. Creation of model data object from all training messages is wrong
  2. The function batch_to_model_data_format is doing something unexpected here.

@samsucik
Copy link
Contributor Author

@dakshvar22 have you read my comment above? I think it's in agreement with your comment and explains what's going on (though perhaps I should've made the explanation more detailed).

@samsucik
Copy link
Contributor Author

samsucik commented May 5, 2021

@TyDunn sorry, I must've overlooked your comment.

what are our options for tackling this?

If you're asking what a fix should look like, then I outlined above -- we can (for now) just check the config and either not proceed with ResponseSelector training or proceed with some reasonable default transformer parameters. But to really fix this properly, we should change the code such that a transformer isn't needed and everything works fine without it, using just sentence-level features for labels. Which fix to go for right now is subject to discussion I'd say.

I don't know how eagerly we recommend the use_text_as_label mode to users. But if we do, I think that users are very likely to run into this with their ResponseSelectors. As a matter of fact, though, it is the non-default mode, so using ResponseSelector with the default settings will work fine.

Edit: Additionally, one also has to specify transformer_size in the config when they use use_text_as_label: True. So, basically, one more config parameter for us to check and perhaps use a sensible default value -- the current default is None, which leads to an error if a non-zero-layer transformer is specified.

@TyDunn TyDunn added effort:research/1 and removed type:discussion 👨‍👧‍👦 Early stage of an idea or validation of thoughts. Should NOT be closed by PR. labels May 7, 2021
@samsucik samsucik linked a pull request May 11, 2021 that will close this issue
6 tasks
@samsucik
Copy link
Contributor Author

samsucik commented May 14, 2021

While investigating this, I came across a deeper issue as described here. As a result, even if I change the code to set a reasonable transformer_size value, the code still won't work when the transformer has 0 layers (which is the default). Hence, for now, I'm gonna:

Note: Previously, I thought we wanted to set num_transformer_layers to a positive default value whenever use_text_as_label: true. Daksh reminded me that we want to keep the default (0) unless the user explicitly provides a positive value (indicating that they want to use a transformer).

@samsucik
Copy link
Contributor Author

Closing this one now because #8705 addresses all that is left here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:rasa-oss 🎡 Anything related to the open source Rasa framework area:rasa-oss/ml/nlu-components Issues focused around rasa's NLU components type:bug 🐛 Inconsistencies or issues which will cause an issue or problem for users or implementors.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants