ResponseSelector training fails with use_text_as_label and simple pipeline #8046

samsucik · 2021-02-25T09:22:06Z

Rasa version: main (2.3.1)

Rasa SDK version (if used & relevant):

Rasa X version (if used & relevant):

Python version: 3.8.5

Operating system (windows, osx, ...): MacOS 10.15.6

Issue: NLU training fails when I have a ResponseSelector that expects sequence-level features but don't provide any featurisers that would create such features (in this case specifically for the label attribute). I would've expected at least some warning when using use_text_as_label but not providing any sequence-level featurisers.

Error (including full traceback):

Traceback (most recent call last):
  File "/Users/sams/.miniconda/envs/rasa1/bin/rasa", line 5, in <module>
    main()
  File "/Users/sams/rasa1/rasa/__main__.py", line 116, in main
    cmdline_arguments.func(cmdline_arguments)
  File "/Users/sams/rasa1/rasa/cli/train.py", line 197, in train_nlu
    return train_nlu(
  File "/Users/sams/rasa1/rasa/train.py", line 700, in train_nlu
    return rasa.utils.common.run_in_loop(
  File "/Users/sams/rasa1/rasa/utils/common.py", line 308, in run_in_loop
    result = loop.run_until_complete(f)
  File "uvloop/loop.pyx", line 1456, in uvloop.loop.Loop.run_until_complete
  File "/Users/sams/rasa1/rasa/train.py", line 750, in train_nlu_async
    return await _train_nlu_with_validated_data(
  File "/Users/sams/rasa1/rasa/train.py", line 812, in _train_nlu_with_validated_data
    await rasa.nlu.train(
  File "/Users/sams/rasa1/rasa/nlu/train.py", line 116, in train
    interpreter = trainer.train(training_data, **kwargs)
  File "/Users/sams/rasa1/rasa/nlu/model.py", line 209, in train
    updates = component.train(working_data, self.config, **context)
  File "/Users/sams/rasa1/rasa/nlu/classifiers/diet_classifier.py", line 832, in train
    self.model.fit(
  File "/Users/sams/rasa1/rasa/utils/tensorflow/models.py", line 222, in fit
    ) = self._get_tf_train_functions(eager, model_data, batch_strategy)
  File "/Users/sams/rasa1/rasa/utils/tensorflow/models.py", line 483, in _get_tf_train_functions
    self._get_tf_call_model_function(
  File "/Users/sams/rasa1/rasa/utils/tensorflow/models.py", line 466, in _get_tf_call_model_function
    tf_call_model_function(next(iter(init_dataset)))
  File "/Users/sams/.miniconda/envs/rasa1/lib/python3.8/site-packages/tensorflow/python/eager/def_function.py", line 780, in __call__
    result = self._call(*args, **kwds)
  File "/Users/sams/.miniconda/envs/rasa1/lib/python3.8/site-packages/tensorflow/python/eager/def_function.py", line 823, in _call
    self._initialize(args, kwds, add_initializers_to=initializers)
  File "/Users/sams/.miniconda/envs/rasa1/lib/python3.8/site-packages/tensorflow/python/eager/def_function.py", line 696, in _initialize
    self._stateful_fn._get_concrete_function_internal_garbage_collected(  # pylint: disable=protected-access
  File "/Users/sams/.miniconda/envs/rasa1/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 2855, in _get_concrete_function_internal_garbage_collected
    graph_function, _, _ = self._maybe_define_function(args, kwargs)
  File "/Users/sams/.miniconda/envs/rasa1/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 3213, in _maybe_define_function
    graph_function = self._create_graph_function(args, kwargs)
  File "/Users/sams/.miniconda/envs/rasa1/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 3065, in _create_graph_function
    func_graph_module.func_graph_from_py_func(
  File "/Users/sams/.miniconda/envs/rasa1/lib/python3.8/site-packages/tensorflow/python/framework/func_graph.py", line 986, in func_graph_from_py_func
    func_outputs = python_func(*func_args, **func_kwargs)
  File "/Users/sams/.miniconda/envs/rasa1/lib/python3.8/site-packages/tensorflow/python/eager/def_function.py", line 600, in wrapped_fn
    return weak_wrapped_fn().__wrapped__(*args, **kwds)
  File "/Users/sams/.miniconda/envs/rasa1/lib/python3.8/site-packages/tensorflow/python/framework/func_graph.py", line 973, in wrapper
    raise e.ag_error_metadata.to_exception(e)
tensorflow.python.autograph.impl.api.StagingError: in user code:

    /Users/sams/rasa1/rasa/utils/tensorflow/models.py:295 train_on_batch  *
        prediction_loss = self.batch_loss(batch_in)
    /Users/sams/rasa1/rasa/nlu/selectors/response_selector.py:737 batch_loss  *
        loss, acc = self._calculate_label_loss(
    /Users/sams/rasa1/rasa/nlu/classifiers/diet_classifier.py:1419 _calculate_label_loss  *
        all_label_ids, all_labels_embed = self._create_all_labels()
    /Users/sams/rasa1/rasa/nlu/selectors/response_selector.py:651 _create_all_labels  *
        sequence_lengths_label = self._get_sequence_lengths(
    /Users/sams/rasa1/rasa/utils/tensorflow/models.py:1126 _get_sequence_lengths  *
        return tf.cast(tf_batch_data[key][sub_key][0], dtype=tf.int32) + 1

    IndexError: list index out of range

Command or request that led to error:

rasa train nlu

Content of configuration file (config.yml) (if relevant):

language: en
pipeline:
  - name: WhitespaceTokenizer
  - name: CountVectorsFeaturizer
  - name: DIETClassifier
    epochs: 1
  - name: ResponseSelector
    epochs: 1
    use_text_as_label: true

Content of domain file (domain.yml) (if relevant):

version: "2.0"
intents:
  - greet
  - goodbye

Content of NLU data file:

version: "2.0"
nlu:
- intent: greet/hi
  examples: |
    - hey
    - hello

- intent: greet/bye
  examples: |
    - bye
    - ciao

- intent: goodbye
  examples: |
    - see you
    - good night

responses:
  utter_goodbye:
  - text: "Bye"

  utter_greet/hi:
  - text: hey there

  utter_greet/bye:
  - text: bye

Definition of done

Add new warning for cases where user hasn't specified transformer_size but has enabled the transformer
Set a default transformer_size in the above case so that ResponseSelector doesn't break

The text was updated successfully, but these errors were encountered:

Dustyposa · 2021-02-26T06:08:19Z

missing intent in domain.yml

greet/hi
greet/bye

samsucik · 2021-02-26T08:19:53Z

@Dustyposa in this case greet is a retrieval intent, so it's enough to list greet in the domain file, without enumerating all the variations with different response keys (in these case the keys being hi and bye).

Dustyposa · 2021-02-26T08:26:51Z

@samsucik
in your config.yml, not declare the greet intent to retrieval intent。
need this configuration:

  - name: ResponseSelector
    retrieval_intent: greet

samsucik · 2021-02-26T08:47:09Z

@Dustyposa strictly speaking, that's not needed. By not specifying a retrieval_intent there, I effectively tell the ResponseSelector to be used for all retrieval intents.

samsucik · 2021-03-04T10:55:12Z

@dakshvar22 I think I found the "bug". Even though I specified use_text_as_label: true, I didn't specify the number of transformer layers. And the default is, unfortunately, 0. Subsequently, here in label data preparation, sequence-level features are discarded if the number of transformer layers is 0. And so it happens that the DIET2DIET code expects sequence-level features to exist always when use_text_as_label: true, but in reality this isn't true and things break.

The easiest (but only temporary) fix would be to validate the config and exit if use_text_as_label: true and num_transformer_layers is 0 or not specified. However, we really need to change the DIET2DIET code to work OK just with sentence-level label features. And since that takes us exactly to the new rasa_layers that I'm creating over in #7589, I think that fixing it as part of that PR might be the easiest way actually...

dakshvar22 · 2021-03-29T10:59:56Z

@samsucik Just checking, are you still planning to fix this in #7589 ?

samsucik · 2021-03-29T11:07:34Z

Short answer: No 😜

TyDunn · 2021-04-29T09:02:08Z

@samsucik Since you are no longer taking care of this in #7589, what are our options for tackling this? Is this a common problem we expect users to run into or more of an edge case?

dakshvar22 · 2021-04-29T11:26:21Z

@samsucik Looking at your configuration, CountVectorsFeaturizer should add both sentence and sequence level features for response attribute of the message. I suspect either of the two:

Creation of model data object from all training messages is wrong
The function batch_to_model_data_format is doing something unexpected here.

samsucik · 2021-04-30T13:40:27Z

@dakshvar22 have you read my comment above? I think it's in agreement with your comment and explains what's going on (though perhaps I should've made the explanation more detailed).

samsucik · 2021-05-05T09:42:41Z

@TyDunn sorry, I must've overlooked your comment.

what are our options for tackling this?

If you're asking what a fix should look like, then I outlined above -- we can (for now) just check the config and either not proceed with ResponseSelector training or proceed with some reasonable default transformer parameters. But to really fix this properly, we should change the code such that a transformer isn't needed and everything works fine without it, using just sentence-level features for labels. Which fix to go for right now is subject to discussion I'd say.

I don't know how eagerly we recommend the use_text_as_label mode to users. But if we do, I think that users are very likely to run into this with their ResponseSelectors. As a matter of fact, though, it is the non-default mode, so using ResponseSelector with the default settings will work fine.

Edit: Additionally, one also has to specify transformer_size in the config when they use use_text_as_label: True. So, basically, one more config parameter for us to check and perhaps use a sensible default value -- the current default is None, which leads to an error if a non-zero-layer transformer is specified.

samsucik · 2021-05-14T12:49:49Z

While investigating this, I came across a deeper issue as described here. As a result, even if I change the code to set a reasonable transformer_size value, the code still won't work when the transformer has 0 layers (which is the default). Hence, for now, I'm gonna:

open a new issue for the uncovered problem (ResponseSelector training fails with use_text_as_label and no transformer due to inconsistent data pre-processing #8705)
for this issue, create a fix that works only if the user specifies a positive num_transformer_layers in their config

Note: Previously, I thought we wanted to set num_transformer_layers to a positive default value whenever use_text_as_label: true. Daksh reminded me that we want to keep the default (0) unless the user explicitly provides a positive value (indicating that they want to use a transformer).

samsucik · 2021-05-28T08:53:03Z

Closing this one now because #8705 addresses all that is left here.

samsucik added type:bug 🐛 Inconsistencies or issues which will cause an issue or problem for users or implementors. area:rasa-oss 🎡 Anything related to the open source Rasa framework area:rasa-oss/ml/nlu-components Issues focused around rasa's NLU components labels Feb 25, 2021

alopez added the type:discussion 👨‍👧‍👦 Early stage of an idea or validation of thoughts. Should NOT be closed by PR. label Apr 29, 2021

TyDunn added effort:research/1 and removed type:discussion 👨‍👧‍👦 Early stage of an idea or validation of thoughts. Should NOT be closed by PR. labels May 7, 2021

TyDunn assigned samsucik May 7, 2021

samsucik linked a pull request May 11, 2021 that will close this issue

Improve response selector defaults when transformer is enabled #8652

Merged

6 tasks

samsucik mentioned this issue May 17, 2021

ResponseSelector training fails with use_text_as_label and no transformer due to inconsistent data pre-processing #8705

Closed

2 tasks

TyDunn added the priority:normal label May 20, 2021

samsucik closed this as completed May 28, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ResponseSelector training fails with use_text_as_label and simple pipeline #8046

ResponseSelector training fails with use_text_as_label and simple pipeline #8046

samsucik commented Feb 25, 2021 •

edited

Loading

Dustyposa commented Feb 26, 2021 •

edited

Loading

samsucik commented Feb 26, 2021

Dustyposa commented Feb 26, 2021 •

edited

Loading

samsucik commented Feb 26, 2021

samsucik commented Mar 4, 2021

dakshvar22 commented Mar 29, 2021

samsucik commented Mar 29, 2021

TyDunn commented Apr 29, 2021

dakshvar22 commented Apr 29, 2021

samsucik commented Apr 30, 2021

samsucik commented May 5, 2021 •

edited

Loading

samsucik commented May 14, 2021 •

edited

Loading

samsucik commented May 28, 2021

ResponseSelector training fails with use_text_as_label and simple pipeline #8046

ResponseSelector training fails with use_text_as_label and simple pipeline #8046

Comments

samsucik commented Feb 25, 2021 • edited Loading

Dustyposa commented Feb 26, 2021 • edited Loading

samsucik commented Feb 26, 2021

Dustyposa commented Feb 26, 2021 • edited Loading

samsucik commented Feb 26, 2021

samsucik commented Mar 4, 2021

dakshvar22 commented Mar 29, 2021

samsucik commented Mar 29, 2021

TyDunn commented Apr 29, 2021

dakshvar22 commented Apr 29, 2021

samsucik commented Apr 30, 2021

samsucik commented May 5, 2021 • edited Loading

samsucik commented May 14, 2021 • edited Loading

samsucik commented May 28, 2021

samsucik commented Feb 25, 2021 •

edited

Loading

Dustyposa commented Feb 26, 2021 •

edited

Loading

Dustyposa commented Feb 26, 2021 •

edited

Loading

samsucik commented May 5, 2021 •

edited

Loading

samsucik commented May 14, 2021 •

edited

Loading