Implement interface for bulk inferencing in TF models #8560

dakshvar22 · 2021-04-27T20:31:21Z

Proposed changes:

All models now use run_inference method to generate predictions through the model. run_inference is meant to perform batch inferencing as well which means that it implements the batching and combining of output for different batches. The specific TF models like TED, DIET do not need to know the implementation details of this batch inferencing.
Needed for future features like IntentTEDPolicy.

Status (please check what you already did):

added some tests for the functionality
updated the documentation
updated the changelog (please check changelog for instructions)
reformat files using black (please check Readme for instructions)

github-actions · 2021-04-28T04:42:57Z

Commit: ba17484, The full report is available as an artifact.

Dataset: Carbon Bot, Dataset repository branch: main

Configuration	Intent Classification Micro F1	Entity Recognition Micro F1	Response Selection Micro F1
`BERT + DIET(bow) + ResponseSelector(bow)` test: `1m32s`, train: `4m2s`, total: `5m34s`	0.7942 (0.00)	0.7529 (0.00)	0.5563 (0.00)
`BERT + DIET(seq) + ResponseSelector(t2t)` test: `1m55s`, train: `4m29s`, total: `6m24s`	0.8019 (0.00)	0.7896 (0.00)	0.5485 (0.00)
`Sparse + BERT + DIET(bow) + ResponseSelector(bow)` test: `1m39s`, train: `4m34s`, total: `6m12s`	0.7961 (-0.00)	0.7529 (0.00)	0.5762 (0.01)
`Sparse + BERT + DIET(seq) + ResponseSelector(t2t)` test: `2m0s`, train: `5m6s`, total: `7m6s`	0.7961 (-0.01)	0.7880 (-0.00)	0.5714 (-0.02)
`Sparse + DIET(bow) + ResponseSelector(bow)` test: `42s`, train: `2m52s`, total: `3m34s`	0.7243 (-0.00)	0.7529 (0.00)	0.4901 (0.01)
`Sparse + DIET(seq) + ResponseSelector(t2t)` test: `1m2s`, train: `4m19s`, total: `5m21s`	0.7476 (0.02)	0.7000 (0.01)	0.5430 (0.02)

Dataset: Hermit, Dataset repository branch: main

Configuration	Intent Classification Micro F1	Entity Recognition Micro F1	Response Selection Micro F1
`BERT + DIET(bow) + ResponseSelector(bow)` test: `2m54s`, train: `20m46s`, total: `23m39s`	0.8866 (-0.00)	0.7504 (0.00)	`no data`
`BERT + DIET(seq) + ResponseSelector(t2t)` test: `3m12s`, train: `12m54s`, total: `16m5s`	0.8922 (0.00)	0.7981 (0.00)	`no data`
`Sparse + BERT + DIET(bow) + ResponseSelector(bow)` test: `3m1s`, train: `24m17s`, total: `27m18s`	0.8857 (0.02)	0.7504 (0.00)	`no data`
`Sparse + BERT + DIET(seq) + ResponseSelector(t2t)` test: `3m16s`, train: `13m55s`, total: `17m11s`	0.8829 (0.02)	0.7991 (-0.02)	`no data`
`Sparse + DIET(bow) + ResponseSelector(bow)` test: `1m11s`, train: `21m14s`, total: `22m26s`	0.8309 (-0.00)	0.7504 (0.00)	`no data`
`Sparse + DIET(seq) + ResponseSelector(t2t)` test: `1m28s`, train: `12m44s`, total: `14m12s`	0.8513 (0.02)	0.7582 (-0.00)	`no data`

Dataset: Private 1, Dataset repository branch: main

Configuration	Intent Classification Micro F1	Entity Recognition Micro F1	Response Selection Micro F1
`BERT + DIET(bow) + ResponseSelector(bow)` test: `2m0s`, train: `3m31s`, total: `5m30s`	0.9075 (0.00)	0.9612 (0.00)	`no data`
`BERT + DIET(seq) + ResponseSelector(t2t)` test: `2m21s`, train: `3m20s`, total: `5m40s`	0.9116 (0.00)	0.9726 (0.00)	`no data`
`Spacy + DIET(bow) + ResponseSelector(bow)` test: `34s`, train: `2m45s`, total: `3m18s`	0.8503 (0.00)	0.9574 (0.00)	`no data`
`Spacy + DIET(seq) + ResponseSelector(t2t)` test: `57s`, train: `3m16s`, total: `4m13s`	0.8565 (0.00)	0.9377 (0.00)	`no data`
`Sparse + DIET(bow) + ResponseSelector(bow)` test: `29s`, train: `3m16s`, total: `3m45s`	0.9012 (0.01)	0.9612 (0.00)	`no data`
`Sparse + DIET(seq) + ResponseSelector(t2t)` test: `50s`, train: `3m10s`, total: `3m59s`	0.9023 (-0.00)	0.9735 (0.00)	`no data`
`Sparse + Spacy + DIET(bow) + ResponseSelector(bow)` test: `39s`, train: `3m57s`, total: `4m35s`	0.9033 (0.01)	0.9574 (0.00)	`no data`
`Sparse + Spacy + DIET(seq) + ResponseSelector(t2t)` test: `1m1s`, train: `3m40s`, total: `4m41s`	0.8867 (-0.01)	0.9711 (-0.00)	`no data`

Dataset: Private 2, Dataset repository branch: main

Configuration	Intent Classification Micro F1	Entity Recognition Micro F1	Response Selection Micro F1
`BERT + DIET(bow) + ResponseSelector(bow)` test: `2m4s`, train: `11m19s`, total: `13m23s`	0.8734 (0.00)	`no data`	`no data`
`Spacy + DIET(bow) + ResponseSelector(bow)` test: `42s`, train: `5m37s`, total: `6m18s`	0.7275 (0.00)	`no data`	`no data`
`Spacy + DIET(seq) + ResponseSelector(t2t)` test: `49s`, train: `5m33s`, total: `6m22s`	0.7897 (0.00)	`no data`	`no data`
`Sparse + DIET(bow) + ResponseSelector(bow)` test: `38s`, train: `5m4s`, total: `5m42s`	0.8519 (0.00)	`no data`	`no data`
`Sparse + DIET(seq) + ResponseSelector(t2t)` test: `44s`, train: `4m54s`, total: `5m38s`	0.8519 (0.00)	`no data`	`no data`
`Sparse + Spacy + DIET(bow) + ResponseSelector(bow)` test: `49s`, train: `7m43s`, total: `8m32s`	0.8637 (0.02)	`no data`	`no data`
`Sparse + Spacy + DIET(seq) + ResponseSelector(t2t)` test: `54s`, train: `6m8s`, total: `7m2s`	0.8755 (0.02)	`no data`	`no data`

Dataset: Private 3, Dataset repository branch: main

Configuration	Intent Classification Micro F1	Entity Recognition Micro F1	Response Selection Micro F1
`BERT + DIET(bow) + ResponseSelector(bow)` test: `1m3s`, train: `1m4s`, total: `2m6s`	0.9136 (0.00)	`no data`	`no data`
`BERT + DIET(seq) + ResponseSelector(t2t)` test: `1m6s`, train: `47s`, total: `1m53s`	0.8560 (0.00)	`no data`	`no data`
`Spacy + DIET(bow) + ResponseSelector(bow)` test: `39s`, train: `52s`, total: `1m31s`	0.6049 (0.00)	`no data`	`no data`
`Spacy + DIET(seq) + ResponseSelector(t2t)` test: `43s`, train: `42s`, total: `1m25s`	0.5967 (0.00)	`no data`	`no data`
`Sparse + DIET(bow) + ResponseSelector(bow)` test: `35s`, train: `1m2s`, total: `1m37s`	0.8477 (0.01)	`no data`	`no data`
`Sparse + DIET(seq) + ResponseSelector(t2t)` test: `39s`, train: `43s`, total: `1m21s`	0.8560 (0.02)	`no data`	`no data`
`Sparse + Spacy + DIET(bow) + ResponseSelector(bow)` test: `40s`, train: `1m13s`, total: `1m52s`	0.8807 (0.01)	`no data`	`no data`
`Sparse + Spacy + DIET(seq) + ResponseSelector(t2t)` test: `44s`, train: `49s`, total: `1m33s`	0.8560 (-0.02)	`no data`	`no data`

Dataset: Sara, Dataset repository branch: main

Configuration	Intent Classification Micro F1	Entity Recognition Micro F1	Response Selection Micro F1
`BERT + DIET(bow) + ResponseSelector(bow)` test: `2m30s`, train: `4m42s`, total: `7m11s`	0.8492 (0.00)	0.8683 (0.00)	0.8652 (-0.00)
`BERT + DIET(seq) + ResponseSelector(t2t)` test: `2m50s`, train: `3m48s`, total: `6m38s`	0.8531 (0.00)	0.8774 (0.00)	0.8696 (0.00)
`Sparse + BERT + DIET(bow) + ResponseSelector(bow)` test: `2m41s`, train: `7m9s`, total: `9m50s`	0.8737 (0.01)	0.8683 (0.00)	0.8957 (0.00)
`Sparse + BERT + DIET(seq) + ResponseSelector(t2t)` test: `3m1s`, train: `5m2s`, total: `8m3s`	0.8737 (0.01)	0.9019 (-0.00)	0.8957 (-0.01)
`Sparse + DIET(bow) + ResponseSelector(bow)` test: `55s`, train: `5m11s`, total: `6m6s`	0.8355 (0.01)	0.8683 (0.00)	0.8609 (-0.00)

samsucik · 2021-04-28T07:34:03Z

@dakshvar22 I can see failed training on Sara with

    ...
    ...
    class RasaTrainingLogger(tf.keras.callbacks.Callback):
AttributeError: module 'tensorflow' has no attribute 'keras'

Is this something to be fixed in this PR, or is it unrelated?

dakshvar22 · 2021-04-28T07:37:00Z

I don't think that's related but weird that it fails specifically on that pair of dataset and config.
Let me rerun that configuration once more

github-actions · 2021-04-28T07:40:57Z

Hey @dakshvar22! 👋 To run model regression tests, comment with the /modeltest command and a configuration.

Tips 💡: The model regression test will be run on push events. You can re-run the tests by re-add status:model-regression-tests label or use a Re-run jobs button in Github Actions workflow.

Tips 💡: Every time when you want to change a configuration you should edit the comment with the previous configuration.

You can copy this in your comment and customize:

/modeltest

```yml
##########
## Available datasets
##########
# - "Carbon Bot"
# - "Hermit"
# - "Private 1"
# - "Private 2"
# - "Private 3"
# - "Sara"

##########
## Available configurations
##########
# - "BERT + DIET(bow) + ResponseSelector(bow)"
# - "BERT + DIET(seq) + ResponseSelector(t2t)"
# - "Spacy + DIET(bow) + ResponseSelector(bow)"
# - "Spacy + DIET(seq) + ResponseSelector(t2t)"
# - "Sparse + BERT + DIET(bow) + ResponseSelector(bow)"
# - "Sparse + BERT + DIET(seq) + ResponseSelector(t2t)"
# - "Sparse + DIET(bow) + ResponseSelector(bow)"
# - "Sparse + DIET(seq) + ResponseSelector(t2t)"
# - "Sparse + Spacy + DIET(bow) + ResponseSelector(bow)"
# - "Sparse + Spacy + DIET(seq) + ResponseSelector(t2t)"

## Example configuration
#################### syntax #################
## include:
##   - dataset: ["<dataset_name>"]
##     config: ["<configuration_name>"]
#
## Example:
## include:
##  - dataset: ["Carbon Bot"]
##    config: ["Sparse + DIET(bow) + ResponseSelector(bow)"]
#
## Shortcut:
## You can use the "all" shortcut to include all available configurations or datasets
#
## Example: Use the "Sparse + EmbeddingIntent + ResponseSelector(bow)" configuration
## for all available datasets
## include:
##  - dataset: ["all"]
##    config: ["Sparse + DIET(bow) + ResponseSelector(bow)"]
#
## Example: Use all available configurations for the "Carbon Bot" and "Sara" datasets
## and for the "Hermit" dataset use the "Sparse + DIET + ResponseSelector(T2T)" and
## "BERT + DIET + ResponseSelector(T2T)" configurations:
## include:
##  - dataset: ["Carbon Bot", "Sara"]
##    config: ["all"]
##  - dataset: ["Hermit"]
##    config: ["Sparse + DIET(seq) + ResponseSelector(t2t)", "BERT + DIET(seq) + ResponseSelector(t2t)"]
#
## Example: Define a branch name to check-out for a dataset repository. Default branch is 'main'
## dataset_branch: "test-branch"
## include:
##  - dataset: ["Carbon Bot", "Sara"]
##    config: ["all"]


include:
 - dataset: ["Carbon Bot"]
   config: ["Sparse + DIET(bow) + ResponseSelector(bow)"]

```

github-actions · 2021-04-28T07:40:59Z

/modeltest

include:
 - dataset: ["Sara"]
   config: ["Sparse + DIET(seq) + ResponseSelector(t2t)"]

github-actions · 2021-04-28T07:41:02Z

The model regression tests have started. It might take a while, please be patient.
As soon as results are ready you'll see a new comment with the results.

Used configuration can be found in the comment.

github-actions · 2021-04-28T07:53:04Z

Commit: ba17484, The full report is available as an artifact.

Dataset: Sara, Dataset repository branch: main

Configuration	Intent Classification Micro F1	Entity Recognition Micro F1	Response Selection Micro F1
`Sparse + DIET(seq) + ResponseSelector(t2t)` test: `1m6s`, train: `3m49s`, total: `4m55s`	0.8521 (0.01)	0.8470 (0.03)	0.8652 (0.01)

dakshvar22 · 2021-04-28T07:54:15Z

@samsucik Okay it ran successfully this time.

samsucik

Thanks, Daksh, especially for adding the tests! A few tiny things may require changes, but nothing serious as far as I can see 🙂

rasa/utils/tensorflow/models.py

tests/utils/tensorflow/test_models.py

Co-authored-by: Sam Sucik <[email protected]>

dakshvar22 · 2021-04-28T11:17:06Z

@samsucik I also made rasa_predict private now because there should be only one public method for running inference ideally and that can be run_inference now.

samsucik

Looks good 🚀

rasa/utils/tensorflow/models.py

tests/utils/tensorflow/test_models.py

Co-authored-by: Sam Sucik <[email protected]>

dakshvar22 · 2021-04-28T11:49:45Z

Thanks for the good discussion @samsucik 🙌

dakshvar22 added 2 commits April 27, 2021 22:22

use data generator for inference

e2705c3

refactor to create the generator inside to avoid duplication

6fd3fff

dakshvar22 changed the title ~~Refactor inference to use data generators instead of pre-prepared batches~~ Implement interface for bulk inferencing in TF models Apr 27, 2021

added tests and changelog

ac764ec

dakshvar22 requested review from samsucik and JEM-Mosig April 27, 2021 22:06

dakshvar22 added runner:gpu status:model-regression-tests and removed status:model-regression-tests labels Apr 27, 2021

github-actions bot deleted a comment from dakshvar22 Apr 27, 2021

dakshvar22 self-assigned this Apr 27, 2021

github-actions bot removed status:model-regression-tests runner:gpu labels Apr 28, 2021

dakshvar22 added runner:gpu status:model-regression-tests labels Apr 28, 2021

github-actions bot removed status:model-regression-tests runner:gpu labels Apr 28, 2021

samsucik suggested changes Apr 28, 2021

View reviewed changes

dakshvar22 and others added 3 commits April 28, 2021 11:37

Apply suggestions from code review

9eb8a2b

Co-authored-by: Sam Sucik <[email protected]>

review comments

0c5ae99

fix docstring

e08e649

place merge method inside RasaModel. Make rasa_predict private

0517008

dakshvar22 requested a review from samsucik April 28, 2021 11:14

satisfy linter

a092ba7

samsucik approved these changes Apr 28, 2021

View reviewed changes

rasa/utils/tensorflow/models.py Outdated Show resolved Hide resolved

tests/utils/tensorflow/test_models.py Outdated Show resolved Hide resolved

dakshvar22 and others added 2 commits April 28, 2021 13:48

Update rasa/utils/tensorflow/models.py

0bd1f5c

Co-authored-by: Sam Sucik <[email protected]>

change annotation type

82f90e5

dakshvar22 merged commit cd62d41 into main Apr 28, 2021

dakshvar22 deleted the predict_generator branch April 28, 2021 12:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement interface for bulk inferencing in TF models #8560

Implement interface for bulk inferencing in TF models #8560

dakshvar22 commented Apr 27, 2021 •

edited

Loading

github-actions bot commented Apr 28, 2021

samsucik commented Apr 28, 2021

dakshvar22 commented Apr 28, 2021

github-actions bot commented Apr 28, 2021

github-actions bot commented Apr 28, 2021

github-actions bot commented Apr 28, 2021

github-actions bot commented Apr 28, 2021

dakshvar22 commented Apr 28, 2021

samsucik left a comment

dakshvar22 commented Apr 28, 2021

samsucik left a comment

dakshvar22 commented Apr 28, 2021

Implement interface for bulk inferencing in TF models #8560

Implement interface for bulk inferencing in TF models #8560

Conversation

dakshvar22 commented Apr 27, 2021 • edited Loading

github-actions bot commented Apr 28, 2021

samsucik commented Apr 28, 2021

dakshvar22 commented Apr 28, 2021

github-actions bot commented Apr 28, 2021

github-actions bot commented Apr 28, 2021

github-actions bot commented Apr 28, 2021

github-actions bot commented Apr 28, 2021

dakshvar22 commented Apr 28, 2021

samsucik left a comment

Choose a reason for hiding this comment

dakshvar22 commented Apr 28, 2021

samsucik left a comment

Choose a reason for hiding this comment

dakshvar22 commented Apr 28, 2021

dakshvar22 commented Apr 27, 2021 •

edited

Loading