Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor feature combining layers in DIET and TED #7589

Merged
merged 144 commits into from
Apr 27, 2021

Conversation

samsucik
Copy link
Contributor

@samsucik samsucik commented Dec 17, 2020

Proposed changes:

  • Move feature-processing logic used by TED, DIET and ResponseSelector into separate layer components (rasa_layers) as part of e2e leftovers (Bring e2e in *for real* left-overs #6670); closes Turn feature combining in TED/DIET into re-usable layers #7187. These layers, unlike those in layers.py, are meant to be used only with Rasa models (hence they also consume and utilise model configs). They fit between the model-specific code like ted_policy.py or models.py and the model-agnostic code like in layers.py. The newly introduced layers form a pipeline that goes like this (from bottom to top level):
    • ConcatenateSparseDenseFeatures combines multiple sparse and dense feature tensors into one.
    • RasaFeatureCombiningLayer additionally combines sequence-level and sentence-level features.
    • RasaSequenceLayer (only used for attributes with sequence-level features) additionally embeds the combined features with fully connected layers and a transformer, and facilitates masked language modeling.
  • Add unit tests for the new layers, see this comment for an overview.
  • Fix a small bug in the masked language modelling implementation (which perhaps wasn't causing any serious degradation in model performance).

To-do:

  • with ideas/help from engineering:
    • improve naming re _prepare_sequence_sentence_concat (see this comment for context)
    • get the new unit tests (testing the ML layer classes) to a state where we agree that they're good examples worth following
    • agree on using constant names vs their values in docstrings (see this Slack message). Decision: either is acceptable, staying with constants' values.
  • add type annotations to added unit tests
  • update MINIMUM_COMPATIBLE_VERSION (due to model-breaking changes) moved to Increase MINIMUM_COMPATIBLE_VERSION before releasing 2.6.0 #8498

Status (please check what you already did):

  • added some tests for the functionality
  • updated the documentation
  • updated the changelog (please check changelog for instructions)
  • reformat files using black (please check Readme for instructions)

rasa/core/policies/ted_policy.py Outdated Show resolved Hide resolved
rasa/core/policies/ted_policy.py Outdated Show resolved Hide resolved
rasa/core/policies/ted_policy.py Outdated Show resolved Hide resolved
rasa/nlu/classifiers/diet_classifier.py Outdated Show resolved Hide resolved
rasa/utils/tensorflow/rasa_layers.py Outdated Show resolved Hide resolved
@Ghostvv Ghostvv mentioned this pull request Jan 12, 2021
7 tasks
@samsucik samsucik added tools:clear-poetry-cache-unit-tests Clear poetry cache for the unit tests and removed unit-tests:clear-poetry-cache labels Jan 19, 2021
Copy link
Contributor

@twerkmeister twerkmeister left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@github-actions
Copy link
Contributor

Hey @samsucik! 👋 To run model regression tests, comment with the /modeltest command and a configuration.

Tips 💡: The model regression test will be run on push events. You can re-run the tests by re-add status:model-regression-tests label or use a Re-run jobs button in Github Actions workflow.

Tips 💡: Every time when you want to change a configuration you should edit the comment with the previous configuration.

You can copy this in your comment and customize:

/modeltest

```yml
##########
## Available datasets
##########
# - "Carbon Bot"
# - "Hermit"
# - "Private 1"
# - "Private 2"
# - "Private 3"
# - "Sara"

##########
## Available configurations
##########
# - "BERT + DIET(bow) + ResponseSelector(bow)"
# - "BERT + DIET(seq) + ResponseSelector(t2t)"
# - "Spacy + DIET(bow) + ResponseSelector(bow)"
# - "Spacy + DIET(seq) + ResponseSelector(t2t)"
# - "Sparse + BERT + DIET(bow) + ResponseSelector(bow)"
# - "Sparse + BERT + DIET(seq) + ResponseSelector(t2t)"
# - "Sparse + DIET(bow) + ResponseSelector(bow)"
# - "Sparse + DIET(seq) + ResponseSelector(t2t)"
# - "Sparse + Spacy + DIET(bow) + ResponseSelector(bow)"
# - "Sparse + Spacy + DIET(seq) + ResponseSelector(t2t)"

## Example configuration
#################### syntax #################
## include:
##   - dataset: ["<dataset_name>"]
##     config: ["<configuration_name>"]
#
## Example:
## include:
##  - dataset: ["Carbon Bot"]
##    config: ["Sparse + DIET(bow) + ResponseSelector(bow)"]
#
## Shortcut:
## You can use the "all" shortcut to include all available configurations or datasets
#
## Example: Use the "Sparse + EmbeddingIntent + ResponseSelector(bow)" configuration
## for all available datasets
## include:
##  - dataset: ["all"]
##    config: ["Sparse + DIET(bow) + ResponseSelector(bow)"]
#
## Example: Use all available configurations for the "Carbon Bot" and "Sara" datasets
## and for the "Hermit" dataset use the "Sparse + DIET + ResponseSelector(T2T)" and
## "BERT + DIET + ResponseSelector(T2T)" configurations:
## include:
##  - dataset: ["Carbon Bot", "Sara"]
##    config: ["all"]
##  - dataset: ["Hermit"]
##    config: ["Sparse + DIET(seq) + ResponseSelector(t2t)", "BERT + DIET(seq) + ResponseSelector(t2t)"]
#
## Example: Define a branch name to check-out for a dataset repository. Default branch is 'main'
## dataset_branch: "test-branch"
## include:
##  - dataset: ["Carbon Bot", "Sara"]
##    config: ["all"]


include:
 - dataset: ["Carbon Bot"]
   config: ["Sparse + DIET(bow) + ResponseSelector(bow)"]

```

1 similar comment
@github-actions
Copy link
Contributor

Hey @samsucik! 👋 To run model regression tests, comment with the /modeltest command and a configuration.

Tips 💡: The model regression test will be run on push events. You can re-run the tests by re-add status:model-regression-tests label or use a Re-run jobs button in Github Actions workflow.

Tips 💡: Every time when you want to change a configuration you should edit the comment with the previous configuration.

You can copy this in your comment and customize:

/modeltest

```yml
##########
## Available datasets
##########
# - "Carbon Bot"
# - "Hermit"
# - "Private 1"
# - "Private 2"
# - "Private 3"
# - "Sara"

##########
## Available configurations
##########
# - "BERT + DIET(bow) + ResponseSelector(bow)"
# - "BERT + DIET(seq) + ResponseSelector(t2t)"
# - "Spacy + DIET(bow) + ResponseSelector(bow)"
# - "Spacy + DIET(seq) + ResponseSelector(t2t)"
# - "Sparse + BERT + DIET(bow) + ResponseSelector(bow)"
# - "Sparse + BERT + DIET(seq) + ResponseSelector(t2t)"
# - "Sparse + DIET(bow) + ResponseSelector(bow)"
# - "Sparse + DIET(seq) + ResponseSelector(t2t)"
# - "Sparse + Spacy + DIET(bow) + ResponseSelector(bow)"
# - "Sparse + Spacy + DIET(seq) + ResponseSelector(t2t)"

## Example configuration
#################### syntax #################
## include:
##   - dataset: ["<dataset_name>"]
##     config: ["<configuration_name>"]
#
## Example:
## include:
##  - dataset: ["Carbon Bot"]
##    config: ["Sparse + DIET(bow) + ResponseSelector(bow)"]
#
## Shortcut:
## You can use the "all" shortcut to include all available configurations or datasets
#
## Example: Use the "Sparse + EmbeddingIntent + ResponseSelector(bow)" configuration
## for all available datasets
## include:
##  - dataset: ["all"]
##    config: ["Sparse + DIET(bow) + ResponseSelector(bow)"]
#
## Example: Use all available configurations for the "Carbon Bot" and "Sara" datasets
## and for the "Hermit" dataset use the "Sparse + DIET + ResponseSelector(T2T)" and
## "BERT + DIET + ResponseSelector(T2T)" configurations:
## include:
##  - dataset: ["Carbon Bot", "Sara"]
##    config: ["all"]
##  - dataset: ["Hermit"]
##    config: ["Sparse + DIET(seq) + ResponseSelector(t2t)", "BERT + DIET(seq) + ResponseSelector(t2t)"]
#
## Example: Define a branch name to check-out for a dataset repository. Default branch is 'main'
## dataset_branch: "test-branch"
## include:
##  - dataset: ["Carbon Bot", "Sara"]
##    config: ["all"]


include:
 - dataset: ["Carbon Bot"]
   config: ["Sparse + DIET(bow) + ResponseSelector(bow)"]

```

@github-actions
Copy link
Contributor

/modeltest

# dataset_branch: "no_random_seed"
 include:
  - dataset: ["all"]
    config: ["all"]

1 similar comment
@github-actions
Copy link
Contributor

/modeltest

# dataset_branch: "no_random_seed"
 include:
  - dataset: ["all"]
    config: ["all"]

@github-actions
Copy link
Contributor

The model regression tests have started. It might take a while, please be patient.
As soon as results are ready you'll see a new comment with the results.

Used configuration can be found in the comment.

1 similar comment
@github-actions
Copy link
Contributor

The model regression tests have started. It might take a while, please be patient.
As soon as results are ready you'll see a new comment with the results.

Used configuration can be found in the comment.

@github-actions
Copy link
Contributor

Commit: 5f43e5a, The full report is available as an artifact.

Dataset: Carbon Bot, Dataset repository branch: main

Configuration Intent Classification Micro F1 Entity Recognition Micro F1 Response Selection Micro F1
BERT + DIET(bow) + ResponseSelector(bow)
test: 1m23s, train: 3m51s, total: 5m14s
0.7942 (0.00) 0.7529 (0.00) 0.5563 (0.00)
BERT + DIET(seq) + ResponseSelector(t2t)
test: 1m42s, train: 4m8s, total: 5m50s
0.8019 (0.00) 0.7896 (0.00) 0.5485 (0.00)
Sparse + BERT + DIET(bow) + ResponseSelector(bow)
test: 1m30s, train: 4m42s, total: 6m11s
0.7903 (-0.01) 0.7529 (0.00) 0.5497 (-0.02)
Sparse + BERT + DIET(seq) + ResponseSelector(t2t)
test: 1m48s, train: 4m42s, total: 6m29s
0.7961 (-0.01) 0.7880 (-0.00) 0.5960 (0.00)
Sparse + DIET(bow) + ResponseSelector(bow)
test: 38s, train: 2m40s, total: 3m17s
0.7184 (-0.01) 0.7529 (0.00) 0.4967 (0.01)
Sparse + DIET(seq) + ResponseSelector(t2t)
test: 57s, train: 3m55s, total: 4m51s
0.7476 (0.02) 0.7000 (0.01) 0.5033 (-0.02)

Dataset: Hermit, Dataset repository branch: main

Configuration Intent Classification Micro F1 Entity Recognition Micro F1 Response Selection Micro F1
BERT + DIET(bow) + ResponseSelector(bow)
test: 2m39s, train: 19m2s, total: 21m41s
0.8857 (-0.00) 0.7504 (0.00) no data
BERT + DIET(seq) + ResponseSelector(t2t)
test: 2m46s, train: 12m6s, total: 14m52s
0.8922 (0.00) 0.7981 (0.00) no data
Sparse + BERT + DIET(bow) + ResponseSelector(bow)
test: 2m34s, train: 21m33s, total: 24m7s
0.8857 (0.02) 0.7504 (0.00) no data
Sparse + BERT + DIET(seq) + ResponseSelector(t2t)
test: 2m49s, train: 13m3s, total: 15m51s
0.8838 (0.02) 0.7984 (-0.02) no data
Sparse + DIET(bow) + ResponseSelector(bow)
test: 1m2s, train: 18m48s, total: 19m49s
0.8318 (-0.00) 0.7504 (0.00) no data
Sparse + DIET(seq) + ResponseSelector(t2t)
test: 1m18s, train: 12m9s, total: 13m26s
0.8513 (0.02) 0.7575 (-0.00) no data

Dataset: Private 1, Dataset repository branch: main

Configuration Intent Classification Micro F1 Entity Recognition Micro F1 Response Selection Micro F1
BERT + DIET(bow) + ResponseSelector(bow)
test: 1m47s, train: 3m12s, total: 4m59s
0.9075 (0.00) 0.9612 (0.00) no data
BERT + DIET(seq) + ResponseSelector(t2t)
test: 2m7s, train: 3m3s, total: 5m10s
0.9116 (0.00) 0.9726 (0.00) no data
Spacy + DIET(bow) + ResponseSelector(bow)
test: 31s, train: 2m28s, total: 2m58s
0.8503 (0.00) 0.9574 (0.00) no data
Spacy + DIET(seq) + ResponseSelector(t2t)
test: 51s, train: 2m58s, total: 3m49s
0.8565 (0.00) 0.9377 (0.00) no data
Sparse + DIET(bow) + ResponseSelector(bow)
test: 26s, train: 2m58s, total: 3m23s
0.8992 (0.01) 0.9612 (0.00) no data
Sparse + DIET(seq) + ResponseSelector(t2t)
test: 45s, train: 2m51s, total: 3m35s
0.9023 (-0.00) 0.9735 (0.00) no data
Sparse + Spacy + DIET(bow) + ResponseSelector(bow)
test: 33s, train: 3m35s, total: 4m8s
0.8971 (0.00) 0.9574 (0.00) no data
Sparse + Spacy + DIET(seq) + ResponseSelector(t2t)
test: 55s, train: 3m21s, total: 4m15s
0.8898 (-0.00) 0.9693 (-0.00) no data

Dataset: Private 2, Dataset repository branch: main

Configuration Intent Classification Micro F1 Entity Recognition Micro F1 Response Selection Micro F1
BERT + DIET(bow) + ResponseSelector(bow)
test: 1m51s, train: 10m45s, total: 12m36s
0.8734 (0.00) no data no data
Spacy + DIET(bow) + ResponseSelector(bow)
test: 37s, train: 5m23s, total: 6m0s
0.7275 (0.00) no data no data
Spacy + DIET(seq) + ResponseSelector(t2t)
test: 43s, train: 5m27s, total: 6m9s
0.7897 (0.00) no data no data
Sparse + DIET(bow) + ResponseSelector(bow)
test: 33s, train: 4m45s, total: 5m17s
0.8552 (0.00) no data no data
Sparse + DIET(seq) + ResponseSelector(t2t)
test: 38s, train: 4m45s, total: 5m23s
0.8519 (0.00) no data no data
Sparse + Spacy + DIET(bow) + ResponseSelector(bow)
test: 42s, train: 7m6s, total: 7m47s
0.8659 (0.02) no data no data
Sparse + Spacy + DIET(seq) + ResponseSelector(t2t)
test: 46s, train: 5m56s, total: 6m42s
0.8755 (0.02) no data no data

Dataset: Private 3, Dataset repository branch: main

Configuration Intent Classification Micro F1 Entity Recognition Micro F1 Response Selection Micro F1
BERT + DIET(bow) + ResponseSelector(bow)
test: 56s, train: 58s, total: 1m53s
0.9136 (0.00) no data no data
BERT + DIET(seq) + ResponseSelector(t2t)
test: 59s, train: 42s, total: 1m41s
0.8560 (0.00) no data no data
Spacy + DIET(bow) + ResponseSelector(bow)
test: 34s, train: 47s, total: 1m20s
0.6049 (0.00) no data no data
Spacy + DIET(seq) + ResponseSelector(t2t)
test: 37s, train: 38s, total: 1m15s
0.5967 (0.00) no data no data
Sparse + DIET(bow) + ResponseSelector(bow)
test: 30s, train: 55s, total: 1m25s
0.8477 (0.01) no data no data
Sparse + DIET(seq) + ResponseSelector(t2t)
test: 34s, train: 39s, total: 1m12s
0.8560 (0.02) no data no data
Sparse + Spacy + DIET(bow) + ResponseSelector(bow)
test: 34s, train: 1m3s, total: 1m37s
0.8807 (0.01) no data no data
Sparse + Spacy + DIET(seq) + ResponseSelector(t2t)
test: 38s, train: 43s, total: 1m21s
0.8560 (-0.02) no data no data

Dataset: Sara, Dataset repository branch: main

Configuration Intent Classification Micro F1 Entity Recognition Micro F1 Response Selection Micro F1
BERT + DIET(bow) + ResponseSelector(bow)
test: 2m10s, train: 4m10s, total: 6m20s
0.8492 (0.00) 0.8683 (0.00) 0.8696 (0.00)
BERT + DIET(seq) + ResponseSelector(t2t)
test: 2m27s, train: 3m23s, total: 5m50s
0.8531 (0.00) 0.8774 (0.00) 0.8696 (0.00)
Sparse + BERT + DIET(bow) + ResponseSelector(bow)
test: 2m20s, train: 6m28s, total: 8m48s
0.8756 (0.01) 0.8683 (0.00) 0.8913 (0.00)
Sparse + BERT + DIET(seq) + ResponseSelector(t2t)
test: 2m37s, train: 4m31s, total: 7m8s
0.8737 (0.01) 0.9019 (-0.00) 0.9000 (-0.01)
Sparse + DIET(bow) + ResponseSelector(bow)
test: 49s, train: 4m50s, total: 5m38s
0.8345 (0.00) 0.8683 (0.00) 0.8609 (-0.00)
Sparse + DIET(seq) + ResponseSelector(t2t)
test: 1m7s, train: 3m42s, total: 4m49s
0.8521 (0.01) 0.8470 (0.03) 0.8630 (0.00)

@github-actions
Copy link
Contributor

Commit: 5f43e5a, The full report is available as an artifact.

Dataset: Carbon Bot, Dataset repository branch: main

Configuration Intent Classification Micro F1 Entity Recognition Micro F1 Response Selection Micro F1
BERT + DIET(bow) + ResponseSelector(bow)
test: 1m22s, train: 3m39s, total: 5m0s
0.7942 (0.00) 0.7529 (0.00) 0.5563 (0.00)
BERT + DIET(seq) + ResponseSelector(t2t)
test: 1m41s, train: 4m1s, total: 5m42s
0.8019 (0.00) 0.7896 (0.00) 0.5485 (0.00)
Sparse + BERT + DIET(bow) + ResponseSelector(bow)
test: 1m28s, train: 4m19s, total: 5m46s
0.7961 (-0.00) 0.7529 (0.00) 0.5364 (-0.03)
Sparse + BERT + DIET(seq) + ResponseSelector(t2t)
test: 1m45s, train: 4m35s, total: 6m20s
0.7961 (-0.01) 0.7880 (-0.00) 0.5695 (-0.03)
Sparse + DIET(bow) + ResponseSelector(bow)
test: 36s, train: 2m36s, total: 3m12s
0.7262 (-0.00) 0.7529 (0.00) 0.5050 (0.02)
Sparse + DIET(seq) + ResponseSelector(t2t)
test: 56s, train: 3m49s, total: 4m44s
0.7476 (0.02) 0.7000 (0.01) 0.5033 (-0.02)

Dataset: Hermit, Dataset repository branch: main

Configuration Intent Classification Micro F1 Entity Recognition Micro F1 Response Selection Micro F1
BERT + DIET(bow) + ResponseSelector(bow)
test: 2m30s, train: 18m6s, total: 20m36s
0.8866 (-0.00) 0.7504 (0.00) no data
BERT + DIET(seq) + ResponseSelector(t2t)
test: 2m55s, train: 12m24s, total: 15m18s
0.8922 (0.00) 0.7981 (0.00) no data
Sparse + BERT + DIET(bow) + ResponseSelector(bow)
test: 2m43s, train: 22m33s, total: 25m16s
0.8866 (0.02) 0.7504 (0.00) no data
Sparse + BERT + DIET(seq) + ResponseSelector(t2t)
test: 3m2s, train: 13m30s, total: 16m31s
0.8838 (0.02) 0.8000 (-0.01) no data
Sparse + DIET(bow) + ResponseSelector(bow)
test: 1m3s, train: 19m21s, total: 20m24s
0.8346 (-0.00) 0.7504 (0.00) no data
Sparse + DIET(seq) + ResponseSelector(t2t)
test: 1m22s, train: 12m23s, total: 13m44s
0.8513 (0.02) 0.7577 (-0.00) no data

Dataset: Private 1, Dataset repository branch: main

Configuration Intent Classification Micro F1 Entity Recognition Micro F1 Response Selection Micro F1
BERT + DIET(bow) + ResponseSelector(bow)
test: 1m51s, train: 3m23s, total: 5m14s
0.9075 (0.00) 0.9612 (0.00) no data
BERT + DIET(seq) + ResponseSelector(t2t)
test: 2m13s, train: 3m12s, total: 5m24s
0.9116 (0.00) 0.9726 (0.00) no data
Spacy + DIET(bow) + ResponseSelector(bow)
test: 33s, train: 2m39s, total: 3m11s
0.8503 (0.00) 0.9574 (0.00) no data
Spacy + DIET(seq) + ResponseSelector(t2t)
test: 56s, train: 3m8s, total: 4m3s
0.8565 (0.00) 0.9377 (0.00) no data
Sparse + DIET(bow) + ResponseSelector(bow)
test: 27s, train: 3m8s, total: 3m34s
0.8950 (0.00) 0.9612 (0.00) no data
Sparse + DIET(seq) + ResponseSelector(t2t)
test: 49s, train: 3m2s, total: 3m50s
0.9012 (-0.00) 0.9717 (0.00) no data
Sparse + Spacy + DIET(bow) + ResponseSelector(bow)
test: 37s, train: 3m45s, total: 4m22s
0.9023 (0.01) 0.9574 (0.00) no data

Dataset: Private 2, Dataset repository branch: main

Configuration Intent Classification Micro F1 Entity Recognition Micro F1 Response Selection Micro F1
BERT + DIET(bow) + ResponseSelector(bow)
test: 1m56s, train: 10m38s, total: 12m34s
0.8734 (0.00) no data no data
Spacy + DIET(bow) + ResponseSelector(bow)
test: 41s, train: 5m45s, total: 6m26s
0.7275 (0.00) no data no data
Spacy + DIET(seq) + ResponseSelector(t2t)
test: 48s, train: 5m33s, total: 6m20s
0.7897 (0.00) no data no data
Sparse + DIET(bow) + ResponseSelector(bow)
test: 36s, train: 4m55s, total: 5m31s
0.8616 (0.01) no data no data
Sparse + DIET(seq) + ResponseSelector(t2t)
test: 41s, train: 4m50s, total: 5m30s
0.8519 (0.00) no data no data
Sparse + Spacy + DIET(bow) + ResponseSelector(bow)
test: 47s, train: 7m36s, total: 8m22s
0.8637 (0.02) no data no data
Sparse + Spacy + DIET(seq) + ResponseSelector(t2t)
test: 52s, train: 6m5s, total: 6m57s
0.8766 (0.02) no data no data

Dataset: Private 3, Dataset repository branch: main

Configuration Intent Classification Micro F1 Entity Recognition Micro F1 Response Selection Micro F1
BERT + DIET(bow) + ResponseSelector(bow)
test: 1m1s, train: 1m2s, total: 2m3s
0.9136 (0.00) no data no data
BERT + DIET(seq) + ResponseSelector(t2t)
test: 1m5s, train: 45s, total: 1m50s
0.8560 (0.00) no data no data
Spacy + DIET(bow) + ResponseSelector(bow)
test: 38s, train: 51s, total: 1m28s
0.6049 (0.00) no data no data
Spacy + DIET(seq) + ResponseSelector(t2t)
test: 42s, train: 41s, total: 1m23s
0.5967 (0.00) no data no data
Sparse + DIET(bow) + ResponseSelector(bow)
test: 34s, train: 1m0s, total: 1m33s
0.8477 (0.01) no data no data
Sparse + DIET(seq) + ResponseSelector(t2t)
test: 37s, train: 41s, total: 1m18s
0.8560 (0.02) no data no data
Sparse + Spacy + DIET(bow) + ResponseSelector(bow)
test: 38s, train: 1m10s, total: 1m48s
0.8807 (0.01) no data no data
Sparse + Spacy + DIET(seq) + ResponseSelector(t2t)
test: 43s, train: 48s, total: 1m30s
0.8560 (-0.02) no data no data

Dataset: Sara, Dataset repository branch: main

Configuration Intent Classification Micro F1 Entity Recognition Micro F1 Response Selection Micro F1
BERT + DIET(bow) + ResponseSelector(bow)
test: 2m23s, train: 4m32s, total: 6m54s
0.8492 (0.00) 0.8683 (0.00) 0.8674 (0.00)
BERT + DIET(seq) + ResponseSelector(t2t)
test: 2m44s, train: 3m40s, total: 6m23s
0.8531 (0.00) 0.8774 (0.00) 0.8696 (0.00)
Sparse + BERT + DIET(bow) + ResponseSelector(bow)
test: 2m36s, train: 6m54s, total: 9m29s
0.8786 (0.01) 0.8683 (0.00) 0.8913 (0.00)
Sparse + BERT + DIET(seq) + ResponseSelector(t2t)
test: 2m54s, train: 4m50s, total: 7m44s
0.8737 (0.01) 0.9019 (-0.00) 0.8891 (-0.02)
Sparse + DIET(bow) + ResponseSelector(bow)
test: 54s, train: 5m12s, total: 6m5s
0.8306 (0.00) 0.8683 (0.00) 0.8609 (-0.00)
Sparse + DIET(seq) + ResponseSelector(t2t)
test: 1m6s, train: 3m43s, total: 4m49s
0.8521 (0.01) 0.8470 (0.03) 0.8717 (0.01)

@samsucik samsucik enabled auto-merge (squash) April 27, 2021 08:00
@samsucik samsucik merged commit ca66e34 into main Apr 27, 2021
@samsucik samsucik deleted the e2e-feature-combining-layers-2.2 branch April 27, 2021 10:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tools:clear-poetry-cache-unit-tests Clear poetry cache for the unit tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Turn feature combining in TED/DIET into re-usable layers
6 participants