Model Checkpointing for DIET, TED and ResponseSelector #5985

tttthomasssss · 2020-06-10T08:58:53Z

Proposed changes:

Implemented model checkpointing for DIET. The best DIET model during training will be stored instead of just the last model. The model is evaluated on the basis of evaluate_every_number_of_epochs. Checkpointing is enabled iff the following is set for the DIETClassifier, ResponseSelector and TEDPolicy in the config.yml file:
* checkpoint_model: True
* evaluate_on_number_of_examples > 0

The model is stored to whatever location has been specified with the --out parameter when calling rasa train nlu ...

Status (please check what you already did):

added some tests for the functionality
updated the documentation
updated the changelog (please check changelog for instructions)
reformat files using black (please check Readme for instructions)

tabergma

Great start 🚀 Left some comments.

Main concern is that we don't need the MODEL_CHECKPOINT_DIR, I think we can simple store the intermediate best model in a tmp directory. Also this solution should be independent from DIET. All Rasa Models should be able to store the best model, e.g. TEDPolicy and ResponseSelector. I guess you also need the option checkpoint_model over there.

rasa/nlu/classifiers/diet_classifier.py

rasa/utils/tensorflow/models.py

…y directory

simplify conditional Co-authored-by: Tanja <[email protected]>

better var name Co-authored-by: Tanja <[email protected]>

Co-authored-by: Tanja <[email protected]>

tttthomasssss · 2020-06-12T09:38:15Z

Thanks very much for the review comments!

Checkpointing for TEDPolicy and the ResponseSelector is yet to be added.

…o model-checkpointing-diet

…/tensorflow/constants

tttthomasssss · 2020-09-09T08:30:53Z

@tabergma I...ahem...finally managed to get back to this issue. Could you kindly review the latest changes? Checkpointing is now also added to the response_selector (though likely the old version 😱 ) and TEDPolicy.

rasa/utils/tensorflow/models.py

tests/core/test_policies.py

tabergma

Left a few more comments. Thanks for tackling so many already. Once those comments are resolved and all checks pass, I guess we are good to go 💯

rasa/utils/tensorflow/models.py

clearer check Co-authored-by: Tanja <[email protected]>

…o model-checkpointing-diet

tabergma

Looks great! Thanks for resolving all my comments 💯

…heck

… pycharm refactor

Ghostvv · 2020-09-14T09:28:22Z

@tttthomasssss is it only for DIET or for TED as well?

tttthomasssss · 2020-09-14T09:34:43Z

@Ghostvv it covers TED and the ResponseSelector as well

parangitis · 2022-10-25T07:29:48Z

what parameter/metrics do you use to determine the best model, is it combination of acc and loss or only one of them?

add model checkpointing for DIET during training

4ca4fbb

tttthomasssss requested a review from tabergma June 10, 2020 08:58

tttthomasssss added 2 commits June 10, 2020 10:59

renamed changelog file to new pr number

027c201

please the linter

66a8d83

tabergma requested changes Jun 10, 2020

View reviewed changes

tttthomasssss and others added 8 commits June 12, 2020 10:59

remove model_checkpoint_dir and store model checkpoints at a temporar…

8943786

…y directory

adds log output to inform user at which epoch the best model occurred

9b5ace4

replace tempfile.mkdtemp with rasa.utils.io.create_temporary_directory()

178817c

Update rasa/utils/tensorflow/models.py

2f2e4f6

simplify conditional Co-authored-by: Tanja <[email protected]>

rename vars to sth more explicity

60b38d1

Update rasa/utils/tensorflow/models.py

fa97ba0

better var name Co-authored-by: Tanja <[email protected]>

Update rasa/utils/tensorflow/models.py

d8eeb04

Co-authored-by: Tanja <[email protected]>

Update rasa/utils/tensorflow/models.py

54bed8a

Co-authored-by: Tanja <[email protected]>

tttthomasssss added 3 commits June 12, 2020 11:39

Merge github.com:RasaHQ/rasa into model-checkpointing-diet

0496a79

Merge branch 'model-checkpointing-diet' of github.com:RasaHQ/rasa int…

a4ad4de

…o model-checkpointing-diet

erm, merge fail or so?!

e826422

tttthomasssss self-assigned this Jun 25, 2020

tttthomasssss force-pushed the model-checkpointing-diet branch from 32fe373 to a4ad4de Compare July 15, 2020 13:51

tttthomasssss added 10 commits July 15, 2020 15:52

add model checkpointing to response selector

4dabfd1

move the CHECKPOINT_MODEL const from rasa/nlu/constants to rasa/utils…

3853b2c

…/tensorflow/constants

add checkpointing to TEDPolicy

0c2a23a

add checkpointing to docs

b1eee1e

intermediary checkin with lots of debugs etc

366fc2b

removed print

aaea96a

removed prints

32cb58a

removed print

726c966

removed prints

c7edcd7

prints the correct best epoch now :)

16ac725

tabergma reviewed Sep 10, 2020

View reviewed changes

rasa/utils/tensorflow/models.py Outdated Show resolved Hide resolved

tabergma reviewed Sep 10, 2020

View reviewed changes

tests/core/test_policies.py Outdated Show resolved Hide resolved

tabergma reviewed Sep 10, 2020

View reviewed changes

tttthomasssss added 5 commits September 11, 2020 09:26

updated improvement log

a649123

fixes epoch counter

f34559d

added type information to tests

1b0b95a

removes unnecessary call to eval after the training has finished

56b5cea

black formatting

2bba85e

tabergma reviewed Sep 11, 2020

View reviewed changes

rasa/utils/tensorflow/models.py Outdated Show resolved Hide resolved

tttthomasssss and others added 6 commits September 11, 2020 11:08

Update rasa/utils/tensorflow/models.py

2e48659

clearer check Co-authored-by: Tanja <[email protected]>

improves check for model checkpointing

aaef532

Merge branch 'model-checkpointing-diet' of github.com:RasaHQ/rasa int…

70a63ed

…o model-checkpointing-diet

improves check for model checkpointing

9a60c2b

black formatting

06f6328

Merge branch 'master' into model-checkpointing-diet

046dd89

tabergma approved these changes Sep 11, 2020

View reviewed changes

tabergma added the status:ready-to-merge label Sep 11, 2020

tttthomasssss added 6 commits September 11, 2020 13:40

changes format of stories test file from md to yml due to a failing c…

0701bb8

…heck

move data

d05d0e4

Merge branch 'master' into model-checkpointing-diet

dcb9b56

fixes the tests

cb98688

changed names back to original names - and, well, thank you very much…

624c373

… pycharm refactor

Merge branch 'master' into model-checkpointing-diet

2b78c6a

tttthomasssss merged commit 1d43d67 into master Sep 14, 2020

tttthomasssss deleted the model-checkpointing-diet branch September 14, 2020 09:11

Ghostvv changed the title ~~Model Checkpointing for DIET~~ Model Checkpointing for DIET, TED and ResponseSelector Sep 14, 2020

tttthomasssss mentioned this pull request Sep 17, 2020

Is there a plan to implement early stopping in model training? #6693

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Model Checkpointing for DIET, TED and ResponseSelector #5985

Model Checkpointing for DIET, TED and ResponseSelector #5985

tttthomasssss commented Jun 10, 2020 •

edited by Ghostvv

Loading

tabergma left a comment

tttthomasssss commented Jun 12, 2020

tttthomasssss commented Sep 9, 2020

tabergma left a comment

tabergma left a comment

Ghostvv commented Sep 14, 2020

tttthomasssss commented Sep 14, 2020

parangitis commented Oct 25, 2022

Model Checkpointing for DIET, TED and ResponseSelector #5985

Model Checkpointing for DIET, TED and ResponseSelector #5985

Conversation

tttthomasssss commented Jun 10, 2020 • edited by Ghostvv Loading

tabergma left a comment

Choose a reason for hiding this comment

tttthomasssss commented Jun 12, 2020

tttthomasssss commented Sep 9, 2020

tabergma left a comment

Choose a reason for hiding this comment

tabergma left a comment

Choose a reason for hiding this comment

Ghostvv commented Sep 14, 2020

tttthomasssss commented Sep 14, 2020

parangitis commented Oct 25, 2022

tttthomasssss commented Jun 10, 2020 •

edited by Ghostvv

Loading