Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Model Checkpointing for DIET, TED and ResponseSelector #5985

Merged
merged 68 commits into from
Sep 14, 2020

Conversation

tttthomasssss
Copy link
Contributor

@tttthomasssss tttthomasssss commented Jun 10, 2020

Proposed changes:

Implemented model checkpointing for DIET. The best DIET model during training will be stored instead of just the last model. The model is evaluated on the basis of evaluate_every_number_of_epochs. Checkpointing is enabled iff the following is set for the DIETClassifier, ResponseSelector and TEDPolicy in the config.yml file:
* checkpoint_model: True
* evaluate_on_number_of_examples > 0

The model is stored to whatever location has been specified with the --out parameter when calling rasa train nlu ...

Status (please check what you already did):

  • added some tests for the functionality
  • updated the documentation
  • updated the changelog (please check changelog for instructions)
  • reformat files using black (please check Readme for instructions)

Copy link
Contributor

@tabergma tabergma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great start 🚀 Left some comments.

Main concern is that we don't need the MODEL_CHECKPOINT_DIR, I think we can simple store the intermediate best model in a tmp directory. Also this solution should be independent from DIET. All Rasa Models should be able to store the best model, e.g. TEDPolicy and ResponseSelector. I guess you also need the option checkpoint_model over there.

rasa/nlu/classifiers/diet_classifier.py Outdated Show resolved Hide resolved
rasa/nlu/classifiers/diet_classifier.py Outdated Show resolved Hide resolved
rasa/nlu/classifiers/diet_classifier.py Outdated Show resolved Hide resolved
rasa/utils/tensorflow/models.py Outdated Show resolved Hide resolved
rasa/utils/tensorflow/models.py Outdated Show resolved Hide resolved
rasa/utils/tensorflow/models.py Outdated Show resolved Hide resolved
rasa/utils/tensorflow/models.py Outdated Show resolved Hide resolved
rasa/utils/tensorflow/models.py Outdated Show resolved Hide resolved
rasa/utils/tensorflow/models.py Outdated Show resolved Hide resolved
rasa/utils/tensorflow/models.py Outdated Show resolved Hide resolved
@tttthomasssss
Copy link
Contributor Author

Thanks very much for the review comments!

Checkpointing for TEDPolicy and the ResponseSelector is yet to be added.

@tttthomasssss
Copy link
Contributor Author

@tabergma I...ahem...finally managed to get back to this issue. Could you kindly review the latest changes? Checkpointing is now also added to the response_selector (though likely the old version 😱 ) and TEDPolicy.

Copy link
Contributor

@tabergma tabergma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few more comments. Thanks for tackling so many already. Once those comments are resolved and all checks pass, I guess we are good to go 💯

Copy link
Contributor

@tabergma tabergma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Thanks for resolving all my comments 💯

@tttthomasssss tttthomasssss merged commit 1d43d67 into master Sep 14, 2020
@tttthomasssss tttthomasssss deleted the model-checkpointing-diet branch September 14, 2020 09:11
@Ghostvv
Copy link
Contributor

Ghostvv commented Sep 14, 2020

@tttthomasssss is it only for DIET or for TED as well?

@tttthomasssss
Copy link
Contributor Author

@Ghostvv it covers TED and the ResponseSelector as well

@Ghostvv Ghostvv changed the title Model Checkpointing for DIET Model Checkpointing for DIET, TED and ResponseSelector Sep 14, 2020
@parangitis
Copy link

what parameter/metrics do you use to determine the best model, is it combination of acc and loss or only one of them?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants