Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor dialogue state tracking for modelling/dataset interoperability #3526

Merged
merged 35 commits into from
Feb 11, 2022

Conversation

Zhilin123
Copy link
Collaborator

@Zhilin123 Zhilin123 commented Jan 26, 2022

Signed-off-by: Zhilin Wang [email protected]

Design based on attached diagram
UML proposed dialogue1.pdf

Currently supports SGD-QA (Bert-based) and Huggingface GPT2 models, and planning to support other models.

@Zhilin123
Copy link
Collaborator Author

Zhilin123 commented Jan 26, 2022

Integration testing done by comparing output of 1. modified Dataset/DataProcessor with original SGDQA model and 2. original Dataset/DataProcessor with original SGDQA model. Both give identical output

CUDA_VISIBLE_DEVICES=0 python ~/NeMo/examples/nlp/dialogue_state_tracking_generative/sgd_gen.py do_training=False trainer.gpus=1 model.dataset.data_dir=/home/zhilinw/dstc8-schema-guided-dialogue/ model.dataset.dialogues_example_dir=/home/zhilinw/dstc8-schema-guided-dialogue/ model.dataset.use_cache=False

and

CUDA_VISIBLE_DEVICES=0 python ~/NeMo/examples/nlp/dialogue_state_tracking/sgd_qa.py do_training=False trainer.gpus=1 model.dataset.data_dir=/home/zhilinw/dstc8-schema-guided-dialogue/ model.dataset.dialogues_example_dir=/home/zhilinw/dstc8-schema-guided-dialogue/ model.dataset.use_cache=False

[NeMo I 2022-01-25 16:32:03 evaluate:284] Dialog metrics for #SEEN_SERVICES : [('active_intent_accuracy', 38.36), ('average_cat_accuracy', 13.69), ('average_cat_status_accuracy', 100.0), ('average_cat_value_accuracy', 13.69), ('average_goal_accuracy', 25.8), ('average_noncat_accuracy', 30.83), ('average_noncat_status_accuracy', 98.83), ('average_noncat_value_accuracy', 31.19), ('joint_cat_accuracy', 4.19), ('joint_cat_status_accuracy', 47.69), ('joint_cat_value_accuracy', 9.39), ('joint_goal_accuracy', 0.25), ('joint_noncat_accuracy', 4.59), ('joint_noncat_status_accuracy', 22.48), ('joint_noncat_value_accuracy', 13.44), ('requested_slots_f1', 2.7), ('requested_slots_precision', 3.05), ('requested_slots_recall', 90.75)] [NeMo I 2022-01-25 16:32:03 evaluate:286] Dialog metrics for #UNSEEN_SERVICES: [('active_intent_accuracy', 33.82), ('average_cat_accuracy', 18.37), ('average_cat_status_accuracy', 100.0), ('average_cat_value_accuracy', 18.37), ('average_goal_accuracy', 25.22), ('average_noncat_accuracy', 27.62), ('average_noncat_status_accuracy', 97.85), ('average_noncat_value_accuracy', 28.22), ('joint_cat_accuracy', 1.18), ('joint_cat_status_accuracy', 25.6), ('joint_cat_value_accuracy', 11.14), ('joint_goal_accuracy', 0.51), ('joint_noncat_accuracy', 2.22), ('joint_noncat_status_accuracy', 42.34), ('joint_noncat_value_accuracy', 9.29), ('requested_slots_f1', 4.21), ('requested_slots_precision', 4.25), ('requested_slots_recall', 91.42)] [NeMo I 2022-01-25 16:32:03 evaluate:288] Dialog metrics for #ALL_SERVICES : [('active_intent_accuracy', 34.85), ('average_cat_accuracy', 17.2), ('average_cat_status_accuracy', 100.0), ('average_cat_value_accuracy', 17.2), ('average_goal_accuracy', 25.35), ('average_noncat_accuracy', 28.36), ('average_noncat_status_accuracy', 98.08), ('average_noncat_value_accuracy', 28.91), ('joint_cat_accuracy', 1.92), ('joint_cat_status_accuracy', 31.04), ('joint_cat_value_accuracy', 10.7), ('joint_goal_accuracy', 0.45), ('joint_noncat_accuracy', 2.76), ('joint_noncat_status_accuracy', 37.84), ('joint_noncat_value_accuracy', 10.25), ('requested_slots_f1', 3.87), ('requested_slots_precision', 3.98), ('requested_slots_recall', 91.27)]

@lgtm-com
Copy link

lgtm-com bot commented Jan 26, 2022

This pull request introduces 51 alerts when merging b287422 into 6b51350 - view on LGTM.com

new alerts:

  • 25 for Unused import
  • 16 for Unused local variable
  • 4 for Modification of parameter with default
  • 2 for Superclass attribute shadows subclass method
  • 1 for Unnecessary pass
  • 1 for Except block handles 'BaseException'
  • 1 for Unreachable code
  • 1 for Explicit export is not defined

@lgtm-com
Copy link

lgtm-com bot commented Jan 28, 2022

This pull request introduces 51 alerts when merging 7c12ab6 into 101977e - view on LGTM.com

new alerts:

  • 25 for Unused import
  • 16 for Unused local variable
  • 4 for Modification of parameter with default
  • 2 for Superclass attribute shadows subclass method
  • 1 for Unnecessary pass
  • 1 for Except block handles 'BaseException'
  • 1 for Unreachable code
  • 1 for Explicit export is not defined

Signed-off-by: Zhilin Wang <[email protected]>
Signed-off-by: Zhilin Wang <[email protected]>
@lgtm-com
Copy link

lgtm-com bot commented Jan 31, 2022

This pull request introduces 13 alerts when merging 255294d into 9a1cc36 - view on LGTM.com

new alerts:

  • 4 for Unused import
  • 4 for Modification of parameter with default
  • 2 for Superclass attribute shadows subclass method
  • 1 for Unnecessary pass
  • 1 for Unused local variable
  • 1 for Except block handles 'BaseException'

Signed-off-by: Zhilin Wang <[email protected]>
@lgtm-com
Copy link

lgtm-com bot commented Jan 31, 2022

This pull request introduces 13 alerts when merging 5f6dbd9 into ddcc2a6 - view on LGTM.com

new alerts:

  • 4 for Unused import
  • 4 for Modification of parameter with default
  • 2 for Superclass attribute shadows subclass method
  • 1 for Unnecessary pass
  • 1 for Unused local variable
  • 1 for Except block handles 'BaseException'

@lgtm-com
Copy link

lgtm-com bot commented Jan 31, 2022

This pull request introduces 13 alerts when merging deeeaec into ddcc2a6 - view on LGTM.com

new alerts:

  • 4 for Unused import
  • 4 for Modification of parameter with default
  • 2 for Superclass attribute shadows subclass method
  • 1 for Unnecessary pass
  • 1 for Unused local variable
  • 1 for Except block handles 'BaseException'

@okuchaiev
Copy link
Member

/blossom-ci

@okuchaiev
Copy link
Member

ping @cparisien and @yzhang123

cparisien
cparisien previously approved these changes Feb 1, 2022
Copy link
Collaborator

@cparisien cparisien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me from a design perspective. We should still get a review from someone who is a regular nemo contributor, since I'm not.

@lgtm-com
Copy link

lgtm-com bot commented Feb 1, 2022

This pull request introduces 13 alerts when merging 7d96cf2 into 7a8a5b3 - view on LGTM.com

new alerts:

  • 4 for Unused import
  • 4 for Modification of parameter with default
  • 2 for Superclass attribute shadows subclass method
  • 1 for Unnecessary pass
  • 1 for Unused local variable
  • 1 for Except block handles 'BaseException'

@lgtm-com
Copy link

lgtm-com bot commented Feb 2, 2022

This pull request introduces 13 alerts when merging 0b1bc6c into 7a8a5b3 - view on LGTM.com

new alerts:

  • 4 for Unused import
  • 4 for Modification of parameter with default
  • 2 for Superclass attribute shadows subclass method
  • 1 for Unnecessary pass
  • 1 for Unused local variable
  • 1 for Except block handles 'BaseException'

@lgtm-com
Copy link

lgtm-com bot commented Feb 2, 2022

This pull request introduces 13 alerts when merging f5f3cf8 into fe37d3f - view on LGTM.com

new alerts:

  • 4 for Unused import
  • 4 for Modification of parameter with default
  • 2 for Superclass attribute shadows subclass method
  • 1 for Unnecessary pass
  • 1 for Unused local variable
  • 1 for Except block handles 'BaseException'

@lgtm-com
Copy link

lgtm-com bot commented Feb 11, 2022

This pull request introduces 6 alerts when merging b665b10 into 64eb620 - view on LGTM.com

new alerts:

  • 4 for Modification of parameter with default
  • 1 for Unused local variable
  • 1 for Unused import

@lgtm-com
Copy link

lgtm-com bot commented Feb 11, 2022

This pull request introduces 6 alerts when merging 67ab0d2 into 64eb620 - view on LGTM.com

new alerts:

  • 4 for Modification of parameter with default
  • 1 for Unused local variable
  • 1 for Unused import

@lgtm-com
Copy link

lgtm-com bot commented Feb 11, 2022

This pull request introduces 6 alerts when merging e4418ce into 64eb620 - view on LGTM.com

new alerts:

  • 4 for Modification of parameter with default
  • 1 for Unused local variable
  • 1 for Unused import

@lgtm-com
Copy link

lgtm-com bot commented Feb 11, 2022

This pull request introduces 6 alerts when merging 012004d into 64eb620 - view on LGTM.com

new alerts:

  • 4 for Modification of parameter with default
  • 1 for Unused local variable
  • 1 for Unused import

@lgtm-com
Copy link

lgtm-com bot commented Feb 11, 2022

This pull request introduces 6 alerts when merging 8241502 into 64eb620 - view on LGTM.com

new alerts:

  • 4 for Modification of parameter with default
  • 1 for Unused local variable
  • 1 for Unused import

@lgtm-com
Copy link

lgtm-com bot commented Feb 11, 2022

This pull request introduces 6 alerts when merging 1a1bc0d into 64eb620 - view on LGTM.com

new alerts:

  • 4 for Modification of parameter with default
  • 1 for Unused local variable
  • 1 for Unused import

@lgtm-com
Copy link

lgtm-com bot commented Feb 11, 2022

This pull request introduces 6 alerts when merging f7a7e9c into 9aef14f - view on LGTM.com

new alerts:

  • 4 for Modification of parameter with default
  • 1 for Unused local variable
  • 1 for Unused import

yzhang123
yzhang123 previously approved these changes Feb 11, 2022
self.data_dir = data_dir
self._tokenizer = tokenizer

def open_file(self, filename):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you add doc strings for class functions?

@lgtm-com
Copy link

lgtm-com bot commented Feb 11, 2022

This pull request introduces 6 alerts when merging c63b2a9 into 9aef14f - view on LGTM.com

new alerts:

  • 4 for Modification of parameter with default
  • 1 for Unused local variable
  • 1 for Unused import

@lgtm-com
Copy link

lgtm-com bot commented Feb 11, 2022

This pull request introduces 6 alerts when merging b400571 into 9aef14f - view on LGTM.com

new alerts:

  • 4 for Modification of parameter with default
  • 1 for Unused local variable
  • 1 for Unused import

@lgtm-com
Copy link

lgtm-com bot commented Feb 11, 2022

This pull request introduces 6 alerts when merging bd46660 into 9aef14f - view on LGTM.com

new alerts:

  • 4 for Modification of parameter with default
  • 1 for Unused local variable
  • 1 for Unused import

@lgtm-com
Copy link

lgtm-com bot commented Feb 11, 2022

This pull request introduces 6 alerts when merging d100676 into 298b686 - view on LGTM.com

new alerts:

  • 4 for Modification of parameter with default
  • 1 for Unused local variable
  • 1 for Unused import

@yzhang123 yzhang123 merged commit 058fa38 into main Feb 11, 2022
@yzhang123 yzhang123 deleted the dialogue_state_tracking_refactor branch February 11, 2022 20:46
@Zhilin123 Zhilin123 restored the dialogue_state_tracking_refactor branch February 14, 2022 22:36
fayejf pushed a commit that referenced this pull request Mar 2, 2022
…ty (#3526)

* refactor dialogue state tracking for modelling/dataset interoperability

Signed-off-by: Zhilin Wang <[email protected]>

* fix style changes

Signed-off-by: Zhilin Wang <[email protected]>

* fix typo

Signed-off-by: Zhilin Wang <[email protected]>

* fix style raised by lgtm

Signed-off-by: Zhilin Wang <[email protected]>

* fix style formatting

Signed-off-by: Zhilin Wang <[email protected]>

* update template to include description of intent

Signed-off-by: Zhilin Wang <[email protected]>

* update Jenkinsfile

Signed-off-by: Zhilin Wang <[email protected]>

* changes based on requests in review

Signed-off-by: Zhilin Wang <[email protected]>

* add compatibility with assistant dataset

Signed-off-by: Zhilin Wang <[email protected]>

* update Jenkins

Signed-off-by: Zhilin Wang <[email protected]>

* remove dialogue_state_tracking

Signed-off-by: Zhilin Wang <[email protected]>

* update huggingface utils for dialogue

Signed-off-by: Zhilin Wang <[email protected]>

* rename dialogue_state_tracking_hybrid to dialogue_state_tracking_sgdqa

Signed-off-by: Zhilin Wang <[email protected]>

* style fix

Signed-off-by: Zhilin Wang <[email protected]>

* fix style

Signed-off-by: Zhilin Wang <[email protected]>

* style fix nemo/collections/nlp/models/dialogue_state_tracking_sgdqa/__init__.py

Signed-off-by: Zhilin Wang <[email protected]>

* update Jenkinsfile for SGDGEN

Signed-off-by: Zhilin Wang <[email protected]>

* update Jenkinsfile for SGDGEN

Signed-off-by: Zhilin Wang <[email protected]>

* update Jenkinsfile for SGDGEN

Signed-off-by: Zhilin Wang <[email protected]>

* update Jenkinsfile for SGDGEN

Signed-off-by: Zhilin Wang <[email protected]>

* update Jenkinsfile for SGDGEN

Signed-off-by: Zhilin Wang <[email protected]>

* fix typo

Signed-off-by: Zhilin Wang <[email protected]>

* add docstrings for assistant data processsor

Signed-off-by: Zhilin Wang <[email protected]>

Co-authored-by: Zhilin Wang <[email protected]>
Co-authored-by: Oleksii Kuchaiev <[email protected]>
Co-authored-by: Yang Zhang <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants