Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attention weight logging #5673

Merged
merged 145 commits into from
Jan 26, 2021
Merged

Attention weight logging #5673

merged 145 commits into from
Jan 26, 2021

Conversation

JEM-Mosig
Copy link
Contributor

@JEM-Mosig JEM-Mosig commented Apr 20, 2020

Proposed changes:

Status (please check what you already did):

  • added some tests for the functionality
  • updated the changelog (please check changelog for instructions)
  • reformat files using black (please check Readme for instructions)
  • List this as experimental feature here

@tmbo
Copy link
Member

tmbo commented May 26, 2020

@JEM-Mosig what's up with this PR?

@JEM-Mosig
Copy link
Contributor Author

@JEM-Mosig what's up with this PR?

The corresponding project is on hold due to deadline for dataset collection. Should continue mid June.

@tmbo
Copy link
Member

tmbo commented Sep 26, 2020

how is the implementation going, is this still relevant @JEM-Mosig ?

@JEM-Mosig
Copy link
Contributor Author

how is the implementation going, is this still relevant @JEM-Mosig ?

I'm still on this. It's a side project and might take a bit more time. Not critical for 2.0 or anything. I only created a draft PR so its easier for others to see the code changes when we discuss.

@JEM-Mosig JEM-Mosig changed the title Early implementation of attention weight logging Attention weight logging Oct 19, 2020
@Ghostvv Ghostvv self-requested a review October 20, 2020 08:58
@koaning
Copy link
Contributor

koaning commented Oct 20, 2020

@JEM-Mosig I just had a look and it's making me wonder what the easiest way might to go about visualising this. Your PR opens up the attention weights but there's also some other vectors that I'd love to get a hold on too.

image

I'm wondering what is practical here. I'd love to make DIET more "peekable" but doing so might add unwanted complexity.

@Ghostvv
Copy link
Contributor

Ghostvv commented Oct 21, 2020

@JEM-Mosig why do we need special ..._with_diagnostic methods? I don't see where it is called in processor. Can we just reuse normal predict and process methods?

@JEM-Mosig
Copy link
Contributor Author

@JEM-Mosig why do we need special ..._with_diagnostic methods? I don't see where it is called in processor. Can we just reuse normal predict and process methods?

If we output the attention weights (and maybe other data) with the predict / process functions, then messages get a lot bigger. Though we could add a parameter that decides to add or not to add the diagnostic data. I think I'll change it to be parameters @koaning .
Right now, the _with_diagnostic functions are only called by rasalit.

@Ghostvv
Copy link
Contributor

Ghostvv commented Oct 21, 2020

I see. @wochinge what do you think would be the best approach here?

rasa/shared/constants.py Outdated Show resolved Hide resolved
Base automatically changed from master to main January 22, 2021 11:15
tests/conftest.py Outdated Show resolved Hide resolved
tests/utils/tensorflow/test_numpy.py Outdated Show resolved Hide resolved
@rasabot rasabot merged commit 8d7e71e into main Jan 26, 2021
@rasabot rasabot deleted the johannes-73 branch January 26, 2021 14:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants