-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generalized translator and inference #429
Conversation
…; some refactoring to outputs and SequenceInference
Thanks! I'm happy to have the code separated better. A few thoughts:
|
…utoRegressiveMLELoss
This would be ready for review. All in all, I fortunately didn’t have to do any major changes, most changes are renamings, making method signatures more consistent, and a bit of rearranged code.
One final question would be whether |
@msperber Are you currently looking at my comment about separating the MLP and softmax classes? If not, I can take a look. |
@neubig No, I'm not working on this currently, so feel free to go ahead. |
…into generalize-translator
* Started separating out softmax * Started fixing tests * Fixed more tests * Fixed remainder of running tests * Fixed the rest of tests * Added AuxNonLinear * Updated examples (many were already broken?) * Fixed recipes * Removed MLP class * Added some doc * fix problem when calling a super constructor that is wrapped in serialized_init * Added some doc * fix / clean up sequence labeler * fix using scorer * document how to run test configs directly
Alright, this should be ready. |
I made a few changes and LGTM! But quite bizarrely, it seems that tests are now not passing. The only changes I made should have no possibility of causing this effect (just deleted an unused yaml file and changed some documentation), so maybe it's due to a difference in the Travis environment? |
P.S. also, tests are passing on my machine. |
It seems that this is due to an upgrade in pyyaml that introduced some major changes related to making loading safe which I don't fully understand and don't seem to be well documented. I'm downgrading to the previous version for now. |
Hey @msperber can you also refactor my code (LexiconDecoder)? I don't have time to read the whole refactoring and my code does not work anymore. I think simply just because you are not using it, you can't just remove it from master branch. If you have any question about it, you can always ask me. Thank you. |
@philip30 I did this actually. The reason why is because it shouldn't be a decoder, but rather a softmax. I think it should be implemented from scratch. Perhaps we should make an issue though. |
@neubig Alright, I'll make an issue! |
The current design of translator, decoder, and inference classes is quite specific to RNN-based attentional encoder-decoder models which makes it difficult to re-use much of the code when implementing alternatives. One place this problem becomes evident is the current diverge between transformer- and RNN-based models.
This PR does some preliminary refactoring, but is mainly meant to initiate discussions about how to properly design the interfaces.
Description of changes so far:
Inference
base class, add some missing content toDecoder
base class, add type annotations in some placesSimpleInference
toSequenceInference
Things to do:
ClassifierInference
andSequenceInference
which both derive from theInference
base class. A few problems I noticed so far are that (1)SequenceInference
assumes use of a search strategy which does not make sense for non-autoregressive models (e.g. the sequence labeler) and (2) having a separateClassifierInference
is probably reasonable, but some things like forced decoding apply here as well and should be reflected in the base class.