Skip to content
This repository has been archived by the owner on Jan 15, 2024. It is now read-only.

[FEATURE] [WIP] Use softmax with length in attention cells #910

Closed
wants to merge 5 commits into from

Conversation

ptrendx
Copy link
Contributor

@ptrendx ptrendx commented Aug 29, 2019

Description

MXNet added support for softmax with length parameter (apache/mxnet#15169) and this PR attempts to use it in attention cells. Work done by me and @blchu.

@eric-haibin-lin Could you help in making sure this works for all models (we tested just BERT and Transformer decoder)?

Checklist

Essentials

  • PR's title starts with a category (e.g. [BUGFIX], [MODEL], [TUTORIAL], [FEATURE], [DOC], etc)
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage
  • Code is well-documented

@ptrendx ptrendx requested a review from szha as a code owner August 29, 2019 20:15
@codecov
Copy link

codecov bot commented Aug 29, 2019

Codecov Report

❗ No coverage uploaded for pull request head (pr_softmax_with_length@c7ae5b4). Click here to learn what that means.
The diff coverage is n/a.

@codecov
Copy link

codecov bot commented Aug 29, 2019

Codecov Report

Merging #910 into master will decrease coverage by 61.26%.
The diff coverage is 7.69%.

Impacted file tree graph

@@             Coverage Diff             @@
##           master     #910       +/-   ##
===========================================
- Coverage   90.48%   29.21%   -61.27%     
===========================================
  Files          66       66               
  Lines        6400     6380       -20     
===========================================
- Hits         5791     1864     -3927     
- Misses        609     4516     +3907
Impacted Files Coverage Δ
src/gluonnlp/model/transformer.py 14.5% <0%> (-76.71%) ⬇️
src/gluonnlp/model/attention_cell.py 21.64% <25%> (-72.99%) ⬇️
src/gluonnlp/model/bilm_encoder.py 15.25% <0%> (-84.75%) ⬇️
src/gluonnlp/model/train/language_model.py 16.47% <0%> (-80.69%) ⬇️
src/gluonnlp/data/batchify/embedding.py 17.96% <0%> (-79.69%) ⬇️
src/gluonnlp/model/sequence_sampler.py 12.11% <0%> (-79.59%) ⬇️
src/gluonnlp/data/sampler.py 18.59% <0%> (-77.89%) ⬇️
src/gluonnlp/data/dataset.py 22.22% <0%> (-76.99%) ⬇️
src/gluonnlp/model/lstmpcellwithclip.py 23.07% <0%> (-76.93%) ⬇️
src/gluonnlp/model/language_model.py 23.25% <0%> (-76.75%) ⬇️
... and 48 more

@eric-haibin-lin
Copy link
Member

moved to #1091 for BERT

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants