This repository was archived by the owner on Jan 15, 2024. It is now read-only.
[BUGFIX] Fix Vocab with unknown_token remapped to != 0 via token_to_idx arg#862
Merged
leezu merged 3 commits intodmlc:masterfrom Aug 4, 2019
Merged
[BUGFIX] Fix Vocab with unknown_token remapped to != 0 via token_to_idx arg#862leezu merged 3 commits intodmlc:masterfrom
leezu merged 3 commits intodmlc:masterfrom
Conversation
Merged
Member
|
Job PR-862/1 is complete. |
Confirmed that vocab[vocab.unknown_token] still == 0 for all models created prior to the flexible vocab PR. Ie: - book_corpus_wiki_en_uncased - wiki_multilingual_uncased - openwebtext_book_corpus_wiki_en_uncased - wiki_multilingual_cased - wiki_cn_cased
5723c73 to
112e133
Compare
Member
|
Job PR-862/2 is complete. |
Codecov Report
@@ Coverage Diff @@
## master #862 +/- ##
==========================================
+ Coverage 90.17% 90.17% +<.01%
==========================================
Files 66 66
Lines 6342 6344 +2
==========================================
+ Hits 5719 5721 +2
Misses 623 623
|
szha
approved these changes
Aug 4, 2019
| for token in special_tokens: | ||
| assert token in vocab, "Token %s not found in the vocab" % token | ||
| assert vocab['RandomWordByHaibin'] == 0 | ||
| assert vocab['RandomWordByHaibin'] == vocab[vocab.unknown_token] |
Member
There was a problem hiding this comment.
this is not my random word 😂
eric-haibin-lin
approved these changes
Aug 4, 2019
leezu
added a commit
that referenced
this pull request
Aug 5, 2019
* Fix Vocab with unknown_token remapped to != 0 via token_to_idx arg * Add test * Update test_pretrained_bert_models Confirmed that vocab[vocab.unknown_token] still == 0 for all models created prior to the flexible vocab PR. Ie: - book_corpus_wiki_en_uncased - wiki_multilingual_uncased - openwebtext_book_corpus_wiki_en_uncased - wiki_multilingual_cased - wiki_cn_cased * Use cuda 10.1 on CI * Disable false positive pylint warnings Fixed in master branch by #838 Fix does not apply here as it would require dropping py2. * Don't test against v1.6 beta and v1.4.1 at the same time. Test v0.7x branch only on v1.4 due to incompatible doctest output on both mxnet versions * Disable test_finetune_chinese_inference due to DNS error
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
#732 forgot to remap the default specified in DefaultLookupDict, but only remapped the idx_to_token / token_to_idx data structures.
Checklist
Essentials
Changes
Comments