Skip to content
This repository has been archived by the owner on Jan 15, 2024. It is now read-only.

Commit

Permalink
[enhancement] refactor bert finetuning script (#692)
Browse files Browse the repository at this point in the history
* refactor finetune script

* fix test with inference_only

* enhance data preprocessing

* fix label in bert transform

* fix lint

* fix lint

* Update dataset.py

* fix test

* fix test

* do not use bert-adam on mxnet 1.4

* use sys.executable

* fix tutorial

* parameter test

* fix typo

* commit a missing line
  • Loading branch information
eric-haibin-lin authored and Aston Zhang committed May 10, 2019
1 parent 97baac9 commit 882a117
Show file tree
Hide file tree
Showing 7 changed files with 522 additions and 595 deletions.
12 changes: 7 additions & 5 deletions docs/examples/sentence_embedding/bert.md
Original file line number Diff line number Diff line change
Expand Up @@ -211,18 +211,20 @@ max_len = 128
all_labels = ["0", "1"]
# whether to transform the data as sentence pairs.
# for single sentence classification, set pair=False
# for regression task, set class_labels=None
# for inference without label available, set has_label=False
pair = True
transform = dataset.BERTDatasetTransform(bert_tokenizer, max_len,
labels=all_labels,
label_dtype='int32',
class_labels=all_labels,
has_label=True,
pad=True,
pair=pair)
data_train = data_train_raw.transform(transform)
print('vocabulary used for tokenization = \n%s'%vocabulary)
print('[PAD] token id = %s'%(vocabulary['[PAD]']))
print('[CLS] token id = %s'%(vocabulary['[CLS]']))
print('[SEP] token id = %s'%(vocabulary['[SEP]']))
print('%s token id = %s'%('[PAD]', vocabulary[vocabulary.padding_token]))
print('%s token id = %s'%(vocabulary.cls_token, vocabulary[vocabulary.cls_token]))
print('%s token id = %s'%(vocabulary.sep_token, vocabulary[vocabulary.sep_token]))
print('token ids = \n%s'%data_train[sample_id][0])
print('valid length = \n%s'%data_train[sample_id][1])
print('segment ids = \n%s'%data_train[sample_id][2])
Expand Down
Loading

0 comments on commit 882a117

Please sign in to comment.