Deep dynamic Contextualized word representation (DDCWR)

TensorFlow code and pre-trained models for DDCWR

Important explanation

The method of the model is simple, only using the feed forward neural network with attention mechanism.
Model training is fast, and only a few cycles can be used to train the model. The value of the initialization parameter comes from the BERT model of Google.
The effect of the model is very good. In most cases, it is consistent with the current (2018-11-13) optimal model. Sometimes the effect is better. The optimal effect can be seen in gluebenchmark.

Thought of article

This model Deep_dynamic_word_representation(DDWR) combines the BERT model and ELMo's deep context word representation.

The BERT comes from BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding The ELMo comes from Deep contextualized word representations

Basic usage method

Download Pre-trained models

BERT-Base, Uncased

Doenload GLUE dataDATA

using this script

Sentence (and sentence-pair) classification tasks

difference

export BERT_BASE_DIR=/path/to/bert/uncased_L-12_H-768_A-12
export GLUE_DIR=/path/to/glue

python run_classifier_elmo.py \
  --task_name=MRPC \
  --do_train=true \
  --do_eval=true \
  --data_dir=$GLUE_DIR/MRPC \
  --vocab_file=$BERT_BASE_DIR/vocab.txt \
  --bert_config_file=$BERT_BASE_DIR/bert_config.json \
  --init_checkpoint=$BERT_BASE_DIR/bert_model.ckpt \
  --max_seq_length=128 \
  --train_batch_size=32 \
  --learning_rate=2e-5 \
  --num_train_epochs=3.0 \
  --output_dir=/tmp/mrpc_output/

Prediction from classifier

the same as https://github.com/google-research/bert

export BERT_BASE_DIR=/path/to/bert/uncased_L-12_H-768_A-12
export GLUE_DIR=/path/to/glue
export TRAINED_CLASSIFIER=/path/to/fine/tuned/classifier

python run_classifier_elmo.py \
  --task_name=MRPC \
  --do_predict=true \
  --data_dir=$GLUE_DIR/MRPC \
  --vocab_file=$BERT_BASE_DIR/vocab.txt \
  --bert_config_file=$BERT_BASE_DIR/bert_config.json \
  --init_checkpoint=$TRAINED_CLASSIFIER \
  --max_seq_length=128 \
  --output_dir=/tmp/mrpc_output/

more methods to google-research/bert

Solve SQUAD1.1 problem

the same as https://github.com/google-research/bert

difference

python run_squad_elmo.py --vocab_file=$BERT_BASE_DIR/vocab.txt --bert_config_file=$BERT_BASE_DIR/bert_config.json --init_checkpoint=$BERT_BASE_DIR/bert_model.ckpt --do_train=True --train_file=$SQUAD_DIR/train-v1.1.json --do_predict=True --predict_file=$SQUAD_DIR/dev-v1.1.json --train_batch_size=12 --learning_rate=3e-5 --num_train_epochs=2.0 --max_seq_length=384 --doc_stride=128 --output_dir=./tmp/elmo_squad_base/

Experimental Result

python run_squad_elmo.py
{“exact_match”: 81.20151371807, “f1”: 88.56178500169332}

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
README.md		README.md
__init__.py		__init__.py
create_pretraining_data.py		create_pretraining_data.py
extract_features.py		extract_features.py
modeling.py		modeling.py
optimization.py		optimization.py
run_classifier_elmo.py		run_classifier_elmo.py
run_pretraining.py		run_pretraining.py
run_squad.py		run_squad.py
run_squad_elmo.py		run_squad_elmo.py
tokenization.py		tokenization.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Deep dynamic Contextualized word representation (DDCWR)

Important explanation

Thought of article

Basic usage method

Download Pre-trained models

Doenload GLUE dataDATA

Sentence (and sentence-pair) classification tasks

Prediction from classifier

Solve SQUAD1.1 problem

Experimental Result

About

Releases

Packages

Languages

yuanxiaosc/Deep_dynamic_contextualized_word_representation

Folders and files

Latest commit

History

Repository files navigation

Deep dynamic Contextualized word representation (DDCWR)

Important explanation

Thought of article

Basic usage method

Download Pre-trained models

Doenload GLUE dataDATA

Sentence (and sentence-pair) classification tasks

Prediction from classifier

Solve SQUAD1.1 problem

Experimental Result

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages