BioPre is a supervised model using neural network with bag-of-words embedding to predict the entity mention in body of scientific article given the correspondint metadata including abstract and author information.

You can use:

train.py

to train a model with abstract-body pairs (author information is optional);

type "train.py -h" to check the usage.

You can use:

predict.py

to predict entity mentions in body with the model trained using train.py;

type "predict.py -h" to check the usage.

All the articles should be annotated to entity lists in *.csv format, and all the vocabularies should be in *.json format.

Failed attemps are under failed_attempts folder consisting of a binary classifier, a LSTM model and some data making-up scripts.

Have problems? Look into the code by yourself!

More problems or want the training data/optimized models? Contact me through zhengyl940425@gmail.com!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Files

README.md

Latest commit

History

README.md

File metadata and controls