First version of Zero-Shot Sequence Labeler #2260

alanakbik · 2021-05-01T09:07:19Z

First version of TARS few-shot sequence tagger. Train like this:

# init corpus and map label names to descriptions
corpus = WNUT_17(label_name_map={
    "location": "location name",
    "corporation": "corporation name",
    "person": "person name",
    "creative-work": "name of song, movie, book or other creative work",
    "product": "name of product or consumer good",
    "group": "name of music band, sports team or non-corporate organization",
}
)

dictionary = corpus.make_label_dictionary('ner')
print(dictionary)

# init the TARS sequence tagger
tars_tagger = TARSTagger(
    'ner_wnut',
    dictionary,
    tag_type='ner',
    embeddings='bert-base-uncased',
    num_negative_labels_to_sample=1,
    prefix=True,
)

# train the model
trainer = ModelTrainer(tars_tagger, corpus)

trainer.train('resources/taggers/few-shot-sequence-tagger',
              learning_rate=0.02,
              mini_batch_size=16,
              mini_batch_chunk_size=1,
              max_epochs=20,
              monitor_test=True,
              embeddings_storage_mode="none",
              )

This PR also makes a number of smaller changes:

Change the best model logic so that again the best-model.pt file is saved. The information which epoch was the best model is no longer encoded directly in the filename but rather will be added explicitly to the logs.
Changes in loss averaging so that we get consistency between training / testing and in particular TARS-like approaches. This means that the sequence tagger no longer returns an averaged loss over all words in the mini-batch, but rather a summed loss plus the information over how many words the loss was summed.

…tars_tagger

alanakbik added 30 commits March 30, 2021 16:14

Merge branch 'master' of git://github.com/megantosh/flair

6101539

Initial commit of sequence TARS

cf01b14

Fix bug in deleting best model if there is none

e372b42

remove loss averaging

93e286f

Fix loss computation

89528ed

Loss averaging for CRF

b7aea5f

Force last test with last model

4ecfec0

Refactor anneal options

7bf4e1d

Remove prints

5d17b09

Disable best mode saving

02a394a

Fix bug where last model state was not loaded

c940aad

Add option to prepend or append TARS tag

53ae0b5

Add option to prepend or append TARS tag

6e90ca3

Add option to prepend or append TARS tag

aff03ed

Merge branch 'master' into tars_tagger

da53b95

Initial commit of sequence TARS

d9d9db2

Fix bug in deleting best model if there is none

52eb26f

remove loss averaging

b457465

Fix loss computation

31980ab

Loss averaging for CRF

57c7640

Force last test with last model

560a2ff

Refactor anneal options

ce4d461

Remove prints

a5ad359

Disable best mode saving

8fc23ec

Fix bug where last model state was not loaded

307d9d6

Add option to prepend or append TARS tag

bd1e715

Add option to prepend or append TARS tag

1cd5630

Add option to prepend or append TARS tag

9dc3b9d

Merge branch 'tars_tagger' of https://github.com/flairNLP/flair into …

6667579

…tars_tagger

Changes in loss calculation and best model saving

6eb9770

alanakbik added 6 commits April 30, 2021 10:13

Save best model correction

f9b050c

Fix bug in loading best model

b816673

Merge branch 'master' of git://github.com/HallerPatrick/flair

dee8213

Merge branch 'master' of https://github.com/flairNLP/flair

5f918b5

Merge branch 'master' into tars_tagger

c843bd3

Fix unit test

a9c93a1

alanakbik merged commit d5dd0a2 into master May 1, 2021

alanakbik deleted the tars_tagger branch May 1, 2021 09:53

ashokxnarang mentioned this pull request Aug 31, 2021

Few Shots TARS with NER #2408

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

First version of Zero-Shot Sequence Labeler #2260

First version of Zero-Shot Sequence Labeler #2260

alanakbik commented May 1, 2021 •

edited

Loading

First version of Zero-Shot Sequence Labeler #2260

First version of Zero-Shot Sequence Labeler #2260

Conversation

alanakbik commented May 1, 2021 • edited Loading

alanakbik commented May 1, 2021 •

edited

Loading