WIP: haiku bert implementation #3520

madisonmay · 2020-03-29T22:57:05Z

Still a work in progress but the contextual embeddings line up with the pytorch version so this is roughly at parity with jax-bert

TODO (mostly notes to myself):

Add save_pretrained
Make from_pretrained work with names
Add dropout at training time, pass through training flag
Make sure weight initializations line up when pre-trained state isn't passed
Gradually work towards parity with the pytorch version if desired? (target models, BERT variants, etc.)
Write HaikuPretrainedModel to take advantage of archive resolution / make saving + loading compatible with pytorch bins?

To use the pre-trained weights cleanly I ended up subclassing hk.Module -- unsure how I feel about this decision but I couldn't think of a better method at the time. Feel free to suggest an alternative if you have ideas.

…on the last layer.

…sion

… np.arange.

stale · 2020-07-27T20:33:54Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

mfuntowicz added 12 commits March 18, 2020 14:59

WIP flax bert

2231d44

Initial commit Bert Jax/Flax implementation.

12ca05f

Embeddings working and equivalent to PyTorch.

fa67548

Move embeddings in its own module BertEmbeddings

b41eda7

Added jax.jit annotation on forward call

e134c88

BertEncoder on par with PyTorch ! :D

9bc4b58

Add BertPooler on par with PyTorch !!

f137e8a

Working Jax+Flax implementation of BertModel with < 1e-5 differences …

833c788

…on the last layer.

Fix pooled output to take only the first token of the sequence.

0a8a496

Refactoring to use BertConfig from transformers.

091fe16

Renamed FXBertModel to FlaxBertModel

e4ee7d1

Model is now initialized in FlaxBertModel constructor and reused.

6041da0

thomwolf assigned mfuntowicz, patrickvonplaten and thomwolf Mar 30, 2020

mfuntowicz added 5 commits April 2, 2020 23:11

WIP JaxPreTrainedModel

a57c779

Cleaning up the code of FlaxBertModel

7164e5d

Added ability to load Flax model saved through save_pretrained()

c505a0b

Added ability to convert Pytorch Bert model to FlaxBert

4018880

FlaxBert can now load every Pytorch Bert model with on-the-fly conver…

36c95e4

…sion

mfuntowicz self-requested a review April 7, 2020 12:25

mfuntowicz and others added 5 commits April 7, 2020 16:10

Fix hardcoded shape values in conversion scripts.

6800c24

Improve the way we handle LayerNorm conversion from PyTorch to Flax.

094ff16

Added positional embeddings as parameter of BertModel with default to…

96009cb

… np.arange.

Let's roll FlaxRoberta !

694a044

WIP: haiku bert implementation

6d5818e

mfuntowicz force-pushed the jax-bert branch from e8db984 to b3ea77a Compare April 8, 2020 13:43

WIP: support for full suite of BERT model checkpoints

d24a7ee

madisonmay force-pushed the haiku-bert branch from 635b32d to d24a7ee Compare April 9, 2020 12:19

mfuntowicz mentioned this pull request Apr 9, 2020

Integrate Bert-like model on Flax runtime. #3722

Merged

5 tasks

mfuntowicz force-pushed the jax-bert branch 2 times, most recently from b259000 to 27e9bc5 Compare May 1, 2020 19:28

mfuntowicz force-pushed the jax-bert branch from d9102d5 to 0112bbd Compare May 8, 2020 13:59

mfuntowicz force-pushed the jax-bert branch from 0112bbd to c43ee15 Compare May 28, 2020 19:38

stale bot added the wontfix label Jul 27, 2020

madisonmay closed this Jul 28, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: haiku bert implementation #3520

WIP: haiku bert implementation #3520

Uh oh!

madisonmay commented Mar 29, 2020 •

edited

Loading

Uh oh!

stale bot commented Jul 27, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

WIP: haiku bert implementation #3520

WIP: haiku bert implementation #3520

Uh oh!

Conversation

madisonmay commented Mar 29, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

stale bot commented Jul 27, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

madisonmay commented Mar 29, 2020 •

edited

Loading