Add 7B presets for Mistral #1436

tirthasheshpatel · 2024-02-14T08:03:34Z

Adds mistral_7b_en and mistral_instruct_7b_en presets for Mistral. Also added tests for presets for all the components of Mistral.

mattdangerw · 2024-02-14T18:39:33Z

Can you show a full generation run with the causal lm class running in a colab? Loaded from kaggle.

[skip ci]

tirthasheshpatel · 2024-02-14T19:59:57Z

Can you show a full generation run with the causal lm class running in a colab? Loaded from kaggle.

Here: https://colab.research.google.com/drive/1508c8IY_nQQIsF33nUc_87l0mLo2BFw0?usp=sharing

mattdangerw · 2024-02-14T21:37:29Z

This looks great! Just made the Kaggle model public so we can re-run ci.

I think there is one small thing to fix here. With the Kaggle rewrite our tokenizers can be created without a vocabulary, which is loaded after the fact. Basically loading will look like MistralTokenizer.from_config(config); MistralTokenizer.load_assets(path). So we have to accommodate a tokenizer not knowing it's vocabulary on creation.

To deal with this fact, we only grab the special tokens from the tokenizer the first time a preprocessing layer built. Not when it is constructed. Basically we want this...

https://github.com/keras-team/keras-nlp/blob/bc3852f9fc3e9edaab7bc6c5d7ee40e6d3618ac2/keras_nlp/models/bert/bert_preprocessor.py#L141-L156

Instead of this...

https://github.com/keras-team/keras-nlp/blob/bc3852f9fc3e9edaab7bc6c5d7ee40e6d3618ac2/keras_nlp/models/mistral/mistral_preprocessor.py#L122-L132

You might be able to trigger a bug if you save a MistralCausalLM in the .keras format, and then load the model back, and check causal_lm.preprocessor.packer.start_value. Not sure but we should probably check.

…-presets

tirthasheshpatel · 2024-02-14T22:00:24Z

Thanks for noticing this @mattdangerw! Should be fixed now. Let me know if it looks good to you now!

[skip ci]

mattdangerw

lgtm!

[skip ci]

tirthasheshpatel added 2 commits February 14, 2024 07:16

Refactor the checkpoints script

f903f32

Add the 7B preset for Mistral

79e38f8

tirthasheshpatel added the type:feature New feature or request label Feb 14, 2024

tirthasheshpatel requested a review from mattdangerw February 14, 2024 08:03

tirthasheshpatel mentioned this pull request Feb 14, 2024

Preset and doc for Mistral (multilingual) #1418

Closed

Upate the preset version

17a4f12

[skip ci]

mattdangerw added the kokoro:force-run Runs Tests on GPU label Feb 14, 2024

kokoro-team removed the kokoro:force-run Runs Tests on GPU label Feb 14, 2024

tirthasheshpatel added 3 commits February 14, 2024 21:51

Fix the bug in Mistral preprocessor

0b742fa

Merge branch 'master' of github.com:keras-team/keras-nlp into mistral…

7714593

…-presets

Fix merge artifacts

2b81579

Fix the tokenizer test

d3869ed

[skip ci]

mattdangerw approved these changes Feb 14, 2024

View reviewed changes

Mark smallest preset test as extra_large for now

08f8d8b

[skip ci]

mattdangerw merged commit 70c57cc into keras-team:master Feb 15, 2024

tirthasheshpatel deleted the mistral-presets branch February 16, 2024 07:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add 7B presets for Mistral #1436

Add 7B presets for Mistral #1436

Uh oh!

tirthasheshpatel commented Feb 14, 2024 •

edited

Loading

Uh oh!

mattdangerw commented Feb 14, 2024

Uh oh!

tirthasheshpatel commented Feb 14, 2024

Uh oh!

mattdangerw commented Feb 14, 2024

Uh oh!

tirthasheshpatel commented Feb 14, 2024

Uh oh!

mattdangerw left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add 7B presets for Mistral #1436

Add 7B presets for Mistral #1436

Uh oh!

Conversation

tirthasheshpatel commented Feb 14, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mattdangerw commented Feb 14, 2024

Uh oh!

tirthasheshpatel commented Feb 14, 2024

Uh oh!

mattdangerw commented Feb 14, 2024

Uh oh!

tirthasheshpatel commented Feb 14, 2024

Uh oh!

mattdangerw left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

tirthasheshpatel commented Feb 14, 2024 •

edited

Loading