[WIP] Add GeoV model #22403

vpj · 2023-03-27T16:30:15Z

What does this PR do?

This PR adds the 9B parameter GeoV language model trained by Georges Harik.

sgugger · 2023-03-28T13:34:56Z

cc @ArthurZucker

vpj · 2023-03-30T08:13:46Z

Model weights GeoV/GeoV-9b

ArthurZucker · 2023-03-30T08:38:34Z

Hey @vpj feel free to ping me for any guidance ! 😉 Also if you need a review tell me

vpj · 2023-03-30T15:20:24Z

Yeah need a review. Im new to huggingface transformers. Just went by the tutorials. Let me know what else needs to be done in order to merge this. Thanks

vpj · 2023-03-31T05:31:06Z

@ArthurZucker

ArthurZucker · 2023-03-31T08:03:02Z

Sure ! Reviewing now!

ArthurZucker

Good work ! It's already pretty clean.
Main comment is with naming, and the missing copied from.
Plus I am pretty sure we don't need a new tokenizer

src/transformers/models/geov/configuration_geov.py

ArthurZucker · 2023-03-31T08:39:05Z

src/transformers/models/geov/modeling_geov.py

Let's change the order of the classes. General functions should be at the beginning. It' a nit but a convention that makes it easier to read the entire file! Let's follow for example what you can see in T5 or any other models!

I used what was on gpt-neox. Moved PretrainedModel class down, let me know if it's ok.

src/transformers/models/geov/modeling_geov.py

ArthurZucker · 2023-03-31T11:36:37Z

tests/models/geov/test_modeling_geov.py

+            model.to(torch_device)
+
+            inputs = tokenizer("My favorite food is", return_tensors="pt").to(torch_device)
+            expected_output = "My favorite food is pizza. I love pizza. I love pizza. I"


Suggested change

expected_output = "My favorite food is pizza. I love pizza. I love pizza. I"

EXPECTED_OUTPUT = "My favorite food is pizza. I love pizza. I love pizza. I"

Also do you mind adding a test with the logits of the model?

Sorry didnt get you about the test with logits of the model.

ArthurZucker · 2023-03-31T11:38:26Z

src/transformers/models/geov/configuration_geov.py

This model should be almost entirely the same as Reformer or GPTNeoX , so let's add copied from wherever we can! Also we should rename every GeoV to Geov in name of the classes, it's gonna be more convienent

Why rename GeoV to Geov?

src/transformers/models/geov/modeling_geov.py

ArthurZucker · 2023-03-31T11:48:52Z

src/transformers/models/geov/tokenization_geov.py

Pretty sure we don't need a new tokenizer for this. We just have to add /n as a special token and it will not be processed by the spm model (consider looking at the reformer, or the big_bird tokenizers which should be usable.

Looking into it

I previously tried adding \n as a special token. But from tokenization_utils.py and tokenization_utils_base.py it looked to me like it assigns len(self) + 1 id to new tokens. But we need to add a special token with a specific id. I went through reformer but couldn't figure out how to do that? Can you help?

Sure. Gimme a bit of time I'll try to find the best solution to deal with this.

vpj · 2023-04-03T12:27:03Z

FAILED tests/models/whisper/test_modeling_flax_whisper.py::FlaxWhisperModelTest::test_equivalence_pt_to_flax - AssertionError: 1.1205673e-05 not less than or equal to 1e-05 : outputs.encoder_last_hidden_state: Difference between PyTorch and Flax is 1.1205673217773438e-05 (>= 1e-05).

This is why torch_and_flax test is failing

vpj · 2023-04-03T12:30:54Z

@ArthurZucker Very much appreciate the help so far. Can you please help me get this PR ready by tomorrow since I won't be available for a week after tomorrow. Thank you

HuggingFaceDocBuilderDev · 2023-04-03T12:36:33Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

vpj · 2023-04-03T13:47:39Z

FAILED tests/extended/test_trainer_ext.py::TestTrainerExt::test_run_seq2seq_no_dist - TypeError: unsupported operand type(s) for +: 'NoneType' and 'int'
FAILED tests/models/pix2struct/test_image_processing_pix2struct.py::Pix2StructImageProcessingTest::test_expected_patches - PIL.UnidentifiedImageError: cannot identify image file <_io.BytesIO object at 0x7f2bc8801950>

This error for tests_torch

ArthurZucker · 2023-04-03T14:19:14Z

These two tests seem unrelated to your PR, pull from main and normally they should dissappear

ArthurZucker · 2023-04-03T14:20:50Z

Ok the history of the PR got a little bit messed up 😅 it's alright it can happen from time to time! You can either rebase on main starting from [4bd65f3](https://github.com/huggingface/transformers/pull/22403/commits/4bd65f354c8366feccae95278c1f0b3a85110b3b) (for example as it is one commit unnafected) or you can do a soft reset to your first commit, add only your modifications and force push. And then pull from main

ArthurZucker · 2023-04-03T14:23:36Z

The styling depends on the version of black that you are using. Seems like most of the test are now good, the last should work with a pip install black==23.1

merge main

vpj · 2023-04-03T14:33:44Z

Yeah messed up by doing a rebase instead of a merge

vpj · 2023-04-03T14:56:45Z

I am not sure what the check means by imports order/format, to me it looks quite similar to other files.

ArthurZucker · 2023-04-03T15:06:53Z

Ok this is ruff acting up, I use ruff 0.0.258 ( we recently pinned the correct one)

vpj · 2023-04-03T15:14:57Z

What should I do? Do I have to install ruff 0.0.258 and run make style?

ArthurZucker

Okay, thanks a lot for all the cleanup!With this I can better see the actual changes. Given this, I am not really sure that we have to go through all the troubles of adding everything to transformers!
The easiest way to share the model is to put it on the hub using this tutorial.

I am sorry if this is more work as you must have created the dev env and etc! But The way the code is now looks good! And it should properly blend in the Custom Model.

Especially for the tokenization (you would have had to add a testing file) you can just do something like

class GeovTokenizer(RoformerTokenizer):
    def __init__(self):
        super().__init__()

    def convert_tokens_to_ids():
    
    def convert_ids_to_tokens():

which would be the only things you would have to rewrite! I can also help you as much as I can on adding this to the hub directly!

ArthurZucker · 2023-04-03T15:23:39Z

src/transformers/models/geov/modeling_geov.py

+
+        return outputs
+
+    @classmethod


I don't see it on the attention class but yes, if the whole attention class is wrapped, you don't need to put it everwhere.

vpj · 2023-04-03T16:39:39Z

Oh thanks, didn't know that. I saw this https://huggingface.co/docs/transformers/add_new_model and thought I had to create a pull request.

So, just to make sure I'm clear, should I close this PR and share the model according to https://huggingface.co/docs/transformers/custom_models?

ArthurZucker · 2023-04-03T18:21:38Z

Yes! It would be the best 😉 Thanks for your comprehension!

vpj · 2023-04-04T04:01:03Z

Just out of curiosity, how do you choose which models to add to the repo and what goes to the hub?

vpj · 2023-04-04T08:11:45Z

Added to the hub, but it doesn't work with pipelines (text-generation).

How do I register the model for text-generation.

This is what I'm doing now

  GeoVConfig.register_for_auto_class()
  GeoVModel.register_for_auto_class("AutoModel")
  GeoVForCausalLM.register_for_auto_class("AutoModelForCausalLM")
  GeoVTokenizer.register_for_auto_class()

Trying to load the pipeline with

generator = pipeline(model="GeoV/GeoV-9b", trust_remote_code=True)

gives the error

The model 'GeoVForCausalLM' is not supported for text-generation. Supported models are ...

Thanks

ArthurZucker · 2023-04-04T13:56:08Z

Just out of curiosity, how do you choose which models to add to the repo and what goes to the hub?

The more we grow, the more we are trying to add models to the hub! Especially if the model does not have a lot of changes compared to a model that we already support!

For the issue, I think you have to update the mapping MODEL_FOR_MASKED_LM_MAPPING by adding your model

vpj · 2023-04-04T14:56:24Z

How can I change MODEL_FOR_MASKED_LM_MAPPING if I'm adding to the hub?

ArthurZucker · 2023-04-05T09:04:59Z

The same way you did for the AUTO_CONFIG_MAPPING.
An example from here:

config.json:
...
"auto_map": {
  "AutoConfig": "configuration_glm.GLMConfig",
  "AutoModel": "modeling_glm.GLMModel",
  "AutoModelForSeq2SeqLM": "modeling_glm.GLMForConditionalGeneration",
  "AutoModelForMultipleChoice": "modeling_glm.GLMForMultipleChoice",
  "AutoModelForSequenceClassification": "modeling_glm.GLMForSequenceClassification"
  },
...

ArthurZucker · 2023-04-05T09:05:33Z

So my bad, you just need to add AutoModelForMaskedLM !

vpj · 2023-04-05T20:20:19Z

This is a causal lm, is it ok to add it to masked lm?

ArthurZucker · 2023-04-11T08:22:34Z

Ah sorry for Causal LM it should be AutoModelForSeq2SeqLM

github-actions · 2023-05-05T15:02:37Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

automatically generated code

414b6d3

vpj marked this pull request as draft March 27, 2023 16:30

vpj added 12 commits March 27, 2023 22:48

tokenizer and model code

f7de635

fix import and tokenizer init

dd81ba7

reverse rotary value

9f7d0fc

copyright

b179326

tests

9ad19e8

tests fix

ef0c211

tests fix

1d1f4c6

remove einops

504c5d7

docs

f7029cf

docs

6e9e66e

style

08e18a8

docs

b775b10

vpj marked this pull request as ready for review March 28, 2023 09:59

fix docs typo

119075c

vpj added 2 commits March 30, 2023 10:23

docs typos

c78f65b

readme

2b6f102

seq len typo fix

1fdc965

ArthurZucker reviewed Mar 31, 2023

View reviewed changes

vpj added 4 commits April 1, 2023 13:33

fix an/a

6a91ed8

fix default params docs

e108c53

class order

f32d35e

oops wrong file

f09bb94

vpj added 2 commits April 3, 2023 17:38

fix geovlayer parallel residual

c30251a

toc tree

8936db6

link to geov-ai/geav

3e2bc07

vpj force-pushed the add_geov branch from ec71e54 to 3e2bc07 Compare April 3, 2023 14:28

Merge remote-tracking branch 'upstream/main' into add_geov

17c32bd

merge main

styling

a62cc58

ArthurZucker reviewed Apr 3, 2023

View reviewed changes

github-actions bot closed this May 13, 2023

	expected_output = "My favorite food is pizza. I love pizza. I love pizza. I"
	EXPECTED_OUTPUT = "My favorite food is pizza. I love pizza. I love pizza. I"

[WIP] Add GeoV model #22403

[WIP] Add GeoV model #22403

Uh oh!

Conversation

vpj commented Mar 27, 2023

What does this PR do?

Uh oh!

sgugger commented Mar 28, 2023

Uh oh!

vpj commented Mar 30, 2023

Uh oh!

ArthurZucker commented Mar 30, 2023

Uh oh!

vpj commented Mar 30, 2023

Uh oh!

vpj commented Mar 31, 2023

Uh oh!

ArthurZucker commented Mar 31, 2023

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vpj commented Apr 3, 2023

Uh oh!

vpj commented Apr 3, 2023

Uh oh!

HuggingFaceDocBuilderDev commented Apr 3, 2023

Uh oh!

vpj commented Apr 3, 2023

Uh oh!

ArthurZucker commented Apr 3, 2023

Uh oh!

ArthurZucker commented Apr 3, 2023

Uh oh!

ArthurZucker commented Apr 3, 2023

Uh oh!

vpj commented Apr 3, 2023

Uh oh!

vpj commented Apr 3, 2023

Uh oh!

ArthurZucker commented Apr 3, 2023

Uh oh!

vpj commented Apr 3, 2023

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vpj commented Apr 3, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ArthurZucker commented Apr 3, 2023

Uh oh!

vpj commented Apr 3, 2023 •

edited

Loading

ArthurZucker commented Apr 4, 2023 •

edited

Loading