Issue instantiating a keras_nlp.models.Backbone from a model preset of Hugging Face handles #1574

ghost · 2024-04-12T13:45:43Z

Describe the bug

I am unable to instantiate a keras_nlp.models.Backbone from a model preset of Hugging Face handles, and get the following error:

/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_token.py:88: UserWarning: 
The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.
  warnings.warn(
config.json: 100%
 462/462 [00:00<00:00, 18.6kB/s]
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
[<ipython-input-2-085dc66f3fb4>](https://localhost:8080/#) in <cell line: 1>()
----> 1 backbone = keras_nlp.models.Backbone.from_preset("hf://dmis-lab/biobert-v1.1")

1 frames
[/usr/local/lib/python3.10/dist-packages/keras_nlp/src/utils/preset_utils.py](https://localhost:8080/#) in check_config_class(preset, config_file)
    409     with open(config_path) as config_file:
    410         config = json.load(config_file)
--> 411     return keras.saving.get_registered_object(config["registered_name"])
    412 

KeyError: 'registered_name'

To Reproduce
https://colab.research.google.com/drive/1sL3dEM8ZOLCbQ5RM1P6usrrzfqk5aTbQ?usp=sharing

Expected behavior
This line should probably be changed?
https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/utils/preset_utils.py#L411

E.g., config.json, from HF: https://huggingface.co/google-bert/bert-base-uncased/resolve/main/config.json

Would you like to help us fix it?

The text was updated successfully, but these errors were encountered:

mattdangerw · 2024-04-12T19:12:03Z

I believe this is because you are attempting to load a huggingface/transformers model as a KerasNLP model. The config and weights are in a fundamentally different format. See #1294 for a bit more of a deep dive here.

I think there's two actionable things here:

Long term, we want to offer easy ways to convert from transformers models to keras_nlp models, and vice versa. TBD on how exactly that should work. A set of tools shipped with keras-nlp? Suggestion welcome.
More immediately, we might want to make sure we can detect this case and have good error message. Especially because both Transformers and KerasNLP use a config.json. It's confusing!

@Wauplin @SamanehSaadat cc'ing for thoughts.

Wauplin · 2024-04-15T13:01:23Z

More immediately, we might want to make sure we can detect this case and have good error message. Especially because both Transformers and KerasNLP use a config.json. It's confusing!

I see two places in the code where we could add a check. Either when we download from the Hub (check the metadata.json file for instance?) or when we load the model from a local preset. Both are valid IMO. If we add the check when downloading the files from HF, that would be HF-specific and therefore not applied if a user manually download files and try to load them (you'll have such usage for sure!). If we add the check when loading the files, that would not be specific to HF which is somehow better IMO. It would mean checking the consistency of the config.json compared to what's expected. WDYT?

Agree on the longer term goal as well but I'll defer the topic to @Rocketknight1 who's more knowledgeable on the transformers side (and most likely to be discussed in a separate issue?).

Rocketknight1 · 2024-04-15T13:12:33Z

@mattdangerw I think a short-term check that raises a sensible error makes sense as the first step. Longer-term, conversion should be possible - once we detect a transformers checkpoint in KerasNLP, as long as KerasNLP already has support for that architecture, we could just have a mapping for config attributes, tokenizer vocab and layer names to convert the checkpoint.

In theory, simple. In practice, I'm sure there are lots of painful edge cases to worry about, but I suspect we'd get most of the benefit just from supporting a few of the most popular architectures (e.g. Gemma/Llama/Mistral/Mixtral), and maybe that wouldn't be so bad!

SamanehSaadat · 2024-04-15T18:11:27Z

@Wauplin I agree that we should add a check when loading the model to make sure both local and remote presets are covered. I believe, the check should be done at the beginning of our from_preset(). I'll add the check.

SamanehSaadat · 2024-07-30T20:55:09Z

The long-term approach has been addressed in #1662 so I'll close this issue.

github-actions bot assigned SuryanarayanaY Apr 12, 2024

SamanehSaadat mentioned this issue Apr 16, 2024

Improve error handling for non-keras model loading attempts #1577

Merged

SuryanarayanaY added type:feature New feature or request stat:awaiting keras-eng labels Apr 17, 2024

mattdangerw assigned SamanehSaadat May 7, 2024

SamanehSaadat mentioned this issue Jun 24, 2024

Covert a safetensor checkpoint from Hugging Face hub #1662

Merged

SamanehSaadat closed this as completed Jul 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue instantiating a keras_nlp.models.Backbone from a model preset of Hugging Face handles #1574

Issue instantiating a keras_nlp.models.Backbone from a model preset of Hugging Face handles #1574

ghost commented Apr 12, 2024 •

edited by ghost

Loading

mattdangerw commented Apr 12, 2024

Wauplin commented Apr 15, 2024

Rocketknight1 commented Apr 15, 2024

SamanehSaadat commented Apr 15, 2024

SamanehSaadat commented Jul 30, 2024

Issue instantiating a keras_nlp.models.Backbone from a model preset of Hugging Face handles #1574

Issue instantiating a keras_nlp.models.Backbone from a model preset of Hugging Face handles #1574

Comments

ghost commented Apr 12, 2024 • edited by ghost Loading

mattdangerw commented Apr 12, 2024

Wauplin commented Apr 15, 2024

Rocketknight1 commented Apr 15, 2024

SamanehSaadat commented Apr 15, 2024

SamanehSaadat commented Jul 30, 2024

ghost commented Apr 12, 2024 •

edited by ghost

Loading