feat: Add .from_hf_hub() and similar methods to ModelMeta#3737
Conversation
# Conflicts: # mteb/models/model_meta.py
| license=model_license, | ||
| framework=frameworks, | ||
| training_datasets=None, | ||
| similarity_fn_name=None, |
There was a problem hiding this comment.
We can fetch similarity function from: https://huggingface.co/sentence-transformers/embeddinggemma-300m-medical/blob/main/config_sentence_transformers.json#L24
but if not I would be ok with assuming cosine? (I suspect it is also in the model data).
Feel free to make this a seperate issue
There was a problem hiding this comment.
might be easier to just fetch from the model - but if we could avoid loading/downloading the model that would be great
There was a problem hiding this comment.
Added from_hub_for_sentence_transformer maybe not best naming
There was a problem hiding this comment.
Hmm I don't think that is what we want. Why can't that just be the default for from_hub?
mteb/models/model_meta.py
Outdated
| """ | ||
| from mteb.models import CrossEncoderWrapper | ||
|
|
||
| meta = cls.from_hf_hub(model.model.name_or_path, revision, compute_metadata) |
There was a problem hiding this comment.
| meta = cls.from_hf_hub(model.model.name_or_path, revision, compute_metadata) | |
| meta = cls.from_hf_hub(model.model.name_or_path, revision, compute_metadata) |
maybe worth splitting this into an inner _fetch_metadata_from_hub
and a public fetch_from_hub.
| if "API" in self.framework or self.name is None: | ||
| return None |
There was a problem hiding this comment.
- Let us add a warning here
- Shouldn't this be based on the number of parameters? It could have API tag while also having public weights
There was a problem hiding this comment.
Currently, we have api tag only for private models, so I'm not sure what to do here
There was a problem hiding this comment.
But if it is private, then the number of parameters is None? (We have a few private functions where the number of parameters is public, but then I suppose we could also estimate the memory usage?). Though I see that might be a bit odd
ModelMeta.from_hf_hub() and similar methods to ModelMeta
.from_hf_hub() and similar methods to ModelMeta.from_hf_hub() and similar methods to ModelMeta
# Conflicts: # mteb/models/get_model_meta.py # mteb/models/model_meta.py
|
Problem with HF limits again or HF have some problems, but there is nothing in their status page (aws seems fine too) |
| license=model_license, | ||
| framework=frameworks, | ||
| training_datasets=None, | ||
| similarity_fn_name=None, |
There was a problem hiding this comment.
Hmm I don't think that is what we want. Why can't that just be the default for from_hub?
KennethEnevoldsen
left a comment
There was a problem hiding this comment.
I think we are at a solid state!

Close #3735
Close #3695
Close #3734
Move
_model_meta_from_hf_hub,_model_meta_from_cross_encoder,_model_meta_from_sentence_transformertoModelMetaclass