Skip to content

Conversation

@alvarobartt
Copy link
Member

@alvarobartt alvarobartt commented May 27, 2025

What does this PR do?

This PR adds a missing DistilBERT variant on weight naming so that it can also be loaded in Text Embeddings Inference (TEI). Apparently, the weight naming issue is persistent across all the DistilBERT models on https://huggingface.co/sentence-transformers, so this PR should solve that and enable all the Sentence Transformers models with the DistilBERT architecture (note that those do appear as currently supported because the architecture is indeed supported but without this patch the deployment would fail). See e.g. https://huggingface.co/sentence-transformers/distiluse-base-multilingual-cased-v2.

Fixes #600

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline, Pull Request section?
  • Was this discussed/approved via a GitHub issue or the forum? Please add a link to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the documentation guidelines, and here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

@Narsil

@alvarobartt alvarobartt marked this pull request as ready for review May 27, 2025 10:15
@alvarobartt alvarobartt requested a review from Narsil May 28, 2025 09:40
@Narsil Narsil merged commit cf423d1 into main Jun 2, 2025
14 checks passed
@Narsil Narsil deleted the patch-distilbert-variants branch June 2, 2025 13:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

distiluse-base-multilingual-cased-v2 error when start

3 participants