Skip to content

Conversation

@nguyenhoan1988
Copy link

Categorize text into unseen labels by leveraging pre-trained Sentence Transformers.
Input sentences and candidate labels are encoded into dense vectors, and the label with the highest cosine similarity to the sentence's embedding is predicted.

@tomaarsen
Copy link
Member

Hello!

This is really cool, it reminds me a bit of SetFit as well. However, I'm a little bit unsure whether it fits in Sentence Transformers like this currently. Normally, we implement this kind of functionality in utility functions, e.g.: https://sbert.net/docs/package_reference/util.html#sentence_transformers.util.paraphrase_mining

Or in a third party project that extends Sentence Transformers with training and inference of strong zero-shot classification models.

Beyond that, although Sentence Transformer models are definitely good out of the box for zero-shot classification, they're often not trained specifically for it (especially with the label_template), and people might be better off using a zero-shot-classification model on Hugging Face.

In short, I'm not very sure what to do with this yet.

  • Tom Aarsen

@nguyenhoan1988
Copy link
Author

Hi,

I have developed a private benchmark of zero-shot classification and the off-the-shelf Sentence Transformers models perform quite well.
When I test the NLI approach of the zero-shot-classification model on Hugging Face, it has been very compute intensive (1 sentence makes N predictions, with N is the number of labels).
While SetFit is not truly zero-shot in a sense.
I will probably make this functionality into a separate library that extends Sentence Transformers to do zero-shot classification.

Thanks a lot for your feedback,
Hoan

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants