add qwen2 classification#1817
Conversation
ranzhejiang
commented
Mar 5, 2025
- To meet the customer's demands, we need to add the classification function interface of the qwen2 on gaudi
- you can test it with following code
|
LGTM! |
|
Ok, thanks @Luca-Calabria. Waiting for @regisss then |
|
@regisss Would you please review this PR? Thank you. |
|
@ranzhejiang There are merge conflicts to solve following #1698. Can you also add a test for it? It can be the code snippet you posted above 🙂 |
|
Ok, I will solve it later, and add this demo in example, thanks for review. |
97ea0ca to
6dedcbb
Compare
|
@ranzhejiang @regisss any update here? |
|
My comment above with
should be addressed |
|
Sorry for late reply owing to others. For first comment, I have made sure my code is up-to-date with Transformers v4.49, now no confilict, for second comment, I had not find proper tests for this text-classification task example before, so I will add this the code snippet into https://github.com/huggingface/optimum-habana/tree/main/examples/text-classification before 5.23 |
|
hi @ranzhejiang ! hope my question is not self evident, but why do we need to implement those custom task classes ? my understanding is that if they all use the same based model |
| def __init__(self, config): | ||
| super().__init__(config) | ||
| self.num_labels = config.num_labels | ||
| self.model = GaudiQwen2Model(config) |
There was a problem hiding this comment.
my point is that the line in adapt_transformers_to_gaudi: transformers.models.qwen2.modeling_qwen2.Qwen2Model = GaudiQwen2Model already makes these qwen2 models with task heads compatible/optimized for gaudi.
There was a problem hiding this comment.
What you said make sense, it seems that this plan can not run work well for auto tp when I firstly give this pr, but now it can works well for single card, but I still need confirm what you said in 8-card env.
There was a problem hiding this comment.
What you said makes sense, I have changed my code and not using GaudiQwen2Model for init, and the new PR is in #2062