Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

what should I do if I want to improve the performance of hellaswag? #2154

Open
mathCrazyy opened this issue Dec 12, 2024 · 3 comments
Open
Assignees
Labels
discussion Start a discussion

Comments

@mathCrazyy
Copy link

image

I want to find some dataset , for example OpenO1, KD 14B to 3B, or use lora, but I have a bad result:
image
the result of KD only reach 96.8% of the ori 3B Qwen2.5 model
what should I do? Thanks.

@joecummings
Copy link
Contributor

Did you fine-tune the 14B model on your desired dataset first? That's an important pre-step to knowledge distillation.

@joecummings joecummings added the discussion Start a discussion label Dec 12, 2024
@mathCrazyy
Copy link
Author

Sorry I didn't, I mistakenly thought it was not important.

@joecummings
Copy link
Contributor

Sorry I didn't, I mistakenly thought it was not important.

All good - give that a go and LMK how it works after re-evaluating

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion Start a discussion
Projects
None yet
Development

No branches or pull requests

3 participants