Skip to content

Conversation

clefourrier
Copy link
Member

@clefourrier clefourrier commented Apr 16, 2024

Still missing:

  • merge greedy sampling with the basic greedy evals, to avoid doing generations too many times - we can do sampled generation + normal generations in one step and split after on the num samples needed when we apply the metrics - will divide time taken by 2
  • update all the other launchers - inference endpoints, nanotron, etc - they don't cover it XD

NathanHB
NathanHB previously approved these changes Apr 21, 2024
Copy link
Member

@NathanHB NathanHB left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM ! Do we have metric for gsm8k to compare with other implementations ?

@clefourrier
Copy link
Member Author

I'll check it - however, please don't merge yet, I need to propagate the changes to the other lauchers + simplify greedy :)

@NathanHB
Copy link
Member

Oh mb I thought it ready for review !

@NathanHB NathanHB dismissed their stale review April 22, 2024 09:40

not ready

@clefourrier
Copy link
Member Author

I hope I'll finish it today :)
It'll be a bit bigger because I'm simplifying part of the current system

@clefourrier clefourrier requested a review from NathanHB April 22, 2024 13:09
@clefourrier
Copy link
Member Author

Tests failing because of the hub problems ^^"""

@clefourrier clefourrier mentioned this pull request Apr 23, 2024
@NathanHB
Copy link
Member

Is this good to be merged ? I saw on slack you did not get the same results for mistral models. (do we even know how they ran their tests ?)

@clefourrier
Copy link
Member Author

I'll do more tests today - we could merge it now and adjust later if needed though.

@NathanHB
Copy link
Member

Alright i'm merging it so that we can merge the other PRs

@NathanHB NathanHB merged commit 0a455c4 into main Apr 30, 2024
NathanHB added a commit that referenced this pull request Jul 11, 2024
This PR needs #158 to be merged first.

The main problem is that splits are no longer by size, but they are however now making sure that all batched generative evals are launched with similar evals (same sampling, same eos tokens)

---------

Co-authored-by: Nathan Habib <[email protected]>
hynky1999 pushed a commit that referenced this pull request May 22, 2025

Co-authored-by: Nathan Habib <[email protected]>

* added review change

---------

Co-authored-by: Nathan Habib <[email protected]>
hynky1999 pushed a commit that referenced this pull request May 22, 2025
This PR needs #158 to be merged first.

The main problem is that splits are no longer by size, but they are however now making sure that all batched generative evals are launched with similar evals (same sampling, same eos tokens)

---------

Co-authored-by: Nathan Habib <[email protected]>
NathanHB added a commit that referenced this pull request Sep 19, 2025
Co-authored-by: Nathan Habib <[email protected]>

* added review change

---------

Co-authored-by: Nathan Habib <[email protected]>
NathanHB added a commit that referenced this pull request Sep 19, 2025
This PR needs #158 to be merged first.

The main problem is that splits are no longer by size, but they are however now making sure that all batched generative evals are launched with similar evals (same sampling, same eos tokens)

---------

Co-authored-by: Nathan Habib <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants