Skip to content

Conversation

@garrett361
Copy link
Owner

Merge upstream/main into this man. Also uses the upstream version of the dataset stats printing.

finbarrtimbers and others added 27 commits July 16, 2025 10:21
* Added scripts to run benchmarks.

* Removed install script.

* Added install script back.
* first pass remap verifier

* make judge json parsing a little more robust

* typoooooooo

* typoooooooo

* fix logic...

* clean logging naming up
* add punk tokenizer

* fix up command
* Made changes.

* Switched to use ray.util.queue.Queue instead of a custom RayQueue class.

* Now, only handles new version.

* Updated benchmark_generators.py and test_grpo_fast.py.

* CLeaned up code from Claude.

* training_step defaults to None.

* Added an info dataclass to replace the tuple.

* Removes assumption that queries_prompt_Q and inference_results_Q are in sync by moving queries_prompt_Q to be a map.

* CLeaned up benchmark

* Added code to split batch sizes.

* Removed benchmark scripts, which are now in a separate PR.

* Now, we create all Ray queues in main, and pass them in as appropriate.

* Removed changes

* Test changes.

* Linter passes

* Added tests.

* Now, we index with the dataset indices.

* Checks and tests pass.

* Ran linter

* Added benchmark scripts back. Whoops.
* Set new default value for num_samples

* Now run N batches at once

* different batch size

* Fix pack length

* Fix pack length

* Fix wasted compute % (was accidentally multiplying by 100), and fix num rollouts (was referencing the wrong variable).

* Now, we save benchmark results to CSV.

* Now show a percentage for time spent generating.

* Updated benchmark saving code.

* Fixed syntax error.

* Fixed benchmark

* Fixed timing code.

* Removed changes to vllm_utils3.py.

* Now, we actually write the data to disk>

* Bigger batch

* Modified benchmark

* Undid changes to benchmark script.

* Temp change

* Undid changes to benchmark script.
it was only being installed in regular Dockerfile

Co-authored-by: Michael Noukhovitch <[email protected]>
Co-authored-by: Saurabh Shah <[email protected]>
* binary reward for code

* style

* binary code reward flag -> pass rate reward threshold
* Now, we run individual prompts through the queue.

* Fixed issues.

* Ran linter

* Fixed linter errors.

* COde lints.

* Test passes.

* Ran linter.

* Ensures that we send single prompts as requests.

* Now, code lints.

* Cleaned up code.

* Fixes test.

* Linter passes.

* Cleaned test up.

* Removed redundant comments.
* Adds flashinfer dep.

* Now, open_instruct builds even on mac.

* Updated install instructions to add flash-infer.

* Now, we set flashinfer as the default attention backend.

* Added flashinfer to the base dockerfile.

* Ran linter.

* Removed extra changes to mason.py.

* Undid changes to uv.lock.

* Updated requirements.txt

* Updated flash-attn version.

---------

Co-authored-by: Hamish Ivison <[email protected]>
* delete function

Signed-off-by: Yu Chin Fabian Lim <[email protected]>

* Update open_instruct/dataset_transformation.py

---------

Signed-off-by: Yu Chin Fabian Lim <[email protected]>
Co-authored-by: Hamish Ivison <[email protected]>
prev-branch: padding-free-squashing-7

Co-authored-by: Hamish Ivison <[email protected]>
* Fix misnamed variables.

* Ran linter.
…lenai#765)

Adds new olmo-core-compatible chat templates. Includes:
* New olmo template with support for function-calling. Includes a basic hard-coded system prompt, and appends "You do not have access to any functions" to any SFT examples that do not include functions.
* Thinker version of the above template, has <think> included in the generation prompt
* R1-style thinker template
These 3 templates mirror our current Tulu templates

Also includes some necessary changes to the --add_bos logic, to handle the new chat template which does not have a bos token.

Includes a few other QoL fixes:
* Fixes a bug in the olmocore tokenization script re: label mask
* Logs dataset-level statistics during data mixing and tokenization
* Supports easy upsampling during data mixing
* fix up my (jacob's) slightly broken pr

---------

Co-authored-by: jacob-morrison <[email protected]>
* remove moar things

* create on pr

* dont create on pr
@garrett361 garrett361 marked this pull request as ready for review July 23, 2025 16:06
@garrett361 garrett361 requested review from dangxuanhong and removed request for dangxuanhong July 23, 2025 16:06
@garrett361 garrett361 requested a review from fabianlim July 23, 2025 16:06
@garrett361 garrett361 changed the title Main merge upstream/main merge and dataset stats Jul 23, 2025
Copy link
Collaborator

@dangxuanhong dangxuanhong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @garrett361 The changes made to dataset_transformation.py look good to me.

@garrett361 garrett361 merged commit 9be3034 into main Jul 23, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants