Add new beaker names #803

hamishivi · 2025-07-21T19:55:26Z

Update mason to support the new aliases in https://docs.google.com/document/d/1aNShU0lywa9vjhnVQV_XWx9J3LB2_Z2NF7zCSHIHCJI/edit?tab=t.0

Also remove old cirrascale clusters, since they no longer exist.

In future, we should also use the tag system to get weka/interconnect information if possible.

* Update oe-eval.sh to set a default timeout of 48h. (allenai#789) * Updated configs to support changes. (allenai#790) * Add benchmark scripts (allenai#786) * Added scripts to run benchmarks. * Removed install script. * Added install script back. * Add remap verifier (allenai#773) * first pass remap verifier * make judge json parsing a little more robust * typoooooooo * typoooooooo * fix logic... * clean logging naming up * Ran the linter. (allenai#792) * fix the URL for code api setup (allenai#791) Co-authored-by: Michael Noukhovitch <[email protected]> * Add nltk setup to uv dockerfile (allenai#785) * add punk tokenizer * fix up command * Switches the actors to use the Ray queue. (allenai#784) * Made changes. * Switched to use ray.util.queue.Queue instead of a custom RayQueue class. * Now, only handles new version. * Updated benchmark_generators.py and test_grpo_fast.py. * CLeaned up code from Claude. * training_step defaults to None. * Added an info dataclass to replace the tuple. * Removes assumption that queries_prompt_Q and inference_results_Q are in sync by moving queries_prompt_Q to be a map. * CLeaned up benchmark * Added code to split batch sizes. * Removed benchmark scripts, which are now in a separate PR. * Now, we create all Ray queues in main, and pass them in as appropriate. * Removed changes * Test changes. * Linter passes * Added tests. * Now, we index with the dataset indices. * Checks and tests pass. * Ran linter * Added benchmark scripts back. Whoops. * Set new default value for num_samples * Updates the benchmark script (allenai#795) * Set new default value for num_samples * Now run N batches at once * different batch size * Fix pack length * Fix pack length * Fix wasted compute % (was accidentally multiplying by 100), and fix num rollouts (was referencing the wrong variable). * Now, we save benchmark results to CSV. * Now show a percentage for time spent generating. * Updated benchmark saving code. * Fixed syntax error. * Fixed benchmark * Fixed timing code. * Removed changes to vllm_utils3.py. * Now, we actually write the data to disk> * Bigger batch * Modified benchmark * Undid changes to benchmark script. * Temp change * Undid changes to benchmark script. * install nginx in uv (allenai#793) it was only being installed in regular Dockerfile Co-authored-by: Michael Noukhovitch <[email protected]> Co-authored-by: Saurabh Shah <[email protected]> * allow passing local models, bubble up dataset cache errors (allenai#797) Co-authored-by: Michael Noukhovitch <[email protected]> * binary reward for code (allenai#798) * binary reward for code * style * binary code reward flag -> pass rate reward threshold * Now, we run individual prompts through the queue. (allenai#796) * Now, we run individual prompts through the queue. * Fixed issues. * Ran linter * Fixed linter errors. * COde lints. * Test passes. * Ran linter. * Ensures that we send single prompts as requests. * Now, code lints. * Cleaned up code. * Fixes test. * Linter passes. * Cleaned test up. * Removed redundant comments. * Adds flashinfer dep. (allenai#800) * Adds flashinfer dep. * Now, open_instruct builds even on mac. * Updated install instructions to add flash-infer. * Now, we set flashinfer as the default attention backend. * Added flashinfer to the base dockerfile. * Ran linter. * Removed extra changes to mason.py. * Undid changes to uv.lock. * Updated requirements.txt * Updated flash-attn version. --------- Co-authored-by: Hamish Ivison <[email protected]> * new beaker names (allenai#803) * Remove Unused DPO Function (allenai#794) * delete function Signed-off-by: Yu Chin Fabian Lim <[email protected]> * Update open_instruct/dataset_transformation.py --------- Signed-off-by: Yu Chin Fabian Lim <[email protected]> Co-authored-by: Hamish Ivison <[email protected]> * extra reporting (allenai#799) prev-branch: padding-free-squashing-7 Co-authored-by: Hamish Ivison <[email protected]> * Revert "Now, we run individual prompts through the queue. (allenai#796)" (allenai#804) This reverts commit 541058c. * Fix misnamed variables. (allenai#808) * Fix misnamed variables. * Ran linter. * Fix broken syntax. (allenai#809) Co-authored-by: Hamish Ivison <[email protected]> * Add new olmo chat templates, and improve data mixing/tokenization (allenai#765) Adds new olmo-core-compatible chat templates. Includes: * New olmo template with support for function-calling. Includes a basic hard-coded system prompt, and appends "You do not have access to any functions" to any SFT examples that do not include functions. * Thinker version of the above template, has <think> included in the generation prompt * R1-style thinker template These 3 templates mirror our current Tulu templates Also includes some necessary changes to the --add_bos logic, to handle the new chat template which does not have a bos token. Includes a few other QoL fixes: * Fixes a bug in the olmocore tokenization script re: label mask * Logs dataset-level statistics during data mixing and tokenization * Supports easy upsampling during data mixing * Fixes from last PR (allenai#810) * fix up my (jacob's) slightly broken pr --------- Co-authored-by: jacob-morrison <[email protected]> * Delete run_repro.sh (allenai#813) * Fix disk space error on image creation (allenai#814) * remove moar things * create on pr * dont create on pr * use upstream stats --------- Signed-off-by: Yu Chin Fabian Lim <[email protected]> Co-authored-by: Finbarr Timbers <[email protected]> Co-authored-by: Hamish Ivison <[email protected]> Co-authored-by: Michael <[email protected]> Co-authored-by: Michael Noukhovitch <[email protected]> Co-authored-by: Saurabh Shah <[email protected]> Co-authored-by: Yu Chin Fabian Lim <[email protected]> Co-authored-by: Jacob Morrison <[email protected]>

new beaker names

1dbde0d

saurabh111233212 approved these changes Jul 21, 2025

View reviewed changes

hamishivi merged commit b3e8e70 into main Jul 21, 2025
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add new beaker names #803

Add new beaker names #803

Uh oh!

hamishivi commented Jul 21, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add new beaker names #803

Add new beaker names #803

Uh oh!

Conversation

hamishivi commented Jul 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

hamishivi commented Jul 21, 2025 •

edited

Loading