Skip to content

Commit

Permalink
Add e2e dev scripts (#549)
Browse files Browse the repository at this point in the history
* Add e2e dev scripts

* Add some examples of end to end model training

* quick push
  • Loading branch information
vwxyzjn authored Feb 4, 2025
1 parent 6d97283 commit 0c2aa44
Show file tree
Hide file tree
Showing 3 changed files with 1,832 additions and 0 deletions.
15 changes: 15 additions & 0 deletions docs/ai2_internal.md
Original file line number Diff line number Diff line change
Expand Up @@ -306,6 +306,21 @@ done
```
### End-to-end Model Training
For post-training, we often need to train the models throughout all 3 stages. The rough steps are as follows:
1. Run a sweep of SFT training and use the internal leaderboard https://huggingface.co/spaces/allenai/oe-eval-leaderboard to select the best model.
2. Run a sweep of DPO training and select the best model.
3. Based on the best DPO model, use its dataset to train an RM.
4. Use the best DPO (and RM) to train an RLVR model.
We have some example dev scripts on the whole process in the `docs/archived_dev_scripts` directory. Note that these scripts are not cleaned up like [docs/tulu3.md](docs/tulu3.md), but they are useful for reference.
* docs/archived_dev_scripts/olmo2_1124.sh (the commands used to produce [OLMo 2 1124](https://huggingface.co/collections/allenai/olmo-2-674117b93ab84e98afc72edc))
* docs/archived_dev_scripts/olmoe_0125.sh (the commands used to produce [OLMoE 0125](https://huggingface.co/collections/allenai/olmoe-0125-67992134f9ebea0a941706ca))
### Ai2 Internal Evaluation
Expand Down
Loading

0 comments on commit 0c2aa44

Please sign in to comment.