Description
This PR #1506 introduces a new executable script: run_grpo.py.
The main purpose is to consolidate and generalize functionality that is currently split across run_grpo_math.py and run_grpo_rm.py.
After #1506 successful integration and confirmation of completeness, the obsolete scripts will be removed in a separate PR.
Related PRs
#1506