Skip to content

[GRPOTrainer bug fix] a little bug in completions with bootstrap#4442

Closed
SolarWindRider wants to merge 6 commits into
huggingface:mainfrom
SolarWindRider:feature
Closed

[GRPOTrainer bug fix] a little bug in completions with bootstrap#4442
SolarWindRider wants to merge 6 commits into
huggingface:mainfrom
SolarWindRider:feature

Conversation

@SolarWindRider

@SolarWindRider SolarWindRider commented Nov 3, 2025

Copy link
Copy Markdown
Contributor

1. Fix a little bug in GRPOTrainer.

When inputs are conversational, bootstrap is a list not str, causing a str concat error.

I add a bootstrap type check to fix the bug while maintain the original program

2. Add a new GRPO family Trainer

@SolarWindRider SolarWindRider changed the title fix a little bug in completions with bootstrap [GRPOTrainer bug fix]fix a little bug in completions with bootstrap Nov 3, 2025
@SolarWindRider SolarWindRider changed the title [GRPOTrainer bug fix]fix a little bug in completions with bootstrap [GRPOTrainer bug fix] a little bug in completions with bootstrap Nov 3, 2025
@qgallouedec

Copy link
Copy Markdown
Member

thanks, can you slit this PR into two PRs instead please?

@SolarWindRider

Copy link
Copy Markdown
Contributor Author

thanks, can you slit this PR into two PRs instead please?

Sure, I split the grpotrainer bug fix in there:
#4452

@SolarWindRider SolarWindRider deleted the feature branch November 5, 2025 07:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants