Skip to content

fix: entropy in DAPO#1652

Merged
vermouth1992 merged 1 commit intoverl-project:mainfrom
tongyx361:tyx/fix/entropy-in-dapo
May 23, 2025
Merged

fix: entropy in DAPO#1652
vermouth1992 merged 1 commit intoverl-project:mainfrom
tongyx361:tyx/fix/entropy-in-dapo

Conversation

@tongyx361
Copy link
Copy Markdown
Collaborator

Checklist Before Starting

  • Search for similar PR(s).

What does this PR do?

This PR adds entropy computation and logging to DAPO trainer, aligning with other trainers.

Additional Info.

Checklist Before Submitting

  • Read the Contribute Guide.
  • Apply pre-commit checks.
  • Add [BREAKING] to the PR title if it breaks any API.
  • Update the documentation about your changes in the docs.
  • Add CI test(s) if necessary.

@vermouth1992 vermouth1992 merged commit a7b2e29 into verl-project:main May 23, 2025
12 checks passed
@tongyx361 tongyx361 mentioned this pull request May 24, 2025
7 tasks
ETOgaosion pushed a commit that referenced this pull request May 24, 2025
### Checklist Before Starting

- [x] Search for similar PR(s).

### What does this PR do?

This PR fixes:

- DAPO CI triggering path patterns outdated since #1392
- `response_mask` computation missing but skipping the CI test in #1652 

### Tests

- [x] DAPO CI is correctly triggered and passed, e.g.,
https://github.com/volcengine/verl/actions/runs/15223958183/job/42823610223?pr=1666

### Additional Info.

- **Issue Number**: #1392 , #1652 
- **Training**: none
- **Inference**: none

### Checklist Before Submitting

- [x] Read the [Contribute
Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide).
- [x] Apply [pre-commit
checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting).
- [x] Add `[BREAKING]` to the PR title if it breaks any API.
- [x] Update the documentation about your changes in the
[docs](https://github.com/volcengine/verl/tree/main/docs).
- [x] Add CI test(s) if necessary.
ETOgaosion pushed a commit to Jianbing-D/verl that referenced this pull request Jun 8, 2025
### Checklist Before Starting

- [x] Search for similar PR(s).

### What does this PR do?

This PR adds entropy computation and logging to DAPO trainer, aligning
with other trainers.

### Additional Info.

- **Issue Number**: verl-project#1455
- **Training**: none
- **Inference**: none

### Checklist Before Submitting

- [x] Read the [Contribute
Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide).
- [x] Apply [pre-commit
checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting).
- [x] Add `[BREAKING]` to the PR title if it breaks any API.
- [x] Update the documentation about your changes in the
[docs](https://github.com/volcengine/verl/tree/main/docs).
- [x] Add CI test(s) if necessary.
ETOgaosion pushed a commit to Jianbing-D/verl that referenced this pull request Jun 8, 2025
### Checklist Before Starting

- [x] Search for similar PR(s).

### What does this PR do?

This PR fixes:

- DAPO CI triggering path patterns outdated since verl-project#1392
- `response_mask` computation missing but skipping the CI test in verl-project#1652 

### Tests

- [x] DAPO CI is correctly triggered and passed, e.g.,
https://github.com/volcengine/verl/actions/runs/15223958183/job/42823610223?pr=1666

### Additional Info.

- **Issue Number**: verl-project#1392 , verl-project#1652 
- **Training**: none
- **Inference**: none

### Checklist Before Submitting

- [x] Read the [Contribute
Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide).
- [x] Apply [pre-commit
checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting).
- [x] Add `[BREAKING]` to the PR title if it breaks any API.
- [x] Update the documentation about your changes in the
[docs](https://github.com/volcengine/verl/tree/main/docs).
- [x] Add CI test(s) if necessary.
wwwjn pushed a commit to wwwjn/verl that referenced this pull request Jun 10, 2025
### Checklist Before Starting

- [x] Search for similar PR(s).

### What does this PR do?

This PR adds entropy computation and logging to DAPO trainer, aligning
with other trainers.

### Additional Info.

- **Issue Number**: verl-project#1455
- **Training**: none
- **Inference**: none

### Checklist Before Submitting

- [x] Read the [Contribute
Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide).
- [x] Apply [pre-commit
checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting).
- [x] Add `[BREAKING]` to the PR title if it breaks any API.
- [x] Update the documentation about your changes in the
[docs](https://github.com/volcengine/verl/tree/main/docs).
- [x] Add CI test(s) if necessary.
wwwjn pushed a commit to wwwjn/verl that referenced this pull request Jun 10, 2025
### Checklist Before Starting

- [x] Search for similar PR(s).

### What does this PR do?

This PR fixes:

- DAPO CI triggering path patterns outdated since verl-project#1392
- `response_mask` computation missing but skipping the CI test in verl-project#1652 

### Tests

- [x] DAPO CI is correctly triggered and passed, e.g.,
https://github.com/volcengine/verl/actions/runs/15223958183/job/42823610223?pr=1666

### Additional Info.

- **Issue Number**: verl-project#1392 , verl-project#1652 
- **Training**: none
- **Inference**: none

### Checklist Before Submitting

- [x] Read the [Contribute
Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide).
- [x] Apply [pre-commit
checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting).
- [x] Add `[BREAKING]` to the PR title if it breaks any API.
- [x] Update the documentation about your changes in the
[docs](https://github.com/volcengine/verl/tree/main/docs).
- [x] Add CI test(s) if necessary.
chenjiaoAngel added a commit to chenjiaoAngel/verl that referenced this pull request Nov 14, 2025
### Checklist Before Starting

- [x] Search for similar PR(s).

### What does this PR do?

This PR adds entropy computation and logging to DAPO trainer, aligning
with other trainers.

### Additional Info.

- **Issue Number**: verl-project#1455
- **Training**: none
- **Inference**: none

### Checklist Before Submitting

- [x] Read the [Contribute
Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide).
- [x] Apply [pre-commit
checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting).
- [x] Add `[BREAKING]` to the PR title if it breaks any API.
- [x] Update the documentation about your changes in the
[docs](https://github.com/volcengine/verl/tree/main/docs).
- [x] Add CI test(s) if necessary.
chenjiaoAngel added a commit to chenjiaoAngel/verl that referenced this pull request Nov 14, 2025
### Checklist Before Starting

- [x] Search for similar PR(s).

### What does this PR do?

This PR fixes:

- DAPO CI triggering path patterns outdated since verl-project#1392
- `response_mask` computation missing but skipping the CI test in verl-project#1652 

### Tests

- [x] DAPO CI is correctly triggered and passed, e.g.,
https://github.com/volcengine/verl/actions/runs/15223958183/job/42823610223?pr=1666

### Additional Info.

- **Issue Number**: verl-project#1392 , verl-project#1652 
- **Training**: none
- **Inference**: none

### Checklist Before Submitting

- [x] Read the [Contribute
Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide).
- [x] Apply [pre-commit
checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting).
- [x] Add `[BREAKING]` to the PR title if it breaks any API.
- [x] Update the documentation about your changes in the
[docs](https://github.com/volcengine/verl/tree/main/docs).
- [x] Add CI test(s) if necessary.
TimurTaepov pushed a commit to giorgossideris/verl that referenced this pull request Dec 20, 2025
### Checklist Before Starting

- [x] Search for similar PR(s).

### What does this PR do?

This PR adds entropy computation and logging to DAPO trainer, aligning
with other trainers.

### Additional Info.

- **Issue Number**: verl-project#1455
- **Training**: none
- **Inference**: none

### Checklist Before Submitting

- [x] Read the [Contribute
Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide).
- [x] Apply [pre-commit
checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting).
- [x] Add `[BREAKING]` to the PR title if it breaks any API.
- [x] Update the documentation about your changes in the
[docs](https://github.com/volcengine/verl/tree/main/docs).
- [x] Add CI test(s) if necessary.
TimurTaepov pushed a commit to giorgossideris/verl that referenced this pull request Dec 20, 2025
### Checklist Before Starting

- [x] Search for similar PR(s).

### What does this PR do?

This PR fixes:

- DAPO CI triggering path patterns outdated since verl-project#1392
- `response_mask` computation missing but skipping the CI test in verl-project#1652 

### Tests

- [x] DAPO CI is correctly triggered and passed, e.g.,
https://github.com/volcengine/verl/actions/runs/15223958183/job/42823610223?pr=1666

### Additional Info.

- **Issue Number**: verl-project#1392 , verl-project#1652 
- **Training**: none
- **Inference**: none

### Checklist Before Submitting

- [x] Read the [Contribute
Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide).
- [x] Apply [pre-commit
checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting).
- [x] Add `[BREAKING]` to the PR title if it breaks any API.
- [x] Update the documentation about your changes in the
[docs](https://github.com/volcengine/verl/tree/main/docs).
- [x] Add CI test(s) if necessary.
vyomakesh0728 added a commit to vyomakesh0728/verl that referenced this pull request Jan 22, 2026
### Checklist Before Starting

- [x] Search for similar PR(s).

### What does this PR do?

This PR adds entropy computation and logging to DAPO trainer, aligning
with other trainers.

### Additional Info.

- **Issue Number**: verl-project#1455
- **Training**: none
- **Inference**: none

### Checklist Before Submitting

- [x] Read the [Contribute
Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide).
- [x] Apply [pre-commit
checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting).
- [x] Add `[BREAKING]` to the PR title if it breaks any API.
- [x] Update the documentation about your changes in the
[docs](https://github.com/volcengine/verl/tree/main/docs).
- [x] Add CI test(s) if necessary.
vyomakesh0728 added a commit to vyomakesh0728/verl that referenced this pull request Jan 22, 2026
### Checklist Before Starting

- [x] Search for similar PR(s).

### What does this PR do?

This PR fixes:

- DAPO CI triggering path patterns outdated since verl-project#1392
- `response_mask` computation missing but skipping the CI test in verl-project#1652 

### Tests

- [x] DAPO CI is correctly triggered and passed, e.g.,
https://github.com/volcengine/verl/actions/runs/15223958183/job/42823610223?pr=1666

### Additional Info.

- **Issue Number**: verl-project#1392 , verl-project#1652 
- **Training**: none
- **Inference**: none

### Checklist Before Submitting

- [x] Read the [Contribute
Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide).
- [x] Apply [pre-commit
checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting).
- [x] Add `[BREAKING]` to the PR title if it breaks any API.
- [x] Update the documentation about your changes in the
[docs](https://github.com/volcengine/verl/tree/main/docs).
- [x] Add CI test(s) if necessary.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants