fix sequence parallelism conflict in kimiVL#1899
fix sequence parallelism conflict in kimiVL#1899vermouth1992 merged 7 commits intoverl-project:mainfrom
Conversation
Signed-off-by: ShareLer <ShareLe@163.com>
Signed-off-by: ShareLer <ShareLe@163.com>
Signed-off-by: ShareLer <ShareLe@163.com>
Signed-off-by: ShareLer <ShareLe@163.com>
Signed-off-by: ShareLer <ShareLe@163.com>
Signed-off-by: ShareLer <ShareLe@163.com>
eric-haibin-lin
left a comment
There was a problem hiding this comment.
thanks! would u mind adding a unit test in tests/models/test_transformers_ulysses.py that reproduce this error?
### Checklist Before Starting - [x] Search for similar PR(s). ### What does this PR do? Fix sequence parallelism conflict in kimiVL patch. Background: A recent VLM-related PR(verl-project#1739 ) has modified the sequence parallelism logic of VLM: Split inputs_embeds after the model's embedding layer instand of spliting input_ids and position_ids before forward. However, the SP logic I implemented in KimiVL's PR(verl-project#1639 ) was still implemented in accordance with the old logic. And split the image token at the combination of image_token and text_token to avoid the problem of 'the Image features and image tokens do not match'. Since these two PR were developed in parallel which led to logical conflicts after the PR were merged. ### High-Level Design > Demonstrate the high-level design if this PR is complex. ### Specific Changes - Delete the patch for _merge_with_image_features which to assign the image token to the corresponding SP rank. - Adjust the processing related to position_ids in _ulysses_flash_attn_forward. ### API > Demonstrate how the API changes if any. ### Usage Example > Provide usage example(s) for easier usage. ```python # Add code snippet or script demonstrating how to use this ``` ### Test  ### Additional Info. - **Issue Number**: Fixes issue # or discussion # if any. - **Training**: [Note which backend this PR will affect: FSDP, Megatron, both, or none] - **Inference**: [Note which backend this PR will affect: vLLM, SGLang, both, or none] ### Checklist Before Submitting - [ ] Read the [Contribute Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide). - [ ] Apply [pre-commit checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting). - [ ] Add `[BREAKING]` to the PR title if it breaks any API. - [ ] Update the documentation about your changes in the [docs](https://github.com/volcengine/verl/tree/main/docs). - [ ] New CI unit test(s) are added to cover the code path. - [ ] Rely on existing unit tests on CI that covers the code path. --------- Signed-off-by: ShareLer <ShareLe@163.com>
It might not be possible to add the unit test of KimiVL because the unit tests in tests/models/test_transformers_ulysses.py rely on importing the model's config from transformers (ie: 'KimiVLConfig') to create a mock model by It is necessary to add ulysses-related unit tests for VLM. Perhaps I can first try to add a unit test related to ulysses for QwenVL. |
|
@ShareLer The unittest for qwen2vl ulysses already exists https://github.com/volcengine/verl/blob/85fef90d518577b44abca3015249c6cde52b0be9/.github/workflows/e2e_ppo_trainer.yml#L200-L208 I think current state is sufficient |
### Checklist Before Starting - [x] Search for similar PR(s). ### What does this PR do? Fix sequence parallelism conflict in kimiVL patch. Background: A recent VLM-related PR(verl-project#1739 ) has modified the sequence parallelism logic of VLM: Split inputs_embeds after the model's embedding layer instand of spliting input_ids and position_ids before forward. However, the SP logic I implemented in KimiVL's PR(verl-project#1639 ) was still implemented in accordance with the old logic. And split the image token at the combination of image_token and text_token to avoid the problem of 'the Image features and image tokens do not match'. Since these two PR were developed in parallel which led to logical conflicts after the PR were merged. ### High-Level Design > Demonstrate the high-level design if this PR is complex. ### Specific Changes - Delete the patch for _merge_with_image_features which to assign the image token to the corresponding SP rank. - Adjust the processing related to position_ids in _ulysses_flash_attn_forward. ### API > Demonstrate how the API changes if any. ### Usage Example > Provide usage example(s) for easier usage. ```python # Add code snippet or script demonstrating how to use this ``` ### Test  ### Additional Info. - **Issue Number**: Fixes issue # or discussion # if any. - **Training**: [Note which backend this PR will affect: FSDP, Megatron, both, or none] - **Inference**: [Note which backend this PR will affect: vLLM, SGLang, both, or none] ### Checklist Before Submitting - [ ] Read the [Contribute Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide). - [ ] Apply [pre-commit checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting). - [ ] Add `[BREAKING]` to the PR title if it breaks any API. - [ ] Update the documentation about your changes in the [docs](https://github.com/volcengine/verl/tree/main/docs). - [ ] New CI unit test(s) are added to cover the code path. - [ ] Rely on existing unit tests on CI that covers the code path. --------- Signed-off-by: ShareLer <ShareLe@163.com>
### Checklist Before Starting - [x] Search for similar PR(s). ### What does this PR do? Fix sequence parallelism conflict in kimiVL patch. Background: A recent VLM-related PR(verl-project#1739 ) has modified the sequence parallelism logic of VLM: Split inputs_embeds after the model's embedding layer instand of spliting input_ids and position_ids before forward. However, the SP logic I implemented in KimiVL's PR(verl-project#1639 ) was still implemented in accordance with the old logic. And split the image token at the combination of image_token and text_token to avoid the problem of 'the Image features and image tokens do not match'. Since these two PR were developed in parallel which led to logical conflicts after the PR were merged. ### High-Level Design > Demonstrate the high-level design if this PR is complex. ### Specific Changes - Delete the patch for _merge_with_image_features which to assign the image token to the corresponding SP rank. - Adjust the processing related to position_ids in _ulysses_flash_attn_forward. ### API > Demonstrate how the API changes if any. ### Usage Example > Provide usage example(s) for easier usage. ```python # Add code snippet or script demonstrating how to use this ``` ### Test  ### Additional Info. - **Issue Number**: Fixes issue # or discussion # if any. - **Training**: [Note which backend this PR will affect: FSDP, Megatron, both, or none] - **Inference**: [Note which backend this PR will affect: vLLM, SGLang, both, or none] ### Checklist Before Submitting - [ ] Read the [Contribute Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide). - [ ] Apply [pre-commit checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting). - [ ] Add `[BREAKING]` to the PR title if it breaks any API. - [ ] Update the documentation about your changes in the [docs](https://github.com/volcengine/verl/tree/main/docs). - [ ] New CI unit test(s) are added to cover the code path. - [ ] Rely on existing unit tests on CI that covers the code path. --------- Signed-off-by: ShareLer <ShareLe@163.com>
### Checklist Before Starting - [x] Search for similar PR(s). ### What does this PR do? Fix sequence parallelism conflict in kimiVL patch. Background: A recent VLM-related PR(verl-project#1739 ) has modified the sequence parallelism logic of VLM: Split inputs_embeds after the model's embedding layer instand of spliting input_ids and position_ids before forward. However, the SP logic I implemented in KimiVL's PR(verl-project#1639 ) was still implemented in accordance with the old logic. And split the image token at the combination of image_token and text_token to avoid the problem of 'the Image features and image tokens do not match'. Since these two PR were developed in parallel which led to logical conflicts after the PR were merged. ### High-Level Design > Demonstrate the high-level design if this PR is complex. ### Specific Changes - Delete the patch for _merge_with_image_features which to assign the image token to the corresponding SP rank. - Adjust the processing related to position_ids in _ulysses_flash_attn_forward. ### API > Demonstrate how the API changes if any. ### Usage Example > Provide usage example(s) for easier usage. ```python # Add code snippet or script demonstrating how to use this ``` ### Test  ### Additional Info. - **Issue Number**: Fixes issue # or discussion # if any. - **Training**: [Note which backend this PR will affect: FSDP, Megatron, both, or none] - **Inference**: [Note which backend this PR will affect: vLLM, SGLang, both, or none] ### Checklist Before Submitting - [ ] Read the [Contribute Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide). - [ ] Apply [pre-commit checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting). - [ ] Add `[BREAKING]` to the PR title if it breaks any API. - [ ] Update the documentation about your changes in the [docs](https://github.com/volcengine/verl/tree/main/docs). - [ ] New CI unit test(s) are added to cover the code path. - [ ] Rely on existing unit tests on CI that covers the code path. --------- Signed-off-by: ShareLer <ShareLe@163.com>
### Checklist Before Starting - [x] Search for similar PR(s). ### What does this PR do? Fix sequence parallelism conflict in kimiVL patch. Background: A recent VLM-related PR(verl-project#1739 ) has modified the sequence parallelism logic of VLM: Split inputs_embeds after the model's embedding layer instand of spliting input_ids and position_ids before forward. However, the SP logic I implemented in KimiVL's PR(verl-project#1639 ) was still implemented in accordance with the old logic. And split the image token at the combination of image_token and text_token to avoid the problem of 'the Image features and image tokens do not match'. Since these two PR were developed in parallel which led to logical conflicts after the PR were merged. ### High-Level Design > Demonstrate the high-level design if this PR is complex. ### Specific Changes - Delete the patch for _merge_with_image_features which to assign the image token to the corresponding SP rank. - Adjust the processing related to position_ids in _ulysses_flash_attn_forward. ### API > Demonstrate how the API changes if any. ### Usage Example > Provide usage example(s) for easier usage. ```python # Add code snippet or script demonstrating how to use this ``` ### Test  ### Additional Info. - **Issue Number**: Fixes issue # or discussion # if any. - **Training**: [Note which backend this PR will affect: FSDP, Megatron, both, or none] - **Inference**: [Note which backend this PR will affect: vLLM, SGLang, both, or none] ### Checklist Before Submitting - [ ] Read the [Contribute Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide). - [ ] Apply [pre-commit checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting). - [ ] Add `[BREAKING]` to the PR title if it breaks any API. - [ ] Update the documentation about your changes in the [docs](https://github.com/volcengine/verl/tree/main/docs). - [ ] New CI unit test(s) are added to cover the code path. - [ ] Rely on existing unit tests on CI that covers the code path. --------- Signed-off-by: ShareLer <ShareLe@163.com>
Checklist Before Starting
What does this PR do?
Fix sequence parallelism conflict in kimiVL patch.
Background:
A recent VLM-related PR(#1739 ) has modified the sequence parallelism logic of VLM: Split inputs_embeds after the model's embedding layer instand of spliting input_ids and position_ids before forward.
However, the SP logic I implemented in KimiVL's PR(#1639 ) was still implemented in accordance with the old logic. And split the image token at the combination of image_token and text_token to avoid the problem of 'the Image features and image tokens do not match'.
Since these two PR were developed in parallel which led to logical conflicts after the PR were merged.
High-Level Design
Specific Changes
API
Usage Example
# Add code snippet or script demonstrating how to use thisTest
Additional Info.
Checklist Before Submitting
[BREAKING]to the PR title if it breaks any API.