Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HETERO] Support splitting new graph pattern for pipeline parallel and correct the number of submodels #25224

Conversation

WeldonWangwang
Copy link
Contributor

@WeldonWangwang WeldonWangwang commented Jun 26, 2024

Details:

  • Fix qwen1.5-14b-chat with HETERO pipeline parallelism
    Add supported to patten:
    ReadValue->Gather->Concat
                 |------>ShapeOf(fused on other different affinity node) ->.... 
    
  • Correct the value of HETERO_NUMBER_OF_SUBMODELS by subtracting the number of independent submodels to reduce confusion

Tickets:

  • ticket-id

@github-actions github-actions bot added category: inference OpenVINO Runtime library - Inference category: HETERO OpenVINO HETERO plugin labels Jun 26, 2024
@WeldonWangwang WeldonWangwang force-pushed the wangwang/Fix_qwen1.5-14b-chat branch from aa1e1c0 to 09c2109 Compare June 28, 2024 02:44
@peterchen-intel peterchen-intel added this to the 2024.3 milestone Jul 1, 2024
@WeldonWangwang WeldonWangwang force-pushed the wangwang/Fix_qwen1.5-14b-chat branch from 67c96a4 to 7d1b5bc Compare July 1, 2024 03:25
@WeldonWangwang WeldonWangwang marked this pull request as ready for review July 1, 2024 03:26
@WeldonWangwang WeldonWangwang requested review from a team as code owners July 1, 2024 03:26
@WeldonWangwang WeldonWangwang changed the title Fix qwen1.5-14b-chat [HETERO] Fix qwen1.5-14b-chat with pipeline parallel and the number of submodels Jul 1, 2024
@peterchen-intel peterchen-intel changed the title [HETERO] Fix qwen1.5-14b-chat with pipeline parallel and the number of submodels [HETERO] Support splitting new graph pattern for pipeline parallel and correct the number of submodels Jul 3, 2024
bool is_shapeof = ov::is_type<op::util::ShapeOfBase>(op);
if (((fused_model_op_map.find(name) != fused_model_op_map.end()) || is_shapeof) &&
supported.count(name)) {
if ((!supported.count(fused_model_op_map[name]) || is_shapeof) &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems this is a special case(shape of) of codes line 401 to 408?
maybe we can consider optimize in future?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a problem that occurs when recursively looping through the entire graph, move shape of case to L401-L408 can not fix this issues, we will try the simpler way to optimize this API

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please create a ticket to follow up if there is a way to make this solution more common.

@peterchen-intel peterchen-intel enabled auto-merge July 5, 2024 02:19
@peterchen-intel peterchen-intel added this pull request to the merge queue Jul 5, 2024
Merged via the queue into openvinotoolkit:master with commit a5c0d67 Jul 5, 2024
122 checks passed
ieliz pushed a commit to ieliz/openvino that referenced this pull request Jul 5, 2024
…d correct the number of submodels (openvinotoolkit#25224)

### Details:
 - Fix qwen1.5-14b-chat with HETERO pipeline parallelism
     Add supported to patten:
     ```
     ReadValue->Gather->Concat
|------>ShapeOf(fused on other different affinity node) ->....
     ```
- Correct the value of HETERO_NUMBER_OF_SUBMODELS by subtracting the
number of independent submodels to reduce confusion

### Tickets:
 - *ticket-id*
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: HETERO OpenVINO HETERO plugin category: inference OpenVINO Runtime library - Inference
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants