Skip to content

Plugin EP: Fix bug that incorrectly assigned duplicate MetDef IDs to fused nodes in different GraphViews#27666

Merged
adrianlizarraga merged 1 commit intomainfrom
adrianl/PluginEp_CompiledSubgraph_UniqueMetaDefName_Fix
Mar 16, 2026
Merged

Plugin EP: Fix bug that incorrectly assigned duplicate MetDef IDs to fused nodes in different GraphViews#27666
adrianlizarraga merged 1 commit intomainfrom
adrianl/PluginEp_CompiledSubgraph_UniqueMetaDefName_Fix

Conversation

@adrianlizarraga
Copy link
Copy Markdown
Contributor

@adrianlizarraga adrianlizarraga commented Mar 15, 2026

Description

Fixes a bug where PluginExecutionProvider::GetCapability() incorrectly assigned duplicate MetaDef IDs to fused nodes that live in different GraphViewer instances (e.g., the then/else branches of an If node).

The root cause was that GetCapability() created a new ModelMetadefIdGenerator on every invocation. Since the graph partitioner calls GetCapability() once per subgraph, the generator's monotonic counter reset each time, producing colliding IDs across subgraphs. This caused session creation to fail with:

Failed to add kernel for example_ep_9433721956998717990_0 example_ep example_ep: Conflicting with a registered kernel with op versions. the since version is: 1

Fix

  • Promoted ModelMetadefIdGenerator to an instance member of PluginExecutionProvider so the same generator is reused across all GetCapability() calls, ensuring unique MetaDef IDs.
    • This is also consistent with how existing provider-bridge EPs create and use a single generator instance.
    • Bonus perf improvement: No longer recomputes the entire model's hash on every call to GetCapability().

Testing

Example EP changes:

  • Refactored SaveConstantInitializers()TrySaveConstantInitializer() to save initializers per-node-input instead of via graph.GetInitializers(), which doesn't return initializers defined in parent or sibling subgraphs.
  • Extracted CopiesConstantInitializers() helper to deduplicate the condition for drop_constant_initializers.

Unit testing:

  • Added unit test called CompilingPluginEp_MultiSubgraphs_DuplicateMetaDefIdBug — runs an If model with Mul nodes in both branches, verifying that both fused nodes receive unique MetaDef IDs and the session creates/runs successfully.

Credit to @apwojcik for finding the bug.

…fused nodes that live in different GraphViews (e.g., different branch of an If node)
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes a bug where PluginExecutionProvider::GetCapability() created a new ModelMetadefIdGenerator per invocation, causing duplicate MetaDef IDs for fused nodes across different subgraphs (e.g., If branches). Also refactors the example plugin EP's initializer saving logic.

Changes:

  • Promoted ModelMetadefIdGenerator from a local variable in GetCapability() to an instance member of PluginExecutionProvider to ensure unique IDs across calls.
  • Refactored example EP's SaveConstantInitializers() into CopiesConstantInitializers() and TrySaveConstantInitializer() to handle initializers from parent/sibling subgraphs.
  • Added unit test CompilingPluginEp_MultiSubgraphs_DuplicateMetaDefIdBug verifying the fix with an If model.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.

Show a summary per file
File Description
onnxruntime/core/session/plugin_ep/ep_plugin_provider_interfaces.h Added metadef_id_generator_ as instance member
onnxruntime/core/session/plugin_ep/ep_plugin_provider_interfaces.cc Use instance member instead of local generator; moved include to header
onnxruntime/test/autoep/library/example_plugin_ep/ep.h Replaced SaveConstantInitializers with CopiesConstantInitializers and TrySaveConstantInitializer
onnxruntime/test/autoep/library/example_plugin_ep/ep.cc Implemented refactored methods; moved initializer saving to CompileImpl per-input
onnxruntime/test/autoep/test_execution.cc Added regression test for duplicate MetaDef ID bug

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

@adrianlizarraga adrianlizarraga merged commit 929f73e into main Mar 16, 2026
95 checks passed
@adrianlizarraga adrianlizarraga deleted the adrianl/PluginEp_CompiledSubgraph_UniqueMetaDefName_Fix branch March 16, 2026 16:05
tianleiwu pushed a commit that referenced this pull request Mar 16, 2026
…fused nodes in different GraphViews (#27666)

Fixes a bug where `PluginExecutionProvider::GetCapability()` incorrectly
assigned duplicate MetaDef IDs to fused nodes that live in different
GraphViewer instances (e.g., the then/else branches of an If node).

The root cause was that `GetCapability()` created a new
`ModelMetadefIdGenerator` on every invocation. Since the graph
partitioner calls `GetCapability()` once per subgraph, the generator's
monotonic counter reset each time, producing colliding IDs across
subgraphs. This caused session creation to fail with:

> Failed to add kernel for example_ep_9433721956998717990_0 example_ep
example_ep: Conflicting with a registered kernel with op versions. the
since version is: 1

- Promoted `ModelMetadefIdGenerator` to an instance member of
`PluginExecutionProvider` so the same generator is reused across all
`GetCapability()` calls, ensuring unique MetaDef IDs.
- This is also consistent with how existing provider-bridge EPs create
and use a single generator instance.
- **Bonus perf improvement**: No longer recomputes the entire model's
hash on every call to `GetCapability()`.

Example EP changes:
- Refactored `SaveConstantInitializers()` →
`TrySaveConstantInitializer()` to save initializers per-node-input
instead of via `graph.GetInitializers()`, which doesn't return
initializers defined in parent or sibling subgraphs.
- Extracted `CopiesConstantInitializers()` helper to deduplicate the
condition for drop_constant_initializers.

Unit testing:
- Added unit test called
`CompilingPluginEp_MultiSubgraphs_DuplicateMetaDefIdBug` — runs an If
model with Mul nodes in both branches, verifying that both fused nodes
receive unique MetaDef IDs and the session creates/runs successfully.

Credit to @apwojcik for [finding the
bug.](#27608)
tianleiwu added a commit that referenced this pull request Mar 16, 2026
This cherry-picks the following commits for the release:

| Commit ID | PR Number | Commit Title |
|-----------|-----------|-------------|
| eb23be8 | #27354 | Update python_requires |
| d626b56 | #27479 | [QNN EP] Enable offline x64 compilation with
memhandle IO type |
| 60ce0e6 | #27607 | Use `_tpause` instead of `__builtin_ia32_tpause`
|
| 69feb84 | #27591 | Add PCI bus fallback for Linux GPU device
discovery in containerized environments |
| de92668 | #27650 | Revert "[QNN EP] Fix error messages being logged
as VERBOSE instead o… |
| 0f66526 | #27644 | [Plugin EP] Check for nullptr before
dereferencing |
| 929f73e | #27666 | Plugin EP: Fix bug that incorrectly assigned
duplicate MetDef IDs to fused nodes in different GraphViews |

---------

Co-authored-by: XXXXRT666 <157766680+XXXXRT666@users.noreply.github.com>
Co-authored-by: derdeljan-msft <derdeljan@microsoft.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Shogo Yamazaki <f9ifphmiz7i8akhowc8l5t1x9qp0lfu4@mocknen.net>
Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com>
Co-authored-by: baijumeswani <12852605+baijumeswani@users.noreply.github.com>
Co-authored-by: edgchen1 <18449977+edgchen1@users.noreply.github.com>
Co-authored-by: Baiju Meswani <bmeswani@microsoft.com>
Co-authored-by: Artur Wojcik <artur.wojcik@amd.com>
Co-authored-by: Adrian Lizarraga <adlizarraga@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants