whisper: Support a variant of the whisper pipeline where encoder / decoder are stateful.#1857
Merged
kunal-vaishnavi merged 5 commits intoDec 4, 2025
Conversation
9c5203a to
fdb7fae
Compare
kunal-vaishnavi
approved these changes
Nov 30, 2025
Contributor
Author
|
Thanks @kunal-vaishnavi -- can you trigger a re-run of the checks? I don't think that this MacOS failure is related to this changeset. |
a55d04b to
acbbd95
Compare
Contributor
Author
|
@kunal-vaishnavi -- any insight into why this MacOS build is failing? I also see similar failures in another PR: #1900 I don't think it's related to these changesets, so perhaps there is some instability in MacOS builds currently? |
Contributor
|
I am still investigating. The CI failure is in the latest commit on the main branch and in an older commit as well. |
acbbd95 to
3496b55
Compare
…derState::HasCrossKVCacheOutputs()
3496b55 to
7633be2
Compare
Contributor
Author
|
@kunal-vaishnavi -- I rebased against latest main that included MacOS fixes, and pipeline is passing now. |
kunal-vaishnavi
pushed a commit
that referenced
this pull request
Dec 5, 2025
…coder are stateful. (#1857) This will mainly be used to support EPCtx-wrapped OpenVINO IR's -- the same set of models that are compatible with OpenVINO GenAI. In this case, unlike the default pipeline, the encoder doesn't pull in KV cross projection -- it only outputs encoder_hidden_states. For the decoder portion of the pipeline, the cross & self KV cache tensors are managed internal to the model (attached to the session).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This will mainly be used to support EPCtx-wrapped OpenVINO IR's -- the same set of models that are compatible with OpenVINO GenAI.
In this case, unlike the default pipeline, the encoder doesn't pull in KV cross projection -- it only outputs encoder_hidden_states.
For the decoder portion of the pipeline, the cross & self KV cache tensors are managed internal to the model (attached to the session).