refactor: stop sending request bodies twice to extproc#636
Merged
Conversation
wengyao04
approved these changes
May 23, 2025
mathetake
added a commit
that referenced
this pull request
Jun 10, 2025
**Commit Message** Previously, AWS signing has included the "content-length" header. However, Envoy's extproc filter strips it from the request as we are using CONTINUE_AND_REPLACE option to reduce the memory overhead. While we have still no clue as to why AWS doesn't complain when the request body is small, excluding content-length from the signing target headers will make the tests with both small and large bodies pass. **Related Issues/PRs (if applicable)** CONTINUE_AND_REPLACE was introduced in #636 to avoid sending a request body twice between Envoy and the ExtProc. --------- Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>
yuzisun
pushed a commit
to yuzisun/ai-gateway
that referenced
this pull request
Jun 10, 2025
**Commit Message** Previously, AWS signing has included the "content-length" header. However, Envoy's extproc filter strips it from the request as we are using CONTINUE_AND_REPLACE option to reduce the memory overhead. While we have still no clue as to why AWS doesn't complain when the request body is small, excluding content-length from the signing target headers will make the tests with both small and large bodies pass. **Related Issues/PRs (if applicable)** CONTINUE_AND_REPLACE was introduced in envoyproxy#636 to avoid sending a request body twice between Envoy and the ExtProc. --------- Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>
yuzisun
added a commit
that referenced
this pull request
Jun 10, 2025
**Commit Message** Previously, AWS signing has included the "content-length" header. However, Envoy's extproc filter strips it from the request as we are using CONTINUE_AND_REPLACE option to reduce the memory overhead. While we have still no clue as to why AWS doesn't complain when the request body is small, excluding content-length from the signing target headers will make the tests with both small and large bodies pass. **Related Issues/PRs (if applicable)** CONTINUE_AND_REPLACE was introduced in #636 to avoid sending a request body twice between Envoy and the ExtProc. Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com> Co-authored-by: Takeshi Yoneda <t.y.mathetake@gmail.com>
mathetake
added a commit
that referenced
this pull request
Jun 27, 2025
**Description** This reverts 65ca02a. It was introduced to avoid sending request bodies twice to the extrpoc. However, the current implementation of extproc in Envoy side will ALWAYS remove content-length header hence result in forcing the chunked transfer-encoding. That causes some issue with some AI providers as described in #721 for example. **Related Issues/PRs (if applicable)** Reverts #636 Fixes #721 --------- Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Commit Message
Previously, request body phase was unnecessarily called for upstream filter extproc, hence the entire request body was traveling from Envoy to the extproc at least twice (+ retry). This does the small refactoring of the extproc code so that it will utilize CONTINUE_AND_REPLACE status on the request headers phase since it can access the original body saved in memory at the routing phase. This helps the memory pressure reduction.
Related Issues/PRs (if applicable)