http_client: Add Ability to Process Http Chunked Stream#8316
Merged
edsiper merged 2 commits intofluent:masterfrom Mar 15, 2024
Merged
http_client: Add Ability to Process Http Chunked Stream#8316edsiper merged 2 commits intofluent:masterfrom
edsiper merged 2 commits intofluent:masterfrom
Conversation
Contributor
Author
ryanohnemus
commented
Dec 21, 2023
| int chunked_encoding; /* Chunked transfer encoding ? */ | ||
| int connection_close; /* connection: close ? */ | ||
| long chunked_cur_size; | ||
| long chunked_exp_size; /* expected chunked size */ |
Contributor
Author
There was a problem hiding this comment.
these were completely unused excluding in a mocked aws test, so I cleaned this up.
d1fb005 to
b5f2267
Compare
Current flb_http_do processes chunked streams, but requires that all chunks are received before allowing interaction with the response payload. This change allows a user to only initiate the http request with flb_http_do_request and then process the live stream of data by fetching available chunks with flb_http_get_available_chunks. Signed-off-by: ryanohnemus <ryanohnemus@gmail.com>
b5f2267 to
fc0ebc8
Compare
6 tasks
Member
|
thanks for this. I suggest merging this for 3.0 release (March) due to API changes. |
Signed-off-by: Eduardo Silva <eduardo@calyptia.com>
Member
|
fixed minor conflict, waiting for CI |
Member
|
note CI issue is a timing thing in macos, not real issue |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Currently
flb_http_doprocesses chunked encoded responses, but requires that all chunks are received before allowing interaction with the response payload.This change allows a user to only initiate the http request with
flb_http_do_requestand then process the stream of data with calls toflb_http_get_response_data.This is needed to be able to process HTTP/1.1 chunk-encoded streams that do not normally end (or are very long lived). Ex: Kubernetes watches use http chunk-encoded streams that continuously send updates as chunks until the session expires.
A new flb_http internal code
FLB_HTTP_CHUNK_AVAILABLEwas added to signify that there are received chunks but there is more data to be received from upstream. The current data received is in the client.resp.payload but it is not guaranteed to be a full application message, so users implementing streamed chunk processing will need to account for it by checking their own end of message semantics and accounting for it with the bytes_consumed logic highlighted in the usage section below.The end of stream's "end chunk" will continue to return
FLB_HTTP_OKwhich signifies the end of the chunked transfer.Details on usage:
flb_http_do_request- new method added with it's only purpose as sending the http request, does not process any response dataflb_http_get_response_data- this will handle the data recv & chunk processing that was originally a part offlb_http_do, the difference is it allows you to pass in a bytes_consumed param. The idea behind this is that any bytes_consumed will be removed from the current http payload portion. This is extremely useful as it allows you to process chunked streams using the internal http buffer without having to copy chunks outside of the http connection. It also allows you to leave max buffer size logic within the http client code, and have your connection error out if you've let the buffer grow too far. Standardflb_http_dowill loop through flb_http_get_response_data always passing bytes_consumed=0 as it does not consume any of the chunked data until the end chunk(end of stream) when the entire response is available within theclient.resp.payload. Future users (see mocked usage example below), will be able to process parts of the payload as chunks become available and inform the http_client to clean up those bytes withbytes_consumedbefore it attempts to read more chunks from upstream.I included documentation in the code and found this to be a fairly clear way to understand how to use this method:
flb_http_do- will work the same as today, the internals of the call were updated to useflb_http_do_requestwithflb_http_get_response_datauntil the end chunk is received. I did however change when an http_client can not increase buffer because it would go past max size, now returns aFLB_HTTP_ERRORinstead of 0.Code Example:
This is pseudocode-ish but working with my minikube k8s config included later.
Testing
Before we can approve your change; please submit the following in a comment:
Debug log output from testing the change
Attached Valgrind output that shows no leaks or memory corruption was found
I ran
run_code_analysis.shwith:If this is a change to packaging of containers or native binaries then please confirm it works for all targets.
ok-package-testlabel to test for all targets (requires maintainer to do).Documentation
Backporting
Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.