-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
core: Add support for user-defined HTTP headers in NetworkReader
.
#568
core: Add support for user-defined HTTP headers in NetworkReader
.
#568
Conversation
WalkthroughThe changes involve modifications to the Changes
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few more comments to apply before we merge. Most of the comments in the round can be directly committed through GitHub if you want.
Also, can you mention in the PR description that this PR is the same as #564 and we had discussion in the previous PR?
Co-authored-by: Lin Zhihao <[email protected]>
Co-authored-by: Lin Zhihao <[email protected]>
Co-authored-by: Lin Zhihao <[email protected]>
Co-authored-by: Lin Zhihao <[email protected]>
Co-authored-by: Lin Zhihao <[email protected]>
Co-authored-by: Lin Zhihao <[email protected]>
…clp into xiaochong-add-custom-headers-v2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
🧹 Outside diff range and nitpick comments (7)
components/core/tests/test-NetworkReader.cpp (1)
195-222
: Consider addressing external test dependency and magic number.While the test is well-structured, there are two areas for improvement:
The test depends on an external service (httpbin.org). Consider:
- Adding retry logic for network failures
- Documenting the external dependency
- Adding a mock server for offline testing
The magic number
10
should be defined as a named constant with clear intent.Here's a suggested improvement:
+ // Number of test headers to generate for comprehensive coverage + static constexpr int cNumTestHeaders{10}; - constexpr int cNumHttpHeaderKeyValuePairs{10}; - for (size_t i{0}; i < cNumHttpHeaderKeyValuePairs; ++i) { + for (size_t i{0}; i < cNumTestHeaders; ++i) {components/core/src/clp/NetworkReader.hpp (1)
Line range hint
99-112
: Consider enhancing header parameter documentationWhile the documentation is accurate and references the CURL documentation appropriately, consider adding:
- Example usage with common headers
- Any header name restrictions or formatting requirements
- Impact of invalid header formats
Example addition:
* @param http_header_kv_pairs Key-value pairs representing HTTP headers to pass to the server - * in the download request. Doc: https://curl.se/libcurl/c/CURLOPT_HTTPHEADER.html + * in the download request. Headers should follow HTTP/1.1 format (e.g., {"Authorization": + * "Bearer token"}). Invalid header formats may cause the request to fail. + * Doc: https://curl.se/libcurl/c/CURLOPT_HTTPHEADER.htmlcomponents/core/src/clp/CurlDownloadHandler.cpp (5)
83-93
: Simplify the lambda function instd::transform
for clarityThe lambda function used in
std::transform
can be simplified by removing the explicit return type, as it can be inferred by the compiler. This enhances readability without affecting functionality.Apply this diff to simplify the lambda:
std::transform( lower_key.begin(), lower_key.end(), lower_key.begin(), - [](unsigned char c) -> char { + [](unsigned char c) { // Implicitly cast the input character into `unsigned char` to avoid UB: // https://en.cppreference.com/w/cpp/string/byte/tolower return static_cast<char>(std::tolower(c)); } );
76-110
: Handle potential exceptions from header processingWhen processing user-provided headers, consider wrapping the loop with a try-catch block to handle any unexpected exceptions that may arise, ensuring robustness of the constructor.
Example:
try { for (auto const& [key, value] : http_header_kv_pairs.value()) { // Existing header processing code } } catch (const std::exception& e) { // Handle exception, possibly by rethrowing or logging }
78-82
: Clarify the comment on duplicate header keysThe comment mentions that duplicate handling is left to the server. However, since
std::unordered_map
does not allow duplicate keys, consider revising the comment for accuracy.Apply this diff to update the comment:
// NOTE: We rely on the unordered_map to prevent duplicate keys. - // handling to the server.
83-83
: Prefer emplace over insert for efficiencyWhen initializing
lower_key
, consider usingstd::string lower_key{key};
directly without the braces for clarity.Apply this diff:
- auto lower_key{key}; + std::string lower_key = key;
61-68
: Use consistent naming for constantsConsider renaming the constants
cRangeHeaderName
,cCacheControlHeaderName
, andcPragmaHeaderName
to use uppercase letters and underscores, following the convention for constant expressions.Apply this diff to rename the constants:
- constexpr std::string_view cRangeHeaderName{"range"}; + constexpr std::string_view RANGE_HEADER_NAME{"range"};Repeat for the other constants.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (5)
components/core/src/clp/CurlDownloadHandler.cpp
(3 hunks)components/core/src/clp/CurlDownloadHandler.hpp
(3 hunks)components/core/src/clp/NetworkReader.cpp
(4 hunks)components/core/src/clp/NetworkReader.hpp
(5 hunks)components/core/tests/test-NetworkReader.cpp
(2 hunks)
🔇 Additional comments (10)
components/core/src/clp/CurlDownloadHandler.hpp (3)
8-8
: LGTM! Required includes for custom headers support
The addition of <optional>
and <unordered_map>
headers is necessary and properly placed in the include section.
Also applies to: 11-11
59-61
: LGTM! Well-documented parameter
The documentation for http_header_kv_pairs
is clear and includes a helpful reference to the libcurl documentation.
72-74
: Consider validating header names and values
While the implementation looks good, consider adding validation for header names and values to prevent injection of malformed headers.
Let's check if there's any validation in the implementation:
Consider implementing these validations:
- Header names should be valid HTTP header names
- Header values should not contain control characters
- Certain headers (like Host, Content-Length) should be restricted
components/core/tests/test-NetworkReader.cpp (1)
9-9
: LGTM! The new includes are properly organized and necessary.
The additions of unordered_map
, fmt
, and json
headers are well-placed and essential for the new test cases.
Also applies to: 14-15
components/core/src/clp/NetworkReader.cpp (4)
124-125
: LGTM: Constructor parameter addition is well-designed
The new optional parameter for HTTP headers is appropriately typed and follows good practices:
- Uses std::optional for optional headers
- Uses std::unordered_map for efficient key-value lookup
- Maintains backward compatibility by adding at the end
137-142
: LGTM: Proper parameter forwarding to DownloaderThread
The initialization correctly moves the headers map to avoid unnecessary copying while maintaining the existing initialization pattern.
227-228
: LGTM: Clean parameter propagation to CurlDownloadHandler
The headers are properly forwarded to the CurlDownloadHandler constructor.
Line range hint 226-237
: Verify thread-safety in error handling
The error handling block involves shared state modifications. Let's verify there are no potential race conditions in error reporting between threads.
✅ Verification successful
Let me gather more context about the error handling implementation to verify thread safety.
Thread safety is properly implemented in error handling
The error handling implementation is thread-safe due to:
- Error state variables (
m_state
andm_curl_ret_code
) are atomic - The error message buffer is protected by shared pointer semantics
- The ordering of state updates in
set_download_completion_status
is documented and properly synchronized - Buffer resource access is protected by mutex locks
- Condition variable notifications occur under mutex protection
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for all error handling patterns in the codebase
# to verify thread-safety in error reporting
# Look for error state modifications
rg -p "set_download_completion_status|m_curl_error_msg_buf" --type cpp
# Look for mutex usage around error handling
rg -p "std::mutex|std::lock_guard|std::unique_lock" --type cpp
Length of output: 1480
Script:
#!/bin/bash
# Look for the implementation of set_download_completion_status
ast-grep --pattern 'auto set_download_completion_status(CURLcode $_) -> void { $$$ }'
# Look for any other methods that might access the error state
rg -p "m_download_completed|m_curl_error_code" --type cpp -A 2
Length of output: 152
Script:
#!/bin/bash
# Look for the implementation details in NetworkReader.hpp and NetworkReader.cpp
cat components/core/src/clp/NetworkReader.hpp | rg -A 10 "set_download_completion_status"
# Check for any concurrent access patterns to the error state
cat components/core/src/clp/NetworkReader.cpp | rg -A 5 "set_download_completion_status"
Length of output: 1442
components/core/src/clp/NetworkReader.hpp (2)
16-16
: LGTM: Include statement properly placed
The addition of <unordered_map> is necessary for the HTTP headers map type and follows the project's include ordering convention.
Line range hint 251-272
: Verify consistent header handling in implementation
The header parameter addition looks good. However, we should verify:
- Header validation in the implementation
- Proper cleanup of CURL header structures
- Thread-safety of header handling
✅ Verification successful
Let me gather more information about the CURL header handling implementation.
Let me check one more aspect about the header validation and setup in the CURL handler.
HTTP header handling implementation is properly managed and thread-safe
The implementation shows robust header handling with:
- Proper RAII cleanup through
CurlStringList
wrapper that handlescurl_slist
memory management - Case-insensitive header key validation
- Thread-safe header handling as headers are stored per-instance
- Correct CURL option setup using
CURLOPT_HTTPHEADER
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for header-related validation and cleanup in the implementation
rg -A 5 "http_header_kv_pairs" components/core/src/clp/
Length of output: 6901
Script:
#!/bin/bash
# Search for CURL header setup and cleanup
ast-grep --pattern 'struct CurlDownloadHandler {
$$$
~CurlDownloadHandler() {
$$$
}
$$$
}'
# Look for header list creation and cleanup
rg -A 10 "curl_slist" components/core/src/clp/
Length of output: 3596
Script:
#!/bin/bash
# Search for header setup in CurlDownloadHandler
rg -A 15 "if \(http_header_kv_pairs\.has_value\(\)\)" components/core/src/clp/CurlDownloadHandler.cpp
# Look for CURL header option setup
rg "CURLOPT_HTTPHEADER" components/core/src/clp/
Length of output: 1652
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (1)
components/core/src/clp/CurlDownloadHandler.cpp
(3 hunks)
🔇 Additional comments (3)
components/core/src/clp/CurlDownloadHandler.cpp (3)
31-32
: Consider reordering constructor parameters for better logical grouping
The http_header_kv_pairs
parameter would be more logically placed before the timeout parameters since it's related to the request configuration rather than timing constraints.
61-68
: 🛠️ Refactor suggestion
Optimize reserved headers set initialization
Consider making the reserved_headers
set static to prevent reconstruction on each constructor call.
94-107
:
Enhance header injection prevention
The current implementation only checks for \r\n
at the end of header values. To prevent HTTP header injection attacks, consider checking for \r
or \n
anywhere in the value.
auto lower_key{key}; | ||
std::transform( | ||
lower_key.begin(), | ||
lower_key.end(), | ||
lower_key.begin(), | ||
[](unsigned char c) -> char { | ||
// Implicitly cast the input character into `unsigned char` to avoid UB: | ||
// https://en.cppreference.com/w/cpp/string/byte/tolower | ||
return static_cast<char>(std::tolower(c)); | ||
} | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Consider caching the lowercase key transformation
The lowercase transformation of the key is used only for comparison but discarded afterwards. If the same key needs to be processed multiple times (e.g., in error messages), consider caching the transformed value.
Apply this diff to optimize the key transformation:
- auto lower_key{key};
+ std::string lower_key;
+ lower_key.reserve(key.size());
+ std::transform(
key.begin(),
key.end(),
std::back_inserter(lower_key),
[](unsigned char c) -> char {
return static_cast<char>(std::tolower(c));
}
);
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
auto lower_key{key}; | |
std::transform( | |
lower_key.begin(), | |
lower_key.end(), | |
lower_key.begin(), | |
[](unsigned char c) -> char { | |
// Implicitly cast the input character into `unsigned char` to avoid UB: | |
// https://en.cppreference.com/w/cpp/string/byte/tolower | |
return static_cast<char>(std::tolower(c)); | |
} | |
); | |
std::string lower_key; | |
lower_key.reserve(key.size()); | |
std::transform( | |
key.begin(), | |
key.end(), | |
std::back_inserter(lower_key), | |
[](unsigned char c) -> char { | |
// Implicitly cast the input character into `unsigned char` to avoid UB: | |
// https://en.cppreference.com/w/cpp/string/byte/tolower | |
return static_cast<char>(std::tolower(c)); | |
} | |
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For PR title, how about:
core: Add support for user-defined HTTP headers in `NetworkReader`.
NetworkReader
.
NetworkReader
.NetworkReader
.
* ffi: Add support for serializing/deserializing auto-generated and user-generated schema tree node IDs. (y-scope#557) Co-authored-by: kirkrodrigues <[email protected]> * clp: Add missing C++ standard library includes in IR parsing files. (y-scope#561) Co-authored-by: kirkrodrigues <[email protected]> * log-viewer-webui: Update `yscope-log-viewer` to the latest version (which uses `clp-ffi-js`). (y-scope#562) * package: Upgrade dependencies to resolve security issues. (y-scope#536) * clp-s: Implement table packing (y-scope#466) Co-authored-by: wraymo <[email protected]> Co-authored-by: Kirk Rodrigues <[email protected]> Co-authored-by: wraymo <[email protected]> * log-viewer-webui: Update `yscope-log-viewer` to the latest version. (y-scope#565) * ci: Switch GitHub macOS build workflow to use macos-13 (x86) and macos-14 (ARM) runners. (y-scope#566) * core: Add support for user-defined HTTP headers in `NetworkReader`. (y-scope#568) Co-authored-by: Lin Zhihao <[email protected]> Co-authored-by: Xiaochong Wei <[email protected]> * chore: Update to the latest version of yscope-dev-utils. (y-scope#574) * build(core): Upgrade msgpack to v7.0.0. (y-scope#575) * feat(ffi): Update IR stream protocol version handling in preparation for releasing the kv-pair IR stream format: (y-scope#573) - Bump the IR stream protocol version to 0.1.0 for the kv-pair IR stream format. - Treat the previous IR stream format's versions as backwards compatible. - Differentiate between backwards-compatible and supported versions during validation. Co-authored-by: kirkrodrigues <[email protected]> * fix(taskfiles): Trim trailing slash from URL prefix in `download-and-extract-tar` (fixes y-scope#577). (y-scope#578) * fix(ffi): Correct `clp::ffi::ir_stream::Deserializer::deserialize_next_ir_unit`'s return value when failing to read the next IR unit's type tag. (y-scope#579) * fix(taskfiles): Update `yscope-log-viewer` sources in `log-viewer-webui-clients` sources list (fixes y-scope#576). (y-scope#580) * fix(cmake): Add Homebrew path detection for `mariadb-connector-c` to fix macOS build failure. (y-scope#582) Co-authored-by: kirkrodrigues <[email protected]> * refactor(ffi): Make `get_schema_subtree_bitmap` a public method of `KeyValuePairLogEvent`. (y-scope#581) * ci: Schedule GitHub workflows to daily run to detect failures due to upgraded dependencies or environments. (y-scope#583) * docs: Update the required version of task. (y-scope#567) * Add pr check workflow --------- Co-authored-by: kirkrodrigues <[email protected]> Co-authored-by: Junhao Liao <[email protected]> Co-authored-by: Henry8192 <[email protected]> Co-authored-by: Devin Gibson <[email protected]> Co-authored-by: wraymo <[email protected]> Co-authored-by: wraymo <[email protected]> Co-authored-by: Xiaochong(Eddy) Wei <[email protected]> Co-authored-by: Xiaochong Wei <[email protected]> Co-authored-by: haiqi96 <[email protected]>
…-scope#568) Co-authored-by: Lin Zhihao <[email protected]> Co-authored-by: Xiaochong Wei <[email protected]>
Description
Modified constructors of
NetworkReader
,DownloaderThread
andCurlDownloadHandler
to support custom headers passed by users.Note that this PR is the same as #564 , which was messed by a newbie contributor.
Validation performed
Added the Authorization header with a Bearer token to fetch a file from remote, validating the response.
Summary by CodeRabbit
New Features
CurlDownloadHandler
andNetworkReader
classes to support custom HTTP headers in download requests.Bug Fixes
Tests