[extension/awslogsencodingextension] Add source_region field (27th field) to S3 server access log parser and skip unknown future fields gracefully.#47149
Conversation
|
Welcome, contributor! Thank you for your contribution to opentelemetry-collector-contrib. Important reminders:
A maintainer will review your pull request soon. Thank you for helping make OpenTelemetry better! |
source_region field (27th field) to S3 server access log parser and skip unknown future fields gracefully.
d1ba73f to
1d47539
Compare
…region field and forward-compat for unknown fields AWS added source_region as the 27th field in S3 server access logs. The parser previously failed with "values in log line exceed the number of available fields" when it encountered 27+ fields. - Add fieldIndexSourceRegion (26) mapped to aws.s3.source_region - Skip fields beyond the known schema instead of erroring, so future AWS additions do not break the parser - Update test data and expected output accordingly
1d47539 to
5730900
Compare
axw
left a comment
There was a problem hiding this comment.
Thanks @raulmartinezr! Just a question about the attribute, otherwise LGTM.
| attributeAWSS3ObjectSize = "aws.s3.object.size" | ||
| attributeAWSS3TurnAroundTime = "aws.s3.turn_around_time" | ||
| attributeAWSS3AclRequired = "aws.s3.acl_required" | ||
| attributeAWSS3SourceRegion = "aws.s3.source_region" |
There was a problem hiding this comment.
Would the standard cloud.region attribute be appropriate here? Not as a resource attribute (that would more appropriate for identifying the S3 bucket region), but as a log record attribute.
There was a problem hiding this comment.
Per AWS documentation, source_region is defined as "the AWS Region from which the request originated" — i.e. the caller's region, not the resource's region. Mapping this to cloud.region (whose OTel semantics describe where a resource lives) would invert the meaning. aws.s3.source_region keeps the intent unambiguous.
There was a problem hiding this comment.
Mapping this to cloud.region (whose OTel semantics describe where a resource lives) would invert the meaning.
What it says is:
"When associated with a resource, this attribute specifies the region where the resource operates. When calling services or APIs deployed on a cloud, this attribute identifies the region where the called destination is deployed."
So it's not just where the resource (in the OTel data model sense) lives, but may also refer to where the target of an operation lives. What's missing is describing where the source of an operation lives.
Anyway, I'm OK with starting with aws.s3.source_region, but we may want to revisit this later on.
|
Thank you for your contribution @raulmartinezr! 🎉 We would like to hear from you about your experience contributing to OpenTelemetry by taking a few minutes to fill out this survey. If you are getting started contributing, you can also join the CNCF Slack channel #opentelemetry-new-contributors to ask for guidance and get help. |
…field) to S3 server access log parser and skip unknown future fields gracefully. (open-telemetry#47149) #### Description **Bug fix**: Add `source_region` field (27th field) to S3 server access log parser and skip unknown future fields gracefully. **Details:** AWS added a `source_region` field to the S3 server access log format. The parser previously returned an error ("values in log line exceed the number of available fields") when it encountered log lines with more fields than defined. This fix: - Adds `source_region` as field index 26 mapped to `aws.s3.source_region`. - Makes the parser skip any fields beyond the known schema instead of failing, providing forward compatibility with future AWS S3 access log additions. #### Link to tracking issue #### Testing #### Documentation --- *This PR description was auto-generated from the changelog entry.* --------- Co-authored-by: Andrew Wilkins <axwalk@gmail.com>
Description
Bug fix: Add
source_regionfield (27th field) to S3 server access log parser and skip unknown future fields gracefully.Details:
AWS added a
source_regionfield to the S3 server access log format. The parserpreviously returned an error ("values in log line exceed the number of available fields")
when it encountered log lines with more fields than defined. This fix:
source_regionas field index 26 mapped toaws.s3.source_region.providing forward compatibility with future AWS S3 access log additions.
Link to tracking issue
Testing
Documentation
This PR description was auto-generated from the changelog entry.