Skip to content

[extension/awslogsencodingextension] Add source_region field (27th field) to S3 server access log parser and skip unknown future fields gracefully.#47149

Merged
edmocosta merged 6 commits into
open-telemetry:mainfrom
raulmartinezr:fix/s3-access-log-source-region-field
Apr 2, 2026
Merged

[extension/awslogsencodingextension] Add source_region field (27th field) to S3 server access log parser and skip unknown future fields gracefully.#47149
edmocosta merged 6 commits into
open-telemetry:mainfrom
raulmartinezr:fix/s3-access-log-source-region-field

Conversation

@raulmartinezr

@raulmartinezr raulmartinezr commented Mar 24, 2026

Copy link
Copy Markdown
Contributor

Description

Bug fix: Add source_region field (27th field) to S3 server access log parser and skip unknown future fields gracefully.

Details:

AWS added a source_region field to the S3 server access log format. The parser
previously returned an error ("values in log line exceed the number of available fields")
when it encountered log lines with more fields than defined. This fix:

  • Adds source_region as field index 26 mapped to aws.s3.source_region.
  • Makes the parser skip any fields beyond the known schema instead of failing,
    providing forward compatibility with future AWS S3 access log additions.

Link to tracking issue

Testing

Documentation


This PR description was auto-generated from the changelog entry.

@raulmartinezr raulmartinezr requested review from a team and axw as code owners March 24, 2026 11:39
@linux-foundation-easycla

linux-foundation-easycla Bot commented Mar 24, 2026

Copy link
Copy Markdown

CLA Signed

The committers listed above are authorized under a signed CLA.

@github-actions github-actions Bot added the first-time contributor PRs made by new contributors label Mar 24, 2026
@github-actions

Copy link
Copy Markdown
Contributor

Welcome, contributor! Thank you for your contribution to opentelemetry-collector-contrib.

Important reminders:

A maintainer will review your pull request soon. Thank you for helping make OpenTelemetry better!

@github-actions github-actions Bot changed the title as per changelog [extension/awslogsencodingextension] Add source_region field (27th field) to S3 server access log parser and skip unknown future fields gracefully. Mar 24, 2026
@raulmartinezr raulmartinezr force-pushed the fix/s3-access-log-source-region-field branch from d1ba73f to 1d47539 Compare March 24, 2026 11:41
…region field and forward-compat for unknown fields

AWS added source_region as the 27th field in S3 server access logs.
The parser previously failed with "values in log line exceed the number
of available fields" when it encountered 27+ fields.

- Add fieldIndexSourceRegion (26) mapped to aws.s3.source_region
- Skip fields beyond the known schema instead of erroring, so future
  AWS additions do not break the parser
- Update test data and expected output accordingly
@raulmartinezr raulmartinezr force-pushed the fix/s3-access-log-source-region-field branch from 1d47539 to 5730900 Compare March 24, 2026 11:41

@axw axw left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @raulmartinezr! Just a question about the attribute, otherwise LGTM.

attributeAWSS3ObjectSize = "aws.s3.object.size"
attributeAWSS3TurnAroundTime = "aws.s3.turn_around_time"
attributeAWSS3AclRequired = "aws.s3.acl_required"
attributeAWSS3SourceRegion = "aws.s3.source_region"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would the standard cloud.region attribute be appropriate here? Not as a resource attribute (that would more appropriate for identifying the S3 bucket region), but as a log record attribute.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per AWS documentation, source_region is defined as "the AWS Region from which the request originated" — i.e. the caller's region, not the resource's region. Mapping this to cloud.region (whose OTel semantics describe where a resource lives) would invert the meaning. aws.s3.source_region keeps the intent unambiguous.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mapping this to cloud.region (whose OTel semantics describe where a resource lives) would invert the meaning.

What it says is:

"When associated with a resource, this attribute specifies the region where the resource operates. When calling services or APIs deployed on a cloud, this attribute identifies the region where the called destination is deployed."

So it's not just where the resource (in the OTel data model sense) lives, but may also refer to where the target of an operation lives. What's missing is describing where the source of an operation lives.

Anyway, I'm OK with starting with aws.s3.source_region, but we may want to revisit this later on.

Comment thread .chloggen/fix-s3-access-log-source-region-field.yaml Outdated
@axw axw added the ready to merge Code review completed; ready to merge by maintainers label Apr 2, 2026
@edmocosta edmocosta merged commit c61aa3f into open-telemetry:main Apr 2, 2026
207 checks passed
@otelbot

otelbot Bot commented Apr 2, 2026

Copy link
Copy Markdown
Contributor

Thank you for your contribution @raulmartinezr! 🎉 We would like to hear from you about your experience contributing to OpenTelemetry by taking a few minutes to fill out this survey. If you are getting started contributing, you can also join the CNCF Slack channel #opentelemetry-new-contributors to ask for guidance and get help.

araiu pushed a commit to araiu/opentelemetry-collector-contrib that referenced this pull request Apr 6, 2026
…field) to S3 server access log parser and skip unknown future fields gracefully. (open-telemetry#47149)

#### Description

**Bug fix**: Add `source_region` field (27th field) to S3 server access
log parser and skip unknown future fields gracefully.

**Details:**

AWS added a `source_region` field to the S3 server access log format.
The parser
previously returned an error ("values in log line exceed the number of
available fields")
when it encountered log lines with more fields than defined. This fix:
- Adds `source_region` as field index 26 mapped to
`aws.s3.source_region`.
- Makes the parser skip any fields beyond the known schema instead of
failing,
providing forward compatibility with future AWS S3 access log additions.

#### Link to tracking issue

#### Testing

#### Documentation

---
*This PR description was auto-generated from the changelog entry.*

---------

Co-authored-by: Andrew Wilkins <axwalk@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

extension/encoding/awslogsencoding first-time contributor PRs made by new contributors ready to merge Code review completed; ready to merge by maintainers

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants