Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 17 additions & 0 deletions .chloggen/fix-s3-access-log-source-region-field.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
change_type: bug_fix

component: extension/aws_logs_encoding

note: Add `source_region` field (27th field) to S3 server access log parser and skip unknown future fields gracefully.

issues: [47149]

subtext: |
AWS added a `source_region` field to the S3 server access log format. The parser
previously returned an error ("values in log line exceed the number of available fields")
when it encountered log lines with more fields than defined. This fix:
- Adds `source_region` as field index 26 mapped to `aws.s3.source_region`.
- Makes the parser skip any fields beyond the known schema instead of failing,
providing forward compatibility with future AWS S3 access log additions.

change_logs: [user]
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ const (
attributeAWSS3ObjectSize = "aws.s3.object.size"
attributeAWSS3TurnAroundTime = "aws.s3.turn_around_time"
attributeAWSS3AclRequired = "aws.s3.acl_required"
attributeAWSS3SourceRegion = "aws.s3.source_region"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would the standard cloud.region attribute be appropriate here? Not as a resource attribute (that would more appropriate for identifying the S3 bucket region), but as a log record attribute.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per AWS documentation, source_region is defined as "the AWS Region from which the request originated" — i.e. the caller's region, not the resource's region. Mapping this to cloud.region (whose OTel semantics describe where a resource lives) would invert the meaning. aws.s3.source_region keeps the intent unambiguous.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mapping this to cloud.region (whose OTel semantics describe where a resource lives) would invert the meaning.

What it says is:

"When associated with a resource, this attribute specifies the region where the resource operates. When calling services or APIs deployed on a cloud, this attribute identifies the region where the called destination is deployed."

So it's not just where the resource (in the OTel data model sense) lives, but may also refer to where the target of an operation lives. What's missing is describing where the source of an operation lives.

Anyway, I'm OK with starting with aws.s3.source_region, but we may want to revisit this later on.


fieldIndexS3BucketOwner = 0
fieldIndexS3BucketName = 1
Expand Down Expand Up @@ -41,6 +42,7 @@ const (
fieldIndexTLSVersion = 23
fieldIndexAccessPointARN = 24
fieldIndexACLRequired = 25
fieldIndexSourceRegion = 26
)

// Some of the attribute names are based on semantic conventions for AWS S3.
Expand Down Expand Up @@ -75,4 +77,5 @@ var attributeNames = [...]string{
fieldIndexTLSVersion: string(conventions.TLSProtocolVersionKey), // TLS version
fieldIndexAccessPointARN: "aws.s3.access_point.arn", // access point ARN
fieldIndexACLRequired: attributeAWSS3AclRequired, // acl required
fieldIndexSourceRegion: attributeAWSS3SourceRegion, // source region
}
Original file line number Diff line number Diff line change
@@ -1 +1 @@
79a59df900b949e55d96a1e698fbacedfd6e09d98eacf8f8d5218e7cd47ef2be amzn-s3-demo-bucket1 [06/Feb/2019:00:00:38 +0000] 192.0.2.3 79a59df900b949e55d96a1e698fbacedfd6e09d98eacf8f8d5218e7cd47ef2be 3E57427F3EXAMPLE REST.GET.VERSIONING - "GET /amzn-s3-demo-bucket1?versioning HTTP/1.1" 200 - 113 - 7 - "-" "S3Console/0.4" - s9lzHYrFp76ZVxRcpX9+5cjAnEH2ROuNkd2BHfIa6UkFVdtjf5mKR3/eTPFvsiP/XV/VLi31234= SigV4 ECDHE-RSA-AES128-GCM-SHA256 AuthHeader amzn-s3-demo-bucket1.s3.us-west-1.amazonaws.com TLSV1.2 arn:aws:s3:us-west-1:123456789012:accesspoint/example-AP Yes TooMany
79a59df900b949e55d96a1e698fbacedfd6e09d98eacf8f8d5218e7cd47ef2be amzn-s3-demo-bucket1 [06/Feb/2019:00:00:38 +0000] 192.0.2.3 79a59df900b949e55d96a1e698fbacedfd6e09d98eacf8f8d5218e7cd47ef2be 3E57427F3EXAMPLE REST.GET.VERSIONING - "GET /amzn-s3-demo-bucket1?versioning HTTP/1.1" 200 - 113 - 7 - "-" "S3Console/0.4" - s9lzHYrFp76ZVxRcpX9+5cjAnEH2ROuNkd2BHfIa6UkFVdtjf5mKR3/eTPFvsiP/XV/VLi31234= SigV4 ECDHE-RSA-AES128-GCM-SHA256 AuthHeader amzn-s3-demo-bucket1.s3.us-west-1.amazonaws.com TLSV1.2 arn:aws:s3:us-west-1:123456789012:accesspoint/example-AP Yes us-east-1 ExtraUnknownField
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
resourceLogs:
- resource:
attributes:
- key: cloud.provider
value:
stringValue: aws
- key: aws.s3.bucket
value:
stringValue: amzn-s3-demo-bucket1
- key: aws.s3.owner
value:
stringValue: 79a59df900b949e55d96a1e698fbacedfd6e09d98eacf8f8d5218e7cd47ef2be
scopeLogs:
- logRecords:
- attributes:
- key: source.address
value:
stringValue: 192.0.2.3
- key: user.id
value:
stringValue: 79a59df900b949e55d96a1e698fbacedfd6e09d98eacf8f8d5218e7cd47ef2be
- key: aws.request_id
value:
stringValue: 3E57427F3EXAMPLE
- key: rpc.method
value:
stringValue: REST.GET.VERSIONING
- key: http.request.method
value:
stringValue: GET
- key: url.path
value:
stringValue: /amzn-s3-demo-bucket1
- key: url.query
value:
stringValue: versioning
- key: network.protocol.name
value:
stringValue: http
- key: network.protocol.version
value:
stringValue: "1.1"
- key: http.response.status_code
value:
intValue: "200"
- key: http.response.body.size
value:
intValue: "113"
- key: duration
value:
intValue: "7"
- key: user_agent.original
value:
stringValue: S3Console/0.4
- key: aws.extended_request_id
value:
stringValue: s9lzHYrFp76ZVxRcpX9+5cjAnEH2ROuNkd2BHfIa6UkFVdtjf5mKR3/eTPFvsiP/XV/VLi31234=
- key: aws.signature.version
value:
stringValue: SigV4
- key: tls.cipher
value:
stringValue: ECDHE-RSA-AES128-GCM-SHA256
- key: aws.s3.auth_type
value:
stringValue: AuthHeader
- key: http.request.header.host
value:
stringValue: amzn-s3-demo-bucket1.s3.us-west-1.amazonaws.com
- key: tls.protocol.version
value:
stringValue: "1.2"
- key: aws.s3.access_point.arn
value:
stringValue: arn:aws:s3:us-west-1:123456789012:accesspoint/example-AP
- key: aws.s3.acl_required
value:
boolValue: true
- key: aws.s3.source_region
value:
stringValue: us-east-1
body: {}
timeUnixNano: "1549411238000000000"
scope:
attributes:
- key: encoding.format
value:
stringValue: aws.s3access
name: github.com/open-telemetry/opentelemetry-collector-contrib/extension/encoding/awslogsencodingextension
Original file line number Diff line number Diff line change
@@ -1 +1 @@
79a59df900b949e55d96a1e698fbacedfd6e09d98eacf8f8d5218e7cd47ef2be amzn-s3-demo-bucket1 [06/Feb/2019:00:00:38 +0000] 192.0.2.3 79a59df900b949e55d96a1e698fbacedfd6e09d98eacf8f8d5218e7cd47ef2be 3E57427F3EXAMPLE REST.GET.VERSIONING - "GET /amzn-s3-demo-bucket1?versioning HTTP/1.1" 200 - 113 - 7 - "-" "S3Console/0.4" - s9lzHYrFp76ZVxRcpX9+5cjAnEH2ROuNkd2BHfIa6UkFVdtjf5mKR3/eTPFvsiP/XV/VLi31234= SigV4 ECDHE-RSA-AES128-GCM-SHA256 AuthHeader amzn-s3-demo-bucket1.s3.us-west-1.amazonaws.com TLSV1.2 arn:aws:s3:us-west-1:123456789012:accesspoint/example-AP Yes
79a59df900b949e55d96a1e698fbacedfd6e09d98eacf8f8d5218e7cd47ef2be amzn-s3-demo-bucket1 [06/Feb/2019:00:00:38 +0000] 192.0.2.3 79a59df900b949e55d96a1e698fbacedfd6e09d98eacf8f8d5218e7cd47ef2be 3E57427F3EXAMPLE REST.GET.VERSIONING - "GET /amzn-s3-demo-bucket1?versioning HTTP/1.1" 200 - 113 - 7 - "-" "S3Console/0.4" - s9lzHYrFp76ZVxRcpX9+5cjAnEH2ROuNkd2BHfIa6UkFVdtjf5mKR3/eTPFvsiP/XV/VLi31234= SigV4 ECDHE-RSA-AES128-GCM-SHA256 AuthHeader amzn-s3-demo-bucket1.s3.us-west-1.amazonaws.com TLSV1.2 arn:aws:s3:us-west-1:123456789012:accesspoint/example-AP Yes us-east-1
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,9 @@ resourceLogs:
- key: aws.s3.acl_required
value:
boolValue: true
- key: aws.s3.source_region
value:
stringValue: us-east-1
body: {}
timeUnixNano: "1549411238000000000"
scope:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -176,10 +176,6 @@ func handleLog(resourceAttr *resourceAttributes, scopeLogs plog.ScopeLogs, log s
var value string
var err error
for i = 0; remaining != ""; i++ {
if i >= len(attributeNames) {
return errors.New("values in log line exceed the number of available fields")
}

value, remaining, err = scanField(remaining)
if err != nil {
if errors.Is(err, io.EOF) {
Expand All @@ -188,6 +184,14 @@ func handleLog(resourceAttr *resourceAttributes, scopeLogs plog.ScopeLogs, log s
return err
}

if i >= len(attributeNames) {
// Skip unknown fields for forward compatibility with future AWS S3
// access log format additions. AWS may append new fields without a
// format version bump.
// See https://docs.aws.amazon.com/AmazonS3/latest/userguide/LogFormat.html.
continue
}

if value == unknownField && i != fieldIndexACLRequired {
// acl required field can be '-' to indicate that no ACL was required
continue
Expand All @@ -207,7 +211,7 @@ func handleLog(resourceAttr *resourceAttributes, scopeLogs plog.ScopeLogs, log s
}
}

if i != fieldIndexACLRequired+1 {
if i < fieldIndexACLRequired+1 {
return errors.New("values in log line are less than the number of available fields")
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -200,8 +200,10 @@ func TestUnmarshalLogs(t *testing.T) {
expectedErr: "values in log line are less than the number of available fields",
},
"too_many_values": {
logFilename: "too_many_values.log",
expectedErr: "values in log line exceed the number of available fields",
// Extra fields beyond the known schema are silently skipped for
// forward compatibility with future AWS S3 access log additions.
logFilename: "too_many_values.log",
expectedFilename: "too_many_values_expected.yaml",
},
}

Expand Down
Loading