Skip to content

[filebeat] aws-s3 input falsely detects gzip file as a font #29968

@andrewkroh

Description

@andrewkroh

The aws-s3 input can fail to properly detect gzip files. We have a file that is falsely matched as application/vnd.ms-fontobject by https://pkg.go.dev/net/http#DetectContentType. I think we can replace the usage of that stdlib method with a more direct gzip magic number check. This would avoid the chance of falsely matching other signatures that are listed before gzip.

$ git diff s3_objects.go 
diff --git a/x-pack/filebeat/input/awss3/s3_objects.go b/x-pack/filebeat/input/awss3/s3_objects.go
index 7fe6b193fa..ebe1a5f082 100644
--- a/x-pack/filebeat/input/awss3/s3_objects.go
+++ b/x-pack/filebeat/input/awss3/s3_objects.go
@@ -15,7 +15,6 @@ import (
        "fmt"
        "io"
        "io/ioutil"
-       "net/http"
        "reflect"
        "strings"
        "time"
@@ -375,18 +374,13 @@ func s3ObjectHash(obj s3EventV2) string {
 // stream without consuming it. This makes it convenient for code executed after this function call
 // to consume the stream if it wants.
 func isStreamGzipped(r *bufio.Reader) (bool, error) {
-       // Why 512? See https://godoc.org/net/http#DetectContentType
-       buf, err := r.Peek(512)
+       buf, err := r.Peek(3)
        if err != nil && err != io.EOF {
                return false, err
        }
 
-       switch http.DetectContentType(buf) {
-       case "application/x-gzip", "application/zip":
-               return true, nil
-       default:
-               return false, nil
-       }
+       // gzip magic number (1f 8b) and the compression method (08 for DEFLATE).
+       return bytes.HasPrefix(buf, []byte{0x1F, 0x8B, 0x08}), nil
 }
 
 // s3Metadata returns a map containing the selected S3 object metadata keys.

Metadata

Metadata

Assignees

No one assigned

    Labels

    FilebeatFilebeatbugneeds_teamIndicates that the issue/PR needs a Team:* label

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions