Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 36 additions & 6 deletions process/process.go
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ import (
"io"
"os"
"strings"
"sync"

"golang.org/x/sys/unix"

Expand All @@ -34,6 +35,22 @@ type systemProcess struct {

var _ Process = &systemProcess{}

var bufPool sync.Pool

// mappingParseBufferSize defines the initial buffer size used to store lines from
// /proc/PID/maps during parsing of mappings.

const mappingParseBufferSize = 256

func init() {
bufPool = sync.Pool{
New: func() any {
buf := make([]byte, mappingParseBufferSize)
return &buf
},
}
}

// New returns an object with Process interface accessing it
func New(pid libpf.PID) Process {
return &systemProcess{
Expand Down Expand Up @@ -63,10 +80,20 @@ func trimMappingPath(path string) string {
}

func parseMappings(mapsFile io.Reader) ([]Mapping, error) {
mappings := make([]Mapping, 0)
mappings := make([]Mapping, 0, 32)
scanner := bufio.NewScanner(mapsFile)
buf := make([]byte, 512)
scanner.Buffer(buf, 8192)
scanBuf := bufPool.Get().(*[]byte)
if scanBuf == nil {
return mappings, errors.New("failed to get memory from sync pool")
}
defer func() {
// Reset memory and return it for reuse.
for j := 0; j < len(*scanBuf); j++ {
Comment thread
rockdaboot marked this conversation as resolved.
(*scanBuf)[j] = 0x0
}
Comment thread
rockdaboot marked this conversation as resolved.
Comment on lines +90 to +93
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On a second thought... why clear the memory at all here? parseMapping() does not rely on the contents of the buffer retrieved from the pool. The scanner read data and only acts on the read data in the buffer. It doesn't matter if we fill the bufferswith 0 bytes or random bytes or leave it as is.

Suggested change
// Reset memory and return it for reuse.
for j := 0; j < len(*scanBuf); j++ {
(*scanBuf)[j] = 0x0
}

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The scanner, that uses this buffer, uses the default split function ScanLines. By not clearing the buffer, we might introduce subtle bugs.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we might introduce subtle bugs

Possibly yes. Can you add a test case that triggers such a bug. That makes it obvious to if someone ever touches that code or wonders if it is required.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I can't see how bufio.Scanner.Scan() uses ScanLines().

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we don't set a specific Split function, the buffer uses the default https://pkg.go.dev/bufio#NewScanner.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have reverted the applied suggestion, as I should not have applied in the first place. And I also reasked for review.
I have reverted the applied suggestion as resetting a buffer before reusing is something I will ask in a review, until it is shown that the operation can be done correctly with every state of the buffer. And this proposed change does not show that it will work as supposed with every state of the shared buffer.

Copy link
Copy Markdown
Contributor

@rockdaboot rockdaboot Feb 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still didn't see any argument why or what exactly can fail. Without an explicit statement it is not possible to create a test case. Can't we please go with the facts and read the code instead of having some vague "fear"?

Copy link
Copy Markdown
Contributor

@rockdaboot rockdaboot Feb 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I said, the std library looks sound to me and doesn't require resetting the buffer to 0. Additionally, we can prove that empirically with a comparison fuzzer:

diff --git a/process/process_test.go b/process/process_test.go
index 3cd2b7d..f64700c 100644
--- a/process/process_test.go
+++ b/process/process_test.go
@@ -4,6 +4,7 @@
 package process
 
 import (
+       "bytes"
        "debug/elf"
        "os"
        "strings"
@@ -105,3 +106,16 @@ func TestNewPIDOfSelf(t *testing.T) {
        require.NoError(t, err)
        assert.NotEmpty(t, mappings)
 }
+
+func FuzzParseMappings(f *testing.F) {
+       f.Fuzz(func(t *testing.T, input []byte) {
+               r1 := bytes.NewReader(input)
+               m1, err1 := parseMappings(r1)
+
+               r2 := bytes.NewReader(input)
+               m2, err2 := parseMappingsNoReset(r2)
+
+               require.Equal(t, err1, err2)
+               require.Equal(t, m1, m2)
+       })
+}

After ~30min, there is no indication that resetting the buffer makes any difference.

...
fuzz: elapsed: 31m45s, execs: 219880737 (97874/sec), new interesting: 77 (total: 149)
fuzz: elapsed: 31m48s, execs: 220278902 (132716/sec), new interesting: 77 (total: 149)
fuzz: elapsed: 31m51s, execs: 220549371 (90139/sec), new interesting: 77 (total: 149)

Unexpected result from running the fuzzer: The agent exits if it fails to parse fields to integer values, see #366

Copy link
Copy Markdown
Member Author

@florianl florianl Feb 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will likely close this PR without merge, as there is no consensus.
For me, reused memory should always be cleared unless it can be shown that not clearing the memory does not interfere with expected behavior. And this PR does not show, that the function works as expected with every state of reused and not reseted memory.
Clearing memory before reusing it prevents a whole class of memory related bugs and I don't want to be the person, introducing such a change.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we now have all the facts on the table. Why not wait for other reviews and then follow whatever the consensus is?

bufPool.Put(scanBuf)
}()
scanner.Buffer(*scanBuf, 8192)
for scanner.Scan() {
var fields [6]string
var addrs [2]string
Expand Down Expand Up @@ -150,12 +177,15 @@ func (sp *systemProcess) GetMappings() ([]Mapping, error) {

mappings, err := parseMappings(mapsFile)
if err == nil {
fileToMapping := make(map[string]*Mapping)
fileToMapping := make(map[string]*Mapping, len(mappings))
for idx := range mappings {
m := &mappings[idx]
if m.Inode != 0 {
fileToMapping[m.Path] = m
if m.Inode == 0 {
// Ignore mappings that are invalid,
// non-existent or are special pseudo-files.
continue
}
fileToMapping[m.Path] = m
}
sp.fileToMapping = fileToMapping
}
Expand Down