Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improved fork handling #35

Merged
merged 16 commits into from
May 19, 2022
Merged

improved fork handling #35

merged 16 commits into from
May 19, 2022

Conversation

kkourt
Copy link
Contributor

@kkourt kkourt commented May 18, 2022

See commits.

@kkourt kkourt force-pushed the pr/kkourt/fork-handling branch 6 times, most recently from 302ace8 to 9dadf0e Compare May 18, 2022 13:40
@kkourt
Copy link
Contributor Author

kkourt commented May 18, 2022

static failures are:

Warning: WARNING:PREFER_DEFINED_ATTRIBUTE_MACRO: Prefer __always_inline over attribute((always_inline))

Let's keep those as is for now for consistency with existing code.

@kkourt kkourt requested a review from jrfastab May 18, 2022 13:50
@kkourt kkourt force-pushed the pr/kkourt/fork-handling branch from 5977972 to b02bae9 Compare May 19, 2022 08:53
kkourt and others added 16 commits May 19, 2022 11:20
We have a lot of different C programs under contrib/.  Move them all to
a single directory.

[commit: 10b6e9d]

Signed-off-by: Kornilios Kourtis <[email protected]>
Add a fork-tester program to illustrate issues for processes that fork()
but do not exec().

For these cases, we do not have process information because this is
created in the exec event.

So far, we seem to handle these cases by iterating the process tree and
finding the parent of the process (see event_find_curr()). After a
fork(), the parent has the same properties (other than the pid) as the
parent so the above approach more or less works.

Unless the parent dies. In this case, the process is inherited by pid 1,
which means that the process information we get is the init process
(e.g., systemd).

The fork-tester program implements the above scenario. First it spawns a
child (chlid1) which also spawns a child (child2). child1 then exits,
and child2 is passwed to init 1. Then child2 does a connect.

fork-tester output:
$ ./contrib/tester-progs/fork-tester
parent: (pid:146611, ppid:102949) starts
child 1 (pid:146612) exits
parent: (pid:146611) child (146612) exited with: 0
child 2 (pid:146613, ppid:1) connecting to 8.8.8.8:53
child 2 done

generated event:
{
  "process_connect": {
    "process": {
      "exec_id": "OjExMDAwMDAwMDox",
      "pid": 1,
      "uid": 0,
      "cwd": "/",
      "binary": "/usr/lib/systemd/systemd",
      "arguments": "splash",
      "flags": "procFS auid rootcwd",
      "start_time": "2022-05-03T06:41:48.244Z",
      "auid": 0,
      "refcnt": 138
    },
    "parent": {},
    "source_ip": "192.168.0.66",
    "source_port": 43546,
    "destination_ip": "8.8.8.8",
    "destination_port": 53,
    "protocol": "TCP"
  },
  "time": "2022-05-05T10:48:53.581Z"
}

This commit just illustrates the issue. Upcoming patches will attempt
to fix it.

[commit: 879e179]

Signed-off-by: Kornilios Kourtis <[email protected]>
We load the current pid, only to be overwritten in the loop.
Remove the unnecessary load.

[commit: 9296588]

Signed-off-by: Kornilios Kourtis <[email protected]>
factor out __event_find_parent from event_find_parent.
__event_find_parent is to be used in a subsequent patch.

[commit: d966aaa]

Signed-off-by: Kornilios Kourtis <[email protected]>
Preparation patch for subsequent changes.

[commit: 3e1419e]

Signed-off-by: Kornilios Kourtis <[email protected]>
This is in preparation for a subsequent commit that will add another
type of event to the cache.

[commit: 7c58724]

Signed-off-by: Kornilios Kourtis <[email protected]>
As discussed in a previous commit, processes that do not fork() might end
up with wrong information.

This patch introduces a CLONE event. The purpose of this event is to
allow the agent to have proper process information for new processes
(i.e., ones created with fork()) that to do not call execve().

The idea is to pass as much information as needed to user-space to
create a new Process based on the parent, Because fork() will just copy
everything from the parent process, we just need a way to find the
parent, the new pids, and the new ktime so that we can build a new exec
id.

Note that CLONE events do not generate gRPC events -- they only update
the internal state of the agent. The reason is that most processes will
do an exec() after a fork(). The exec() will completely overwrite the
process, so having a preceding CLONE event is useless.

We could also generate a gRPC event for the creation of a new process,
lazily if another event happens. For example, if there is a fork()
followed by a connect() (no exec), we could just generate a gRPC event
to before the connect (at the agent level). This was not implemented to
keep things as simple as possible, but we can revisit in the future if
needed.

fork-tester output:
$ ./contrib/tester-progs/fork-tester
parent: (pid:147500, ppid:102949) starts
child 1 (pid:147501) exits
parent: (pid:147500) child (147501) exited with: 0
child 2 (pid:147502, ppid:1) connecting to 8.8.8.8:53
child 2 done

generated event:
{
 "process_connect": {
    "process": {
      "exec_id": "OjE4NzgzNDY2Mzk0MjIzMzoxNDc1MDI=",
      "pid": 147502,
      "uid": 1000,
      "cwd": "/home/kkourt/go/src/github.com/isovalent/hubble-fgs/",
      "binary": "/home/kkourt/go/src/github.com/isovalent/hubble-fgs/contrib/tester-progs/fork-tester",
      "flags": "execve",
      "start_time": "2022-05-05T10:52:22.798Z",
      "auid": 1000,
      "parent_exec_id": "OjE4NzgzNDY2Mzc1MTY0ODoxNDc1MDE=",
      "refcnt": 2
    },
    "parent": {
      "exec_id": "OjE4NzgzNDY2Mzc1MTY0ODoxNDc1MDE=",
      "pid": 147501,
      "uid": 1000,
      "cwd": "/home/kkourt/go/src/github.com/isovalent/hubble-fgs/",
      "binary": "/home/kkourt/go/src/github.com/isovalent/hubble-fgs/contrib/tester-progs/fork-tester",
      "flags": "execve",
      "start_time": "2022-05-05T10:52:22.797Z",
      "auid": 1000,
      "parent_exec_id": "OjE4NzgzNDY2MzEzMjYyNDoxNDc1MDA=",
      "refcnt": 1
    },
    "source_ip": "192.168.0.66",
    "source_port": 43548,
    "destination_ip": "8.8.8.8",
    "destination_port": 53,
    "protocol": "TCP"
  },
  "time": "2022-05-05T10:52:23.798Z"
}

[commit: c0417f5]

Signed-off-by: Kornilios Kourtis <[email protected]>
To be used by later commits.

[commit: 8536cd9]

Signed-off-by: Kornilios Kourtis <[email protected]>
Add a helper for parsing output of commands, but also logging in using
t.Log().

[commit: ae8050a]

Signed-off-by: Kornilios Kourtis <[email protected]>
[commit: aec9167]

Signed-off-by: Kornilios Kourtis <[email protected]>
Introduce a test that tests that tetragon properly handles events that
come from processes that fork() but do not exec().

This patch just introduces the code that spawns the fork-tester program,
and parses its output. It does not start the agent to test that events
are properly created. See subsequent patch for that.

[commit: 072a24d]

Signed-off-by: Kornilios Kourtis <[email protected]>
Start the observer so that we actually do the test.

Note: The fork-tester test program issues a connect event. The test, however,
does not use the connect event, and instead uses an exit event and tests
that the proper PIDs (as retrieved from fork-tester) match. The reason
for not using the connect event is that this would require loading the tcp
sensor and we want to avoid such a dependency (at least for now). We
keep the connect() call in fork-tester, however, because it's useful for
interactive testing/debugging.

I've also verified manually that reverting the "introduce a clone event"
patch, breaks the test.

[commit: 2ed42b9]

Signed-off-by: Kornilios Kourtis <[email protected]>
Signed-off-by: Kornilios Kourtis <[email protected]>

[commit: 24f2f2d]
The "binary" field has at max MAXARGLENGTH size. Until now this is
100. Let's increase that to 256.

[commit: 382eaa8]

Signed-off-by: Anastasios Papagiannis <[email protected]>
Signed-off-by: Kornilios Kourtis <[email protected]>
Before that we use fd_install to track the correlation of file names
with file descriptors. This captures calls of open* syscalls. dup
syscall also use the same mechanism.

dup2 and dup3 syscalls use a different way to do that. This commit
introduces CopyFD as a complementary action to FollowFD to be used
in dup2/dup3 syscalls.

Example syntax is in crds/examples/dup_read.yaml. A smaller example:
  - call: "__x64_sys_dup2"
    syscall: true
    args:
    - index: 0
      type: "fd"
    - index: 1
      type: "int"
    selectors:
    - matchActions:
      - action: CopyFD
        argFd: 0
        argName: 1

[commit: ebe3c49]

Signed-off-by: Kornilios Kourtis <[email protected]>
Signed-off-by: Anastasios Papagiannis <[email protected]>
[commit: 5da0092]

Signed-off-by: Anastasios Papagiannis <[email protected]>
Signed-off-by: Kornilios Kourtis <[email protected]>
@kkourt kkourt force-pushed the pr/kkourt/fork-handling branch from b02bae9 to 5ebdd63 Compare May 19, 2022 09:23
@jrfastab jrfastab merged commit 9951b47 into main May 19, 2022
@jrfastab jrfastab deleted the pr/kkourt/fork-handling branch May 19, 2022 16:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants