-
Notifications
You must be signed in to change notification settings - Fork 380
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Username for process_exec events #2369
Conversation
✅ Deploy Preview for tetragon ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
5d48ea0
to
f997d95
Compare
@jrfastab, please, have a look. I think the PR is ready. |
hi 👋 sorry for the delay I'm going to look at it in more detail tomorrow quick scan looks good to me. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree on the feature since users want it ;-)
But some changes since it is not that easy and we have to predict things.
api/v1/tetragon/tetragon.proto
Outdated
@@ -259,6 +259,8 @@ message Process { | |||
ProcessCredentials process_credentials = 17; | |||
// Executed binary properties. This field is only available on ProcessExec events. | |||
BinaryProperties binary_properties = 18; | |||
// User name associated with process. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After some thoughts and let's see what @jrfastab thinks too, but since username is something abstracted from a user space (name) services and not kernel related that could do network lookups in future, let's separate the things.
I suggest something like this:
// User records
message UserRecord {
// The UNIX username for this record. Corresponds to `pw_name` field of [struct passwd](https://man7.org/linux/man-pages/man3/getpwnam.3.html)
// and the `sp_namp` field of [struct spwd](https://man7.org/linux/man-pages/man3/getspnam.3.html).
name = 1;
}
// Unix user account information
message UserAccount {
// UserRecord user = 1;
}
Then inside message Process:
// Unix user account information. Available only on host deployments when Tetragon is running as
// a system service or directly on host. Accounts information is retrieved from traditional user database
// `/etc/passwd` and no name services lookups are performed.
UserAccount account = 19;
@jrfastab I don't like the 'account' name field, feel free to suggest.
All this is to separate things as username is something that is available depending on the workflow but also to be extensible in future with where things are going in the userspace stack:
Also eventually there are some fields there that we could add to records so we offer cool features to our users ;-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for such detailed review! Your proposed solution with UserAccont
struct looks great. It allows us to extend event with some other useful information in future. I could't find a better naming for 'Account'. So I keep this name for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do we get by having this couple layers of structures here?
UserRecord {
UserName {
name string
}
}
I get having one struct any reason to not just embed name in UserRecord?
UserRecord {
name string
}
If user record grows fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeh having such layering may seem too much, but in the other hand the abstraction at userspace has always been separate users and groups, you have two files for /etc/passwd and /etc/groups, at systemd layer two separate records, glibc sgrp and spwd, NSS etc... and I'm trying to position tetragon be autocompatible as possible... ;-)
Since you point this, I did another review of those user space fields, systemd is the most advanced one. We won't do NSS lookups in tetragon. So taking current distros support:
- Arch Linux seems to have already adopted systemd-userdb service JSON User/Group Record Query Multiplexer/NSS Compatibility as part of the systemd core.
- Ubuntu has https://manpages.ubuntu.com/manpages/lunar/man8/systemd-userdbd.8.html but it is provided by another package. Anyway the end result is a socket on
/run/systemd/userdb/
that provides json records. - Debian issue closed by packaging systemd-userdb in its own package https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=976960
So I see two choices here, and obviously I go with json systemd compatible one as it will allow us to support NSS and all new stuff https://systemd.io/USER_RECORD/ and https://systemd.io/GROUP_RECORD/ by querying the socket automatically, then mapping fields to the tetragon fields automatically.
- JSON systemd-userdb compatible two structs: per user and one per group:
// User records
UserRecord {
name string
...
}
// Group records
GroupRecord {
...
}
message Process {
// Unix user account information. Available only on host deployments when Tetragon is running as
// a system service or directly on host. Accounts information is retrieved from traditional user database
// `/etc/passwd` and no name services lookups are performed.
UserRecord user = 19;
// Unix group account information....
GroupRecord group = xx;
}
- Or we do simple one struct and embed everything, and prefix fields with either user/group for ones that have same name and also same meaning we let it grow
// Unix user account information
AccountRecord {
username string // The UNIX user name for this record.
groupname string // The UNIX group name for this record.
}
message Process {
// Unix user account information. Available only on host deployments when Tetragon is running as
// a system service or directly on host. Accounts information is retrieved from traditional user database
// `/etc/passwd` and no name services lookups are performed.
AccountRecord user/account = 19
}
Again for obvious reasons, I prefer 1. WDYT @jrfastab @anfedotoff ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I vote for the first one:). Btw, if we add group record, I think I can fill it the same way as user record.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I vote for the first one:). Btw, if we add group record, I think I can fill it the same way as user record.
Yes same way, just mention it in the code doc.
Let's wait for @jrfastab ack and I hope we have sorted out all stuff ;-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 works for me. For simplicity we could start with just the User record in t his PR and follow up with Group?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 works for me. For simplicity we could start with just the User record in t his PR and follow up with Group?
It's no problem to get primary group name for the user using LookupGroupId. But primary group name is often the same as user name. So I'm not sure if it is useful to have only this info.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, I vote to keep only User record for now, but for me it is OK to add Group record as I described. @tixxdz, what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@anfedotoff Up to you if you can make it shorter, document it and covered with tests then go with it, otherwise we can merge this PR first, then you follow up.
ee6be8f
to
0906a0c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So LGTM in general, minor fixes again and I think we can merge it.
Thank you for doing it ;-)
pkg/sensors/exec/exec.go
Outdated
mode := cgroups.GetDeploymentMode() | ||
if (mode == cgroups.DEPLOY_SD_SERVICE || mode == cgroups.DEPLOY_SD_USER) && | ||
namespace.GetCurrentNamespace().Mnt.Inum == m.Unix.Msg.Namespaces.MntInum && | ||
namespace.GetCurrentNamespace().User.Inum == m.Unix.Msg.Namespaces.UserInum { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry here I missed that you should replace the namespace check with:
https://github.com/cilium/tetragon/blob/main/pkg/reader/namespace/namespace.go#L270 get host namespaces there and if no error (to make code look correct for static analysis) and that mnt and user ns of exec match host namespaces then you are good to go.
No need to log errors, since normally tetragon will fail at startup if InitHostNamespaces() fails, but let's check the code so go tools won't error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm got confused here! Why do we check if the exec event is running in same host namespace as tetragon? do you have a particular reason for that?
Or do you think due to usernamespace mapping users will get confused?
From execve event in bpf we return the kernel global uid view anyway, so we will always match the host view in /etc/passwd if the uid exists there. Can you please add that in the proto API documentation?
Then from code we can just call here: namespace.GetCurrentNamespace()
that is tetragon and then check if User.IsHost==true and Mnt.IsHost==true
(also cache result in static global var there). This way we are running in host and resolving uids according to host original mapping of /etc/passwd
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm got confused here! Why do we check if the exec event is running in same host namespace as tetragon? do you have a particular reason for that?
I thought if there is a case, when tetragon is running not in host ns. For this case I check if exec and tetragon are in the same namespace. Yes, let's consider only host ns.
0906a0c
to
94042c3
Compare
@tixxdz , please, have a look. I think PR is ready:). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So LGTM still two things pointed out that we need to fix before ;-)
But really looks great!
pkg/sensors/exec/exec.go
Outdated
@@ -68,6 +71,19 @@ func msgToExecveKubeUnix(m *processapi.MsgExecveEvent, exec_id string, filename | |||
return kube | |||
} | |||
|
|||
func msgToExecveAccountUnix(m *exec.MsgExecveEventUnix) { | |||
mode := cgroups.GetDeploymentMode() | |||
if ns, err := namespace.GetMsgNamespaces(m.Unix.Msg.Namespaces); err == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So one last fix please here:
I mean to cache results of Tetragon and check if tetragon is running in host, not the target process that is received in unix message.
So can you please cherry pick this patch here: 69092c5 , the repo is here: https://github.com/cilium/tetragon/compare/pr/tixxdz/2024-05-check-current-namespaces , just the last patch, make it first patch then you add yours on top.
func msgToExecveAccountUnix(m *exec.MsgExecveEventUnix) {
mode := cgroups.GetDeploymentMode()
ns := namespace.GetCurrentNamespace()
if (mode == cgroups.DEPLOY_SD_SERVICE || mode == cgroups.DEPLOY_SD_USER) &&
(ns && ns.Mnt.IsHost && ns.User.IsHost) {
...
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tixxdz, thanks for comments. Maybe I misunderstand something. But look at the case:
Tetragon is running on host. I ran a docker container. I this container I created a user with uid 1001. Than just called whoami
.
"process_exec": {
"process": {
"exec_id": "YW5mZWRvdG9mZi1uaXg6MTEwNzA5MDEyMjAyMjc2MjoyOTU2ODM=",
"pid": 295683,
"uid": 1001,
"cwd": "/",
"binary": "/usr/bin/whoami",
"flags": "execve rootcwd clone",
"start_time": "2024-05-08T14:13:12.359738563Z",
"auid": 4294967295,
"docker": "ead4b91477a0d8affbde3db2317d89b",
"parent_exec_id": "YW5mZWRvdG9mZi1uaXg6NDQ3NDYxOTUzMDAwMDAwMDoyOTQ1OTA=",
"tid": 295683
},
I think we don't want to resolve usernames for processes that are not from host NS.
This is /etc/passwd
file from container:
nginx:x:101:101:nginx user:/nonexistent:/bin/false
user:x:1000:1000::/home/user:/bin/sh
user2:x:1001:1001::/home/user2:/bin/sh
On host, for example uid 1000 is me, anfedotoff:). So, If I start a process from user (1000)
I will recieve my host user (anfedotoff
) in event... So that's why I check if process belongs to host NS. What do you think about such case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm so:
- You create a docker container that's running with uid 1001 right? since tetragon is reporting that. The uid is global in Tetragon view, even if you map it with user namespaces, we always report uids in the host user namespace.
- Then with the username lookup logic we will resolve it to host /etc/passwd, so inside host:/etc/passwd if uid 1001 is userA and inside container:/etc/passwd uid 1001 is userB , in tetragon we will report userA , I guess that's the part you find confusing right?
I agree that users could be confused, let's then do it one step, we go with resolve only if process is in host NS, while we think about it, and we may just document it.
So to merge this, please keep your current change, just fix the test skipped, and we are good ;-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm so:
- You create a docker container that's running with uid 1001 right? since tetragon is reporting that. The uid is global in Tetragon view, even if you map it with user namespaces, we always report uids in the host user namespace.
- Then with the username lookup logic we will resolve it to host /etc/passwd, so inside host:/etc/passwd if uid 1001 is userA and inside container:/etc/passwd uid 1001 is userB , in tetragon we will report userA , I guess that's the part you find confusing right?
Yes, exactly. Let's resolve username only if process is in host NS for now!
ns := namespace.GetCurrentNamespace() | ||
if (mode != cgroups.DEPLOY_SD_SERVICE && mode != cgroups.DEPLOY_SD_USER) || | ||
!ns.Mnt.IsHost || !ns.User.IsHost { | ||
t.Skip() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test is being skipped here: https://github.com/cilium/tetragon/actions/runs/8984334822/job/24676304855?pr=2369#step:6:1345 so you probably need to initialize the deployment as pointed out.
Try to run the test on your local machine with the following command:
make
go test -exec "sudo" -p 1 -parallel 1 -gcflags= -timeout 20m -failfast -cover -run TestEventExecveWithUsername ./pkg/sensors/exec/... -v
It should not be skipped.
94042c3
to
a11c9fc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
I have one change request.
pkg/sensors/exec/exec.go
Outdated
func msgToExecveAccountUnix(m *exec.MsgExecveEventUnix) { | ||
mode := cgroups.GetDeploymentMode() | ||
if ns, err := namespace.GetMsgNamespaces(m.Unix.Msg.Namespaces); err == nil { | ||
if (mode == cgroups.DEPLOY_SD_SERVICE || mode == cgroups.DEPLOY_SD_USER) && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of checking (mode == cgroups.DEPLOY_SD_SERVICE || mode == cgroups.DEPLOY_SD_USER)
can we introduce a user flag for this?
The flag will allow us to document the limitations of just reading /etc/passwd
. I would call the flag something like: --resolve-username=unix-basic
and document the limitations.
I would personally have the default be --resolve-username=disabled
, but I'm fine to set it to "unix-basic" if (mode == cgroups.DEPLOY_SD_SERVICE || mode == cgroups.DEPLOY_SD_USER)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But why introduce the flag? what's purpose it is serving? we are already document that in host deployments we are reading /etc/passwd?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But why introduce the flag? what's purpose it is serving? we are already document that in host deployments we are reading /etc/passwd?
I agree with @tixxdz . I don't know is it reasonable to add a flag only for documentation purpose. Maybe we can find a place in docs to describe the restrictions of username resolving? It might help users to be not surprised when for some process event will not have a username
field. @kkourt what do you think?
P.S. For me it's ok to add on option if you think so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok after an offline discussion with @kkourt , indeed it could be problematic on setups that go with specific NSS, even if we ignore non existent entries in /etc/passwd, this falls short on duplicate or even if users want to override entries.
Instead of checking (mode == cgroups.DEPLOY_SD_SERVICE || mode == cgroups.DEPLOY_SD_USER) can we introduce a user flag for this?
The deployment is a guard, cause if in container deployment or something else, it does not make sense to check that container passwd file.
The flag will allow us to document the limitations of just reading /etc/passwd. I would call the flag something like: --resolve-username=unix-basic and document the limitations.
Makes sense, what if we make it: --metadata-username=unix
? to reflect on https://linux.die.net/man/8/pam_unix and metadata enrichment? calling it --resolve-username is fine too.
I would personally have the default be --resolve-username=disabled, but I'm fine to set it to "unix-basic" if (mode == cgroups.DEPLOY_SD_SERVICE || mode == cgroups.DEPLOY_SD_USER).
Yes that works, let's go with disabled meaning just that, we check deployment mode and we set it, and if in future we want to disable it even for host deployments we could do that.
BTW some notes as follow up:
- Cache the entries. Taking the sssd architecture https://sssd.io/docs/architecture.html it has a cache by default, I think entries could expire, but for entries below 1000 , and referencing systemd uids https://github.com/systemd/systemd/blob/main/docs/UIDS-GIDS.md#summary we can cache those in small map.
- If uid 0 resolves to root which is the convention we should not query the file again, but for later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The primary motivation for having a flag is:
- allow users to turn off this feature (this is aimed at environments that rely on other methods. e.g., NSS)
- allow us to support other methods in the future
I prefer --username-metadata=unix
to --metdata-username
.
One of the extensions we might want to do is --username-metdata=unix:<file>
and then mount /etc/passwd
inside the tetragon's pod to location <file>
. Doing so will allow us to get host username events even if we deploy via a K8s daemonset.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes --username-metadata=unix
is good too.
One of the extensions we might want to do is --username-metdata=unix: and then mount /etc/passwd inside the tetragon's pod to location . Doing so will allow us to get host username events even if we deploy via a K8s daemonset.
Hmm I dont think we should do that, security scanners may flag tetragon. Would be better to ask users that want this, install systemd-userdb then mount the socket from /run/systemd/userdb/
then do json queries on the records. Also container dynamic userids IIRC are not registered somewhere or propagated so we can look them up IIRC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Guys, I added --username-metadata
option. I also added a cache for username lookups by UID below 1000.
I looked at this table and I'm not sure that systemd UIDs greater than 1000 are not changing. Maybe cache only UIDs from [0; 1000) ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Guys, I added --username-metadata option
Thanks @anfedotoff.
I'll let @tixxdz answer about the cache, but in my opinion we can merge the PR without it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Guys, I added --username-metadata option
Thanks @anfedotoff. I'll let @tixxdz answer about the cache, but in my opinion we can merge the PR without it.
I agree, let's first merge what we have now, and you can add the optimization to not open every time /etc/passwd in follow up PR
a11c9fc
to
cf44dc4
Compare
86c385b
to
c478473
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Can you please add @kkourt suggestion about UserRecord doc
Also remove the cache let's please make it in separate PR.
Thanks for fixing the test, much appreciated ;-)
"deployment.mode": mode.String(), | ||
}).Warn("Username resolution is not available for given deployment mode") | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok checking this here is good, thanks
Signed-off-by: Andrey Fedotov <[email protected]>
Signed-off-by: Andrey Fedotov <[email protected]>
c478473
to
465f368
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you ;-)
The CI failures seem unrelated |
Username is useful when tetragon works on host. On different hosts the same username can have different UIDs.
Approach based on resolving username with pure Go
os/user
LGTM. But there is some restrictions:/etc/passwd
is availableTODO: