tetragon: refactor eventCache retry logic #310

jrfastab · 2022-08-11T00:43:47Z

Some objects should never hit retry logic for internal process lookups. One
example here is the exec entries which are the source of the internal
process. But, the exec events can still be missing pod info. Another
example is the test and ready events. These should not be bounced through
the cache and if they are we should get an error. Current implementation
though does not split up internal retries from podInfo retries. Also it
does not allow msg types to specify type specific handlers and instead
did some hard coded exception for exec.

This patch pulls the retry logic into two methods: one to catch missing
internal process and then a second to catch other state such as the
podInfo. Then now that we have a method we can throw exceptions if we
ever catch a cache retry from a test or other event.

Next up I'll add retries for parents, but this requires more specific
message handling so having methods here helps us to do better testing
and debugging as well as handle per message cases without the type
switch reflect logic we previously had.

Signed-off-by: John Fastabend [email protected]

Some objects should never hit retry logic for internal process lookups. One example here is the exec entries which are the source of the internal process. But, the exec events can still be missing pod info. Another example is the test and ready events. These should not be bounced through the cache and if they are we should get an error. Current implementation though does not split up internal retries from podInfo retries. Also it does not allow msg types to specify type specific handlers and instead did some hard coded exception for exec. This patch pulls the retry logic into two methods: one to catch missing internal process and then a second to catch other state such as the podInfo. Then now that we have a method we can throw exceptions if we ever catch a cache retry from a test or other event. Next up I'll add retries for parents, but this requires more specific message handling so having methods here helps us to do better testing and debugging as well as handle per message cases without the type switch reflect logic we previously had. Signed-off-by: John Fastabend <[email protected]>

jrfastab · 2022-08-11T05:40:06Z

pkg/grpc/exec/exec.go

+	args := p.Arguments
+	nspid := msg.Process.NSPID
+
+	if option.Config.EnableK8s && nspid > 0 {


@tpapagian this is interesting, the tests are failing because I added this 'nspid > 0' clause. My logic was if we don't have a nspid then we are in the host pidns and so probably don't need to do a pod lookup. But on 2nd though if hostpid=true (is that even a thing) then we probably should still get a pod. So, I'll revert and replace it with 'contaienrId != 0' I think this should be more accurate. wdyt?

I was trying to avoid bouncing through k8s lookups if it wasn't needed.

Yes, this sounds reasonable. contaienrId != "" is also consistent with what we use inside Needed().

jrfastab requested a review from a team as a code owner August 11, 2022 00:43

jrfastab requested a review from kevsecurity August 11, 2022 00:43

jrfastab commented Aug 11, 2022

View reviewed changes

jrfastab force-pushed the interfaceRefactorCache branch from d056e86 to 3942d0d Compare August 11, 2022 05:41

tpapagian approved these changes Aug 11, 2022

View reviewed changes

jrfastab merged commit db590a2 into main Aug 11, 2022

jrfastab deleted the interfaceRefactorCache branch August 11, 2022 16:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tetragon: refactor eventCache retry logic #310

tetragon: refactor eventCache retry logic #310

jrfastab commented Aug 11, 2022

jrfastab Aug 11, 2022

jrfastab Aug 11, 2022

tpapagian Aug 11, 2022

tetragon: refactor eventCache retry logic #310

tetragon: refactor eventCache retry logic #310

Conversation

jrfastab commented Aug 11, 2022

jrfastab Aug 11, 2022

Choose a reason for hiding this comment

jrfastab Aug 11, 2022

Choose a reason for hiding this comment

tpapagian Aug 11, 2022

Choose a reason for hiding this comment