Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tetragon: refactor eventCache retry logic #310

Merged
merged 1 commit into from
Aug 11, 2022
Merged

Conversation

jrfastab
Copy link
Contributor

Some objects should never hit retry logic for internal process lookups. One
example here is the exec entries which are the source of the internal
process. But, the exec events can still be missing pod info. Another
example is the test and ready events. These should not be bounced through
the cache and if they are we should get an error. Current implementation
though does not split up internal retries from podInfo retries. Also it
does not allow msg types to specify type specific handlers and instead
did some hard coded exception for exec.

This patch pulls the retry logic into two methods: one to catch missing
internal process and then a second to catch other state such as the
podInfo. Then now that we have a method we can throw exceptions if we
ever catch a cache retry from a test or other event.

Next up I'll add retries for parents, but this requires more specific
message handling so having methods here helps us to do better testing
and debugging as well as handle per message cases without the type
switch reflect logic we previously had.

Signed-off-by: John Fastabend [email protected]

Some objects should never hit retry logic for internal process lookups. One
example here is the exec entries which are the source of the internal
process. But, the exec events can still be missing pod info. Another
example is the test and ready events. These should not be bounced through
the cache and if they are we should get an error. Current implementation
though does not split up internal retries from podInfo retries. Also it
does not allow msg types to specify type specific handlers and instead
did some hard coded exception for exec.

This patch pulls the retry logic into two methods: one to catch missing
internal process and then a second to catch other state such as the
podInfo. Then now that we have a method we can throw exceptions if we
ever catch a cache retry from a test or other event.

Next up I'll add retries for parents, but this requires more specific
message handling so having methods here helps us to do better testing
and debugging as well as handle per message cases without the type
switch reflect logic we previously had.

Signed-off-by: John Fastabend <[email protected]>
@jrfastab jrfastab requested a review from a team as a code owner August 11, 2022 00:43
@jrfastab jrfastab requested a review from kevsecurity August 11, 2022 00:43
args := p.Arguments
nspid := msg.Process.NSPID

if option.Config.EnableK8s && nspid > 0 {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tpapagian this is interesting, the tests are failing because I added this 'nspid > 0' clause. My logic was if we don't have a nspid then we are in the host pidns and so probably don't need to do a pod lookup. But on 2nd though if hostpid=true (is that even a thing) then we probably should still get a pod. So, I'll revert and replace it with 'contaienrId != 0' I think this should be more accurate. wdyt?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying to avoid bouncing through k8s lookups if it wasn't needed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this sounds reasonable. contaienrId != "" is also consistent with what we use inside Needed().

@jrfastab jrfastab force-pushed the interfaceRefactorCache branch from d056e86 to 3942d0d Compare August 11, 2022 05:41
@jrfastab jrfastab merged commit db590a2 into main Aug 11, 2022
@jrfastab jrfastab deleted the interfaceRefactorCache branch August 11, 2022 16:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants