-
Notifications
You must be signed in to change notification settings - Fork 380
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
allow for concurrent listing of tracing policies #2506
Conversation
✅ Deploy Preview for tetragon ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
4fb2f68
to
47690ce
Compare
00de621
to
56bd4bc
Compare
56bd4bc
to
8aa738c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, looks good to me, really like the testing with the fake delayed sensor.
As we discussed offline previously, this makes the code significantly more complex indeed, especially with the manual locking parts with the loading and unloading parts. Touching this code would require people to read the git history to understand, maybe we can add more immediate context by commenting a bit more. But the presence of explicit locks is already explicit comments in some ways.
Previously, the value type of the collection map in the handler was "collection", i.e., a non-pointer type. This patch changes it to *collection to avoid copies. This will allow for some code simplifications which will come in later patches. Signed-off-by: Kornilios Kourtis <[email protected]>
The previous patch modified the collection map to hold pointers. Hence, there is no need to update the map, we can just update the collection. Signed-off-by: Kornilios Kourtis <[email protected]>
This patch creates a new structure collectionMap which contains the map for the collections and a lock. The intention is to implement a read-only method to list tracig policies that does not have to go via the manager goroutine. Signed-off-by: Kornilios Kourtis <[email protected]>
Add a listPolicies method in collectionMap. This is meant to be used outside of the manager goroutine. Signed-off-by: Kornilios Kourtis <[email protected]>
This patch modifies the Manager ListTracingPolicies to not go via the manager goroutine, and, instead directly accesses the collection map. This will allow listing the policies without having to wait for BPF program loads to finish. Note that ListTracingPolicies is used both from the gRPC server and the metrics. Signed-off-by: Kornilios Kourtis <[email protected]>
Add a loading and unloading state for tracing policies. It will be used in subsequent patches for listing policies during loading/unloading. Signed-off-by: Kornilios Kourtis <[email protected]>
Signed-off-by: Kornilios Kourtis <[email protected]>
This patch modifies addTracingPolicy to release the lock to the collections while loading a sensor. Before releasing the lock, the state is changed to LoadingState. Hence, while loading the sensors of a policy, it is now possible to list the collections. Note that we assume that collections cannot be modified, i.e., the state of a collection cannot change while loading. Above is achieved by only alllowing read access outside of the goroutine. Signed-off-by: Kornilios Kourtis <[email protected]>
The unload (load) operation only makes sense for enabled (disabled) collections. Add the proper checks. Signed-off-by: Kornilios Kourtis <[email protected]>
Instead of destroying the sensors of a policy and then removing them from the collection, remove them first and then destroy them. Once we remove them, we can release the collection lock that allows listing of the policies. Signed-off-by: Kornilios Kourtis <[email protected]>
This patch modifies disableTracingPolicy to release the lock to the collections while unloading a sensor. Before releasing the lock, the state is changed to UnloadingState. Signed-off-by: Kornilios Kourtis <[email protected]>
This patch modifies disableTracingPolicy to release the lock to the collections while enabling a sensor. Before releasing the lock, the state is changed to LoadingState. Signed-off-by: Kornilios Kourtis <[email protected]>
current code use a struct Sensor, which makes testing the sensor manager code hard. This patch introduces a sensor interface. Subsequent patches will use the interface for testing. Signed-off-by: Kornilios Kourtis <[email protected]>
This patch tests the intermediate states that were introduced in the previous patches. Specifically, it tests whether ListTracingPolicies call observes loading/unloading states for the corresponding sensors. It does so by a TestDelayedSensor that implements the SensorIface interface. Signed-off-by: Kornilios Kourtis <[email protected]>
8aa738c
to
e5f3585
Compare
After merging #2506 it seems that there are cases where 10 * 1 sec is not enough when waiting for a tracing policy to load (mainly in e2e tests). This patch increases that to 20 * 1 sec to avoid timeouts. Signed-off-by: Anastasios Papagiannis <[email protected]>
After merging #2506 it seems that there are cases where 10 * 1 sec is not enough when waiting for a tracing policy to load (mainly in e2e tests). This patch increases that to 20 * 1 sec to avoid timeouts. Signed-off-by: Anastasios Papagiannis <[email protected]>
Fixes: #2429
This PR modifies the sensor manager to allow listing of tracing policies when sensors are loaded/unloaded.
Some BPF programs (and their corresponding sensors) take a long time to load. In the existing implementation, listing policies is serialized with loading/unloading so listing policies has to wait for any loading/unloading operations to complete.
Because tracing policies are listed as part of the metrics, 5ce0f71 introduced a timeout for listing tracing policies, which leads to this timeout being hit: #2429.
This PR addresses this issue by introducing new states for policies (loading/unloading), and modifying the sensor manager to allow for concurrent listing of policies with loading/unloading. To do so, we add a RW lock and execute listing outside of the manager goroutine.
The PR also introduces a sensor interface that allows for better testing. The PR uses this interface to test the newly added functionality.
Please see patches for more details.