Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Experimenter group cache #166

Draft
wants to merge 6 commits into
base: master
Choose a base branch
from
Draft

Conversation

kkoz
Copy link
Contributor

@kkoz kkoz commented Sep 13, 2023

Fixes #165.
Creates a simple in-memory cache which contains all the relevant data in the ExperimenterGroup and ExperimenterGroupMap tables to perform the isRelatedUser check.
Some things that still need to be implemented/fixed:

  • I'm not entirely sure what to do if the cache fails to update and becomes invalid. Some options I could think of:
    1. Log an error, mark the cache as "invalid" and fall back on using the DB for everything.
    2. Perform 1. But also try to do a full cache refresh every N minutes and if it succeeds mark the cache as valid again.
    3. Log an error but continue using the cache, and do a full cache refresh every N minutes
    4. Propagate the error
      The problem with 1 and 2 is that things will continue to "work" even though something has gone wrong. Without checking the logs it's possible nobody will ever notice. With 2, the performance might improve again after the refresh, but if that fails too it's unclear how anyone would determine that something needed to be fixed. The problem with 3 is that for a period the user will either get data they shouldn't have access to or fail to get data they should, but it's not essential data for OMERO to run. Similarly to 1 and 2 it's not clear how we would find out an error had happened on a system which entered such a state. 4 is probably a bad idea but we would learn about the error quickly. Is there some way within the server to basically relay a message to the QA endpoint?
  • I noticed that if you change a user to or from a group owner, it takes awhile for the event context to be updated and until then you may see incorrect responses when getting user data. (Still need to fix this)
  • Need to write javadocs for everything

Testing:

There are a few different scenarios which need to be tested:

  • Getting experimenter data as admin or group owner (should always succeed)
  • Getting experimenter data for a system user (should always succeed)
  • Getting experimenter data as a fellow group member (non-owner) of a group which is not private (should succeed)
  • Getting experimenter data as a fellow group member (non-owner) of a group which is private (should fail)
  • Getting experimenter data for a user with no groups in common (should fail)
  • Creating a group should update the cache
  • Creating a user should update the cache
  • Adding a user to a group should update the cache
  • Removing a user from a group should update the cache
  • Changing group permissions should update the cache

@kkoz kkoz requested review from chris-allan and sbesson September 13, 2023 21:02
@chris-allan
Copy link
Member

Updating a group's permissions (e.g. from Private to ReadOnly) does not seem to create an event with entity of type ExperimenterGroup as I expected. It does create one with an entity of Session, but I don't think we want to update the cache every time we get a Session event. I haven't figured out why this is or what to do about it.

This is because modifying a group's permissions is performed by the Chmod2 infrastructure 1 which does not use Hibernate entities to perform changes to the object graph. When a Chmod2 request completes processing of an object it publishes an EventLogMessage:

https://github.com/ome/omero-blitz/blob/master/src/main/java/omero/cmd/graphs/Chmod2I.java#L344-L352

https://github.com/ome/omero-blitz/blob/master/src/main/java/omero/cmd/graphs/GraphHelper.java#L232-L240

This can be picked up if the PermissionsCacheEventListener were to implement ApplicationListener<EventLogMessage>.

I'm not entirely sure what to do if the cache fails to update and becomes invalid. Currently, I just flag it as invalid and it is ignored until it updates successfully. It's probably better than the server going down, but it also means that it's possible your server will experience a sudden decrease in performance and you have to go to the logs to figure out why.

Thinking about this a little more I'm now of the opinion that we need to react to changes by making incremental updates that synchronize database state with in memory state. Invalidating the entire cache whenever any of these objects are modified is not a viable solution. If 50 users are added to a group we cannot be refreshing the entire cache 50 times; that's not even an extreme use case.

At least we have the entities in hand for everything but the permissions change so should be able to update things very quickly without roundtripping to the database.

I noticed that if you change a user to or from a group owner, it takes awhile for the event context to be updated and until then you may see incorrect responses when getting user data.

Based on this and what I've mentioned above I don't think the EventContext is a good source of information. It should likely not be trusted for any of this work and instead the correct queries performed in its place.

Footnotes

  1. https://omero.readthedocs.io/en/stable/developers/Server/GraphsMigration.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Checking Related User Performing Large Number of Queries
2 participants