-
Notifications
You must be signed in to change notification settings - Fork 6
Track notifications from the Auth service #560
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
91f04a0 to
0ca1421
Compare
|
Causes failures in cluster-manager on initial install. The failures are caused by subscriptions to the ConfigDB being refused. I thought this was to do with permissions being cached inappropriately in RxClient.Auth but this doesn't appear to be the cause. The failures go away once auth and configdb have been restarted. |
|
The cause here is that adding a new principal UPN mapping is not causing the permissions for that principal to be correctly recalculated internally. This problem lies entirely inside the Auth service. The ConfigDB is performing a WATCH on This can be reliably reproduced by:
|
This overrides the core acl-fetch method in the Auth interface with one which subscribes to notifications. This gives us immediate updates when permissions change. We subscribe to the ACL for a given principal for half an hour; if we have no more requests in that time the subscription will be dropped.
This gives push authentication.
Code which has moved from acs-configdb to js-service-api was not moved across correctly.
Skip the Directory for now as that needs direct access to the raw ACL.
The Directory needs access to the raw ACL, so expose this as a public method. It's cleaner to override a public method in any case.
This should be more reliable as the Sparkplug notifications do not sync well with the HTTP caching.
The ServiceClient already had correct code to return a 503 when we got permission denied from the Auth service. Use this instead of overriding it. Switch to Response in place of Maybe as this preserves the HTTP response code.
I even created a sequence function especially for this purpose...
Rationalise our error returns in ServiceClient so that we only return ServiceError for a problem with the remote service. Ordinary parameter errors now throw other errors. This means that a ServiceError thrown during processing of a request should always invoke a 503 response. Remove the hack in Auth.fetch_acl to throw a ServiceError with 503 in.
The original logic here did not retry failed updates and left this to the 10-minute reconcile loop. Changing it to retry individual actions causes a retrying update to block all other update, including an update to delete the cluster. Group actions by cluster. Attach retry logic to each action and then switch through the actions for each cluster. This ensures we abandon a retrying action if we should be doing something else instead.
Instead of duck-typing use a proper class to distinguish between 'a service we contacted returned an error', 'we need to return a specific error' and other failures which happen to result in an error object with a `status` property.
cc1fb6b to
7688799
Compare
This is not fully tested yet.
Add an Auth interface to RxClient. This tracks ACLs via change-notify, caching and tracking the ACL for a given principal for half an hour. Calls to the existing Auth interface methods will now go via this cache instead of the HTTP request cache.
Update existing Auth clients to use RxClient. In most cases this is all that is required to support change-notify. In some cases additional updates to use the ConfigDB notify interface are also included.