Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support bulk checks #146

Merged
merged 5 commits into from
Jul 26, 2023
Merged

Conversation

mikemrm
Copy link
Contributor

@mikemrm mikemrm commented Jul 24, 2023

This adds support for requesting access to multiple resources and actions.

@mikemrm mikemrm marked this pull request as ready for review July 24, 2023 17:51
@mikemrm mikemrm requested review from a team as code owners July 24, 2023 17:51
Comment on lines 102 to 111
type checkStatus struct {
Resource types.Resource
Action string
Error error
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd break this up into two structs: A checkRequest containing the resource/action pair and a checkResult containing the checkRequest object plus an error.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

return echo.NewHTTPError(http.StatusBadRequest, "invalid check request").SetInternal(multierr.Combine(errs...))
}

checkCh := make(chan *checkStatus)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So looking at this code, a channel of pointers seems risky. IMO we have two options here:

  • Spin up goroutines that receive off a <-chan checkRequest (see above) and send to a chan<- checkResponse
  • Use sync.WaitGroup and coordinate writes to a slice

The first option is more idiomatic Go, and I think is generally more robust against introducing data races. I'd recommend we move towards that approach. With that in mind, I'd recommend we make this a buffered channel, like make(chan checkResponse, len(reqBody.Actions)) and pre-populate the channel and close it before doing any work. That way we have a queue already primed for workers to consume from and don't have to do further coordination of work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense, updated.


checkCh := make(chan *checkStatus)

wg := new(sync.WaitGroup)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than use a sync.WaitGroup here, we can instead provide a channel for workers to send results to and iterate over that until we get back as many results as we expect, then close the channel to free resources.

Additionally, we can set up a new context using context.WithCancelCause, then when receiving results cancel the entire operation if we get any errors. SpiceDB seems pretty good at handling context propagation and terminating operations early so we can have things fail faster that way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed WaitGroup and handle waiting with channels.

Comment on lines 216 to 217
case <-time.After(maxCheckDuration):
return echo.NewHTTPError(http.StatusInternalServerError, "checks didn't complete in time")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting up a context using context.WithTimeout is more idiomatic here and lets us handle all errors in one place.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, updated.

@mikemrm mikemrm force-pushed the support-bulk-checks branch 2 times, most recently from 8ef3f26 to 511e91e Compare July 25, 2023 20:44
@mikemrm mikemrm requested a review from jnschaeffer July 25, 2023 20:50
Copy link
Contributor

@jnschaeffer jnschaeffer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good - more thoughts.

Comment on lines 189 to 210
for check := range requestsCh {
result := &checkResult{
Request: check,
}

// Check the permissions
err := r.engine.SubjectHasPermission(ctx, subjectResource, check.Action, check.Resource)
if err != nil {
result.Error = err
}

// Check if doneCh has been closed, if so, don't write to resultsCh.
select {
case <-doneCh:
return
default:
}

resultsCh <- *result
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can probably write this using the context's Done channel. Something like this will work:

for {
  select {
    case check := <-requestsCh:
      // do the thing
    case <-ctx.Done():
      result.Error = ctx.Err()
  }
}

This lets us collect all errors instead of terminating early, which we can use as shown below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah yeah, took me a minute but I see what you mean. Thanks!

Comment on lines 190 to 192
result := &checkResult{
Request: check,
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like there's not a compelling reason for us to do &checkResult here since we dereference the value at the end of the loop body.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah good call, fixed!

Comment on lines 218 to 251
go func() {
var count int

for result := range resultsCh {
count++

if result.Error != nil {
if errors.Is(result.Error, query.ErrActionNotAssigned) {
err := fmt.Errorf("%w: subject '%s' does not have permission to perform action '%s' on resource '%s'",
ErrAccessDenied, subject, result.Request.Action, result.Request.Resource.ID.String())

unauthorizedErrors++

allErrors = append(allErrors, err)
} else {
err := fmt.Errorf("check %d: %w", result.Request.Index, result.Error)

internalErrors++

allErrors = append(allErrors, err)
}

close(doneCh)
close(resultsCh)

return
}

if count == len(reqBody.Actions) {
close(doneCh)
close(resultsCh)
}
}
}()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than ranging over the channel, if we know how many results we expect back we should be able to just read those. Something like this:

for i := 0; i < numChecks; i++ {
  select {
    case result := <-resultsCh:
      // check for errors
    case <-ctx.Done():
      // context error handling
  }
}

Doing this would let us take out doneCh completely and remove the need to process results in a separate goroutine from the rest of the handler.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah yes, i see what you mean. it will just populate the errors with the same context error for all the ones which were cancelled. makes sense!

- Splits up the Request and Results
- Switch to using context.WithTimeout instead of time.After to ensure context is cancelled
- Replaces WaitGroup with looping through the known count and logging all errors

Signed-off-by: Mike Mason <[email protected]>
Copy link
Contributor

@jnschaeffer jnschaeffer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Two small things - neither are blocking.

Comment on lines +193 to +196
// if channel is closed, quit the go routine.
if !ok {
return
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense but I'm not quite sure if it's necessary since we call defer cancel() above. Could be wrong though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is required, as we close the requests channel, which results in selecting the case but with the ok set to false. which resulted in empty results (default checkRequest value) being checked.

Comment on lines +248 to +256
if internalErrors != 0 {
return echo.NewHTTPError(http.StatusInternalServerError, "an error occurred checking permissions").SetInternal(multierr.Combine(allErrors...))
}

if unauthorizedErrors != 0 {
msg := fmt.Sprintf("subject '%s' does not have permission to the requested resource actions", subject)

return echo.NewHTTPError(http.StatusForbidden, msg).SetInternal(multierr.Combine(allErrors...))
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if we need/want to also call RecordError or SetStatus here: https://pkg.go.dev/go.opentelemetry.io/otel/trace#Span

You'd need to check what the spans currently look like for that. Not a blocker though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will track it and dig into it.

@mikemrm mikemrm merged commit 53a5013 into infratographer:main Jul 26, 2023
@mikemrm mikemrm deleted the support-bulk-checks branch July 26, 2023 15:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants