-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add named role support #202
Conversation
3d57e8a
to
80dde1f
Compare
ed155e2
to
d9d8992
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dropping some early comments here regarding the Helm chart. I'll get to the code next.
@@ -0,0 +1,70 @@ | |||
{{- if has .Values.config.crdb.migrateHook (list "pre-sync" "manual") }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we anticipate needing or wanting to support these three kinds of migration workflows?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
manual is perhaps not needed. but pre-sync and init are useful.
pre-sync can't be used during the initial deployment as it will rely on configs and secrets that have to be synced first, and therefore won't be available. so you must use the init method. but after using the init method, you can switch to the pre-sync method which will ensure migrations only execute once per sync.
if having these options is not preferred we could get rid of it and just use init containers.
{{- else }} | ||
generateName: migrate-database- | ||
annotations: | ||
argocd.argoproj.io/hook: PreSync |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general I think it's good to not assume any user of this is using Argo specifically.
# migrateHook sets when to run database migrations. one of: pre-sync, init, manual | ||
# - pre-sync: hook runs as a job before any other changes are synced. | ||
# - init: is run as an init container to the server deployment and may run multiple times if replica count is high. | ||
# - manual: a migrate-database job will be available to triggered manually | ||
migrateHook: "init" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment re: Argo. I do like the concept, though!
# password is the auth password to the database | ||
password: "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like something we shouldn't even let users set manually, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I usually like to expose all available configuration options, which this is an option that can be set in the crdbx config which is why I added it here.
# uri is the raw uri connection string | ||
uri: "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here re: sensitive data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re same as above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some thoughts.
internal/query/relations.go
Outdated
dbTx, err := e.db.DeleteRoleTransaction(ctx, roleResource.ID) | ||
if err != nil { | ||
// If the role doesn't exist, simply ignore. | ||
if !errors.Is(err, database.ErrNoRoleFound) { | ||
return err | ||
} | ||
} else { | ||
// Setup rollback in case an error occurs before we commit. | ||
defer dbTx.Rollback() //nolint:errcheck // error is logged in function | ||
} | ||
|
||
for _, filter := range filters { | ||
if err = e.deleteRelationships(ctx, filter); err != nil { | ||
return fmt.Errorf("failed to delete role action %s: %w", filter.OptionalResourceId, err) | ||
err = fmt.Errorf("failed to delete role action %s: %w", filter.OptionalResourceId, err) | ||
|
||
span.RecordError(err) | ||
span.SetStatus(codes.Error, err.Error()) | ||
|
||
return err | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if I attempt to update a role with new actions while it's being deleted? Reading this code, it looks possible that I wind up with relationships in SpiceDB but no corresponding CRDB entries. This might also, ironically, require a SELECT ... FOR UPDATE
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is now handled properly as we always have a transaction open before on the database. if two changes came in at the same time, one would have to wait until the database released the row.
1f15a23
to
0a6b4ca
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't looked at the tests yet but this looks broadly good. Some minor points below.
internal/api/roles.go
Outdated
// check on the role resource. | ||
resource, err := r.engine.GetRoleResource(ctx, roleResource) | ||
if err != nil { | ||
return echo.NewHTTPError(http.StatusBadRequest, "error getting resource").SetInternal(err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this make sense as a 400? If the resource wasn't found, we probably want to return a 404, and if something else failed then to return a 500.
internal/api/types.go
Outdated
Actions []string `json:"actions"` | ||
|
||
ResourceID gidx.PrefixedID `json:"resource_id,omitempty"` | ||
Creator gidx.PrefixedID `json:"creator"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saw the earlier comment around this got resolved but it looks like I missed this one. As a side note, we could also make this CreatedBy
/created_by
instead; Creator
/creator
just feels off.
We also probably want a field denoting who last updated the role too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whoops, good catch.
internal/query/mock/mock.go
Outdated
} | ||
|
||
return role, nil | ||
} | ||
|
||
// UpdateRole returns nothing but satisfies the Engine interface. | ||
func (e *Engine) UpdateRole(ctx context.Context, actor, roleResource types.Resource, newName string, newActions []string) (types.Role, error) { | ||
return types.Role{}, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm increasingly wary of mock interfaces (even though I'm pretty sure I wrote this one), but for the sake of completeness it likely makes sense to return the updated fields here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah fair nuff. i'll update this to use the mock support for returning values.
internal/query/relations.go
Outdated
// If the name is the same, then clear the changed name. | ||
if role.Name == newName { | ||
newName = "" | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can probably just keep the name and pass it right through. It saves us a little complexity and ultimately almost all of the time is going to be spent in the DB regardless.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point. i'll update it to instead, if no name is provided (for example, when just updating actions) it will populate the name from the existing record so we can keep the query simple.
internal/query/relations.go
Outdated
if err := e.store.CommitContext(dbCtx); err != nil { | ||
span.RecordError(err) | ||
span.SetStatus(codes.Error, err.Error()) | ||
|
||
logRollbackErr(e.logger, e.store.RollbackContext(dbCtx)) | ||
|
||
return types.Role{}, err | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same question here re: rolling back SpiceDB relationships. I'm not saying we need to roll them back, but behavior here should be defined, preferably as comments.
internal/storage/errors.go
Outdated
|
||
func pqIsRoleAlreadyExistsError(err error) bool { | ||
if pqErr, ok := err.(*pq.Error); ok { | ||
return pqErr.Code.Name() == "unique_violation" && pqErr.Constraint == "roles_pkey" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little suspicious of tying our error detection logic to constraint names in the DB. Is there a better way to do this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure of a better way to handle error detection. Since we own the database and migrations, we know the names of the indexes. We could make these more configurable or make them discovered on start but I'm not sure that's necessary at this point.
Alternatively we could just return less descriptive errors but I would like to avoid that.
internal/storage/errors.go
Outdated
|
||
func pqIsRoleNameTakenError(err error) bool { | ||
if pqErr, ok := err.(*pq.Error); ok { | ||
return pqErr.Code.Name() == "unique_violation" && pqErr.Constraint == "roles_resource_id_name" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here re: constraints.
internal/storage/roles.go
Outdated
// If no name is provided, only update the updated_at timestamp. | ||
if name == "" { | ||
row = tx.QueryRowContext(ctx, ` | ||
UPDATE roles SET updated_at = now() WHERE id = $1 | ||
RETURNING id, name, resource_id, creator_id, created_at, updated_at | ||
`, roleID.String()) | ||
} else { | ||
row = tx.QueryRowContext(ctx, ` | ||
UPDATE roles SET name = $1, updated_at = now() WHERE id = $2 | ||
RETURNING id, name, resource_id, creator_id, created_at, updated_at | ||
`, name, roleID.String(), | ||
) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As before, we can probably make this less complex by just accepting whatever role name we get.
11fa1d6
to
291ad8e
Compare
internal/query/relations.go
Outdated
|
||
logRollbackErr(e.logger, e.store.RollbackContext(dbCtx)) | ||
|
||
// TODO: add spicedb rollback logic. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens without this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The role actions are persisted (and will be visible to the user), but role metadata changes are not. While we could attempt to roll back the changes to SpiceDB, that operation itself could fail, meaning we would be in the same situation regardless.
We should probably spell that out here in the code, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for that clarification, I'll get a version of this added to the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Logic looks good - mostly some notes on documentation and tests.
internal/query/relations.go
Outdated
old := make(map[string]bool, len(oldActions)) | ||
new := make(map[string]bool, len(newActions)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to change this, but if you want to save a few bytes on the heap, struct{}
can be used here along with the _, ok := myMap["some_key"]
pattern.
internal/query/relations.go
Outdated
|
||
logRollbackErr(e.logger, e.store.RollbackContext(dbCtx)) | ||
|
||
// TODO: add spicedb rollback logic. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The role actions are persisted (and will be visible to the user), but role metadata changes are not. While we could attempt to roll back the changes to SpiceDB, that operation itself could fail, meaning we would be in the same situation regardless.
We should probably spell that out here in the code, though.
internal/query/relations.go
Outdated
relationships, err := e.readRelationships(ctx, filter) | ||
if err != nil { | ||
return nil, err | ||
} | ||
|
||
out := relationshipsToRoles(relationships) | ||
|
||
rolesByID := make(map[string]storage.Role, len(dbRoles)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe a silly question, but why not just have a map[gidx.PrefixedID]storage.Role
if we just convert every key to a string?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔 that's a good question... i'm not sure why I did that... It definitely doesn't need to be converted. Thanks!
span.RecordError(err) | ||
span.SetStatus(codes.Error, err.Error()) | ||
|
||
logRollbackErr(e.logger, e.store.RollbackContext(dbCtx)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably document what happens here too if this operation fails. The delete from the DB will fail and be rolled back, but the role will have no associated actions, which is probably fine.
That said, it looks like maybe roles can be deleted even if they have assignments? If that's the case, that's not great, but let's solve that in a follow-up PR rather than make this one bigger since it's not new behavior.
internal/query/relations_test.go
Outdated
Name: "UpdateMissingRole", | ||
Input: gidx.MustNewID(RolePrefix), | ||
CheckFn: func(ctx context.Context, t *testing.T, res testingx.TestResult[types.Role]) { | ||
assert.Error(t, res.Err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably want to check what the error is here, and not just that an error occurred.
internal/query/relations_test.go
Outdated
Input: role.ID, | ||
CheckFn: func(ctx context.Context, t *testing.T, res testingx.TestResult[types.Role]) { | ||
assert.NoError(t, res.Err) | ||
require.Equal(t, "test2", res.Success.Name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'll be good to also check the other attributes here: update time, actions, and so on.
if !testCase.Sync { | ||
t.Parallel() | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We almost certainly want to ensure our test cases are run concurrently. This is a good way to detect race conditions in Go code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to store the role actions in CRDB and anoint that as a source of truth for role definitions? If I were the one on PagerDuty rotation for this system I'd be pretty nervous about the brittleness of coordinating transactions between different databases and the risk of things getting into an inconsistent state, and having a single source of truth at least gives you a starting point for restoring the system to a known-good state when the need arises.
-- create "roles" table | ||
CREATE TABLE "roles" ( | ||
"id" character varying NOT NULL, | ||
"name" character varying(64) NOT NULL, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the reasoning for setting this limit at the storage level? In my experience, doing this is just setting yourself up for running future database migrations to increase the limit - 64 characters is pretty low for something you intend to expose to users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
64 characters seems to be a common limit. This field is part of a unique index and needs to have a limit to ensure performance.
@alexhall I missed this before, sorry. I might be wrong here, but if we have actions stored in CRDB it doesn't seem like that fixes the problems that come from coordinating transactions between both systems. If a commit fails in CRDB but we already persisted actions to SpiceDB, for example, we'll now be in an inconsistent state (i.e., SpiceDB will have a different view of actions in a role from what CRDB shows). As written, we at least keep the actions in only one place. |
In this situation, I wouldn't be writing to SpiceDB until after the write to CRDB succeeds. If the write to SpiceDB fails, that can be retried, and possibly even reconciled in the background, but at least I'd know that I have persisted in CRDB a consistent view of how the roles and actions should be. In other words, treat SpiceDB as a query-optimized view used for evaluating access checks, accept that data will get out of sync between CRDB and SpiceDB, and optimize for your ability to recover from that situation. I don't know how well that aligns with your existing architecture, but that's what jumps out to me at first glance when looking at this PR. |
efd9772
to
784860c
Compare
I think this idea has merit, though perhaps for different reasons: storing the actions in CRDB lets us have more flexibility around how we present them (i.e., changing the action names, adding descriptions, etc). That said, would it be worth putting this in a separate PR? This one is already at ~2k LOC and we probably want to give more consideration to how we define actions in general for reasons beyond DB transactions. |
Database package along with migrations and migrate command giving permissions-api it's own database to store details into. Initial support is for Roles Get, Create, Update and Delete. Signed-off-by: Mike Mason <[email protected]>
This integrates the new database which contains role metadata into the query engine as well as updates the http api to expose this new information. Signed-off-by: Mike Mason <[email protected]>
Signed-off-by: Mike Mason <[email protected]>
Signed-off-by: Mike Mason <[email protected]>
Updated CreateRole, UpdateRole and DeleteRole to CreateRoleTransaction, UpdateRoleTransaction and DeleteRoleTransaction to make it more clear that a transaction is being started. Additionally, the comments on these methods have been updated to include statements that Commit or Rollback must be called to ensure the database lifts all locks on rows which are affected. Previously, the method names made it appear as though the action was taken and completed. However this could lead to hung connections and rows. The new names make it clear that a new transaction is being started. The returning struct only has two methods Commit and Rollback in addition to the Record attribute resulting in a simple structure. Signed-off-by: Mike Mason <[email protected]>
Signed-off-by: Mike Mason <[email protected]>
Signed-off-by: Mike Mason <[email protected]>
Signed-off-by: Mike Mason <[email protected]>
784860c
to
ecd30fe
Compare
…roperly Signed-off-by: Mike Mason <[email protected]>
Since we're working with multiple backends, this allows us to place a lock early and ensure a separate request doesn't conflict with an in-flight change. Signed-off-by: Mike Mason <[email protected]>
de5f8a2
to
da03164
Compare
Signed-off-by: Mike Mason <[email protected]>
da03164
to
f2d105f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good - at this point I think we're at least in a good enough state to end-to-end test this and see how it shakes out before cutting a release.
This adds support for giving roles names, along with tracking additional data such as the creator, created at and updated at.
To support this, a small database package has been created to handle interacting with the backend database and simplifying changes.
In addition to the database package, the http api has changed to support these new features.
POST /api/v1/resources/:id/roles
now requires aname
to be provided.PATCH /api/v1/roles/:id
has been added with support for updating an existing roles name and actions.