Automatic Subject/Object encoding #792

g13013 · 2021-12-05T23:07:34Z

Preflight checklist

I could not find a solution in the existing issues, docs, nor discussions.
I agree to follow this project's Code of Conduct.
I have read and am following this repository's Contribution Guidelines.
This issue affects my Ory Cloud project.
I have joined the Ory Community Slack.
I am signed up to the Ory Security Patch Newsletter.

Describe your problem

We are willing to use Keto in our stack as we think it's a perfect fit for our projects. We've been following the progress of Keto since the earliest versions. Before version 0.6 there was an AWS-like implementation I believe, it was convenient for us as our implementations are also inspired by AWS techs, for example, we are using URNs in our applications to reference objects between services, the change in recent Keto specs are fine except one thing, Objects/Subjects naming limitations. forcing every application to translate objects/subjects usually referenced by ARN's to UUIDs or so is far from being ideal. We believe there is a little change in Keto that could remove this limitation without forcing applications to obscure their implementation to overcome this limitation.

Describe your ideal solution

In fact, this limitation could easily be addressed in Keto itself. Keto could do the encoding/decoding of the object/subject before hitting the database. The encoding could be done by using SHA256 or UUID-5 (with tuple's namespace).

Write: Request -> Encode -> Backend
Read: Request -> Encode -> Backend

Also, the object and subject type and format could be relaxed to include objects as well, which could be transformed to JSON and then encoded.

So taking the example from the Doc:

A request to keto like:

{
    "action": "insert",
    "relation_tuple": {
        "namespace": "network",
        "object": "tcp/22",
        "relation": "access",
        "subject_id":  {
            "subnet": "192.168.0.0/24",
            "office_hours": true
         }
}

Assuming the namespace network's uuid is 4fee2932-561d-11ec-8a6d-ff46757a867c the request could be translated using uuid-5 to the following before write/read:

{
    "action": "insert",
    "relation_tuple": {
        "namespace": "network",
        "object": "23c60557-883b-5032-a4ca-5ee40d408b75",
        "relation": "access",
        "subject_id":  "ae60432e-c611-5bf0-b222-b3d7dda64f54"
}

Also this could be an option enabled in the config

namespaces:
  - id: 0
    name: network
    auto_encode:  uuid5
    uuid5_namespace: 4fee2932-561d-11ec-8a6d-ff46757a867c

  - id: 0
    name: videos
    auto_encode:  sha256

Workarounds or alternatives

Write the encoding logic in the applications which makes it difficult to have consistent implementation across all applications especially using microservices with different teams.
implementing some sort of request rewriter in front of keto which could be a viable solution but with a risk of data corruption or bugs if done wrong.

Version

v0.7

Additional Context

No response

The text was updated successfully, but these errors were encountered:

zepatrik · 2021-12-07T10:04:55Z

I would like to know what limitations you are hitting. The URNs would be good to use directly IMO. Are some of them longer than 64 characters? If the use of the characters :#@ is the problem, that limitation should actually not be there anymore since #638.

g13013 · 2021-12-07T17:07:13Z

@zepatrik , basically we don't have any limitation for the URNs length except what the standard states, but even the smallest of the URN's could could easily hit the limit or 64.

Also In a scenario where 2 services (or more) read/write the same permission from Keto they need to share the same encoding logic which puts the application at risk of bugs or even security issues if any of the services changes the encoding without the others updating the same encoding in the exact same time.

it's a little addition to Keto that would make it's usage straight forward and safer.

If the use of the characters :#@ is the problem

this is good news !

zepatrik · 2021-12-09T18:03:51Z

Also In a scenario where 2 services (or more) read/write the same permission from Keto they need to share the same encoding logic which puts the application at risk of bugs or even security issues if any of the services changes the encoding without the others updating the same encoding in the exact same time.

Ideally you would not trans- or encode your internal identifiers but use them directly. The 64 char limit on them is mainly based on the fact that they will be potentially passed around quite a lot. I guess having a mapping would make sense in some cases. We might then use UUID column types instead of strings, and you can either supply your own, or have a string <-> UUID mapping done by Keto. What do you think @aeneasr ?

zepatrik · 2021-12-10T14:54:34Z

After a quick offline discussion, this sounds reasonable and will be added to the roadmap 👍

RichiCoder1 · 2021-12-26T00:34:04Z

So would the idea be a generic mechanism to define encoding raw ID inputs to a uuid?

For string ID's the conversion could be something like uuidv5(<config-defined-ns>, <id>), but the above also mentions an DTO as an ID. Would an object id then (as an example) be something like uuidv5(<config-defined-ns>, sha256(json.Marshal(<id>)))? Or entirely unsupported?

Edit: Reading the API Docs it sounds like the answer is no, so the client would still need to encode if they wanted to do an DTO subject like in the example.

Also how would this interact with API's like https://www.ory.sh/keto/docs/guides/list-api-display-objects#listing-objects ? This would effectively prevent consumers from groking the return of that no? Since encoding is a transparent process and non-reversible?

zepatrik · 2022-01-07T12:07:06Z

I think what makes most sense is to use only UUIDs internally in the relation tuple table. Then for all cases where you don't want to provide a UUID, but any string, we can use a mapping table that is basically only <string, UUIDv4>. This will allow us to have unbound strings.
All requests will then first resolve the string to UUID, use only UUIDs internally, and before response re-translate UUID to string.
We have to use a table for the mapping because UUIDv5 are only one-way (SHA1) and we can't inverse that to get back the external facing ID.
This will affect all APIs btw.

hperl · 2022-01-07T14:40:47Z

Hi, I'd like to take a stab at it :)

zepatrik · 2022-02-17T09:44:06Z

Ok, to sum up the current state:

This feature is developed on the feature branch https://github.com/ory/keto/tree/feat/uuid-mapping to not block releases from current master. Further progress can be tracked at #840.

In relation tuples and related API objects such as queries, subjects (including subject sets) and objects are now automatically mapped to and from UUIDs. The mapping is done in each handler and for each protocol (HTTP, gRPC) separately. Below the handler, all objects and subjects are now UUIDs, but still handled as strings, for the following reasons: * No duplication of datastructures (one for type `string`, one for type `uuid.UUID`). * No unnecessary copying, the mapping is done in-place and batched across multiple objects.

In relation tuples and related API objects such as queries, subjects (including subject sets) and objects are now automatically mapped to and from UUIDs. The mapping is done in each handler and for each protocol (HTTP, gRPC) separately. Below the handler, all objects and subjects are now UUIDs, but still handled as strings, for the following reasons: * No duplication of datastructures (one for type `string`, one for type `uuid.UUID`). * No unnecessary copying, the mapping is done in-place and batched across multiple objects. Additionally, the migration (adding the string values to the mapping table and replacing the values with UUIDs) is now performed automatically as part of the migrations. Tests are in `migratest`, which now also handles `GoMigrations` properly. This method replaces the dedicated command in `cmd/migrate/...` for the UUID mappings. XXX WIP XXX WIP XXX WIP migrations_test package XXX WIP migrations_test package XXX WIP migrations_test package XXX WIP migrations_test package XXX WIP migrations_test package XXX WIP migrations_test package WIP

In relation tuples and related API objects such as queries, subjects (including subject sets) and objects are now automatically mapped to and from UUIDs. The mapping is done in each handler and for each protocol (HTTP, gRPC) separately. Below the handler, all objects and subjects are now UUIDs, but still handled as strings, for the following reasons: * No duplication of datastructures (one for type `string`, one for type `uuid.UUID`). * No unnecessary copying, the mapping is done in-place and batched across multiple objects. Additionally, the migration (adding the string values to the mapping table and replacing the values with UUIDs) is now performed automatically as part of the migrations. Tests are in `migratest`, which now also handles `GoMigrations` properly. This method replaces the dedicated command in `cmd/migrate/...` for the UUID mappings.

g13013 added the feat New feature or request. label Dec 5, 2021

zepatrik added the help wanted We are looking for help on this one. label Dec 10, 2021

zepatrik added this to the v0.8.0 milestone Dec 10, 2021

hperl mentioned this issue Jan 9, 2022

feat: Add mapping table for encoding subject into UUIDs #809

Merged

9 tasks

zepatrik mentioned this issue Feb 17, 2022

feat: map strings to UUIDs (#809) #840

Merged

4 tasks

hperl self-assigned this May 6, 2022

hperl removed the help wanted We are looking for help on this one. label Jun 21, 2022

zepatrik closed this as completed in #840 Jul 27, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Automatic Subject/Object encoding #792

Automatic Subject/Object encoding #792

g13013 commented Dec 5, 2021 •

edited

Loading

zepatrik commented Dec 7, 2021

g13013 commented Dec 7, 2021 •

edited

Loading

zepatrik commented Dec 9, 2021 •

edited

Loading

zepatrik commented Dec 10, 2021

RichiCoder1 commented Dec 26, 2021 •

edited

Loading

zepatrik commented Jan 7, 2022 •

edited

Loading

hperl commented Jan 7, 2022

zepatrik commented Feb 17, 2022 •

edited

Loading

Automatic Subject/Object encoding #792

Automatic Subject/Object encoding #792

Comments

g13013 commented Dec 5, 2021 • edited Loading

Preflight checklist

Describe your problem

Describe your ideal solution

Workarounds or alternatives

Version

Additional Context

zepatrik commented Dec 7, 2021

g13013 commented Dec 7, 2021 • edited Loading

zepatrik commented Dec 9, 2021 • edited Loading

zepatrik commented Dec 10, 2021

RichiCoder1 commented Dec 26, 2021 • edited Loading

zepatrik commented Jan 7, 2022 • edited Loading

hperl commented Jan 7, 2022

zepatrik commented Feb 17, 2022 • edited Loading

g13013 commented Dec 5, 2021 •

edited

Loading

g13013 commented Dec 7, 2021 •

edited

Loading

zepatrik commented Dec 9, 2021 •

edited

Loading

RichiCoder1 commented Dec 26, 2021 •

edited

Loading

zepatrik commented Jan 7, 2022 •

edited

Loading

zepatrik commented Feb 17, 2022 •

edited

Loading