Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recursive Queries Produce Non-plannable Query Outputs #430

Open
jon-whit opened this issue Feb 1, 2025 · 3 comments
Open

Recursive Queries Produce Non-plannable Query Outputs #430

jon-whit opened this issue Feb 1, 2025 · 3 comments

Comments

@jon-whit
Copy link

jon-whit commented Feb 1, 2025

👋 I'm working on a project to map Relationship-based Access Control (ReBAC) queries inspired by Google Zanzibar to Logica queries, but I'm running into an issue with what Logica produces. These queries are very recursive based queries as relationships in one subquery can reference relationships in others etc..

Here's an example of the Logica specification for the kinds of queries I am trying to express:

%%logica Relationships

# resource_type, resource_id, relation, subject_type, subject_id, subject_relation
tuple("document", "1", "editor", "user", "jon", "");
tuple("group", "eng", "member", "group", "ciam", "member");
tuple("group", "ciam", "member", "user", "bob", "");

# unary rules: resource_type, prereq_relation, derived_relation
unary("document", "editor", "viewer");

tuple("account", "t2_x", "blocks", "account", "t2_y", "");

# bidirectional rules: resource_type, source, inverse
bidirectional("account", "blocks", "blockedby");

# binary rules

tuple("folder", "x", "viewer", "user", "jill", "");
tuple("document", "readme", "parent", "folder", "x", "");

# (1) rel1(subject, resource1), rel2(resource1, resource2) :- rel3(subject, resource2)
binary("folder", "viewer", "document", "parent", "can_view");

tuple("document", "roadmap", "viewer", "user", "jack", "");
tuple("document", "roadmap", "allowed", "user", "jack", "");

# (2) rel1(subject, resource), rel2(subject, resource) :- rel3(subject, resource)
binary("document", "viewer", "document", "allowed", "can_view");

Relationships(resource_type, resource_id, relation, subject_type, subject_id, "") :-
  tuple(resource_type, resource_id, relation, subject_type, subject_id, "");

# unary relationships
#
# editor(user:jon, document:1) :- viewer(user:jon, document:1)
Relationships(resource_type, resource_id, derived_relation, subject_type, subject_id, "") :-
  unary(resource_type, prereq_relation, derived_relation),
  Relationships(resource_type, resource_id, prereq_relation, subject_type, subject_id, "");

# bidirectional relationships
#
# blocks(account:t2_x, account:t2_y) :- blockedby(account:t2_y, account:t2_x)
Relationships(resource_type, subject_id, inverse_relation, subject_type, resource_id, "") :-
  bidirectional(resource_type, source_relation, inverse_relation),
  Relationships(resource_type, resource_id, source_relation, subject_type, subject_id, "");

# hierarchical (binary) relationships
#
# viewer(user:jill, folder:x), parent(folder:x, document:readme) :- can_view(user:jill, document:readme)
Relationships(resource_type, resource_id, relation, subject_type, subject_id, "") :-
  binary(target_type, target_relation, resource_type, source_relation, relation),
  Relationships(resource_type, resource_id, source_relation, target_type, target_id, ""),
  Relationships(target_type, target_id, target_relation, subject_type, subject_id, "");

# intersection (binary) relationships
#
# viewer(user:jack, document:roadmap), allowed(user:jack, document:roadmap) :- can_view(user:jack, document:roadmap)
Relationships(resource_type, resource_id, relation, subject_type, subject_id, "") :-
  binary(resource_type, prereq1_relation, resource_type, prereq2_relation, relation),
  Relationships(resource_type, resource_id, prereq1_relation, subject_type, subject_id, ""),
  Relationships(resource_type, resource_id, prereq2_relation, subject_type, subject_id, "");

When I try to run this program against BigQuery in Colab, for example, I get an error:

BadRequest: 400 Resources exceeded during query execution: Not enough resources for query planning - query is too complex.; reason: resourcesExceeded, message: Resources exceeded during query execution: Not enough resources for query planning - query is too complex.

Looking at the query, it appears that the query output does not do a very good job at mapping the recursive semantics to CTE definitions and that these definitions get fully denormalized into intermediate temporary tables. See the attached output-sql.txt

I'm curious if ya'll have any thoughts on how this could be avoided and/or if you feel like this kind of definition is going to be a challenge for something like Logica to really support? I'd love some help with modeling the semantics and being able to produce some plans that can actually get executed.

@EvgSkv
Copy link
Owner

EvgSkv commented Feb 1, 2025

The problem arises because Logica is greedy about putting everything into one query. We need to tell it to break it up.

When we add @Ground(Relationships) the program runs.

Here is a CoLab where your program runs just fine:
https://colab.research.google.com/drive/1dhFL7sprkZcVFuBVlkM7UQi8fiLTeIte?authuser=1#scrollTo=1prW29dtf5s6

Logica is not using recursive CTE, instead it compiles to a SQL iterations. As far as I know no mainstream SQL engine supports aggregation in recursion, but in Logica its very useful, so CTEs are not enough and iteration is required.

In my experience anything that's at all practical can be acheive with SQL iteration the way Logica does it. So if you encounter any further challenges, please let me know, it's very interesting for me to see if Logica can fully support your usecase as-is.

If we uncover the true need for CTE we can work on that.

@jon-whit
Copy link
Author

jon-whit commented Feb 1, 2025

@EvgSkv thanks for the quick reply! That worked great for me.

I finished modeling some of the primitives of ReBAC models using Logica. I wanted to learn more about Logica and so I figured I'd model a real world use case. Google Zanzibar inspired models are becoming more popular, and there has been a lot of projects and literature built around the Zanzibar model. If you would be interested, it may be cool to write a blog post or something together on it. Would you be interested in doing something like that?

Here are some of the projects that have written about Relationship-based Access Control (ReBAC) or sometimes also referred to in literature as Fine-grained Authorization (FGA).

https://openfga.dev/
https://authzed.com/spicedb
https://www.feldera.com/blog/fine-grained-authorization
https://github.com/Permify/permify
https://www.osohq.com/docs/modeling-in-polar/relationship-based-access-control-rebac

Here is a link to the CoLab notebook that I have been working on.

In my experience anything that's at all practical can be acheive with SQL iteration the way Logica does it. So if you encounter any further challenges, please let me know, it's very interesting for me to see if Logica can fully support your usecase as-is.

If we uncover the true need for CTE we can work on that.

I may expose some interesting findings, because these ReBAC models can require mutually recursive CTE semantics and fixed point iteration mechanics, of which only some projects like Feldera and Materialize can offer today AFAIK. But yeah, to your point, I don't think any mainstream SQL engine provides mutually recursive CTEs with fixed point iteration. Only those two AFAIK.

@EvgSkv
Copy link
Owner

EvgSkv commented Feb 4, 2025

@jon-whit , I am happy to support your post if you need any advice on using Logica. I'll get in touch over email.

I didn't know there are any SQL engines supporting aggregation in recursion. This is very interesting. Thank you for links!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants