Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Runtime Dynamic Global Query Filters (centralized where clauses for entities) #34720

Closed
Danielku15 opened this issue Sep 20, 2024 · 9 comments
Closed
Labels
closed-out-of-scope This is not something that will be fixed/implemented and the issue is closed. customer-reported

Comments

@Danielku15
Copy link

Ask a question

Is there a hidden or official way to generate global query filters dynamically as part of query execution?

My data filters depend on some dynamic runtime information which can change the filters quite significantly. Creating an overall expression for all scenarios is quite a risky and challenging task.

In a normal query you would apply custom filters with .Where() on the DbSet but I also need to ensure any .Include() collections are also filtered according to the ruleset (OData style expands). AFAIK Global Query Filters have the benefit of being applied on the entities regardless of whether it is a main query or an include. If this assumption is wrong, I think we can close the discussion here.

It would be great if there is some service/callback/extension point where I can provide an additional .Where() for a given IQueryable<T> regardless whether it is a top-level or expand query.

Include your code

Assume a similar setup like the following project. You want to ensure that users can only see the projects they are members of and if the special access token is used, you can only see your own project. With this simplified setup you would want to generate a query filter matching your current state of authentication.

While adding filter clauses manually to all sets and expands might be technically possible but its risky and prone to errors. Global query filters would allow a central point of access control.

In my case I have an OData style API where users can specify the expands and the .Include clauses are dynamically added. Also here I could add a custom hook to apply where clauses. But if I extend my whole API and application framework to support this I was hoping to tackle this on the data access layer via EF directly.

class UserPrivilege { MasterAdmin, NormalUser }
class User 
{
    public UserPrivilege Privilege {get;set;}
}

class ProjectMember 
{
    public User User {get;set;}
    public Project Project {get;set;}
}

class ProjectGroup
{
    public IList<Project>? Projects {get;set;}
}
 
class Project 
{
    public ProjectGroup Group {get;set;}
    public string ProjectMasterToken {get;set;}
    public IList<ProjectMember>? Members {get;set;}
}

public class MyDbContext(IAuthInfo authInfo) : DbContext 
{
    protected override void OnModelCreating(ModelBuilder modelBuilder)
    {
        modelBuilder.Entity<Project>(e =>
        {
            // how to do something like this: 
            e.HasQueryFilter(() =>
            {
                if(authInfo.IsTokenAuth)
                {
                    return (Project p) => p.ProjectMasterToken == authInfo.ProjectMasterToken;
                }
                else 
                {
                    return authInfo.UserPrivilege switch
                    {
                        UserPrivilege.MasterAdmin => ((Project p) => true),
                        UserPrivilege.NormalUser => ((Project p) => p.Members.Any(m=>m.User.Id == authInfo.UserId),
                    };
                }
            });
        };
    }
}

var groupsWithProjects = myContext.Set<ProjectGroup>()
    .Include(g => g.Projects)
    .ToArray();

Diving deeper

As far I could see query filters are applied in the NavigationExpandingExpressionVisitor:

if (!_queryCompilationContext.IgnoreQueryFilters)
{
var sequenceType = navigationExpansionExpression.Type.GetSequenceType();
var rootEntityType = entityType.GetRootType();
var queryFilter = rootEntityType.GetQueryFilter();
if (queryFilter != null)

Seems it would not be a huge change to have GetQueryFilter() providing a dynamic value instead of pre-registered object in the model.

I was thinking if I add a custom Expression visitor via QueryTranslationPreprocessor and IQueryTranslationPreprocessorFactory

Would be great to get some feedback on this matter.

@roji
Copy link
Member

roji commented Sep 20, 2024

The global query filter is, by design, meant to apply the same filter(s) to all queries (except when IgnoreQueryFilters is specified), without variance across queries. It sounds like you want different filters to be applied to different queries, so they're no longer global at that point.

Now, if I understand correctly, you want to simply apply filtering - within a specific query - that would apply on an entity type within that query regardless of where/how it's being used; for example, the filtering would apply to the entity type when it's the root (DbSet), and when it's being Included.

If we're indeed within the scope a single query, then what's the problem with simply explicitly specifying the filter in that query wherever you reference the entity type? Most queries don't reference an entity type more than once (i.e. don't self-join), so in any case the filter has to only be specified once - why not just do it at that point?

@Danielku15
Copy link
Author

Danielku15 commented Sep 20, 2024

It sounds like you want different filters to be applied to different queries, so they're no longer global at that point.

It depends a bit on the definition/understanding of "global filter". I would still be registering one single "global" callback per entity defining the general business rules to access this entity. The callback might have multiple logic paths resulting in an expression. You're right that its maybe not stictly one single global filter expression anymore but the expressions are somehow dynamic.

Now, if I understand correctly, you want to simply apply filtering - within a specific query - that would apply on an entity type within that query regardless of where/how it's being used; for example, the filtering would apply to the entity type when it's the root (DbSet), and when it's being Included.

I'm not having really a specific query here but rather a general "query endpoint". Think of an API offering inputs where the end-user can specify what to expand on HTTP level (like a GraphQL or OData API). These expands internally result in .Include() calls being appended to the base query. Regardless of how the user might expand down the data graph the appropriate filtering rules need to be applied.

Most queries don't reference an entity type more than once (i.e. don't self-join), so in any case the filter has to only be specified once - why not just do it at that point?

I could theoretically extend my API query framework where I apply filtering on my level when translating the API request. Similar to the ASP.net core OData framework, I am passing a IQueryable<T> to my framework to apply any filters and expands.

From architecture standpoint I would prefer to define the rules and have the filtering on the business & data access layer than relying on the framework translating the user inputs. This mitigates the risk that though side-channels some data might be exposed.

@roji
Copy link
Member

roji commented Sep 20, 2024

It depends a bit on the definition/understanding of "global filter". I would still be registering one single "global" callback per entity defining the general business rules to access this entity. The callback might have multiple logic paths resulting in an expression. You're right that its maybe not stictly one single global filter expression anymore but the expressions are somehow dynamic.

Sure; definition-wise, I was referring to the "global query filter" feature which EF already has (doc); this is very different from what you're describing.

The callback might have multiple logic paths resulting in an expression. You're right that its maybe not stictly one single global filter expression anymore but the expressions are somehow dynamic.

I understand the need here; but the problem here is going to be what that callback will get as its input, and how that's going to be passed from the context of a specific query. In other words, I'm not sure how one could provide a completely generic piece of logic (the dynamic global callback), which then somehow gets some specific, per-query information that determines which actual filter(s) get added when.

I'd suggest trying to come up with a very concrete example of such a dynamic filter that you'd like to define globally; that could provide a basis for further conversation here.

@Danielku15
Copy link
Author

Sure; definition-wise, I was referring to the "global query filter" feature which EF already has (doc); this is very different from what you're describing.

As per current implementation you are right. What I meant with my statement is: Even if the expression is a bit more dynamic, the term "global query filter" is still valid. Same is true for the very first paragraph definition of the current "global query filter" definition.

I understand the need here; but the problem here is going to be what that callback will get as its input, and how that's going to be passed from the context of a specific query. I...

This is a valid point, in my PoC to apply the filters in my API query language parsing I faced the same problem. I think within EFCore the best approach is to pass in the DbContext which will execute the query. Any services required to determine the filter logic can be passed into there (e.g. via DI).

Staying on the same example as initially, a full example could look like this:

class UserPrivilege { MasterAdmin, NormalUser }
class User 
{
    public UserPrivilege Privilege {get;set;}
}

class ProjectMember 
{
    public User User {get;set;}
    public Project Project {get;set;}
}

class ProjectGroup
{
    public IList<Project>? Projects {get;set;}
}
 
class Project 
{
    public ProjectGroup Group {get;set;}
    public string ProjectMasterToken {get;set;}
    public IList<ProjectMember>? Members {get;set;}
}

interface IAuthInfo
{
    bool IsTokenAuth {get;}
    UserPrivilege? UserPrivilege {get;}
}

public class MyDbContext(IAuthInfo authInfo) : DbContext 
{
    public IAuthInfo AuthInfo { get; } = authInfo;
    
    protected override void OnModelCreating(ModelBuilder modelBuilder)
    {
        modelBuilder.Entity<Project>(e =>
        {
            // how to do something like this: 
            e.HasQueryFilter(dbContext =>
            {
                var authInfo = ((MyDbContext)dbContext).AuthInfo;
                if(authInfo.IsTokenAuth)
                {
                    return (Project p) => p.ProjectMasterToken == authInfo.ProjectMasterToken;
                }
                else 
                {
                    return authInfo.UserPrivilege.Value switch
                    {
                        UserPrivilege.MasterAdmin => ((Project p) => true),
                        UserPrivilege.NormalUser => ((Project p) => p.Members.Any(m=>m.User.Id == authInfo.UserId),
                    };
                }
            });
        };
    }
}

var groupsWithProjects = myContext.Set<ProjectGroup>()
    .Include(g => g.Projects)
    .ToArray();

@roji
Copy link
Member

roji commented Oct 1, 2024

I think within EFCore the best approach is to pass in the DbContext which will execute the query

First, EF's global query filters already allow you to reference your context instance, as the docs show. For your specific example above, it should be possible to write a plain old query filter today which does what you want: .HasQueryFilter(p => _isTokenAuth ? p.ProjectMasterToken == _projectMasterToken : ...). In other words, integrate accesses to specific fields on your context inside the query filter, and everything should work correctly.

Note that this isn't fully dynamic: query filters are limited - very much by design - to including specific values from DbContext fields/properties, and not varying the tree itself; there are very good reasons for that. Introducing full-on dynamicity in the middle of the query compilation process is problematic in various ways, and defeats query caching (at multiple levels); as long as the information accessed on the DbContext is a simple value, that value can be cleanly parameterized (via DbParameter), and everything works efficiently. The moment the query tree starts to change in arbitrary ways based on arbitrary things (and once you pass DbContext as a parameter to the query filter lambda, things become arbitrary), caching goes out the window pretty quickly.

This is why as a general rule, EF doesn't do full dynamic querying for you - it makes it too easy for users to shoot themselves in the foot; we try to keep things constrained so that queries work efficiently by default, and provide some advanced hooks for when users really need them. So I'd really recommend approaching this differently.

I'm going to go ahead and close this issue as I don't think we'll be going in the direction of allowing full dynamicity based on DbContext values in global query filters (for the reasons explained above). We also haven't seen users requesting anything like this, and I suspect that if they did, just having access to the DbContext may be insufficient (so your proposal very specifically fits your needs). But please feel free to post back here and continue the conversation - if relevant we can always reopen and revisit.

@roji roji closed this as not planned Won't fix, can't repro, duplicate, stale Oct 1, 2024
@roji roji added the closed-out-of-scope This is not something that will be fixed/implemented and the issue is closed. label Oct 1, 2024
@Danielku15
Copy link
Author

First, EF's global query filters already allow you to reference your context instance, as the docs show.

This is true, yes. Unfortunately the implication of using this is: the whole expression needs to be translateable to the DB or it fails. This also means the whole expression is executed on DB side (e.g. as SQL). In my proposal/example there is still part of the execution in .net and only a minimalistic expression might be applied on DB side. Thinking of execution plans this might have a non-neglectable impact to applications.

Depending on the complexity and abstractions you have in place. reusing common logic (maintaiability) is not easy and prone to errors (you need to use expression trees directly instead of using C# syntax). These filters are quite performance sensitive as they are applied on all queries. In a real setup I'd want to put this filter logic into another dependency where optimizations like caching are applied (and testing is possible). Having "optimal" queries for the different scenarios is important to scale.

Introducing full-on dynamicity in the middle of the query compilation process is problematic in various ways, and defeats query caching (at multiple levels); ...

I fully agree that performance and query caching are very important. But eventually it is not much different from normal .Where() clauses devs have to be aware of their performance implications when writing filter expressions. It is overall a delicate matter in EF to write efficient queries which improved a lot across releases. In my mindset we are moving the where clauses you'd place in your repository, to a central location being globally applied whenever a entity is queried (directly or through eager loading).

Luckily EF provides (will provide) hints like EF.Parameter and EF.Constant to ensure optimal parameterization. I'm not sure how the current DB context field capturing works and ensures that the "global" expression will use the "current" DB context values. But for the dynamic approach proposed here, likely things have to work different and are risky to do wrong.

To avoid problems in wrong usage (capturing of objects), a better (but more complex) approach for "dynamic central query filters" is to have a system like interceptors which can be registered. By not having lambdas its harder to access wrong values which are captured. Having "query filter" interceptors maybe also would fit better into the current architecture.

with DI support on interceptors
public class MyDbContext : DbContext 
{
    protected override void OnModelCreating(ModelBuilder modelBuilder)
    {
        modelBuilder.Entity<Project>(e =>
        {
            e.HasQueryFilterInterceptor<MyAuthFilterInterceptor>();
        };
    }
}

interface IQueryFilterInterceptor<TDbContext, TEntity> where TDbContext : DbContext
{
    Expression<Func<TEntity, bool>> BuildQueryFilter(TDbContext dbContext)
}

class MyAuthFilterInterceptor(IAuthInfo authInfo) : IQueryFilterInterceptor<MyDbContext, Project>
{
    public Expression<Func<Project, bool>> BuildQueryFilter(MyDbContext dbContext)
    {
        // DB context not directly needed anymore but still potentially useful to inspect some details. 
        if(authInfo.IsTokenAuth)
        {
            return (Project p) => p.ProjectMasterToken == authInfo.ProjectMasterToken;
        }
        else 
        {
            return authInfo.UserPrivilege.Value switch
            {
                UserPrivilege.MasterAdmin => ((Project p) => true),
                UserPrivilege.NormalUser => ((Project p) => p.Members.Any(m=>m.User.Id == authInfo.UserId),
            };
        }
    }
}
with DI support on interceptors
public class MyDbContext(IAuthInfo authInfo) : DbContext 
{
    public IAuthInfo AuthInfo { get; } = authInfo;

    protected override void OnModelCreating(ModelBuilder modelBuilder)
    {
        modelBuilder.Entity<Project>(e =>
        {
            // how to do something like this: 
            e.HasQueryFilterInterceptor(new MyAuthFilterInterceptor());
            HasQueryFilter(dbContext =>
            {
                var authInfo = ((MyDbContext)dbContext).AuthInfo;
                if(authInfo.IsTokenAuth)
                {
                    return (Project p) => p.ProjectMasterToken == authInfo.ProjectMasterToken;
                }
                else 
                {
                    return authInfo.UserPrivilege.Value switch
                    {
                        UserPrivilege.MasterAdmin => ((Project p) => true),
                        UserPrivilege.NormalUser => ((Project p) => p.Members.Any(m=>m.User.Id == authInfo.UserId),
                    };
                }
            });
        };
    }
}

interface IQueryFilterInterceptor<TDbContext, TEntity> where TDbContext : DbContext
{
    Expression<Func<TEntity, bool>> BuildQueryFilter(TDbContext dbContext)
}

class MyAuthFilterInterceptor : IQueryFilterInterceptor<MyDbContext, Project>
{
    public Expression<Func<Project, bool>> BuildQueryFilter(MyDbContext dbContext)
    {
        var authInfo = dbContext.AuthInfo;
        if(authInfo.IsTokenAuth)
        {
            return (Project p) => p.ProjectMasterToken == authInfo.ProjectMasterToken;
        }
        else 
        {
            return authInfo.UserPrivilege.Value switch
            {
                UserPrivilege.MasterAdmin => ((Project p) => true),
                UserPrivilege.NormalUser => ((Project p) => p.Members.Any(m=>m.User.Id == authInfo.UserId),
            };
        }
    }
}

We also haven't seen users requesting anything like this,

I think generally global query filters are a niche feature and not widely used, hence their simplicity. Depending on your architectural standpoint you might not want to have such logic on EF level but in some higher business layer. My use-cases lie within very CRUD-focused/data centric systems.

The proposal here also touches some similar topics around global query filters like multi-filters and named query filters. While this propsoal will not solve the other needs directly, I see opportunities to tackle following needs with a similar strategy:

These topics have quite a wide ranging interest.

@roji
Copy link
Member

roji commented Oct 1, 2024

Unfortunately the implication of using this is: the whole expression needs to be translateable to the DB or it fails. This also means the whole expression is executed on DB side (e.g. as SQL). In my proposal/example there is still part of the execution in .net and only a minimalistic expression might be applied on DB side. Thinking of execution plans this might have a non-neglectable impact to applications.

That's very true indeed - though I'd make sure there's a meaningful impact in your specific cases.

But eventually it is not much different from normal .Where() clauses devs have to be aware of their performance implications when writing filter expressions. It is overall a delicate matter in EF to write efficient queries which improved a lot across releases. In my mindset we are moving the where clauses you'd place in your repository, to a central location being globally applied whenever a entity is queried (directly or through eager loading).

I don't think that's quite true... When using regular LINQ queries, it isn't very straightforward to write fully dynamic queries (at least the kind that's really dangerous for performance/caching); in many cases, one needs to use the Expression builder APIs to do so, which is a very clear jump in terms of complexity, and in terms of the user understanding that they've gone past a certain border and are now in a different place.

Just as importantly, anything dynamic that the user does happens before EF sees the input query tree, and so before EF's internal caching (which is one of the first things that happen in the query pipeline). In your proposal, the dynamic transformations to the expression tree would happen much later in the query pipeline, after EF's query cache has already been examined. That alone would make this feature quite complicated to implement, and even then it would still be problematic efficiency-wise: no SQL would be cachable ever once this is used, since we have no way of knowing what happened to the query tree. Once again, you're proposing a totally arbitrary dynamic extensibility point within the query pipeline, after caching has already taken place.

So to summarize, what you're proposing is very different from whatever the user can do in terms of dynamic queries today, which happen before EF sees the tree rather than at some point inside the query pipeline after the initial caching.

I think generally global query filters are a niche feature and not widely used, hence their simplicity.

I'm not sure what that assumption is based on; we see lots of people using them, but the vast majority of peoples' needs are met with the current implementation (modulo a few minor feature requests of course).

The proposal here also touches some similar topics around global query filters like multi-filters and named query filters.

I don't see what your proposal has to do with the two issues you referenced - both seem doable without going into full-scale dynamic extensibility as you propose. Sure, it might be possible for users to achieve these things via your proposed dynamic extensibility, but I wouldn't want users to have to go through that just in order to name filters, disable them on a query-by-query basis, etc.

@erdalsivri
Copy link

We are also investigating the possibility of this exact feature because we'd like to configure our authz checks in a central place for all our entities (we have several hundreds). We had several issues in the past where a developer forgot to add a .Where(...) to filter entities accessible to the current user's so we decided to build something safe-by-default.

What we've found so far is that as Shay mentioned doing anything dynamic in QueryTranslationPreprocessor messes up everything. For example, you can't add new parameters to the expression (while adding a filter like entity.AuthorUserId = @currentUserId), and more importantly it can cause EFCore to cache incorrect stuff (due to dynamic nature of things). However, we can't add our filters before QueryTranslationPreprocessor either because query roots are not expanded at that point. We need NavigationExpandingExpressionVisitor to run before we can apply our filters. That's probably why global query filters are implemented in NavigationExpandingExpressionVisitor.

Currently, we are experimenting with the possibility of running QueryTranslationPreprocessor before EFCore caches the expression in QueryCompiler. Of course, this is not good for cacheability as it means QueryTranslationPreprocessor will run for all queries (no caching at all at this level). However, it is very easy to apply the filters this way because all query entity roots are expanded by EFCore:

We are basically thinking of intercepting QueryCompiler like this:

public class MyQueryCompiler : QueryCompiler {
  // Constructor with all the deps

  public override TResult Execute<TResult>(Expression query) =>
    ExecuteCore<TResult>(query, async: false);

  public override TResult ExecuteAsync<TResult>(
      Expression query,
      CancellationToken cancellationToken = default) => ExecuteCore<TResult>(query, async: true);

  private TResult ExecuteCore<TResult>(Expression query, bool async) {
    var queryCompilationContext = _queryCompilationContextFactory.Create(async);
    var queryTranslationPreprocessor =
      _queryCompilationContextDependencies.QueryTranslationPreprocessorFactory.Create(queryCompilationContext);
    query = queryTranslationPreprocessor.Process(query); // EFCore expands entity roots here.
    
    // We add `.Where` expressions to all query roots based on centrally-configured filter expressions.
    // We get the list of filter expressions from our DI container. Something like:
    //   `Container.GetInstance<QueryFilter<FooEntity>>()` for each entity root.
    // For now filters are simple `LambdaExpression`s like `e => e.AuthorUserId == currentUserId`
    //   (where `currentUserId` and bunch of other context is taken indirectly from `IHttpContextAccessor`).
    query = ApplyDynamicFilters(query); 

    var queryContext = _queryContextFactory.Create();
    query = ExtractParameters(query, queryContext, _logger); // (Any parameter we added before is extracted here properly)

    // Let EFCore compile the (expanded and filtered) query.
    // EFCore will run `QueryTranslationPreprocessor` again but I assume that'll be a no-op.
    var compiledQuery = _compiledQueryCache.GetOrAddQuery(
      _compiledQueryCacheKeyGenerator.GenerateCacheKey(query, async),
      () => CompileQueryCore<TResult>(_database, query, _model, async));

    return compiledQuery(queryContext);
  }
}

We don't know how resource-intensive QueryTranslationPreprocessor is so the plan is to run some benchmarks. We also have a rough idea about how to cache this using a custom cache key. We know which filters are going to be applied to the expression so we think we might be able to use "filter expressions" + "the expression itself" to generate a cache key but we have not experimented with this yet.

Disclaimer: Note this is still an experiment and it might end up being a bad idea so I wouldn't recommend anyone to just copy/paste and use that code but I wanted to share what we are doing in case someone has a feedback or is thinking of doing something similar.

@Li7ye
Copy link

Li7ye commented Oct 25, 2024

@Danielku15 if this feature is very important for you, maybe you can use Harmony to intercept the below method you mentioned before.

private Expression ApplyQueryFilter(IEntityType entityType, NavigationExpansionExpression navigationExpansionExpression)
{
if (!_queryCompilationContext.IgnoreQueryFilters)
{
var sequenceType = navigationExpansionExpression.Type.GetSequenceType();
var rootEntityType = entityType.GetRootType();
var queryFilter = rootEntityType.GetQueryFilter();
if (queryFilter != null)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
closed-out-of-scope This is not something that will be fixed/implemented and the issue is closed. customer-reported
Projects
None yet
Development

No branches or pull requests

4 participants