Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Composable filters and deeply nested queries #792

Merged
merged 14 commits into from
Aug 17, 2020
Merged

Composable filters and deeply nested queries #792

merged 14 commits into from
Aug 17, 2020

Conversation

bart-degreed
Copy link
Contributor

@bart-degreed bart-degreed commented Jul 17, 2020

"Everybody knew it was impossible, until someone came along that didn't know."

Overview

I'm proud to present this PR, which brings the following enhancements:

1. Composable filter expressions
Examples:

  • /api/v1/users?filter=and(equals(lastName,'O''Brian'),greaterOrEqual(age,'25'),lessThan(age,'35'))
    Returns all users with last name O'Brian and age in range 25-35.
  • /api/v1/users?filter=or(endsWith(lastName,'Conner'),any(lastName,'Smith','Williams','Miller'))
    Returns users whose last name either ends in Conner or equals Smith, Williams or Miller.
  • /api/v1/blogs?filter=has(articles)&include=articles&filter[articles]=not(equals(publishDate,null))
    Returns blogs that have at least one article. Only includes articles that have a publication date.
  • /api/v1/blogs/1/articles?filter=greaterThan(count(upvotes),count(downvotes))
    Returns articles from blog 1 where the number of upvotes exceeds the number of downvotes.

2. Deeply nested filtering, pagination, sorting, and sparse fieldsets
Examples:

  • /api/v1/blogs?include=articles.comments&page[number]=2,articles:3,articles.comments:4&page[size]=10,articles:5
    Returns at most 10 blogs at page 2. Per blog, returns at most 5 included articles, all at page 3. Per blog, per article, returns included comments at page 4 using default page size.
  • /api/v1/blogs?sort=title,-id&include=articles&sort[articles]=-publishDate
    Sorts blogs by title, then ID in descending order. Sorts included articles descending by publication date.
  • /api/v1/blogs/1/articles?fields=summary&include=comments&fields[comments]=title,userName
    Returns article summaries from blog with ID 1. Returns titles and usernames from included comments.

3. New extensibility points in ResourceDefinitions to adapt, replace or discard query string input

  • virtual IReadOnlyCollection<IncludeElementExpression> OnApplyIncludes(IReadOnlyCollection<IncludeElementExpression> existingIncludes)
  • virtual FilterExpression OnApplyFilter(FilterExpression existingFilter)
  • virtual PaginationExpression OnApplyPagination(PaginationExpression existingPagination)
  • virtual SortExpression OnApplySort(SortExpression existingSort)
  • virtual SparseFieldSetExpression OnApplySparseFieldSet(SparseFieldSetExpression existingSparseFieldSet)

But the best part is:

All of these get translated into SQL queries, nothing is evaluated in memory!

4. Service/repository separation

  • IQueryable<T> is only used internally in repository, enabling service reuse for alternate data stores.

5. Additional minor enhancements

  • Use ~AttrCapabilities.AllowView to omit attribute from responses; if turned off, fails with HTTP 400 when requested in ?fields=
  • When deleting a resource, it no longer fetches the entity first
  • ICurrentRequest.IsReadOnly enables to use a different connection string for read-only requests
  • Optimized field selection on nested endpoints and relationships: for the parent, only retrieve Id column
  • Improved integration tests: better isolation using temporary database per test class (IntegrationTestContext) and cascaded table truncate (ClearTableAsync)

So how does it work?

The query pipeline roughly looks like this:

HTTP --[ASP.NET Core]--> QueryString --[JADNC:QueryStringParameterReader]--> QueryExpression[] --[JADNC:ResourceService]--> QueryLayer --[JADNC:Repository]--> IQueryable --[EF Core]--> SQL

Processing a request involves the following steps:

  • JsonApiMiddleware collects resource info from routing data for the current request. (unchanged)
  • JsonApiReader transforms json request body into objects. (unchanged)
  • JsonApiController accepts get/post/patch/delete verb and delegates to service. (unchanged)
  • IQueryStringParameterReaders delegate to QueryParsers that transform query string text into QueryExpression objects. (new)
    • By using prefix notation in filters, we don't need users to remember operator precedence and associativity rules.
    • These validated expressions contain direct references to attributes and relationships.
    • The readers also implement IQueryConstraintProvider, which exposes expressions through ExpressionInScope objects.
  • QueryLayerComposer (used from JsonApiResourceService) collects all query constraints. (new)
    • It combines them with default options and ResourceDefinition overrides and composes a tree of QueryLayer objects.
    • It lifts the tree for nested endpoints like /blogs/1/articles and rewrites includes.
    • JsonApiResourceService contains no more usage of IQueryable.
  • EntityFrameworkCoreRepository delegates to QueryableBuilder to transform the QueryLayer tree into IQueryable expression trees. (new)
    QueryBuilder depends on QueryClauseBuilder implementations that visit the tree nodes, transforming them to System.Linq.Expression equivalents.
    The IQueryable expression trees are executed by EF Core, which produces SQL statements out of them.

Example

To get a sense of what this all looks like, let's look at an example query string:

/api/v1/blogs?
  include=owner,articles.revisions.author&
  filter=has(articles)&
  sort=count(articles)&
  page[number]=3&
  fields=title&
    filter[articles]=and(not(equals(author.firstName,null)),has(revisions))&
    sort[articles]=author.lastName&
    fields[articles]=url&
      filter[articles.revisions]=and(greaterThan(publishTime,'2001-01-01'),startsWith(author.firstName,'J'))&
      sort[articles.revisions]=-publishTime,author.lastName&
      fields[articles.revisions]=publishTime

After parsing, the set of scoped expressions is transformed into the following tree by QueryLayerComposer:

QueryLayer<Blog>
{
  Include: owner,articles.revisions
  Filter: has(articles)
  Sort: count(articles)
  Pagination: Page number: 3, size: 5
  Projection
  {
    title
    id
    owner: QueryLayer<Author>
    {
      Sort: id
      Pagination: Page number: 1, size: 5
    }
    articles: QueryLayer<Article>
    {
      Filter: and(not(equals(author.firstName,null)),has(revisions))
      Sort: author.lastName
      Pagination: Page number: 1, size: 5
      Projection
      {
        url
        id
        revisions: QueryLayer<Revision>
        {
          Filter: and(greaterThan(publishTime,'2001-01-01'),startsWith(author.firstName,'J'))
          Sort: -publishTime,author.lastName
          Pagination: Page number: 1, size: 5
          Projection
          {
            publishTime
            id
          }
        }
      }
    }
  }
}

Next, the repository translates this into a LINQ query that the following C# code would represent:

var query = dbContext.Blogs
    .Include("Owner")
    .Include("Articles.Revisions")
    .Where(blog => blog.Articles.Any())
    .OrderBy(blog => blog.Articles.Count)
    .Skip(10)
    .Take(5)
    .Select(blog => new Blog
    {
        Title = blog.Title,
        Id = blog.Id,
        Owner = blog.Owner,
        Articles = new List<Article>(blog.Articles
            .Where(article => article.Author.FirstName != null && article.Revisions.Any())
            .OrderBy(article => article.Author.LastName)
            .Take(5)
            .Select(article => new Article
            {
                Url = article.Url,
                Id = article.Id,
                Revisions = new HashSet<Revision>(article.Revisions
                    .Where(revision => revision.PublishTime > DateTime.Parse("2001-01-01") && revision.Author.FirstName.StartsWith("J"))
                    .OrderByDescending(revision => revision.PublishTime)
                    .ThenBy(revision => revision.Author.LastName)
                    .Take(5)
                    .Select(revision => new Revision
                    {
                        PublishTime = revision.PublishTime,
                        Id = revision.Id
                    }))
            }))
    });

Breaking changes

  • New filter query string syntax. Set options.EnableLegacyFilterNotation to true to allow legacy filters
    To use new notation, prefix with "expr:", for example: ?filter=expr:equals(lastName,'Smith')
  • Multiple filters in query string at same depth are combined using OR operator (used to be AND, which violates json:api recommendations)
  • Using a negative page number (to reverse order) is no longer possible
  • "total-records" in response meta has been renamed to "total-resources" (and casing convention is applied)
  • ResourceDefinition<T>.HideFields() has been replaced by ResourceDefinition<T>.OnApplySparseFieldSet()
  • ResourceDefinition<T>.GetQueryFilters() has been replaced by ResourceDefinition<T>.OnRegisterQueryableHandlersForQueryStringParameters
    These are no longer tied to only filters. For example: ?filter[isHighRisk]=true now uses: ?isHighRisk=true
  • When no sort is provided, resources are sorted ascending by ID
  • Notable renames:
    DefaultResourceService -> JsonApiResourceService
    DefaultResourceRepository -> EntityFrameworkCoreRepository
    BaseJsonApiController.GetRelationshipsAsync -> GetRelationshipAsync
    BaseJsonApiController.GetRelationshipAsync -> GetSecondaryAsync
    AttrCapabilities.AllowMutate -> AllowChange
  • Most occurrences of 'entity' were renamed to 'resource' and Default prefix was removed from various class names
  • Unified internal terminology: main=base (/blogs) -> primary, nested=relationship=deep (/blogs/1/owner) -> secondary

Legacy filter conversion table

Old New
?filter[attribute]=value ?filter=equals(attribute,'value')
?filter[attribute]=ne:value ?filter=not(equals(attribute,'value'))
?filter[attribute]=lt:10 ?filter=lessThan(attribute,'10')
?filter[attribute]=gt:10 ?filter=greaterThan(attribute,'10')
?filter[attribute]=le:10 ?filter=lessOrEqual(attribute,'10')
?filter[attribute]=ge:10 ?filter=greaterOrEqual(attribute,'10')
?filter[attribute]=like:value ?filter=contains(attribute,'value')
?filter[attribute]=in:value1,value2 ?filter=any(attribute,'value1,'value2')
?filter[attribute]=nin:value1,value2 ?filter=not(any(attribute,'value1,'value2'))
?filter[attribute]=isnull: ?filter=equals(attribute,null)
?filter[attribute]=isnotnull: ?filter=not(equals(attribute,null))

Fixed issues

Fixes #788
Fixes #787
Fixes #764
Fixes #761
Fixes #758
Fixes #757
Fixes #751
Fixes #748
Fixes #747
Fixes #738
Fixes #551
Fixes #444
Fixes #217
Fixes #183
Fixes #176

This PR supersedes the following PRs: #705, #782.

Known limitations

These bugs in EF Core prevent some scenarios from working correctly. Please upvote them if they are important for you.

@bart-degreed
Copy link
Contributor Author

bart-degreed commented Jul 17, 2020

The coming weeks I won't be available to respond, but please feel free to try things out and provide feedback. Although still draft, this PR is feature complete, fully working and covered by tests.

@bart-degreed bart-degreed mentioned this pull request Jul 17, 2020
38 tasks
@bart-degreed bart-degreed requested a review from maurei July 17, 2020 12:24
@bart-degreed bart-degreed self-assigned this Jul 17, 2020
@ClintGood
Copy link

Hi Bart

All this looks very clean and powerful.

The following may not be directly related to this PR but maybe a bigger issue. I can open up another issue on the main fork if that makes sense. Anyway...

I tested against my model which does not have a default "Id" field in the underlying table.
My model implements IIdentifiable
There is code that implicitly assumes there will be an Id field in the class and will fail if it is missing (eg. QueryLayerComposer.GetSort)

Attempting to work around this by creating a non mapped Id field will get past the above problem, but will incorrectly try and filter on an Id field in the database.

In short, there seems to be code that always expects Identifiable instead of IIdentifiable

@maurei maurei marked this pull request as ready for review August 3, 2020 10:19
@bart-degreed
Copy link
Contributor Author

Hi @ClintGood thanks for your feedback. As far as I know, there has always been an implicit requirement for a typed Id property to exist. This is something the json:api specification depends on. I've added a comment at #797 that may solve your problem.

Bart Koelman added 2 commits August 10, 2020 13:41
…inition. This hides forced included fields from output, but still fetches them.
@bjornharrtell
Copy link
Contributor

Impressive work @bart-degreed. 👍

@bart-degreed bart-degreed mentioned this pull request Aug 11, 2020
@maurei
Copy link
Member

maurei commented Aug 13, 2020

General feedback item: all items in JsonApiDotNetCore.Internal.Queries.Expressions could use some explanatory docs 👍 A lot of new concepts are being introduced there, will be of great help for any dev to understand/maintain the internals

Bart Koelman added 3 commits August 13, 2020 15:09
…and now that it is integrated parsers can easily be reused from `ResourceDefinition`s without needing to know what type of relationships are allowed at various stages.
@bart-degreed
Copy link
Contributor Author

bart-degreed commented Aug 13, 2020

@maurei I've moved several types out of Internal namespace and added doc-comments. Please let me know specific locations where you believe more explanation would help. I'll address overview documentation later.

… Renamed QueryParser to QueryExpressionParser, because that is the ultimate base expression type being produced.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment