Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate OPENJSON with WITH unless ordering is required #30983

Merged
merged 1 commit into from
Jun 7, 2023

Conversation

roji
Copy link
Member

@roji roji commented May 28, 2023

This PR based based on top of #30976 and #30956, review those first and ignore the first commits here.

We currently generate SQL Server OPENJSON expressions without WITH, applying a regular relational cast to applying the element store type to the output:

WHERE [t].[Id] IN (
    SELECT CAST([i].[value] AS uniqueidentifier) AS [value]
    FROM OPENJSON(@__ids_0) AS [i]

We use this form because it allows preserving the ordering of the JSON array (not needed in the above specific query), and that's not possible with the OPENJSON form that accepts WITH.

This PR switches to using OPENJSON with WITH in contexts where preserving the ordering isn't needed, such as the above query:

WHERE [t].[Id] IN (
    SELECT [i].[value]
    FROM OPENJSON(@__ids_0) WITH ([value] uniqueidentifier '$') AS [i]

Where ordering is important, we continue to use OPENJSON without WITH as today.

Compared to OPENJSON with WITH, OPENJSON without WITH has some performance drawbacks (see #13617 (comment), #13617 (comment)). In addition, some conversions from JSON are only possible when using the WITH variant (see #30727 (comment)).

/cc @yv989c, @pfritschi

@roji
Copy link
Member Author

roji commented May 29, 2023

Converting this back to a draft - there are probably patterns where ordering should be preserved that aren't detected by the current implementation (e.g. projecting out the entire collection as-is, applying an aggregate function that cares about ordering, such as string.Join). It may be a better approach to start with OPENWITH without WITH (and with the ORDER BY), and later convert to the more efficient form (with WITH) only if the ORDER BY has been removed (i.e. as a natural result of the subquery being inside IN/EXISTS).

Ended up starting with OPENJSON/WITH and an ordering by the (non-existing) key; this is an intermediate representation which is temporarily incorrect. In post-processing, a visitor finds SelectExpressions over OPENJSON where the ordering is still there, and removes the WITH (applying a cast on the projection instead). This leverages the existing logic in the query pipeline to elide ordering (e.g. when applying EXISTS/IN): if the ordering gets elided, we simply maintain the original OPENJSON/WITH; if the ordering is still there in post-processing, we convert it.

@roji roji marked this pull request as draft May 29, 2023 09:51
@roji roji marked this pull request as ready for review May 30, 2023 15:17
@roji roji force-pushed the OPENJSONWITH branch 4 times, most recently from be1f02d to 8c8cb6e Compare June 5, 2023 10:53
@roji
Copy link
Member Author

roji commented Jun 6, 2023

@maumar new version up, rebased on latest main.

I added two tests per your offline comments:

  • Column_collection_OrderByDescending_ElementAt applies an ordering to a (naturally ordered) column collection. This shows the natural ordering (by key) getting erased (e.g. x.OrderBy(a).OrderBy(b) removes the first ordering). But it did uncover a small bug in ElementAt, so thanks!
  • Inline_collection_Join_ordered_column_collection joins against an ordered collection; here too, the ordering gets elided since the ordering of joined rowsets has no meaning.

@roji roji merged commit f2a3da4 into dotnet:main Jun 7, 2023
@roji roji deleted the OPENJSONWITH branch June 7, 2023 12:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants