Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Epic] Query aggregation functions on groups #28

Closed
17 of 18 tasks
jsimnz opened this issue Nov 7, 2021 · 5 comments
Closed
17 of 18 tasks

[Epic] Query aggregation functions on groups #28

jsimnz opened this issue Nov 7, 2021 · 5 comments
Labels
area/query Related to the query component epic Includes sub-issues feature New feature or request

Comments

@jsimnz jsimnz added area/query Related to the query component feature New feature or request labels Dec 2, 2021
@AndrewSisley AndrewSisley self-assigned this Dec 13, 2021
@AndrewSisley AndrewSisley added the epic Includes sub-issues label Dec 13, 2021
@AndrewSisley AndrewSisley changed the title Query aggregation functions on groups [Epic] Query aggregation functions on groups Dec 13, 2021
This was referenced Dec 13, 2021
@AndrewSisley
Copy link
Contributor

AndrewSisley commented Dec 13, 2021

Different aggregates should be able to be implemented separately - current tickets:

Notes:

  • There doesn't seem to be a reason for making these dependent on groupBy - any child items (e.g. joins, arrays) should be aggregatable
  • Don't worry about chaining child fields, this is not yet done for other areas such as groupBy and can be added later. Users can work around it by chaining aggregates - e.g. to achive foo: count(_group.child1.bar) they can do foo: sum(_group.barCount) for now

@jsimnz
Copy link
Member Author

jsimnz commented Dec 15, 2021

As per our last dev meeting on Friday Dec 10. Heres a breif breakdown of the intended syntax of the aggregate system on the GQL API.

ageAvg: _avg(field: "age")

Where:

  • ageAvg is the alias we are naming the result of the aggregate function to required
  • _avg is the aggregate function we are running. All aggregate functions (all system reserved query keywords in general) begin with a _ prefix.
  • field is an argument provided to the aggregate function. It is a parameter that expects a single argument, the name of the field to which to apply the aggregate function to.
  • age is the value of the field argument, ideally, this was implemented as an Enum of the object fields, but that wouldn't allow for aggregate functions to use other aggregates as inputs. This is because we can't create a union type on inputs, and we can't create a custom scalar that uses enums. So we are stuck with strings for now. <-- This item should be further discussed.

@AndrewSisley
Copy link
Contributor

What are your thoughts on allowing either a typesafe argument, and at a later date, a string argument capable of handling aliases? Would suggest the string prop should be done consistently with all other field references in our query language and should be done as part of another ticket/epic (like the child field chaining).

It would seem strange to me to allow aliases here, and not everywhere else, or if each item that handles aliases handled them differently.

@jsimnz
Copy link
Member Author

jsimnz commented Dec 21, 2021

Was tracking some of the commits on the aggregate branch, and saw the general strategy taken for the aggregate (specifically, the count func).

Currently you're creating a dedicated countNode planNode, just throwing it out there. Did you give any thought to a general aggregateNode and either sub nodes specific to the aggregate func in question (like countNode) similar to how the typeJoin approach. Or aggregateFuncs attach to the aggregateNode, that are executed together.

Just spit ballin.

Theres an example of how Cockroach implemented their aggregate system similar to the latter of the two above approaches that I can link.

The only downside I see to just using countNode or FUNCNode approach is that it will create alot of individual nodes that need to be directly managed by the multiNode. With the aggregateNode approach at least that complexity is handled by a node that specifically understand that complexity.

We can jump on call and throw some ideas around. Just wanted to bring it up before you got too far down the implementation.

@AndrewSisley
Copy link
Contributor

AndrewSisley commented Dec 21, 2021

I was going to keep them as independent as possible, probably with stuff like average composed of count and sum. I also like to avoid over thinking stuff like that until the adding the second/third etc to minimize the amount of thought required to implement any given item - should be really easy to change post merge if we want.

Can wrap them maybe for organisational reasons, or just move them into an aggregate package, but I largely prefer to keep types separate as opposed to code branching.

RE: your last paragraph - is already implemented, just been cleaning up the code 😂 But that shouldn't stop us from changing anything of course. Call could be good, but preferably not today as it is quite late here now and I'm full of food :)

@jsimnz jsimnz assigned AndrewSisley and unassigned AndrewSisley Jun 7, 2022
@AndrewSisley AndrewSisley modified the milestones: DefraDB v0.3.1, DefraDB v0.4, DefraDB v0.5 Sep 16, 2022
@jsimnz jsimnz modified the milestones: DefraDB v0.5, DefraDB v0.6 Mar 3, 2023
@fredcarle fredcarle modified the milestones: DefraDB v0.6, DefraDB v0.7 Aug 24, 2023
@fredcarle fredcarle removed this from the DefraDB v0.7 milestone Jan 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/query Related to the query component epic Includes sub-issues feature New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants