Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Usage reporting: report referenced fields in addition to executed fields #5956

Merged
merged 6 commits into from
Dec 17, 2021

Conversation

glasser
Copy link
Member

@glasser glasser commented Dec 17, 2021

In Apollo Studio, the "Fields" page lets you see how often your fields
are executed by operations --- ie, how often their resolvers run. It
additionally lets you see which clients and operations ran those
operations.

However, knowing if an operation executed a field doesn't tell the
whole story about the relationship between operations and fields. You
might also be curious to learn if a field was textually referenced in
the operation itself.

It's possible for a field to be referenced without executing. In fact,
there are many reasons this can happen:

  • The field is nested under another field which evaluates to null
  • The field is nested under another field which evaluates to an empty
    list
  • The field is nested under a non-matching fragment
  • The field is nested under @include or @skip

If you're using the Fields page to determine "what are all the
operations that use this field", you probably want to know about these
usages too!

It's also possible for a field to be executed without being referenced.
That's because when we track "executed fields", we are always tracking
the concrete object type that is being executed, not an interface type
it may have been resolved through. So with schema:

interface Animal {
  legs: Int
}
type Dog implements Animal {
  legs: Int
}
type Query {
  myFavoriteAnimal: Animal  # In practice, always returns a Dog
}

the operation { myFavoriteAnimal { legs } } "references"
Animal.legs, not Dog.legs, but it is the field Dog.legs that is
executed.

(Additionally, when using federation, fields can be executed without
being referenced in the operation if the query plan requires them to be
executed to fulfill an @requires or @key directive.)

This PR extends Apollo Server's usage reporting plugin to provide a list
of "referenced fields" along with every operation. Note that these
fields depend only on the operation, not on the variables passed or
anything at execution time, so we don't need one set per trace, just one
set per operation.

In addition to the fact that this statistic is an interesting one, this
will also mean that the Fields page can be useful without needing full
execution tracing. There is a real performance impact of full execution
tracing, especially in the federation context where the ftv1 protocol
involves sending subgraph traces to the gateway/router in-band in the
response. And not every GraphQL server supports full execution tracing
in the first place. With support for referenced fields on the Fields
page, you will be able to run a gateway/router in front of an arbitrary
subgraph without federated tracing (whether you want to do this for
performance or lack of implementation reasons) and still get some useful
data on the Fields page. (You can then perhaps run full tracing on a
sampled subset of queries to get reasonable approximate execution data
too.)

Note that the new Studio functionality may not be in general
availability when this PR merges into the Apollo Server release branch
or when the first alphas are created; it will be fully enabled
(including docs) by the time this PR is released in a non-prerelease.

As part of the implementation, we extend the existing "signature cache"
to cache the referenced field list as well. Note that the signature is a
pure function of the operation, whereas the referenced field list also
depends on the schema, so we add a little mechanism to throw out the
cache if the schema changes.

Fixes #5708.

In Apollo Studio, the "Fields" page lets you see how often your fields
are *executed* by operations --- ie, how often their resolvers run. It
additionally lets you see which clients and operations ran those
operations.

However, knowing if an operation *executed* a field doesn't tell the
whole story about the relationship between operations and fields.  You
might also be curious to learn if a field was textually referenced in
the operation itself.

It's possible for a field to be referenced without executing. In fact,
there are many reasons this can happen:
- The field is nested under another field which evaluates to null
- The field is nested under another field which evaluates to an empty
  list
- The field is nested under a non-matching fragment
- The field is nested under `@include` or `@skip`

If you're using the Fields page to determine "what are all the
operations that use this field", you probably want to know about these
usages too!

It's also possible for a field to be executed without being referenced.
That's because when we track "executed fields", we are always tracking
the concrete object type that is being executed, not an interface type
it may have been resolved through.  So with schema:

```graphql
interface Animal {
  legs: Int
}
type Dog implements Animal {
  legs: Int
}
type Query {
  myFavoriteAnimal: Animal  # In practice, always returns a Dog
}
```

the operation `{ myFavoriteAnimal { legs } }` "references"
`Animal.legs`, not `Dog.legs`, but it is the field `Dog.legs` that is
executed.

(Additionally, when using federation, fields can be executed without
being referenced in the operation if the query plan requires them to be
executed to fulfill an `@requires` or `@key` directive.)

This PR extends Apollo Server's usage reporting plugin to provide a list
of "referenced fields" along with every operation. Note that these
fields depend only on the operation, not on the variables passed or
anything at execution time, so we don't need one set per trace, just one
set per operation.

In addition to the fact that this statistic is an interesting one, this
will also mean that the Fields page can be useful without needing full
execution tracing. There is a real performance impact of full execution
tracing, especially in the federation context where the ftv1 protocol
involves sending subgraph traces to the gateway/router in-band in the
response.  And not every GraphQL server supports full execution tracing
in the first place.  With support for referenced fields on the Fields
page, you will be able to run a gateway/router in front of an arbitrary
subgraph without federated tracing (whether you want to do this for
performance or lack of implementation reasons) and still get some useful
data on the Fields page.  (You can then perhaps run full tracing on a
sampled subset of queries to get reasonable approximate execution data
too.)

Note that the new Studio functionality may not be in general
availability when this PR merges into the Apollo Server release branch
or when the first alphas are created; it will be fully enabled
(including docs) by the time this PR is released in a non-prerelease.

As part of the implementation, we extend the existing "signature cache"
to cache the referenced field list as well. Note that the signature is a
pure function of the operation, whereas the referenced field list also
depends on the schema, so we add a little mechanism to throw out the
cache if the schema changes.

Fixes #5708.
@glasser glasser added size/large Estimated to take MORE THAN A WEEK 2021-12 labels Dec 17, 2021
Copy link
Member

@trevor-scheer trevor-scheer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few minor comments, LGTM. Don't forget a changelog entry.

@glasser glasser changed the base branch from main to release-3.6.0 December 17, 2021 20:59
@glasser
Copy link
Member Author

glasser commented Dec 17, 2021

CHANGELOG update: a66ad28

Copy link
Member

@trevor-scheer trevor-scheer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thanks for all the good test additions 👍

@glasser glasser merged commit 17c6211 into release-3.6.0 Dec 17, 2021
@glasser glasser deleted the glasser/referenced-field-reporting branch December 17, 2021 21:41
@glasser glasser mentioned this pull request Dec 21, 2021
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 16, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
size/large Estimated to take MORE THAN A WEEK
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Extend Studio usage reporting to support referenced fields in addition to executed fields
2 participants