Plugin for connecting arbitrary GraphQL APIs to Gatsby's GraphQL. Remote schemas are stitched together by declaring an arbitrary type name that wraps the remote schema Query type (typeName
below), and putting the remote schema under a field of the Gatsby GraphQL query (fieldName
below).
npm install gatsby-source-graphql
If the remote GraphQL API needs authentication, you should pass environment variables to the build process, so credentials aren't committed to source control. We recommend using dotenv
, which will then expose environment variables. Read more about dotenv and using environment variables here. Then we can use these environment variables via process.env
and configure our plugin.
// In your gatsby-config.js
module.exports = {
plugins: [
// Simple config, passing URL
{
resolve: "gatsby-source-graphql",
options: {
// Arbitrary name for the remote schema Query type
typeName: "SWAPI",
// Field under which the remote schema will be accessible. You'll use this in your Gatsby query
fieldName: "swapi",
// Url to query from
url: "https://swapi-graphql.netlify.app/.netlify/functions/index",
},
},
// Advanced config, passing parameters to apollo-link
{
resolve: "gatsby-source-graphql",
options: {
typeName: "GitHub",
fieldName: "github",
url: "https://api.github.com/graphql",
// HTTP headers
headers: {
// Learn about environment variables: https://gatsby.dev/env-vars
Authorization: `Bearer ${process.env.GITHUB_TOKEN}`,
},
// HTTP headers alternatively accepts a function (allows async)
headers: async () => {
return {
Authorization: await getAuthorizationToken(),
}
},
// Additional options to pass to node-fetch
fetchOptions: {},
},
},
// Advanced config, using a custom fetch function
{
resolve: "gatsby-source-graphql",
options: {
typeName: "GitHub",
fieldName: "github",
url: "https://api.github.com/graphql",
// A `fetch`-compatible API to use when making requests.
fetch: (uri, options = {}) =>
fetch(uri, { ...options, headers: sign(options.headers) }),
},
},
// Complex situations: creating arbitrary Apollo Link
{
resolve: "gatsby-source-graphql",
options: {
typeName: "GitHub",
fieldName: "github",
// Create Apollo Link manually. Can return a Promise.
createLink: pluginOptions => {
return createHttpLink({
uri: "https://api.github.com/graphql",
headers: {
Authorization: `Bearer ${process.env.GITHUB_TOKEN}`,
},
fetch,
})
},
},
},
],
}
{
# This is the fieldName you've defined in the config
swapi {
allSpecies {
name
}
}
github {
viewer {
email
}
}
}
By default, the schema is introspected from the remote schema. The schema is cached in the .cache
directory, and refreshing the schema requires deleting the cache (e.g. by restarting gatsby develop
).
To control schema consumption, you can alternatively construct the schema definition by passing a createSchema
callback. This way you could, for example, read schema SDL or introspection JSON. When the createSchema
callback is used, the schema isn't cached. createSchema
can return a GraphQLSchema instance, or a Promise resolving to one.
const fs = require("fs")
const { buildSchema, buildClientSchema } = require("graphql")
module.exports = {
plugins: [
{
resolve: "gatsby-source-graphql",
options: {
typeName: "SWAPI",
fieldName: "swapi",
url: "https://api.graphcms.com/simple/v1/swapi",
createSchema: async () => {
const json = JSON.parse(
fs.readFileSync(`${__dirname}/introspection.json`)
)
return buildClientSchema(json.data)
},
},
},
{
resolve: "gatsby-source-graphql",
options: {
typeName: "SWAPI",
fieldName: "swapi",
url: "https://api.graphcms.com/simple/v1/swapi",
createSchema: async () => {
const sdl = fs.readFileSync(`${__dirname}/schema.sdl`).toString()
return buildSchema(sdl)
},
},
},
],
}
The plugin provides createLink
option that you can use to enhance your network stack:
add error handling, retries, timeouts, etc. Check out this article
for an excellent explanation of how it works.
Quick example of the http link with retries (using apollo-link-retry):
// gatsby-config.js
const { createHttpLink } = require(`apollo-link-http`)
const { RetryLink } = require(`apollo-link-retry`)
const retryLink = new RetryLink({
delay: {
initial: 100,
max: 2000,
jitter: true,
},
attempts: {
max: 5,
retryIf: (error, operation) =>
Boolean(error) && ![500, 400].includes(error.statusCode),
},
})
module.exports = {
plugins: [
{
resolve: "gatsby-source-graphql",
options: {
typeName: "SWAPI",
fieldName: "swapi",
url: "https://api.graphcms.com/simple/v1/swapi",
createLink: options =>
ApolloLink.from([retryLink, createHttpLink(options)]),
},
},
],
}
It's possible to modify the remote schema, via a transformSchema
option which customizes the way the default schema is transformed before it is merged on the Gatsby schema by the stitching process.
The transformSchema
function gets an object argument with the following fields:
- schema (introspected remote schema)
- link (default link)
- resolver (default resolver)
- defaultTransforms (an array with the default transforms)
- options (plugin options)
The return value is expected to be the final schema used for stitching.
Below an example configuration that uses the default implementation (equivalent to not using the transformSchema
option at all):
const { wrapSchema } = require(`@graphql-tools/wrap`)
const { linkToExecutor } = require(`@graphql-tools/links`)
module.exports = {
plugins: [
{
resolve: "gatsby-source-graphql",
options: {
typeName: "SWAPI",
fieldName: "swapi",
url: "https://api.graphcms.com/simple/v1/swapi",
transformSchema: ({
schema,
link,
resolver,
defaultTransforms,
options,
}) => {
return wrapSchema(
{
schema,
executor: linkToExecutor(link),
},
defaultTransforms
)
}
},
]
}
For details, refer to https://www.graphql-tools.com/docs/schema-wrapping.
An use case for this feature can be seen in this issue.
By default, gatsby-source-graphql
will only refetch the data once the server is restarted. It's also possible to configure the plugin to periodically refetch the data. The option is called refetchInterval
and specifies the timeout in seconds.
module.exports = {
plugins: [
// Simple config, passing URL
{
resolve: "gatsby-source-graphql",
options: {
// Arbitrary name for the remote schema Query type
typeName: "SWAPI",
// Field under which the remote schema will be accessible. You'll use this in your Gatsby query
fieldName: "swapi",
// Url to query from
url: "https://api.graphcms.com/simple/v1/swapi",
// refetch interval in seconds
refetchInterval: 60,
},
},
],
}
By default, gatsby-source-graphql
executes each query in a separate network request.
But the plugin also supports query batching to improve query performance.
Caveat: Batching is only possible for queries starting at approximately the same time. In other words it is bounded by the number of parallel GraphQL queries executed by Gatsby (by default it is 4).
Fortunately, we can increase the number of queries executed in parallel by setting the environment variable
GATSBY_EXPERIMENTAL_QUERY_CONCURRENCY
to a higher value and setting the batch
option of the plugin
to true
.
Example:
cross-env GATSBY_EXPERIMENTAL_QUERY_CONCURRENCY=20 gatsby develop
With plugin config:
const fs = require("fs")
const { buildSchema, buildClientSchema } = require("graphql")
module.exports = {
plugins: [
{
resolve: "gatsby-source-graphql",
options: {
typeName: "SWAPI",
fieldName: "swapi",
url: "https://api.graphcms.com/simple/v1/swapi",
batch: true,
},
},
],
}
By default, the plugin batches up to 5 queries. You can override this by passing
dataLoaderOptions
and set a maxBatchSize
:
const fs = require("fs")
const { buildSchema, buildClientSchema } = require("graphql")
module.exports = {
plugins: [
{
resolve: "gatsby-source-graphql",
options: {
typeName: "SWAPI",
fieldName: "swapi",
url: "https://api.graphcms.com/simple/v1/swapi",
batch: true,
// See https://github.com/graphql/dataloader#new-dataloaderbatchloadfn--options
// for a full list of DataLoader options
dataLoaderOptions: {
maxBatchSize: 10,
},
},
},
],
}
Having 20 parallel queries with 5 queries per batch means we are still running 4 batches in parallel.
Each project is unique so try tuning those two variables and see what works best for you. We've seen up to 5-10x speed-up for some setups.
Under the hood gatsby-source-graphql
uses DataLoader
for query batching. It merges all queries from a batch to a single query that gets sent to the
server in a single network request.
Consider the following example where both of these queries are run:
{
query: `query(id: Int!) {
node(id: $id) {
foo
}
}`,
variables: { id: 1 },
}
{
query: `query(id: Int!) {
node(id: $id) {
bar
}
}`,
variables: { id: 2 },
}
They will be merged into a single query:
{
query: `
query(
$gatsby0_id: Int!
$gatsby1_id: Int!
) {
gatsby0_node: node(id: $gatsby0_id) {
foo
}
gatsby1_node: node(id: $gatsby1_id) {
bar
}
}
`,
variables: {
gatsby0_id: 1,
gatsby1_id: 2,
}
}
Then gatsby-source-graphql
splits the result of this single query into multiple results
and delivers it back to Gatsby as if it executed multiple queries:
{
data: {
gatsby0_node: { foo: `foo` },
gatsby1_node: { bar: `bar` },
},
}
is transformed back to:
[
{ data { node: { foo: `foo` } } },
{ data { node: { bar: `bar` } } },
]
Note that if any query result contains errors the whole batch will fail.
If your server supports apollo-style query batching you can also try
HttpLinkDataLoader.
Pass it to the gatsby-source-graphql
plugin via the createLink
option.
This strategy is usually slower than query merging but provides better error reporting.