Skip to content

feat(router): enable using redis clusters for rate limiting and apq#1499

Merged
df-wg merged 15 commits intomainfrom
dave/eng-6144-verify-use-of-redis-cluster-mode-for-apq
Jan 31, 2025
Merged

feat(router): enable using redis clusters for rate limiting and apq#1499
df-wg merged 15 commits intomainfrom
dave/eng-6144-verify-use-of-redis-cluster-mode-for-apq

Conversation

@df-wg
Copy link
Copy Markdown
Contributor

@df-wg df-wg commented Jan 8, 2025

Motivation and Context

Previously, users were only able to provide a single redis instance to Cosmo for both APQ/Rate Limiting. This didn't take advantage of Redis' built-in cluster mode, which takes care of horizontal scaling for the users.

This PR enables that. In order to do it, users can provide a list of their cluster URLs instead of the singular URL they used to provide, in the now renamed urls field. In order to opt in to cluster mode, users have to set cluster_enabled: true in their configuration for both APQ and Rate Limiting

Warning

As part of the preparations for Cosmo V1, targeted for release in Q1 2025, this pull request introduces essential changes to enhance long-term stability and maintainability. While we strive to minimize breaking changes, they are sometimes necessary to lay the foundation for a more robust and scalable system.

Before:

rate_limit:
  enabled: true
  strategy: "simple"
  storage:
    url: "testuser:testpass@localhost:8000"
    key_prefix: "cosmo_rate_limit"  

storage_providers:
  redis:
    - id: "my_redis"
      url: "test:testpass@localhost:8000"

After

rate_limit:
  enabled: true
  strategy: "simple"
  storage:
    cluster_enabled: true
    urls:
      - "testuser:testpass@localhost:8000"
      - "test2:testpass@localhost:8001"
    key_prefix: "cosmo_rate_limit"  

storage_providers:
  redis:
    - id: "my_redis"
      cluster_enabled: true
      urls:
        - "test:testpass@localhost:8000"
        - "test2:testpass@localhost:8001"

Migration Path:

[ ] Rename storage_providers.redis.url to storage_providers.redis.urls, and rate_limit.storage.url to rate_limit.storage.urls, as the first value of a list

Checklist

Comment thread docker-compose.yml
Comment thread router/internal/persistedoperation/operationstorage/redis/rdcloser_test.go Outdated
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jan 8, 2025

Router image scan passed

✅ No security vulnerabilities found in image:

ghcr.io/wundergraph/cosmo/router:sha-eaafdace83a5d516c3369dac1d11b808af5646dd

Comment thread router-tests/ratelimit_test.go Outdated
Comment thread router/internal/persistedoperation/operationstorage/redis/rdcloser.go Outdated
Comment thread router/internal/persistedoperation/operationstorage/redis/rdcloser.go Outdated
Comment thread router/internal/persistedoperation/operationstorage/redis/rdcloser.go Outdated
Comment thread router/internal/persistedoperation/operationstorage/redis/rdcloser.go Outdated
@df-wg df-wg force-pushed the dave/eng-6144-verify-use-of-redis-cluster-mode-for-apq branch from a152f50 to fbe58fc Compare January 22, 2025 07:13
Comment thread router/internal/persistedoperation/apq/redis.go
Comment thread router/pkg/config/config.go Outdated
@df-wg df-wg force-pushed the dave/eng-6144-verify-use-of-redis-cluster-mode-for-apq branch from de3fe03 to 6c94921 Compare January 27, 2025 08:35
@df-wg df-wg force-pushed the dave/eng-6144-verify-use-of-redis-cluster-mode-for-apq branch from d9388bb to 27b8c6d Compare January 30, 2025 08:12
Comment thread router/internal/persistedoperation/operationstorage/redis/rdcloser.go Outdated
Copy link
Copy Markdown
Contributor

@StarpTech StarpTech left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@df-wg df-wg enabled auto-merge (squash) January 31, 2025 08:50
@df-wg df-wg merged commit 7c5b3a7 into main Jan 31, 2025
@df-wg df-wg deleted the dave/eng-6144-verify-use-of-redis-cluster-mode-for-apq branch January 31, 2025 09:04
james-braund-cabiri added a commit to cabiri-io/cosmo that referenced this pull request Jan 31, 2025
* feat: expose type data and record subgraphs for enums (wundergraph#1495)

* chore(release): Publish [skip ci]

 - cdn@0.12.0
 - wgc@0.71.5
 - @wundergraph/composition@0.34.0
 - controlplane@0.119.0
 - @wundergraph/cosmo-shared@0.33.3
 - studio@0.97.0

* feat: improve rate limit responses (add code, hide stats) (wundergraph#1497)

* chore(release): Publish [skip ci]

 - router@0.161.0

* fix: provider should be specified in the config.yaml (wundergraph#1397)

* fix: update the timeouts for clickhouse and platform service (wundergraph#1500)

* chore(release): Publish [skip ci]

 - wgc@0.71.6
 - controlplane@0.119.1
 - router@0.161.1

* fix: add edfs to the demo environment (wundergraph#1505)

* docs(CONTRIBUTING): fixup minor mistake in CONTRIBUTING.md under Go workspace (wundergraph#1502)

Co-authored-by: Dustin Deus <deusdustin@gmail.com>

* fix: full demo broken in main branch (wundergraph#1508)

* feat(router): optionally add jitter to config polling interval (wundergraph#1506)

Co-authored-by: Dustin Deus <deusdustin@gmail.com>

* chore(release): Publish [skip ci]

 - router@0.162.0

* fix(router): remove wildcard from router graphql path (wundergraph#1509)

* fix: use gauge for server.uptime metric (wundergraph#1510)

Co-authored-by: Ludwig <ludwig.bedacht@gmail.com>

* feat: cache warmer (wundergraph#1501)

Co-authored-by: Ludwig <ludwig.bedacht@gmail.com>
Co-authored-by: starptech <deusdustin@gmail.com>

* chore(release): Publish [skip ci]

 - cdn@0.13.0
 - @wundergraph/cosmo-cdn@0.8.0
 - wgc@0.72.0
 - @wundergraph/cosmo-connect@0.91.0
 - controlplane@0.120.0
 - graphqlmetrics@0.32.0
 - router@0.163.0
 - @wundergraph/cosmo-shared@0.33.4
 - studio@0.98.0

* fix(cache warmup): consider only po of the last 7 days (wundergraph#1513)

* chore(release): Publish [skip ci]

 - controlplane@0.120.1

* fix(cache operation): swallow cache errors and other improvements (wundergraph#1515)

* chore(release): Publish [skip ci]

 - controlplane@0.120.2
 - graphqlmetrics@0.32.1
 - router@0.163.1
 - studio@0.98.1

* feat: add variables remapping support (wundergraph#1516)

Co-authored-by: starptech <deusdustin@gmail.com>

* chore(release): Publish [skip ci]

 - router@0.164.0

* fix(router): write proper line endings and header for multipart (wundergraph#1517)

* chore(release): Publish [skip ci]

 - router@0.164.1

* feat(router): optimize playground delivery, add concurrency_limit to config (wundergraph#1519)

* fix(router): enable health checks during startup (wundergraph#1529)

* feat: improve cache warmer (wundergraph#1530)

Co-authored-by: Ludwig <ludwig.bedacht@gmail.com>

* chore(release): Publish [skip ci]

 - controlplane@0.121.0
 - router@0.165.0
 - studio@0.99.0

* fix: remove semaphore from ResolveGraphQLSubscription (wundergraph#1532)

* chore(release): Publish [skip ci]

 - router@0.165.1

* feat: add compatibility handshake between router and execution config (wundergraph#1534)

* chore(release): Publish [skip ci]

 - wgc@0.72.1
 - @wundergraph/composition@0.35.0
 - @wundergraph/cosmo-connect@0.92.0
 - controlplane@0.121.1
 - router@0.166.0
 - @wundergraph/cosmo-shared@0.34.0
 - studio@0.99.1

* feat: also add handshake for static execution configs (wundergraph#1535)

* chore(router): bump demo library to pickup subscription fix (wundergraph#1518)

* feat(router): add interface for trace propagation (wundergraph#1526)

* chore(release): Publish [skip ci]

 - router@0.167.0

* fix: adding/removing directive is not picked up by wgc subgraph check (wundergraph#1494)

* chore(deps): upgrade ristretto to v2 (wundergraph#1538)

* feat: add normalizedQuery to query plan and request info to trace (wundergraph#1536)

Co-authored-by: df-wg <dave@wundergraph.com>

* fix: add copy button to subgraph routing url (wundergraph#1543)

Co-authored-by: Dustin Deus <deusdustin@gmail.com>

* fix: webhooks shot when schema is unchanged (wundergraph#1542)

* fix: trim the inputs of group mappers (wundergraph#1541)

* fix: subgraphs search functionality (wundergraph#1540)

* chore(release): Publish [skip ci]

 - controlplane@0.121.2
 - graphqlmetrics@0.32.2
 - router@0.168.0
 - studio@0.99.2

* fix: increase max concurrent resolvers (wundergraph#1544)

* refactor(router): redesign JWK authentication logic (wundergraph#1498)

* chore(release): Publish [skip ci]

 - router@0.168.1

* fix: increase the test timeout value to prevent failures on slower machines (wundergraph#1547)

* fix: reduce the breaking change retention duration (wundergraph#1550)

* fix: change the defaults of breaking-change-retention (wundergraph#1551)

* feat(router): enable starting the router without subgraphs (wundergraph#1533)

* fix(router): parse accept header per rfc 9110 (wundergraph#1549)

* chore(release): Publish [skip ci]

 - controlplane@0.121.3
 - router@0.169.0
 - studio@0.99.3

* feat(router): enable using redis clusters for rate limiting and apq (wundergraph#1499)

* fix: json schema for traffic shaping subgraphs (wundergraph#1552)

* chore: Update aws-lambda-router customisation after upstream sync

---------

Co-authored-by: Nithin Kumar B <nithinkumar5353@gmail.com>
Co-authored-by: hardworker-bot <bot@wundergraph.com>
Co-authored-by: Jens Neuse <jens.neuse@gmx.de>
Co-authored-by: Alessandro Pagnin <ale@wundergraph.com>
Co-authored-by: Suvij Surya <suvijsurya76@gmail.com>
Co-authored-by: endigma <endigma@mailcat.ca>
Co-authored-by: Dustin Deus <deusdustin@gmail.com>
Co-authored-by: Ludwig <ludwig.bedacht@gmail.com>
Co-authored-by: Sergiy 🇺🇦 <818351+devsergiy@users.noreply.github.com>
Co-authored-by: df-wg <dave@wundergraph.com>
Co-authored-by: Aenimus <47415099+Aenimus@users.noreply.github.com>
james-braund-cabiri added a commit to cabiri-io/cosmo that referenced this pull request Feb 4, 2025
* feat: expose type data and record subgraphs for enums (wundergraph#1495)

* chore(release): Publish [skip ci]

 - cdn@0.12.0
 - wgc@0.71.5
 - @wundergraph/composition@0.34.0
 - controlplane@0.119.0
 - @wundergraph/cosmo-shared@0.33.3
 - studio@0.97.0

* feat: improve rate limit responses (add code, hide stats) (wundergraph#1497)

* chore(release): Publish [skip ci]

 - router@0.161.0

* fix: provider should be specified in the config.yaml (wundergraph#1397)

* fix: update the timeouts for clickhouse and platform service (wundergraph#1500)

* chore(release): Publish [skip ci]

 - wgc@0.71.6
 - controlplane@0.119.1
 - router@0.161.1

* fix: add edfs to the demo environment (wundergraph#1505)

* docs(CONTRIBUTING): fixup minor mistake in CONTRIBUTING.md under Go workspace (wundergraph#1502)

Co-authored-by: Dustin Deus <deusdustin@gmail.com>

* fix: full demo broken in main branch (wundergraph#1508)

* feat(router): optionally add jitter to config polling interval (wundergraph#1506)

Co-authored-by: Dustin Deus <deusdustin@gmail.com>

* chore(release): Publish [skip ci]

 - router@0.162.0

* fix(router): remove wildcard from router graphql path (wundergraph#1509)

* fix: use gauge for server.uptime metric (wundergraph#1510)

Co-authored-by: Ludwig <ludwig.bedacht@gmail.com>

* feat: cache warmer (wundergraph#1501)

Co-authored-by: Ludwig <ludwig.bedacht@gmail.com>
Co-authored-by: starptech <deusdustin@gmail.com>

* chore(release): Publish [skip ci]

 - cdn@0.13.0
 - @wundergraph/cosmo-cdn@0.8.0
 - wgc@0.72.0
 - @wundergraph/cosmo-connect@0.91.0
 - controlplane@0.120.0
 - graphqlmetrics@0.32.0
 - router@0.163.0
 - @wundergraph/cosmo-shared@0.33.4
 - studio@0.98.0

* fix(cache warmup): consider only po of the last 7 days (wundergraph#1513)

* chore(release): Publish [skip ci]

 - controlplane@0.120.1

* fix(cache operation): swallow cache errors and other improvements (wundergraph#1515)

* chore(release): Publish [skip ci]

 - controlplane@0.120.2
 - graphqlmetrics@0.32.1
 - router@0.163.1
 - studio@0.98.1

* feat: add variables remapping support (wundergraph#1516)

Co-authored-by: starptech <deusdustin@gmail.com>

* chore(release): Publish [skip ci]

 - router@0.164.0

* fix(router): write proper line endings and header for multipart (wundergraph#1517)

* chore(release): Publish [skip ci]

 - router@0.164.1

* feat(router): optimize playground delivery, add concurrency_limit to config (wundergraph#1519)

* fix(router): enable health checks during startup (wundergraph#1529)

* feat: improve cache warmer (wundergraph#1530)

Co-authored-by: Ludwig <ludwig.bedacht@gmail.com>

* chore(release): Publish [skip ci]

 - controlplane@0.121.0
 - router@0.165.0
 - studio@0.99.0

* fix: remove semaphore from ResolveGraphQLSubscription (wundergraph#1532)

* chore(release): Publish [skip ci]

 - router@0.165.1

* feat: add compatibility handshake between router and execution config (wundergraph#1534)

* chore(release): Publish [skip ci]

 - wgc@0.72.1
 - @wundergraph/composition@0.35.0
 - @wundergraph/cosmo-connect@0.92.0
 - controlplane@0.121.1
 - router@0.166.0
 - @wundergraph/cosmo-shared@0.34.0
 - studio@0.99.1

* feat: also add handshake for static execution configs (wundergraph#1535)

* chore(router): bump demo library to pickup subscription fix (wundergraph#1518)

* feat(router): add interface for trace propagation (wundergraph#1526)

* chore(release): Publish [skip ci]

 - router@0.167.0

* fix: adding/removing directive is not picked up by wgc subgraph check (wundergraph#1494)

* chore(deps): upgrade ristretto to v2 (wundergraph#1538)

* feat: add normalizedQuery to query plan and request info to trace (wundergraph#1536)

Co-authored-by: df-wg <dave@wundergraph.com>

* fix: add copy button to subgraph routing url (wundergraph#1543)

Co-authored-by: Dustin Deus <deusdustin@gmail.com>

* fix: webhooks shot when schema is unchanged (wundergraph#1542)

* fix: trim the inputs of group mappers (wundergraph#1541)

* fix: subgraphs search functionality (wundergraph#1540)

* chore(release): Publish [skip ci]

 - controlplane@0.121.2
 - graphqlmetrics@0.32.2
 - router@0.168.0
 - studio@0.99.2

* fix: increase max concurrent resolvers (wundergraph#1544)

* refactor(router): redesign JWK authentication logic (wundergraph#1498)

* chore(release): Publish [skip ci]

 - router@0.168.1

* fix: increase the test timeout value to prevent failures on slower machines (wundergraph#1547)

* fix: reduce the breaking change retention duration (wundergraph#1550)

* fix: change the defaults of breaking-change-retention (wundergraph#1551)

* feat(router): enable starting the router without subgraphs (wundergraph#1533)

* fix(router): parse accept header per rfc 9110 (wundergraph#1549)

* chore(release): Publish [skip ci]

 - controlplane@0.121.3
 - router@0.169.0
 - studio@0.99.3

* feat(router): enable using redis clusters for rate limiting and apq (wundergraph#1499)

* fix: json schema for traffic shaping subgraphs (wundergraph#1552)

* fix: subgraph timeout can't be bigger than global timeout (wundergraph#1548)

* fix: error when graph token is not set when cache warmup is enabled (wundergraph#1554)

* chore(release): Publish [skip ci]

 - router@0.170.0

* fix: incorrect graphql endpoint in playground (wundergraph#1562)

* chore(release): Publish [skip ci]

 - @wundergraph/playground@0.8.3
 - router@0.170.1

* fix: update vulnerable packages (wundergraph#1560)

* fix: synchronize go mod versions (wundergraph#1564)

* chore: reduce verbose logging for failed tests (wundergraph#1565)

* fix: Add missing config mapping, bump aws-lambda-router version

* fix: Repair PNPM lockfile after merge

---------

Co-authored-by: Nithin Kumar B <nithinkumar5353@gmail.com>
Co-authored-by: hardworker-bot <bot@wundergraph.com>
Co-authored-by: Jens Neuse <jens.neuse@gmx.de>
Co-authored-by: Alessandro Pagnin <ale@wundergraph.com>
Co-authored-by: Suvij Surya <suvijsurya76@gmail.com>
Co-authored-by: endigma <endigma@mailcat.ca>
Co-authored-by: Dustin Deus <deusdustin@gmail.com>
Co-authored-by: Ludwig <ludwig.bedacht@gmail.com>
Co-authored-by: Sergiy 🇺🇦 <818351+devsergiy@users.noreply.github.com>
Co-authored-by: df-wg <dave@wundergraph.com>
Co-authored-by: Aenimus <47415099+Aenimus@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants