Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion Index.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ A Kubernetes-native Go microservice framework for building production-grade gRPC
| **Distributed Tracing** | [OpenTelemetry] and [New Relic] support with automatic span creation in interceptors — traces can be sent to any OTLP-compatible backend including [Jaeger] |
| **Prometheus Metrics** | Built-in request latency, error rate, and circuit breaker metrics at `/metrics` |
| **Error Tracking** | Stack traces, gRPC status codes, and async notification to [Sentry], Rollbar, or Airbrake |
| **Resilience** | Client-side circuit breaking and retries via interceptors |
| **Rate Limiting** | Per-pod token bucket rate limiter — disabled by default, pluggable via custom `Limiter` interface for distributed or per-tenant rate limiting. Config: `RATE_LIMIT_PER_SECOND` |
Comment thread
ankurs marked this conversation as resolved.
Outdated
| **Fast Serialization** | [vtprotobuf] codec enabled by default — faster gRPC marshalling with automatic fallback to standard protobuf |
| **Kubernetes-native** | Health/ready probes, graceful SIGTERM shutdown, structured JSON logs, Prometheus metrics — all wired automatically |
| **Swagger / OpenAPI** | Interactive API docs auto-served at `/swagger/` from your protobuf definitions |
Expand Down
17 changes: 10 additions & 7 deletions architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -199,13 +199,16 @@ Interceptors are gRPC middleware that run on every request. ColdBrew chains them

| Order | Interceptor | Package | What It Does |
|-------|------------|---------|--------------|
| 1 | Response Time Logging | `interceptors` | Logs method name, duration, and status code |
| 2 | Trace ID | `interceptors` | Generates a trace ID (or reads it from the `x-trace-id` HTTP header or a `trace_id` proto field) and propagates it to structured logs, Sentry/Rollbar error reports, and OpenTelemetry spans (as the `coldbrew.trace_id` attribute) |
| 3 | Proto Validate | `interceptors` | Validates incoming messages using [protovalidate](https://github.com/bufbuild/protovalidate) annotations. Returns `InvalidArgument` on failure. Disable with `DISABLE_PROTO_VALIDATE` |
| 4 | Prometheus | `interceptors` | Records request count, latency histogram, and status codes |
| 5 | Error Notification | `interceptors` | Sends errors to Sentry/Rollbar/Airbrake asynchronously |
| 6 | New Relic | `interceptors` | Creates a New Relic transaction for APM |
| 7 | Panic Recovery | `interceptors` | Catches panics and converts them to gRPC errors |
| 1 | Default Timeout | `interceptors` | Applies a 60s deadline to unary RPCs without one. Prevents resource exhaustion from clients that don't set deadlines. Config: `GRPC_SERVER_DEFAULT_TIMEOUT_IN_SECONDS` |
| 2 | Rate Limiting | `interceptors` | Per-pod token bucket rate limiter. Returns `ResourceExhausted` when exceeded. Disabled by default. Config: `RATE_LIMIT_PER_SECOND`, `RATE_LIMIT_BURST` |
| 3 | Response Time Logging | `interceptors` | Logs method name, duration, and status code |
| 4 | Trace ID | `interceptors` | Generates a trace ID (or reads it from the `x-trace-id` HTTP header or a `trace_id` proto field) and propagates it to structured logs, Sentry/Rollbar error reports, and OpenTelemetry spans (as the `coldbrew.trace_id` attribute) |
| 5 | Debug Log | `interceptors` | Enables per-request log level override via `bool debug` proto field or `x-debug-log-level` metadata header. Config: `DISABLE_DEBUG_LOG_INTERCEPTOR`, `DEBUG_LOG_HEADER_NAME` |
Comment thread
ankurs marked this conversation as resolved.
Outdated
| 6 | Proto Validate | `interceptors` | Validates incoming messages using [protovalidate](https://github.com/bufbuild/protovalidate) annotations. Returns `InvalidArgument` on failure. Config: `DISABLE_PROTO_VALIDATE` |
| 7 | Prometheus | `interceptors` | Records request count, latency histogram, and status codes |
| 8 | Error Notification | `interceptors` | Sends errors to Sentry/Rollbar/Airbrake asynchronously |
| 9 | New Relic | `interceptors` | Creates a New Relic transaction for APM |
| 10 | Panic Recovery | `interceptors` | Catches panics and converts them to gRPC errors |

{: .note }
OpenTelemetry tracing spans are created by the `otelgrpc` stats handler configured at the gRPC server/client level, not as an interceptor in the chain.
Expand Down
3 changes: 3 additions & 0 deletions config-reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,9 @@ cfg := config.GetColdBrewConfig()
| `DISABLE_DEBUG_LOG_INTERCEPTOR` | bool | `false` | Disable the DebugLogInterceptor. When disabled, proto `debug`/`enable_debug` fields and `x-debug-log-level` headers will not trigger per-request debug logging |
| `DEBUG_LOG_HEADER_NAME` | string | `x-debug-log-level` | gRPC metadata / HTTP header name for per-request debug logging. The header value should be a valid log level (`debug`, `info`, `warn`, `error`). See [Log How-To](/howto/Log/#production-debugging-with-overrideloglevel--trace-id) |
| `GRPC_SERVER_DEFAULT_TIMEOUT_IN_SECONDS` | int | `60` | Default timeout for incoming unary gRPC requests without a deadline. Set to `0` to disable. Does not apply to stream RPCs |
| `RATE_LIMIT_PER_SECOND` | float64 | `0` | Maximum incoming requests per second for this pod (per-pod in-memory token bucket). Set to `0` to disable (default). With N pods, effective cluster-wide limit is N × this value. For distributed rate limiting, use `interceptors.SetRateLimiter()` with a custom implementation |
| `RATE_LIMIT_BURST` | int | `1` | Maximum burst size for the token bucket rate limiter. Only takes effect when `RATE_LIMIT_PER_SECOND > 0` |
| `DISABLE_RATE_LIMIT` | bool | `false` | Disable the rate limiting interceptor entirely |

## gRPC TLS

Expand Down
67 changes: 67 additions & 0 deletions howto/interceptors.md
Original file line number Diff line number Diff line change
Expand Up @@ -171,6 +171,73 @@ func init() {

Set `DISABLE_PROTO_VALIDATE=true` to skip validation entirely.

## Rate limiting

ColdBrew includes a built-in per-pod token bucket rate limiter. It is **disabled by default** and must be explicitly enabled.

### Enabling via environment variables

```yaml
env:
- name: RATE_LIMIT_PER_SECOND
value: "100" # 100 requests per second per pod
- name: RATE_LIMIT_BURST
value: "50" # allow bursts up to 50
```

{: .important }
This is a **per-pod in-memory limit**. With N pods, the effective cluster-wide limit is N × `RATE_LIMIT_PER_SECOND`. For cluster-wide rate limiting, use a custom limiter (see below) or your load balancer.

When a request exceeds the rate limit, the interceptor returns a `ResourceExhausted` gRPC status code.

### Custom per-API rate limiter

For different rate limits per API method, implement the `ratelimit.Limiter` interface and register it during initialization:

```go
Comment thread
ankurs marked this conversation as resolved.
import (
"context"
"fmt"

"github.com/go-coldbrew/interceptors"
"golang.org/x/time/rate"
"google.golang.org/grpc"
)

type perMethodLimiter struct {
limiters map[string]*rate.Limiter
fallback *rate.Limiter
}

func (l *perMethodLimiter) Limit(ctx context.Context) error {
method, _ := grpc.Method(ctx)
limiter, ok := l.limiters[method]
if !ok {
limiter = l.fallback
}
if !limiter.Allow() {
return fmt.Errorf("rate limit exceeded for %s", method)
Comment thread
ankurs marked this conversation as resolved.
Outdated
}
return nil
}

func init() {
interceptors.SetRateLimiter(&perMethodLimiter{
limiters: map[string]*rate.Limiter{
"/myservice.v1.UserService/CreateUser": rate.NewLimiter(10, 5), // 10 rps
"/myservice.v1.UserService/ListUsers": rate.NewLimiter(100, 50), // 100 rps
},
fallback: rate.NewLimiter(50, 25), // 50 rps default
})
}
```
Comment thread
ankurs marked this conversation as resolved.

For distributed rate limiting (e.g., across pods or per-tenant), implement the same interface with a Redis-backed limiter.

Comment thread
coderabbitai[bot] marked this conversation as resolved.
### Disabling

Set `DISABLE_RATE_LIMIT=true` to remove the rate limiting interceptor from the chain entirely.

## Adding custom interceptors to Default interceptors

You can add your own interceptors to the [Default Interceptors] by appending to the list of interceptors.
Expand Down
8 changes: 7 additions & 1 deletion howto/production.md
Original file line number Diff line number Diff line change
Expand Up @@ -473,6 +473,11 @@ env:
# Never use debug level on public services — may log request payloads
- name: LOG_LEVEL
value: "info"
# Rate limit incoming requests (per-pod). Adjust to your service's capacity.
- name: RATE_LIMIT_PER_SECOND
value: "1000"
- name: RATE_LIMIT_BURST
value: "50"
# GRPC_MAX_SEND_MSG_SIZE limits response size FROM your service (default ~2GB).
# GRPC_MAX_RECV_MSG_SIZE limits request size TO your service (default 4MB).
# Consider reducing send size for public APIs; use streaming for large payloads.
Expand Down Expand Up @@ -531,7 +536,7 @@ These are your responsibility to handle at the infrastructure level:

- **CORS** — ColdBrew does not handle CORS headers. Use a reverse proxy (Nginx, Envoy, Istio) or add CORS middleware to the HTTP gateway.
- **Authentication/authorization** — Admin endpoints (`/debug/pprof`, `/metrics`, `/swagger`) have no built-in auth. Disable them for public services or restrict access at the load balancer.
- **Rate limiting** — No built-in rate limiting on any endpoint. Use your load balancer, service mesh, or a rate-limiting proxy.
- **Cluster-wide rate limiting** — Built-in rate limiting (`RATE_LIMIT_PER_SECOND`) is per-pod only. For cluster-wide or per-tenant rate limiting, use `interceptors.SetRateLimiter()` with a custom implementation or your load balancer. See [Interceptors How-To](/howto/interceptors#rate-limiting).
- **HTTP header forwarding** — `HTTP_HEADER_PREFIXES` forwards matching HTTP headers to gRPC metadata. Never add `authorization`, `cookie`, or `x-api-key` prefixes unless you are intentionally doing header-based gRPC auth.

## Production checklist
Expand All @@ -557,6 +562,7 @@ These are your responsibility to handle at the infrastructure level:
- [ ] `DISABLE_SWAGGER=true` — disable API documentation
- [ ] `DISABLE_GRPC_REFLECTION=true` — disable service discovery
- [ ] `DISABLE_DEBUG_LOG_INTERCEPTOR=true` — disable header-based debug logging
- [ ] Enable rate limiting — `RATE_LIMIT_PER_SECOND` (per-pod, adjust to capacity)
Comment thread
ankurs marked this conversation as resolved.
Outdated
- [ ] Consider reducing `GRPC_MAX_SEND_MSG_SIZE` from its ~2GB default if responses are small
- [ ] Restrict `/metrics` access at the load balancer
- [ ] `LOG_LEVEL=info` or higher (never `debug`)
Expand Down
Loading