Skip to content

reverseproxy: Track dynamic upstreams, enable passive healthchecking#7539

Merged
francislavoie merged 2 commits intomasterfrom
dynamic-upstreams-tracking
Mar 4, 2026
Merged

reverseproxy: Track dynamic upstreams, enable passive healthchecking#7539
francislavoie merged 2 commits intomasterfrom
dynamic-upstreams-tracking

Conversation

@francislavoie
Copy link
Member

@francislavoie francislavoie commented Mar 2, 2026

I was reviewing #7517 and thinking about how it would handle dynamic upstreams. I was thinking there might be a leak with dynamic upstreams not getting cleaned up since it doesn't happen in the proxy handler's Cleanup(), but that's not the case because we actually drop the upstreams immediately after every request. This also means we don't have support for passive health checks for dynamic upstreams.

This adds separate tracking for dynamic upstreams, which enables passive health checks to work (but not active health checks still, for obvious reasons) and the admin API endpoint can show upstream health as well (with inactive ones getting removed after an hour). This "one hour" lifetime is somewhat arbitrary, but I think it should serve most users well enough, ensures long-term leaks shouldn't happen while making most real-time usecases (checking health and such) still work fine consistently.

Also added tests for dynamic upstream tracking, the admin endpoint, and passive health checks.

AI's Summary of changes

Problem

Dynamic upstreams (from DynamicUpstreams.GetUpstreams) were provisioned into the same UsagePool as static upstreams, but with a defer hosts.Delete(...) at the end of each proxy loop iteration. This caused two bugs:

  1. Health state loss — when there was no concurrent traffic, the ref count hit 0 after each request, deleting the *Host entry. The next request created a fresh one, wiping passive fail counts. A backend could fail repeatedly and never be marked down.
  2. Churn — every request unnecessarily allocated, stored, and deleted pool entries.

Solution: separate tracking for dynamic hosts

hosts.go

  • Added fillDynamicHost() on *Upstream — stores the host in a plain map[string]dynamicHostEntry (protected by sync.RWMutex) instead of the reference-counted UsagePool. Each call refreshes the lastSeen timestamp.
  • Added a background cleanup goroutine (started exactly once via sync.Once) that sweeps the map every 5 minutes and evicts entries idle for more than an hour.

reverseproxy.go

  • provisionUpstream gains a dynamic bool parameter, routing to fillDynamicHost() vs fillHost() accordingly.
  • Static upstreams at Provision time: provisionUpstream(u, false) — unchanged behaviour.
  • Dynamic upstreams per-request: provisionUpstream(dUp, true) — no more defer hosts.Delete(...).

admin.go

  • The /reverse_proxy/upstreams API endpoint now ranges over both hosts (static) and dynamicHosts (dynamic) so the admin view is complete.

Why passive health checks now work

countFailure increments Host.countFail(1) and schedules a goroutine to decrement it after FailDuration. Since *Host now persists across requests in dynamicHosts, those counts survive between sequential requests. Healthy() reads Host.Fails() < healthCheckPolicy.MaxFails — both of which are correctly set on each fresh *Upstream struct from provisionUpstream — so a dynamic backend that has accumulated enough failures will correctly be skipped by the load balancer.

Assistance Disclosure

Used Github Copilot + Claude Sonnet 4.6

@francislavoie francislavoie added the feature ⚙️ New feature or request label Mar 2, 2026
@francislavoie francislavoie force-pushed the dynamic-upstreams-tracking branch from c5649f7 to 71f5131 Compare March 2, 2026 09:14
@francislavoie francislavoie added this to the v2.11.2 milestone Mar 3, 2026
@francislavoie francislavoie requested a review from mholt March 3, 2026 22:22
Copy link
Member

@mholt mholt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Francis, this is cool. Looks a little complicated but is probably a pretty decent way to do things. Let's try it?

@francislavoie francislavoie merged commit db29860 into master Mar 4, 2026
30 checks passed
@francislavoie francislavoie deleted the dynamic-upstreams-tracking branch March 4, 2026 20:05
@github-actions github-actions bot mentioned this pull request Mar 6, 2026
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature ⚙️ New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants