-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
*: support additional '/metrics' endpoints #8242
Conversation
Note: TODO: |
0dc9c80
to
aa9cfe8
Compare
etcdmain/help.go
Outdated
Set level of detail for exported metrics, specify 'extensive' to include histogram metrics. | ||
--expose-metrics 'false' | ||
Additionally expose '/metrics' endpoints via '/metrics-urls'. | ||
--metrics-urls 'http://localhost:9379' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
--listen-metrics-urls
and default to "" so expose-metrics
isn't needed? blank list => no listen
embed/etcd.go
Outdated
// Start a client server goroutine for each listen address | ||
var h http.Handler | ||
if e.Config().EnableV2 { | ||
h = v2http.NewClientHandler(e.Server, e.Server.Cfg.ReqTimeout()) | ||
h = v2http.NewClientHandler(e.Server, metricsMux, e.Server.Cfg.ReqTimeout()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NewClientHandler
and HandleBasic
shouldn't need to change. I think all that's needed is:
var metricsMux *http.ServeMux
if len(e.cfg.MetricsUrls) > 0 {
metricsMux = http.NewServeMux()
metricsMux.Handle("metrics", prometheus.Handler()) // maybe etcdhttp.MetricsPath or get the path from the user-provided URL
}
etcdmain/grpc_proxy.go
Outdated
@@ -79,7 +82,8 @@ func newGRPCProxyStartCommand() *cobra.Command { | |||
} | |||
|
|||
cmd.Flags().StringVar(&grpcProxyListenAddr, "listen-addr", "127.0.0.1:23790", "listen address") | |||
cmd.Flags().StringVar(&grpcProxyDNSCluster, "discovery-srv", "", "DNS domain used to bootstrap initial cluster") | |||
cmd.Flags().BoolVar(&grpcProxyExposeMetrics, "expose-metrics", false, "Additionally expose '/metrics' endpoints via '/metrics-url'.") | |||
cmd.Flags().StringVar(&grpcProxyMetricsListenAddr, "metrics-url", "http://127.0.0.1:9378", "'/metrics' listen address (secured via client TLS)") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
metrics-addr
to be consistent with listen-addr
and default to nothing?
Documentation/op-guide/monitoring.md
Outdated
@@ -5,7 +5,8 @@ Each etcd server exports metrics under the `/metrics` path on its client port. | |||
The metrics can be fetched with `curl`: | |||
|
|||
```sh | |||
$ curl -L http://localhost:2379/metrics | grep -v debugging # ignore unstable debugging metrics | |||
# OR http://localhost:9379/metrics when etcd is run with '--expose-metrics' flag | |||
$ curl -L http://localhost:9379/metrics | grep -v debugging # ignore unstable debugging metrics |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should stick to the default 2379 here since if the user is changing the ports, they probably don't need this guide any more. Instead of hiding an explanation in a comment, the doc can directly say that optionally metrics can be exported to other listen addresses with --listen-metrics-urls
where it's talking about the client port
Procfile
Outdated
etcd1: bin/etcd --name infra1 --listen-client-urls http://127.0.0.1:2379 --advertise-client-urls http://127.0.0.1:2379 --listen-peer-urls http://127.0.0.1:12380 --initial-advertise-peer-urls http://127.0.0.1:12380 --initial-cluster-token etcd-cluster-1 --initial-cluster 'infra1=http://127.0.0.1:12380,infra2=http://127.0.0.1:22380,infra3=http://127.0.0.1:32380' --initial-cluster-state new --enable-pprof | ||
etcd2: bin/etcd --name infra2 --listen-client-urls http://127.0.0.1:22379 --advertise-client-urls http://127.0.0.1:22379 --listen-peer-urls http://127.0.0.1:22380 --initial-advertise-peer-urls http://127.0.0.1:22380 --initial-cluster-token etcd-cluster-1 --initial-cluster 'infra1=http://127.0.0.1:12380,infra2=http://127.0.0.1:22380,infra3=http://127.0.0.1:32380' --initial-cluster-state new --enable-pprof | ||
etcd3: bin/etcd --name infra3 --listen-client-urls http://127.0.0.1:32379 --advertise-client-urls http://127.0.0.1:32379 --listen-peer-urls http://127.0.0.1:32380 --initial-advertise-peer-urls http://127.0.0.1:32380 --initial-cluster-token etcd-cluster-1 --initial-cluster 'infra1=http://127.0.0.1:12380,infra2=http://127.0.0.1:22380,infra3=http://127.0.0.1:32380' --initial-cluster-state new --enable-pprof | ||
etcd1: bin/etcd --name infra1 --listen-client-urls http://127.0.0.1:2379 --advertise-client-urls http://127.0.0.1:2379 --listen-peer-urls http://127.0.0.1:12380 --initial-advertise-peer-urls http://127.0.0.1:12380 --initial-cluster-token etcd-cluster-1 --initial-cluster 'infra1=http://127.0.0.1:12380,infra2=http://127.0.0.1:22380,infra3=http://127.0.0.1:32380' --initial-cluster-state new --enable-pprof --expose-metrics --metrics-urls http://127.0.0.1:9379 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't change? would like to keep the procfile from being too cluttered
cd82e05
to
893a76a
Compare
embed/etcd.go
Outdated
@@ -400,6 +409,17 @@ func (e *Etcd) serve() (err error) { | |||
e.errHandler(s.serve(e.Server, &e.cfg.ClientTLSInfo, h, e.errHandler)) | |||
}(sctx) | |||
} | |||
for _, murl := range e.cfg.ListenMetricsUrls { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how will Etcd.Close()
shut this down?
etcdmain/config.go
Outdated
@@ -131,6 +133,7 @@ func newConfig() *config { | |||
fs.StringVar(&cfg.WalDir, "wal-dir", cfg.WalDir, "Path to the dedicated wal directory.") | |||
fs.Var(flags.NewURLsValue(embed.DefaultListenPeerURLs), "listen-peer-urls", "List of URLs to listen on for peer traffic.") | |||
fs.Var(flags.NewURLsValue(embed.DefaultListenClientURLs), "listen-client-urls", "List of URLs to listen on for client traffic.") | |||
fs.StringVar(&cfg.ListenMetricsUrlsJSON, "listen-metrics-urls", "", "List of URLs to listen on for metrics (secured via client TLS if any).") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this smart about http/https schemes instead of defaulting to TLS? there may be cases where it's OK to expose the metrics as plaintext but access to the kv store is locked down with tls authentication
etcdmain/grpc_proxy.go
Outdated
@@ -79,7 +81,7 @@ func newGRPCProxyStartCommand() *cobra.Command { | |||
} | |||
|
|||
cmd.Flags().StringVar(&grpcProxyListenAddr, "listen-addr", "127.0.0.1:23790", "listen address") | |||
cmd.Flags().StringVar(&grpcProxyDNSCluster, "discovery-srv", "", "DNS domain used to bootstrap initial cluster") | |||
cmd.Flags().StringVar(&grpcProxyMetricsListenAddr, "metrics-addr", "", "Additional '/metrics' listen address (secured via client TLS)") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
listen for /metrics requests on an additional interface
?
4a4d627
to
21588e0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a few last nits
etcdmain/config.go
Outdated
@@ -131,6 +133,7 @@ func newConfig() *config { | |||
fs.StringVar(&cfg.WalDir, "wal-dir", cfg.WalDir, "Path to the dedicated wal directory.") | |||
fs.Var(flags.NewURLsValue(embed.DefaultListenPeerURLs), "listen-peer-urls", "List of URLs to listen on for peer traffic.") | |||
fs.Var(flags.NewURLsValue(embed.DefaultListenClientURLs), "listen-client-urls", "List of URLs to listen on for client traffic.") | |||
fs.StringVar(&cfg.ListenMetricsUrlsJSON, "listen-metrics-urls", "", "List of URLs to listen on for metrics (non-secured ignoring client TLS).") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just List of URLs to listen on for metrics
? The parenthetical will probably cause confusion
etcdmain/grpc_proxy.go
Outdated
@@ -79,7 +81,7 @@ func newGRPCProxyStartCommand() *cobra.Command { | |||
} | |||
|
|||
cmd.Flags().StringVar(&grpcProxyListenAddr, "listen-addr", "127.0.0.1:23790", "listen address") | |||
cmd.Flags().StringVar(&grpcProxyDNSCluster, "discovery-srv", "", "DNS domain used to bootstrap initial cluster") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
keep this?
etcdmain/help.go
Outdated
Set level of detail for exported metrics, specify 'extensive' to include histogram metrics. | ||
Set level of detail for exported metrics, specify 'extensive' to include histogram metrics. | ||
--listen-metrics-urls '' | ||
List of URLs to listen on for metrics (non-secured ignoring client TLS). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove parenthetical bit here too?
embed/etcd.go
Outdated
@@ -205,6 +208,11 @@ func (e *Etcd) Close() { | |||
e.Clients[i].Close() | |||
} | |||
} | |||
for i := range e.metricsListeners { | |||
if e.metricsListeners[i] != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need for the nil check? metricsListeners is only appended with non-nil listeners
Signed-off-by: Gyu-Ho Lee <[email protected]>
Signed-off-by: Gyu-Ho Lee <[email protected]>
Signed-off-by: Gyu-Ho Lee <[email protected]>
The changes are not available in the latest stable release v3.2.9. When will a newer version be cut with |
This will be in v3.3, which is planned for late this year, or early next year. |
I think it may be a bit unfortunate to couple the
This means that both client port and metrics port must be reachable from the the same nodes. If Prometheus is given a TLS Client Certs to access the It is also generally a burden and a security risk to carry TLS Client Certs to the monitoring system, as it often run in a separate environment. For instance, in the case of Prometheus on Kubernetes:
I am of the opinion that if etcd listens over TLS, and the metrics URLs are Ideally tho, and as mentioned in kubernetes/kubernetes#53405, we could just have another TLS configuration for TLS Client Cert Auth with different certs, to the |
@Quentin-M we've already talked to @xiang90 about this and he mentioned it is possible to have a separate set of client certs for Prometheus that only has permission to access the /metrics endpoint of etcd, using etcd's authorization. I imagine that's what you are looking for, as then you are serving the metrics via https, and have them authorized, but not full access to etcd. |
etcd v3.3 introduced a new flag to allow serving `/metrics` and `/health` under a different port than e.g. `/v2/keys`. This allows us to protect etcd's data via firewall rules but still let monitoring tools to access the monitoring information. See feature request in etcd repo: etcd-io/etcd#8060. The implementation landed in v3.3: etcd-io/etcd#8242 This PR instructs etcd to serve metrics and health under the additonal port `2381` *unconditionally* **when the used etcd binary is** `>=v3.3.x`. However, if not explicitely set in the `senza.yaml` this port won't be mapped to the outside and therefore isn't accessible. It doesn't expose more information than anything under `2379` already does.
@gyuho Do you think it would be OK to cherrypick this PR back to v3.2 and v3.1? I took a quick look at the fix and it seems fine to backport to me. Would you please let me know if otherwise? If OK, I can do the cherry pick work. Thanks! |
Fix #8060.
https://github.com/prometheus/prometheus/wiki/Default-port-allocations