+ metrics_path: /prom
+ static_configs:
+ - targets:
+ - "127.0.0.1:9876"
+```
+
+## 分布式跟踪
+分布式跟踪可以通过可视化端到端的性能来帮助了解性能瓶颈。
+
+Ozone 使用 [jaeger](https://jaegertracing.io) 跟踪库收集跟踪,可以将跟踪数据发送到任何兼容的后端(Zipkin,…)。
+
+默认情况下,跟踪功能是关闭的,可以通过 `ozon-site.xml` 的 `hdds.tracing.enabled` 配置变量打开。
+
+```XML
+
+ hdds.tracing.enabled
+ true
+
+```
+
+Jaeger 客户端可以用环境变量进行配置,如[这份](https://github.com/jaegertracing/jaeger-client-java/blob/master/jaeger-core/README.md)文档所述。
+
+例如:
+
+```shell
+JAEGER_SAMPLER_PARAM=0.01
+JAEGER_SAMPLER_TYPE=probabilistic
+JAEGER_AGENT_HOST=jaeger
+```
+
+此配置将记录1%的请求,以限制性能开销。有关 Jaeger 抽样的更多信息,请查看[文档](https://www.jaegertracing.io/docs/1.18/sampling/#client-sampling-configuration)。
+
+## Ozone Insight
+Ozone Insight 是一个用于检查 Ozone 集群当前状态的工具,它可以显示特定组件的日志记录、指标和配置。
+
+请使用`ozone insight list`命令检查可用的组件:
+
+```shell
+> ozone insight list
+
+Available insight points:
+
+ scm.node-manager SCM Datanode management related information.
+ scm.replica-manager SCM closed container replication manager
+ scm.event-queue Information about the internal async event delivery
+ scm.protocol.block-location SCM Block location protocol endpoint
+ scm.protocol.container-location SCM Container location protocol endpoint
+ scm.protocol.security SCM Block location protocol endpoint
+ om.key-manager OM Key Manager
+ om.protocol.client Ozone Manager RPC endpoint
+ datanode.pipeline More information about one ratis datanode ring.
+```
+
+## 配置
+
+`ozone insight config` 可以显示与特定组件有关的配置(只支持选定的组件)。
+
+```shell
+> ozone insight config scm.replica-manager
+
+Configuration for `scm.replica-manager` (SCM closed container replication manager)
+
+>>> hdds.scm.replication.thread.interval
+ default: 300s
+ current: 300s
+
+There is a replication monitor thread running inside SCM which takes care of replicating the containers in the cluster. This property is used to configure the interval in which that thread runs.
+
+
+>>> hdds.scm.replication.event.timeout
+ default: 30m
+ current: 30m
+
+Timeout for the container replication/deletion commands sent to datanodes. After this timeout the command will be retried.
+
+```
+
+## 指标
+`ozone insight metrics` 可以显示与特定组件相关的指标(只支持选定的组件)。
+```shell
+> ozone insight metrics scm.protocol.block-location
+Metrics for `scm.protocol.block-location` (SCM Block location protocol endpoint)
+
+RPC connections
+
+ Open connections: 0
+ Dropped connections: 0
+ Received bytes: 1267
+ Sent bytes: 2420
+
+
+RPC queue
+
+ RPC average queue time: 0.0
+ RPC call queue length: 0
+
+
+RPC performance
+
+ RPC processing time average: 0.0
+ Number of slow calls: 0
+
+
+Message type counters
+
+ Number of AllocateScmBlock: ???
+ Number of DeleteScmKeyBlocks: ???
+ Number of GetScmInfo: ???
+ Number of SortDatanodes: ???
+```
+
+## 日志
+
+`ozone insights logs` 可以连接到所需的服务并显示与一个特定组件相关的DEBUG/TRACE日志。例如,显示RPC消息:
+
+```shell
+>ozone insight logs om.protocol.client
+
+[OM] 2020-07-28 12:31:49,988 [DEBUG|org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB|OzoneProtocolMessageDispatcher] OzoneProtocol ServiceList request is received
+[OM] 2020-07-28 12:31:50,095 [DEBUG|org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB|OzoneProtocolMessageDispatcher] OzoneProtocol CreateVolume request is received
+```
+
+使用 `-v` 标志,也可以显示 protobuf 信息的内容(TRACE级别的日志):
+
+```shell
+ozone insight logs -v om.protocol.client
+
+[OM] 2020-07-28 12:33:28,463 [TRACE|org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB|OzoneProtocolMessageDispatcher] [service=OzoneProtocol] [type=CreateVolume] request is received:
+cmdType: CreateVolume
+traceID: ""
+clientId: "client-A31DF5C6ECF2"
+createVolumeRequest {
+ volumeInfo {
+ adminName: "hadoop"
+ ownerName: "hadoop"
+ volume: "vol1"
+ quotaInBytes: 1152921504606846976
+ volumeAcls {
+ type: USER
+ name: "hadoop"
+ rights: "200"
+ aclScope: ACCESS
+ }
+ volumeAcls {
+ type: GROUP
+ name: "users"
+ rights: "200"
+ aclScope: ACCESS
+ }
+ creationTime: 1595939608460
+ objectID: 0
+ updateID: 0
+ modificationTime: 0
+ }
+}
+
+[OM] 2020-07-28 12:33:28,474 [TRACE|org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB|OzoneProtocolMessageDispatcher] [service=OzoneProtocol] [type=CreateVolume] request is processed. Response:
+cmdType: CreateVolume
+traceID: ""
+success: false
+message: "Volume already exists"
+status: VOLUME_ALREADY_EXISTS
+```
+
+
+
+实际上 `ozone insight` 是通过 HTTP 端点来检索所需的信息(`/conf`、`/prom`和`/logLevel`端点),它在安全环境中还不被支持。
+
+
\ No newline at end of file