Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Create base config for observability PoC #80

Closed
wants to merge 3 commits into from
Closed
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions crates/topos-tce/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -25,8 +25,8 @@ topos-tce-broadcast = { path = "../topos-tce-broadcast" }
bincode = "1.3.3"
clap = { version = "3.0.10", features = ["derive", "env"] }
lazy_static = "1.4"
opentelemetry = { version = "0.18", default-features = false, features = ["trace", "rt-tokio"] }
opentelemetry-jaeger = { version = "0.17", features = ["rt-tokio"] }
opentelemetry = { version = "0.18", features = ["rt-tokio", "metrics"] }
opentelemetry-otlp = { version = " 0.11", features = ["tonic", "metrics"] }
pretty_env_logger = "0.4.0"
tracing-opentelemetry = "0.18.0"
tracing-attributes = "0.1.23"
Expand Down
40 changes: 33 additions & 7 deletions crates/topos-tce/src/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,15 @@ mod storage;
use crate::app_context::AppContext;
use crate::cli::AppArgs;
use clap::Parser;
use opentelemetry::global;
use opentelemetry::runtime;
use opentelemetry::sdk::export::metrics::aggregation::cumulative_temporality_selector;
use opentelemetry::sdk::metrics::selectors;
use opentelemetry::sdk::trace::{self, RandomIdGenerator, Sampler};
use opentelemetry::sdk::Resource;
use opentelemetry::{global, KeyValue};
use opentelemetry::KeyValue;
use opentelemetry_otlp::{ExportConfig, Protocol, WithExportConfig};
use std::time::Duration;
use tce_store::{Store, StoreConfig};
use tokio::spawn;
use topos_p2p::{utils::local_key_pair, Multiaddr};
Expand All @@ -25,12 +31,13 @@ async fn main() {
let peer_id = key.public().to_peer_id();
tracing::Span::current().record("peer_id", &peer_id.to_string());

let tracer = opentelemetry_jaeger::new_agent_pipeline()
.with_endpoint(args.jaeger_agent.clone())
.with_service_name(args.jaeger_service_name.clone())
.with_max_packet_size(1500)
.with_auto_split_batch(true)
.with_instrumentation_library_tags(false)
let tracer = opentelemetry_otlp::new_pipeline()
.tracing()
.with_exporter(
opentelemetry_otlp::new_exporter()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're using the same exporter builder, maybe we can mutualise the code here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Freyskeyd, thanks for reviewing this PR!

Do you mind taking over the instrumentation bit? It would be great if you pushed some changes to this branch fixing the already existing minimal setup!

I'm probably not the most appropriate person to touch the code base. I made some changes on the main.rs and Cargo.toml files, but am not entirely sure if that's the optimal solution.

.tonic()
.with_endpoint("http://otel-collector:4317"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We would need to keep using a cli arg or an env variable (similarly to jaeger_agent).

)
.with_trace_config(
trace::config()
.with_sampler(Sampler::AlwaysOn)
Expand All @@ -47,6 +54,25 @@ async fn main() {
.install_batch(opentelemetry::runtime::Tokio)
.unwrap();

let export_config = ExportConfig {
endpoint: "http://otel-collector:4317".to_string(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We would need to keep using a cli arg or an env variable (similarly to jaeger_agent).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes definitely. Specially for the cases where users would like to opt-out and do not send any telemetry data.

I don't have confidence in Rust, so kept the code minimal.

timeout: Duration::from_secs(3),
protocol: Protocol::Grpc,
};

let _meter = opentelemetry_otlp::new_pipeline()
.metrics(
selectors::simple::inexpensive(),
cumulative_temporality_selector(),
runtime::Tokio,
)
.with_exporter(
opentelemetry_otlp::new_exporter()
.tonic()
.with_export_config(export_config),
)
.build();

#[cfg(feature = "log-json")]
let formatting_layer = tracing_subscriber::fmt::layer().json();

Expand Down
3 changes: 3 additions & 0 deletions tools/.env
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# TCE default args
TOOLCHAIN_VERSION=stable
GITHUB_TOKEN=${GITHUB_TOKEN}
49 changes: 37 additions & 12 deletions tools/docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,21 +4,46 @@ services:
container_name: jaeger
restart: always
ports:
- "6831/udp"
- "6832/udp"
- "16685"
- "16686:16686"
- "14250:14250"
- "16685:16685"

otel-collector:
image: otel/opentelemetry-collector-contrib-dev:latest
restart: always
command: ["--config=/etc/otel-collector-config.yaml"]
volumes:
- ./otel-collector-config.yaml:/etc/otel-collector-config.yaml
ports:
- "1888:1888" # pprof extension
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most ports don't need host bindings, no? Actually this is a thought I've had for a while: With docker-compose, having links created automatically between all containers on a common network, we don't event need to specify any port (neither host nor container), no?

Copy link
Contributor Author

@foo0x29a foo0x29a Dec 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sebastiendan ,

Most ports don't need host bindings, no?

Yes you're correct. Will leave only the necessary for the PoC, thanks for spotting this!

With docker-compose, having links created automatically between all containers on a common network, we don't event need to specify any port (neither host nor container), no?

That's correct. The difference, however, is if we define the ports field but omit the ${HOST} value, docker compose will assign a random host port:

$ docker compose ps
NAME                     COMMAND                  SERVICE             STATUS              PORTS
boot                     "/bin/sh -c ./topos-…"   boot                running             0.0.0.0:57131->9090/tcp
jaeger                   "/go/bin/all-in-one-…"   jaeger              running             5775/udp, 5778/tcp, 0.0.0.0:14250->14250/tcp, 6831-6832/udp, 14268/tcp, 0.0.0.0:16685-16686->16685-16686/tcp
prometheus               "/bin/prometheus --c…"   prometheus          running             0.0.0.0:9090->9090/tcp
spam                     "./cert-spammer"         cert-spammer        running             
tools-otel-collector-1   "/otelcontribcol --c…"   otel-collector      running             0.0.0.0:1888->1888/tcp, 0.0.0.0:4317->4317/tcp, 0.0.0.0:8888-8889->8888-8889/tcp, 0.0.0.0:13133->13133/tcp, 0.0.0.0:55679->55679/tcp, 55680/tcp
tools-peer-1             "/bin/sh -c ./topos-…"   peer                running             0.0.0.0:57140->1340/tcp, 0.0.0.0:57141->9090/tcp
tools-peer-2             "/bin/sh -c ./topos-…"   peer                running             0.0.0.0:57132->1340/tcp, 0.0.0.0:57133->9090/tcp
tools-peer-3             "/bin/sh -c ./topos-…"   peer                running             0.0.0.0:57136->1340/tcp, 0.0.0.0:57137->9090/tcp
tools-peer-4             "/bin/sh -c ./topos-…"   peer                running             0.0.0.0:57138->1340/tcp, 0.0.0.0:57139->9090/tcp
tools-peer-5             "/bin/sh -c ./topos-…"   peer                running             0.0.0.0:57134->1340/tcp, 0.0.0.0:57135->9090/tcp

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The difference, however, is if we define the ports field but omit the ${HOST} value, docker compose will assign a random host port

Yep, but what I meant is that I think we tend to use random port assignation "by default" because we think that ports should be specified. Let me describe an example:

version: '3.5'

services:
  one:
    image: ...
    ports:
      - 3000
  two:
    image: ...
    env:
      - "endpoint=http://one:3000"

Here often we would want to specify the 3000 port on one because we want two to access it, but I believe this isn't needed. (this was my comment above)

- "8888:8888" # Prometheus metrics exposed by the collector
- "8889:8889" # Prometheus exporter metrics
- "13133:13133" # health_check extension
- "4317:4317" # OTLP gRPC receiver
- "55679:55679" # zpages extension
depends_on:
- jaeger

prometheus:
container_name: prometheus
image: prom/prometheus:latest
restart: always
volumes:
- ./prometheus.yaml:/etc/prometheus/prometheus.yml
ports:
- "9090:9090"

boot:
container_name: boot
image: ghcr.io/toposware/tce:pr-4
image: tce:develop
init: true
build:
context: ../
args:
- TOOLCHAIN_VERSION=stable
- GITHUB_TOKEN
- TOOLCHAIN_VERSION=${TOOLCHAIN_VERSION}
- GITHUB_TOKEN=${GITHUB_TOKEN}
depends_on:
- jaeger
- otel-collector
ports:
- "9090"
environment:
Expand All @@ -36,15 +61,15 @@ services:
- TCE_TRBP_DELIVERY_THRESHOLD=3

peer:
image: ghcr.io/toposware/tce:pr-4
image: tce:develop
init: true
build:
context: ../
args:
- TOOLCHAIN_VERSION=stable
- GITHUB_TOKEN
- TOOLCHAIN_VERSION=${TOOLCHAIN_VERSION}
- GITHUB_TOKEN=${GITHUB_TOKEN}
depends_on:
- jaeger
- otel-collector
- boot
ports:
- "9090"
Expand Down Expand Up @@ -72,7 +97,7 @@ services:
- RUST_LOG=debug
- TARGET_NODES_PATH=/nodes.json
depends_on:
- jaeger
- otel-collector
- peer
volumes:
- ./peer_nodes.json:/nodes.json
50 changes: 50 additions & 0 deletions tools/otel-collector-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
receivers:
otlp:
protocols:
grpc:

processors:
batch:

extensions:
health_check:
pprof:
endpoint: :1888
zpages:
endpoint: :55679

exporters:
prometheus:
endpoint: 0.0.0.0:8889
const_labels:
label1: value1
resource_to_telemetry_conversion:
enabled: true

jaeger:
endpoint: jaeger:14250
tls:
insecure: true

processors:
batch:

extensions:
health_check:
pprof:
endpoint: :1888
zpages:
endpoint: :55679

service:
extensions: [pprof, zpages, health_check]
pipelines:
traces:
receivers: [otlp]
processors: [batch]
exporters: [jaeger]
metrics:
receivers: [otlp]
processors: [batch]
exporters: [prometheus]

6 changes: 6 additions & 0 deletions tools/prometheus.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
scrape_configs:
- job_name: 'otel-collector'
scrape_interval: 10s
static_configs:
- targets: ['otel-collector:8889']
- targets: ['otel-collector:8888']