Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🚀 Feature: Add docs for local grafana tempo integration #43

Open
1 task done
henryxparker opened this issue Jul 30, 2024 · 10 comments
Open
1 task done

🚀 Feature: Add docs for local grafana tempo integration #43

henryxparker opened this issue Jul 30, 2024 · 10 comments

Comments

@henryxparker
Copy link

Which component is this feature for?

Anthropic Instrumentation

🔖 Feature description

An addition to the grafana tempo docs that includes instructions on how to connect it with a local grafana instance.

🎤 Why is this feature needed ?

I'm evaluating traceloop for my team. I don't have much grafana experience and so trying to get this working with a local version of grafana has honestly been an absolute nightmare. (even though I know it should have been really simple)

✌️ How do you aim to achieve this?

Add a blurb under the "Without Grafana Agent" section:
If you are running tempo locally set the environment variable to point to tempo's http ingest port

default: TRACELOOP_BASE_URL=0.0.0.0:4318

🔄️ Additional Information

No response

👀 Have you spent some time to check if this feature request has been raised before?

  • I checked and didn't find similar issue

Are you willing to submit PR?

None

@nirga
Copy link
Member

nirga commented Jul 31, 2024

Hey @henryxparker! Thanks and sorry you had a bad experience with the grafana integration. We'll work with the Grafana team on making this work better.
In the meantime - can you verify that setting TRACELOOP_BASE_URL=0.0.0.0:4318 worked for you locally? We'll update the docs at https://github.com/traceloop/docs

@henryxparker
Copy link
Author

actually it required TRACELOOP_BASE_URL=http://0.0.0.0:4318 because the default for the local tempo setup in the grafana tempo docs does not use https, but yes I can confirm it worked locally.

@henryxparker
Copy link
Author

If you would like to verify locally here is a docker compose file and a config file for tempo.
Put them in the same directory, make a subdirectory called tempo-data and then call docker compose up and you should be able to access grafana at localhost:3000 where you can see the traces.

These were created from combining two examples from the grafana docs: grafana-agent-example, tempo-local-quickstart

docker-compose.yaml

version: '3'
services:
  # Tempo runs as user 10001, and docker compose creates the volume as root.
  # As such, we need to chown the volume in order for Tempo to start correctly.
  init:
    image: &tempoImage grafana/tempo:latest
    user: root
    entrypoint:
      - "chown"
      - "10001:10001"
      - "/var/tempo"
    volumes:
      - ./tempo-data:/var/tempo

  tempo:
    image: *tempoImage
    command: [ "-config.file=/etc/tempo.yaml" ]
    volumes:
      - ./tempo.yaml:/etc/tempo.yaml
      - ./tempo-data:/var/tempo
    ports:
      - "14268:14268"  # jaeger ingest
      - "3200:3200"   # tempo
      - "9095:9095" # tempo grpc
      - "4317:4317"  # otlp grpc
      - "4318:4318"  # otlp http
      - "9411:9411"   # zipkin
    depends_on:
      - init
  loki:
    image: grafana/loki:2.9.0
    ports:
      - "3100:3100"
    command: -config.file=/etc/loki/local-config.yaml
  prometheus:
    image: prom/prometheus:v2.47.0
    command:
      - --web.enable-remote-write-receiver
      - --config.file=/etc/prometheus/prometheus.yml
    ports:
      - "9090:9090"
  grafana:
    environment:
      - GF_PATHS_PROVISIONING=/etc/grafana/provisioning
      - GF_AUTH_ANONYMOUS_ENABLED=true
      - GF_AUTH_ANONYMOUS_ORG_ROLE=Admin
    entrypoint:
      - sh
      - -euc
      - |
        mkdir -p /etc/grafana/provisioning/datasources
        cat <<EOF > /etc/grafana/provisioning/datasources/ds.yaml
        apiVersion: 1
        datasources:
        - name: Loki
          type: loki
          access: proxy
          orgId: 1
          url: http://loki:3100
          basicAuth: false
          isDefault: false
          version: 1
          editable: false
        - name: Prometheus
          type: prometheus
          orgId: 1
          url: http://prometheus:9090
          basicAuth: false
          isDefault: false
          version: 1
          editable: false
        - name: Tempo
          type: tempo
          access: proxy
          orgId: 1
          url: http://tempo:3200
          basicAuth: false
          isDefault: true
          version: 1
          editable: false
          apiVersion: 1
          uid: tempo
          jsonData:
            httpMethod: GET
            serviceMap:
              datasourceUid: prometheus
        EOF
        /run.sh
    image: grafana/grafana:latest
    ports:
      - "3000:3000"

tempo.yaml

stream_over_http_enabled: true
server:
  http_listen_port: 3200
  log_level: info

query_frontend:
  search:
    duration_slo: 5s
    throughput_bytes_slo: 1.073741824e+09
  trace_by_id:
    duration_slo: 5s

distributor:
  receivers:                           # this configuration will listen on all ports and protocols that tempo is capable of.
    jaeger:                            # the receives all come from the OpenTelemetry collector.  more configuration information can
      protocols:                       # be found there: https://github.com/open-telemetry/opentelemetry-collector/tree/main/receiver
        thrift_http:                   #
        grpc:                          # for a production deployment you should only enable the receivers you need!
        thrift_binary:
        thrift_compact:
    zipkin:
    otlp:
      protocols:
        http:
        grpc:
    opencensus:

ingester:
  max_block_duration: 5m               # cut the headblock when this much time passes. this is being set for demo purposes and should probably be left alone normally

compactor:
  compaction:
    block_retention: 1h                # overall Tempo trace retention. set for demo purposes

metrics_generator:
  registry:
    external_labels:
      source: tempo
      cluster: docker-compose
  storage:
    path: /var/tempo/generator/wal
    remote_write:
      - url: http://prometheus:9090/api/v1/write
        send_exemplars: true
  traces_storage:
    path: /var/tempo/generator/traces

storage:
  trace:
    backend: local                     # backend configuration to use
    wal:
      path: /var/tempo/wal             # where to store the wal locally
    local:
      path: /var/tempo/blocks

overrides:
  defaults:
    metrics_generator:
      processors: [service-graphs, span-metrics, local-blocks] # enables metrics generator
      generate_native_histograms: both

@DSgUY
Copy link

DSgUY commented Oct 23, 2024

@henryxparker do you solve this? Same nightmare here!

@zioproto
Copy link

@DSgUY I extended this Azure Sample to have traceloop sending traces to a Grafana Tempo installation running in my Kubernetes cluster:

https://github.com/Azure-Samples/azure-openai-terraform-deployment-sample/

Here is how I install the Helm chart locally:
https://github.com/Azure-Samples/azure-openai-terraform-deployment-sample/blob/b5a113691e19f23667f2caf268c5d4916d370de6/infra/installation_script.tftpl#L7-L11

Here is how to point your application to send traces to the to the local Grafana Tempo distributor:
https://github.com/Azure-Samples/azure-openai-terraform-deployment-sample/blob/b5a113691e19f23667f2caf268c5d4916d370de6/sample-application/chatbot.py#L47

@nirga
Copy link
Member

nirga commented Oct 23, 2024

@zioproto @DSgUY @henryxparker if any of you are willing to update our docs at https://github.com/traceloop/docs with the things you learned here that would be tremendously helpful for the community! ❤️
I just can't seem to get enough time to test this myself so I can't be certain how to fix our current guide.

@nirga nirga transferred this issue from traceloop/openllmetry Oct 23, 2024
@DSgUY
Copy link

DSgUY commented Oct 23, 2024

@zioproto @DSgUY @henryxparker if any of you are willing to update our docs at https://github.com/traceloop/docs with the things you learned here that would be tremendously helpful for the community! ❤️ I just can't seem to get enough time to test this myself so I can't be certain how to fix our current guide.

I'm still trying but sure...

@DSgUY
Copy link

DSgUY commented Oct 23, 2024

I manage to get the traces. can i configure metrucs and logs? Maybe using prometheus, promtail and loki?

@nirga
Copy link
Member

nirga commented Oct 23, 2024

I think grafana agent can translate the otel metrics format to Prometheus - https://grafana.com/docs/agent/latest/flow/tasks/opentelemetry-to-lgtm-stack/

@nirga
Copy link
Member

nirga commented Oct 23, 2024

Btw @DSgUY traceloop as a platform can also integrate with grafana if that helps

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants