Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running otel-go-instrumentation in the same container as the app errors with "process not found yet, trying again soon" (ARM host, x86_64 container) #1141

Open
mjneth opened this issue Sep 30, 2024 · 3 comments
Labels
bug Something isn't working

Comments

@mjneth
Copy link

mjneth commented Sep 30, 2024

Describe the bug

When running otel-go-instrumentation in the same container as the app it errors with "process not found yet, trying again soon".

The sidecar approach needing to run as privileged is a concern for us so I was seeing if this can work if otel-go-instrumentation is run in the same container as the app being instrumented.

Environment

  • OS: OSX Sonova 14.5 Apple M2 Pro ARM host machine, Debian 11 bullseye x86_64 container
  • Go Version: 1.23.1
  • Version: v0.14.0-alpha

To Reproduce

Dockerfile:

FROM golang:1.23.1-bullseye as base

RUN apt-get update && apt-get install -y curl vim net-tools procps wget clang gcc llvm make libbpf-dev

RUN wget https://github.com/open-telemetry/opentelemetry-go-instrumentation/archive/refs/tags/v0.14.0-alpha.tar.gz && \
  tar zxf v0.14.0-alpha.tar.gz && \
  cd opentelemetry-go-instrumentation-0.14.0-alpha/ && \
  make build && \
  cp otel-go-instrumentation /usr/local/bin

docker-compose.yaml:

version: '3.4'

services:
  app:
    platform: linux/amd64
    build:
      context: .
    #### See if these can be removed if it's running in the same container
    privileged: true
    pid: "host"
    cap_add:
      - SYS_PTRACE
    ####
    environment:
      OTEL_GO_AUTO_TARGET_EXE: /app/oteltest
      OTEL_SERVICE_NAME: otelgoautotest
      OTEL_EXPORTER_OTLP_ENDPOINT: http://localhost:4318
      OTEL_PROPAGATORS: tracecontext,baggage
    expose:
      - 8080
    ports:
      - 8080:8080
    volumes:
      - .:/app
    working_dir: /app

Test app to be instrumented:

package main

import (
  "fmt"
  "net/http"
)

func handler( w http.ResponseWriter, r *http.Request) {
  fmt.Fprintf(w, "received request")
}

func main() {
  http.HandleFunc("/otelgotest", handler)

  fmt.Println("Starting...")
  http.ListenAndServe(":8080", nil)
}

Run the app and otel-go-instrumentation in the same container:

  1. Run and exec into the container: docker-compose run app.
  2. Build the test app: go build ..
  3. Run the app in the background: /app/oteltest & .
  4. Run otel-go-instrumentation: otel-go-instrumentation --log-level=debug.

It fails to find the app process with the error:

{"level":"info","ts":1727709418.1301615,"logger":"go.opentelemetry.io/auto","caller":"cli/main.go:86","msg":"building OpenTelemetry Go instrumentation ...","globalImpl":false}
{"level":"debug","ts":1727709420.1934721,"logger":"Instrumentation.Analyzer","caller":"process/discover.go:71","msg":"process not found yet, trying again soon","exe_path":"/app/oteltest"}

Though the process can be seen by ps -ef:
root 2394 2102 0 15:15 pts/0 00:00:00 /usr/bin/qemu-x86_64 /app/oteltest /app/oteltest.

I tried setting OTEL_GO_AUTO_TARGET_EXE to how it shows in ps -ef and some variations of that to see if it's an issue with being an ARM mac host machine and x86_64 container since the process output shows qemu-x86_64. I also tried running /app/oteltest in the container from one terminal and exec'd into the container a second time from another terminal to run otel-go-instrumentation and still had the same error.

Expected behavior

The otel-go-instrumentation binary is able to find the test app process, instrument it, and send traces to OTEL_EXPORTER_OTLP_ENDPOINT.

Additional context

I'm not sure if this is considered a supported flow or not but the sidecar approach requiring being run as privileged or needing additional capabilities is a concern. I'm curious if the stance here is to use the privileged sidecar approach or instrument your go app in the app-code instead and there isn't intended to be a way to auto-instrument without a privileged container.

@mjneth mjneth added the bug Something isn't working label Sep 30, 2024
@RonFed
Copy link
Contributor

RonFed commented Oct 2, 2024

@mjneth Thank you for opening this issue.
Can you try using an ARM image to see if the problem relates to qemu?

@mjneth
Copy link
Author

mjneth commented Oct 2, 2024

@mjneth Thank you for opening this issue. Can you try using an ARM image to see if the problem relates to qemu?

It does work with an ARM image but our production images are x86_64 so we want to use the same in the docker-compose setups for local dev. Is it feasible for this to work with qemu?

@mjneth mjneth changed the title Running otel-go-instrumentation in the same container as the app errors with "process not found yet, trying again soon" Running otel-go-instrumentation in the same container as the app errors with "process not found yet, trying again soon" (ARM host, x86_64 container) Oct 2, 2024
@RonFed
Copy link
Contributor

RonFed commented Oct 3, 2024

@mjneth From the ps output you attached, it seems that the actual process being run is qemu one which emulates the go binary, if that is the case then our current implementation won't support it since we are looking for a running go executable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants