Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to connect to the server: tls: first record does not look like a TLS handshake #310

Closed
Wudadada opened this issue Oct 25, 2024 · 12 comments

Comments

@Wudadada
Copy link

Wudadada commented Oct 25, 2024

What happened:
My supervisor could not connect to the server

env: CentOS 8
opentelemetry-collector-contrib version: v0.112.0
opamp version: build from commit #307

my supervisor config

server:
  endpoint: ws://127.0.0.1:4320/v1/opamp
  tls:
    # Disable verification to test locally.
    # Don't do this in production.
    #insecure_skip_verify: true
    insecure: true
    # For more TLS settings see config/configtls.ClientConfig

capabilities:
  reports_effective_config: true
  reports_own_metrics: true
  reports_health: true
  accepts_remote_config: true
  reports_remote_config: true

agent:
  executable: /usr/bin/otelcol-contrib

storage:
  directory: .

supervisor log:

[root@yptjkcshj-Linux-005 opamp]# ./opamp-supervisor --config supervisor.yaml 
2024/10/25 11:25:56 Supervisor starting, id=0192c1b5-990d-7695-b5a1-463fa45d5cab, type=io.opentelemetry.collector, version=1.0.0.
2024/10/25 11:25:56 Starting OpAMP client...
2024/10/25 11:25:56 OpAMP Client started.
2024/10/25 11:25:56 Starting agent /usr/bin/otelcol-contrib
2024/10/25 11:25:56 Failed to connect to the server: tls: first record does not look like a TLS handshake
2024/10/25 11:25:56 Connection failed (tls: first record does not look like a TLS handshake), will retry.
2024/10/25 11:25:56 Agent process started, PID=4147985
2024/10/25 11:25:56 Agent is not healthy: Get "http://localhost:13133": dial tcp [::1]:13133: connect: connection refused
2024/10/25 11:25:57 Agent is not healthy: Get "http://localhost:13133": dial tcp [::1]:13133: connect: connection refused
2024/10/25 11:25:57 Failed to connect to the server: tls: first record does not look like a TLS handshake
2024/10/25 11:25:57 Connection failed (tls: first record does not look like a TLS handshake), will retry.
2024/10/25 11:25:57 Agent is not healthy: Get "http://localhost:13133": dial tcp [::1]:13133: connect: connection refused
2024/10/25 11:25:58 Failed to connect to the server: tls: first record does not look like a TLS handshake
2024/10/25 11:25:58 Connection failed (tls: first record does not look like a TLS handshake), will retry.
2024/10/25 11:25:58 Agent is not healthy: Get "http://localhost:13133": dial tcp [::1]:13133: connect: connection refused
2024/10/25 11:25:59 Failed to connect to the server: tls: first record does not look like a TLS handshake
2024/10/25 11:25:59 Connection failed (tls: first record does not look like a TLS handshake), will retry.
2024/10/25 11:26:00 Agent is not healthy: Get "http://localhost:13133": dial tcp [::1]:13133: connect: connection refused
2024/10/25 11:26:01 Failed to connect to the server: tls: first record does not look like a TLS handshake
2024/10/25 11:26:01 Connection failed (tls: first record does not look like a TLS handshake), will retry.
@Wudadada
Copy link
Author

Wudadada commented Oct 25, 2024

my otelcol-contrib config:

receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
      http:
        endpoint: 0.0.0.0:4318
  prometheus:
    config:
      scrape_configs:
        - job_name: 'otel-collector'
          scrape_interval: 5s
          static_configs:
            - targets: ['0.0.0.0:19100']

processors:
  batch:
    timeout: 5s
    send_batch_size: 100000
  transform:
    log_statements:
      - context: log
        statements:
          - set(severity_text, "TRACE") where severity_number == 1
          - set(severity_text, "DEBUG") where severity_number == 5
          - set(severity_text, "INFO") where severity_number == 9
          - set(severity_text, "WARN") where severity_number == 13
          - set(severity_text, "ERROR") where severity_number == 17
          - set(severity_text, "FATAL") where severity_number == 21

exporters:
  clickhouse:
    endpoint: tcp://10.105.212.248:9000?dial_timeout=10s
    create_schema: true
    database: otel
    async_insert: true
    ttl: 72h
    compress: lz4
    timeout: 5s
    retry_on_failure:
      enabled: true
      initial_interval: 5s
      max_interval: 30s
      max_elapsed_time: 300s
    username: "default"
    password: "123"
    cluster_name: cluster_3S_1R
    table_engine:
      name: "ReplicatedMergeTree"

    logs_table_name: otel_logs

    traces_table_name: otel_traces

    metrics_tables:
      gauge: 
        name: "otel_metrics_gauge"
      sum: 
        name: "otel_metrics_sum"
      summary: 
        name: "otel_metrics_summary"
      histogram: 
        name: "otel_metrics_histogram"
      exponential_histogram: 
        name: "otel_metrics_exp_histogram"
  debug:
    verbosity: detailed

extensions:
  opamp:
    server:
      ws:
        endpoint: ws://127.0.0.1:4320/v1/opamp
        tls: 
          insecure: true
    instance_uid: 01BX5ZZKBKACTAV9WEVGEMMVRZ

service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [batch]
      exporters: [clickhouse]
    logs:
      receivers: [otlp]
      processors: [batch, transform]
      exporters: [clickhouse]
    metrics:
      receivers: [otlp, prometheus]
      processors: [batch]
      exporters: [clickhouse]
  extensions: [opamp]
  telemetry:
    logs:
      level: "debug"

@srikanthccv
Copy link
Member

With this change open-telemetry/opentelemetry-collector-contrib#35363, the tls section would be conditionally loaded. Is your opamp server configured to use TLS?

@Wudadada
Copy link
Author

Wudadada commented Oct 27, 2024

With this change open-telemetry/opentelemetry-collector-contrib#35363, the tls section would be conditionally loaded. Is your opamp server configured to use TLS?

Hi, could you let me know where I can view my server configuration? I couldn’t find it. I just built and ran the server binary on the server without making any configuration changes.

@srikanthccv
Copy link
Member

Do you have your own server implementation, or do you use the example https://github.com/open-telemetry/opamp-go/tree/main/internal/examples/server as your OpAMP server?

@Wudadada
Copy link
Author

Wudadada commented Oct 28, 2024

Do you have your own server implementation, or do you use the example https://github.com/open-telemetry/opamp-go/tree/main/internal/examples/server as your OpAMP server?

I built the example server's code locally and run the server binary on my centOS server,the only personal change is that i changed the ui's port in /opamp-go/internal/examples/server/uisrv/ui.go, from 4321 to 9321

image

@srikanthccv
Copy link
Member

That explains it. If it's a production, please fix the tls config to use proper certs and update the endpoint to use the wss protocol. If you are experimenting, you can set insecure_skip_verify: true and change the endpoint to wss://127.0.0.1:4320/v1/opamp

@Wudadada
Copy link
Author

That explains it. If it's a production, please fix the tls config to use proper certs and update the endpoint to use the wss protocol. If you are experimenting, you can set insecure_skip_verify: true and change the endpoint to wss://127.0.0.1:4320/v1/opamp

It works, thanks a lot!

@priyeshsingh550
Copy link

Hi @srikanthccv

I have added the same insecure_skip_verify: true In my config file but still i am getting the error.

2024/11/12 10:05:23 Failed to connect to the server: tls: first record does not look like a TLS handshake
2024/11/12 10:05:23 Connection failed (tls: first record does not look like a TLS handshake), will retry.

My Supervisor file is

server:
endpoint: wss://...:4321/v1/opamp
tls:
insecure_skip_verify: true

agent:
executable: /usr/bin/otelcol-contrib

@srikanthccv
Copy link
Member

@priyeshsingh550
Copy link

Yes i am using the same example server

@priyeshsingh550
Copy link

@srikanthccv is their anything i am missing here, my opamp server is running on ec2 machine with the same above example server

@srikanthccv
Copy link
Member

my opamp server is running on ec2 machine with the same above example server

@priyeshsingh550, you are probably trying to connect to the server using the EC2 instance's public IP/hostname, but the example server's self-signed certificate is only valid for 127.0.0.1 and localhost. This could be solved by regenerating the certs EC2 IP/hostname to the alt_names sections. Please remember that it is an example server and is not meant to be used for anything beyond simple testing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants