Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpampSupervisor/OpampExtension is not restarting if the remote config changes applied to the processor pipeline #34377

Open
MSA0208 opened this issue Aug 1, 2024 · 12 comments
Assignees

Comments

@MSA0208
Copy link

MSA0208 commented Aug 1, 2024

Component(s)

cmd/opampsupervisor, extension/opamp

Describe the issue you're reporting

Hi ,

have started opamp server , parallely started supervisor which has extension and collector details for execution of the collector,

my collector includes the transform processor as part of the pipeline.

Am using opamp for the remote configuration restart for my collector, i have observed that the collector is restarting if i add the new configurations in the pipeline and thats working fine

similarly if i want to update the processor configuration for transforming , if i update my config.yaml remotely , server is accepting the remote changes given , but i could see no restart on the collector for this remotely pushed changes.

any inputs on this process config changes , do we have this capability for any changes in the config or its limited to the service pipeline alone?

Copy link
Contributor

github-actions bot commented Aug 1, 2024

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@crobert-1 crobert-1 added the bug Something isn't working label Aug 1, 2024
@Frapschen
Copy link
Contributor

@MSA0208 Can you share your opampsupervisor and otel collctor config?

@MSA0208
Copy link
Author

MSA0208 commented Aug 6, 2024

@Frapschen nothing much changes from the existing opampsupervisor, apart from the bootstrap.yaml , which is my own collector configurations to start my collector binary
and am using bootstrap.yaml itself as my config.yaml as of now

@MSA0208
Copy link
Author

MSA0208 commented Aug 6, 2024

now am able to restart the collector, if thers any change in the pipeline , otherwise its not restarting

what i want to know is if we do some small change in the processor, OTTL , i want to restart to pick those changes

@bacherfl
Copy link
Contributor

bacherfl commented Sep 4, 2024

Hi @MSA0208 is this issue still occurring for you? I just tried this out with the current state on main and it seems that the agent is restarted when editing something in e.g. the transform processor. For example, I started with the the following additional configuration which i set in the opamp server:

processors:
    transform:
        error_mode: ignore
        flatten_data: false
        log_statements: []
        metric_statements:
            - conditions: []
              context: metric
              statements: []
        trace_statements:
            - conditions: []
              context: resource
              statements:
                - keep_matching_keys(attributes, "^(aaa|bbb|c).*")

exporters:
  debug:
service:
  pipelines:
    metrics:
      exporters:
      - debug
      processors:
      - transform
      receivers:
      - prometheus/own_metrics

And the agent was restarted. After that, I changed one of the ottl statements and the agent was restarted again.

Can you maybe share an example for the config you were using so I can try to reproduce the issue?

@MSA0208
Copy link
Author

MSA0208 commented Sep 4, 2024

Hi @bacherfl ,

now the issue is resolved and is working as expected.
but the current issue am facing is am trying with TLS certs , and it always says first record doesnt look like TLS because of supervisor may be.

i tried connecting opamp server and opamp agent client using TLS , thats working , but when i use supervisor in the middle am getting the above mentioned error,

so still debugging the TLS w.r.t to supervisor, any inputs here will help

@bacherfl
Copy link
Contributor

bacherfl commented Sep 5, 2024

Thanks for the update @MSA0208 - Can you share the opampsupervisor config you are using? Then I will try to see if I can reproduce the issue you are having with TLS

@MSA0208
Copy link
Author

MSA0208 commented Sep 5, 2024

server:
endpoint: ws://127.0.0.1:4320/v1/opamp
tls:
insecure_skip_verify: true
ca_file: "/root/OTEL98/opamp-go-main/internal/certs/certs/ca.cert.pem"
cert_file: "/root/OTEL98/opamp-go-main/internal/certs/server_certs/server.cert.pem"
key_file: "/root/OTEL98/opamp-go-main/internal/certs/server_certs/server.key.pem"

capabilities:

Keys with boolean true/false values that enable a particular

OpAMP capability.

The Supervisor will accept remote configuration from the Server.

If enabled the Supervisor will also report RemoteConfig status

to the Server.

AcceptsRemoteConfig: true # false if unspecified
accepts_remote_config: true
reports_remote_config: true
accepts_restart_command: true
reports_effective_config: true
reports_own_metrics: true
reports_health: true
accepts_opamp_connection_settings: true

storage:
agent:
# executable: /root/PI40/aiopsx-platform-NGx_NorthBound/cmd/otelcol-ngx/ngx-connector
executable: ../cmd/otelcol-ngx/ngx-connector

args: --config
env:
#config_fil: ./config.yaml
access_dirs:
read:
allow: [/var/log]
deny: [/var/log/secret_logs]
write:
allow: [/var/otelcol]

this is the supervisor.yaml file, same TLS am using in opampserver while starting and same am passing to my actual config.yaml file as well

@bacherfl
Copy link
Contributor

bacherfl commented Oct 3, 2024

Hi @MSA0208 and sorry for the late reply, but I now looked into the issue you are having with TLS. Looking at the config, you are using the private key and certificate used by the opamp server, i.e. this one: https://github.com/open-telemetry/opamp-go/tree/main/internal/certs/server_certs. However, this certificate can not be used for authenticating clients at the server, as it lacks the TLS Web Client Authentication key usage extension.
For this reason, the opamp agent client example creates its own key pair when connecting to the server (see https://github.com/open-telemetry/opamp-go/blob/ad5317009abb490ff5e57e564ac8e82f70f9f477/internal/examples/agent/agent/agent.go#L363) - you can use that as a reference to create a key pair for the supervisor and then use the new key pair to connect to the opamp server.

@MSA0208
Copy link
Author

MSA0208 commented Oct 3, 2024

Hi @bacherfl ,
Thank you for your inputs.
Am using openSSL to generate these certs and using the same in opamp extension, supervisor.yaml and opampserver as well.
so your saying we need to create keypair and use the same in all 3 mentioned above?
am bit confused here, Should i use the new keypair as part of supervisor config , or opampextension config?

From the Logs of supervisor i see that it always falling back to http
settings.TLSConfig :
settings.httpMiddleware :
hs.TLSConfig from serverimpl.go
Falling back to http!!! with listenAddr : localhost:4322
Started startHttpServer with listenAddr: localhost:4322

and from the Agent log i.e, collector , the error is like
2024-10-03T12:20:21.673-0700 error [email protected]/opamp_agent.go:76 Failed to connect to the OpAMP server {"kind": "extension", "name": "opamp", "error": "tls: first record does not look like a TLS handshake"}
gitlab.otxlab.net/itom/opr/opsb-content/aiopsx-platform/extension/opampextension.(*opampAgent).Start.func2

@bacherfl
Copy link
Contributor

bacherfl commented Oct 4, 2024

Hi @bacherfl , Thank you for your inputs. Am using openSSL to generate these certs and using the same in opamp extension, supervisor.yaml and opampserver as well. so your saying we need to create keypair and use the same in all 3 mentioned above? am bit confused here, Should i use the new keypair as part of supervisor config , or opampextension config?

No, you can keep using the key pair you were using for the server, but you need to create a separate key pair (using the same certificate authority used for creating the server certificates, i.e. this one) with the TLS Web Client Authentication key usage extension enabled, and use that for the supervisor.

From the Logs of supervisor i see that it always falling back to http settings.TLSConfig : settings.httpMiddleware : hs.TLSConfig from serverimpl.go Falling back to http!!! with listenAddr : localhost:4322 Started startHttpServer with listenAddr: localhost:4322

and from the Agent log i.e, collector , the error is like 2024-10-03T12:20:21.673-0700 error [email protected]/opamp_agent.go:76 Failed to connect to the OpAMP server {"kind": "extension", "name": "opamp", "error": "tls: first record does not look like a TLS handshake"} gitlab.otxlab.net/itom/opr/opsb-content/aiopsx-platform/extension/opampextension.(*opampAgent).Start.func2

I noticed that the opamp server url in the config you posted earlier started with ws - Due to the change in #35363 the TLS setting are only applied if the server URL starts with https or wss

Copy link
Contributor

github-actions bot commented Dec 4, 2024

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label Dec 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants