-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[jmxfetch] Add agent5-like JMXFetch helper commands #1208
Conversation
dcac445
to
eb77bd0
Compare
discussed this IRL: this would make the helpers work only if for example the Let's generalize this to at least pull the configs from the |
eb77bd0
to
fb75f5a
Compare
fb75f5a
to
faea963
Compare
@arbll coming back to reviewing this, have a few pre-review comments:
Had a quick look at the code and didn't notice any blocker, but I'll give this a thorough review once you've addressed these 2 points, thanks! :) |
faea963
to
9de5251
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit concerned about the general approach of copying check configs to a temp directory and running jmxfetch against it. We have to be very careful about the permissions that get applied to these files, and if the command fails unexpectedly the copied files would remain there. These files can very well contain secrets that shouldn't be written in some unclear location on the filesystem, even if the ownership/perms are good.
If there's no other reasonable way to implement this, I think we should make the behavior very clear by asking users to provide a directory (as a command line option) where all the pulled configs would be written to.
Talking about alternative implementation approaches, have you looked into making this command use the same approach as a running agent (i.e. the agent provides an authenticated https endpoint that JMXFetch hits to pull the check configs)? Is JMXFetch able to run list_
commands and pull configs from that endpoint instead of the filesystem?
cmd/agent/app/jmx.go
Outdated
jmxListCmd.AddCommand(jmxListEverythingCmd, jmxListMatchingCmd, jmxListLimitedCmd, jmxListCollectedCmd, jmxListNotMatchingCmd) | ||
|
||
jmxListCmd.PersistentFlags().StringSliceVar(&checks, "checks", []string{"jmx"}, "JMX checks (ex: jmx,tomcat)") | ||
jmxCollectCmd.PersistentFlags().StringSliceVar(&checks, "checks", []string{"jmx"}, "JMX checks (ex: jmx,tomcat)") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a good idea to allow specifying checks, but the default should be that it collects from all configured jmx checks (unless it's very complex to implement)
cmd/agent/app/jmx.go
Outdated
if err != nil { | ||
log.Fatalln(err) | ||
} | ||
defer os.RemoveAll(dir) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
deferred calls won't be executed if os.Exit
is called, which is what log.Fatal
does.
Alternative: make the commands return an error
, use cobra.Command{RunE: [...]}
. Commands return an error whenever they run into "fatal" errors like this, and cobra takes cares of printing the error message.
cmd/agent/app/jmx.go
Outdated
for _, c := range configs { | ||
if strings.EqualFold(c.Name, checkName) { | ||
if c.IsJMX() { | ||
return &c, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unhandled edge case: unless I'm missing something, configs
may contain multiple configs for a given check name. This would only take into account the first one.
Removing the 6.1 milestone on this, we have to agree on an general implementation design and the 6.1 freeze is coming soon |
74da9cc
to
a2c398d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Mostly looks great, just a couple of nits and questions!
config.Datadog.Set("cmd_port", 0) | ||
|
||
// start the cmd HTTP server | ||
if err := api.StartServer(); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the agent is running won't this try to listen on the same port as the running agent and thus fail? Maybe I'm missing something because you would've surely encountered this while testing.... 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. I am setting the port to 0 to let the OS attribute one for me. The port is then given to jmxfetch as a command line parameter
pkg/jmxfetch/jmxfetch.go
Outdated
} | ||
|
||
// Run starts the JMXFetch process | ||
func (j *JMXFetch) Run() error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we call this Start()
instead? We're wrapping exec.Command
pretty closely, and I'd rather not be misled to think that this method will actually block until completion.
c.runner.LogLevel = config.Datadog.GetString("log_level") | ||
c.runner.JmxExitFile = jmxExitFile | ||
|
||
err := c.runner.Run() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: see my comment below regarding the JMXFetch
interface :)
return nil | ||
} | ||
|
||
func loadConfigs() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wondering if it would be more REST idiomatic to load all configs and use a GET query parameter to decide what we return. It feels a little more natural to me and a more predictable behavior. Your approach does minimize code changes on the JMXFetch side though.
pkg/jmxfetch/jmxfetch.go
Outdated
) | ||
|
||
if j.ConfDirectory != "" { | ||
subprocessArgs = append(subprocessArgs, "--check") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is actually kind of irrelevant now right? Looks like the check will just pull all configs available on the agent endpoint...
yamlBuff.Write([]byte(line)) | ||
yamlBuff.Write([]byte("\n")) | ||
} | ||
buffer, err := yaml.Marshal(&rawConfig) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🍰
@@ -21,6 +21,15 @@ import ( | |||
|
|||
var JMXConfigCache = cache.NewBasicCache() | |||
|
|||
func AddJMXCachedConfig(config check.Config) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is exported so could you please add a comment for the godoc describing its (obvious) behavior and the fact it's a thread-safe call?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me! 👍
new approach does not use files, relies on IPC via an https endpoint
What does this PR do?
Adds agent5-like JMXFetch helper commands.
https://docs.datadoghq.com/integrations/java/#troubleshooting
Motivation
Help with jmx troubleshooting.
Additional Notes
I would advice to review the second commit (656bb20) separately since it's a refactoring and could be merged separately from the rest.
Depends on DataDog/jmxfetch#171