Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[jmxfetch] Add agent5-like JMXFetch helper commands #1208

Merged
merged 30 commits into from
Apr 19, 2018

Conversation

arbll
Copy link
Member

@arbll arbll commented Feb 7, 2018

What does this PR do?

Adds agent5-like JMXFetch helper commands.

https://docs.datadoghq.com/integrations/java/#troubleshooting

Motivation

Help with jmx troubleshooting.

Additional Notes

I would advice to review the second commit (656bb20) separately since it's a refactoring and could be merged separately from the rest.

Depends on DataDog/jmxfetch#171

@arbll arbll requested a review from a team as a code owner February 7, 2018 19:04
@arbll arbll force-pushed the arbll/jmx-helper-commands branch 2 times, most recently from dcac445 to eb77bd0 Compare February 7, 2018 19:32
@olivielpeau
Copy link
Member

discussed this IRL: this would make the helpers work only if for example the foo check's config is defined in a file located at conf.d/foo.d/conf.yaml, which is not always the case (and, actually, in most cases it's not the only file where the config would be located).

Let's generalize this to at least pull the configs from the FileProvider config provider (and ideally all valid config providers, depending on the amount of work needed on JMXFetch's side)

@arbll arbll force-pushed the arbll/jmx-helper-commands branch from eb77bd0 to fb75f5a Compare February 13, 2018 10:50
@arbll arbll requested a review from a team as a code owner February 13, 2018 10:50
@arbll arbll added this to the 6.0.0 milestone Feb 16, 2018
@olivielpeau olivielpeau modified the milestones: 6.0.0, 6.1.0 Feb 20, 2018
@arbll arbll force-pushed the arbll/jmx-helper-commands branch from fb75f5a to faea963 Compare March 1, 2018 11:20
@olivielpeau
Copy link
Member

@arbll coming back to reviewing this, have a few pre-review comments:

  • could you update the docs to reflect the new commands, and could you also document their caveats if there are any? (for instance on the source of the yaml configs)
  • can you fix the merge conflict?

Had a quick look at the code and didn't notice any blocker, but I'll give this a thorough review once you've addressed these 2 points, thanks! :)

@arbll arbll force-pushed the arbll/jmx-helper-commands branch from faea963 to 9de5251 Compare March 6, 2018 16:56
Copy link
Member

@olivielpeau olivielpeau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit concerned about the general approach of copying check configs to a temp directory and running jmxfetch against it. We have to be very careful about the permissions that get applied to these files, and if the command fails unexpectedly the copied files would remain there. These files can very well contain secrets that shouldn't be written in some unclear location on the filesystem, even if the ownership/perms are good.

If there's no other reasonable way to implement this, I think we should make the behavior very clear by asking users to provide a directory (as a command line option) where all the pulled configs would be written to.

Talking about alternative implementation approaches, have you looked into making this command use the same approach as a running agent (i.e. the agent provides an authenticated https endpoint that JMXFetch hits to pull the check configs)? Is JMXFetch able to run list_ commands and pull configs from that endpoint instead of the filesystem?

jmxListCmd.AddCommand(jmxListEverythingCmd, jmxListMatchingCmd, jmxListLimitedCmd, jmxListCollectedCmd, jmxListNotMatchingCmd)

jmxListCmd.PersistentFlags().StringSliceVar(&checks, "checks", []string{"jmx"}, "JMX checks (ex: jmx,tomcat)")
jmxCollectCmd.PersistentFlags().StringSliceVar(&checks, "checks", []string{"jmx"}, "JMX checks (ex: jmx,tomcat)")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a good idea to allow specifying checks, but the default should be that it collects from all configured jmx checks (unless it's very complex to implement)

if err != nil {
log.Fatalln(err)
}
defer os.RemoveAll(dir)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deferred calls won't be executed if os.Exit is called, which is what log.Fatal does.

Alternative: make the commands return an error, use cobra.Command{RunE: [...]}. Commands return an error whenever they run into "fatal" errors like this, and cobra takes cares of printing the error message.

for _, c := range configs {
if strings.EqualFold(c.Name, checkName) {
if c.IsJMX() {
return &c, nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unhandled edge case: unless I'm missing something, configs may contain multiple configs for a given check name. This would only take into account the first one.

@olivielpeau
Copy link
Member

Removing the 6.1 milestone on this, we have to agree on an general implementation design and the 6.1 freeze is coming soon

@arbll arbll force-pushed the arbll/jmx-helper-commands branch from 74da9cc to a2c398d Compare April 16, 2018 12:11
@arbll arbll added [deprecated] team/agent-core Deprecated. Use metrics-logs / shared-components labels instead.. kind/feature and removed do-not-merge/WIP labels Apr 16, 2018
@truthbk truthbk self-requested a review April 16, 2018 14:56
Copy link
Member

@truthbk truthbk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Mostly looks great, just a couple of nits and questions!

config.Datadog.Set("cmd_port", 0)

// start the cmd HTTP server
if err := api.StartServer(); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the agent is running won't this try to listen on the same port as the running agent and thus fail? Maybe I'm missing something because you would've surely encountered this while testing.... 🤔

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. I am setting the port to 0 to let the OS attribute one for me. The port is then given to jmxfetch as a command line parameter

}

// Run starts the JMXFetch process
func (j *JMXFetch) Run() error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we call this Start() instead? We're wrapping exec.Command pretty closely, and I'd rather not be misled to think that this method will actually block until completion.

c.runner.LogLevel = config.Datadog.GetString("log_level")
c.runner.JmxExitFile = jmxExitFile

err := c.runner.Run()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: see my comment below regarding the JMXFetch interface :)

return nil
}

func loadConfigs() {
Copy link
Member

@truthbk truthbk Apr 16, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering if it would be more REST idiomatic to load all configs and use a GET query parameter to decide what we return. It feels a little more natural to me and a more predictable behavior. Your approach does minimize code changes on the JMXFetch side though.

)

if j.ConfDirectory != "" {
subprocessArgs = append(subprocessArgs, "--check")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually kind of irrelevant now right? Looks like the check will just pull all configs available on the agent endpoint...

yamlBuff.Write([]byte(line))
yamlBuff.Write([]byte("\n"))
}
buffer, err := yaml.Marshal(&rawConfig)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🍰

@@ -21,6 +21,15 @@ import (

var JMXConfigCache = cache.NewBasicCache()

func AddJMXCachedConfig(config check.Config) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is exported so could you please add a comment for the godoc describing its (obvious) behavior and the fact it's a thread-safe call?

truthbk
truthbk previously approved these changes Apr 19, 2018
Copy link
Member

@truthbk truthbk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! 👍

@truthbk truthbk dismissed olivielpeau’s stale review April 19, 2018 13:11

new approach does not use files, relies on IPC via an https endpoint

@arbll arbll merged commit 7c492b6 into master Apr 19, 2018
@arbll arbll deleted the arbll/jmx-helper-commands branch April 19, 2018 13:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[deprecated] team/agent-core Deprecated. Use metrics-logs / shared-components labels instead.. kind/feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants