Skip to content

[Fleet] Enable log introspections#14390

Merged
michalpristas merged 20 commits intoelastic:fleetfrom
michalpristas:feat-fleet-logintro
Dec 5, 2019
Merged

[Fleet] Enable log introspections#14390
michalpristas merged 20 commits intoelastic:fleetfrom
michalpristas:feat-fleet-logintro

Conversation

@michalpristas
Copy link
Copy Markdown
Contributor

@michalpristas michalpristas commented Nov 5, 2019

Enables log introspection by introducing monitoring component into application
Flow

  • FLEET_MONITORING program is injected into AST with
    • monitoring configuration from agent spec
    • list of programs included in configuration
    • FLEET_MONITORING is appended as a last step of configuration change
  • operator detect FLEET_MONITORING and start/stop respective beats to monitor monitorable beats
  • beats are made monitorable by
    • including logging configuration
    • including http.enabled with host specified targeting unix socket

Fixes: #14437
Fixes #14452

@ph ph self-assigned this Nov 15, 2019
@ph ph removed the Team:Beats label Nov 19, 2019
@ph ph added the in progress Pull request is currently in progress. label Nov 22, 2019
@ph
Copy link
Copy Markdown
Contributor

ph commented Nov 22, 2019

#14558 was merged in master and I've rebased the fleet branch with the commit.

@michalpristas michalpristas marked this pull request as ready for review November 26, 2019 09:18
@ph
Copy link
Copy Markdown
Contributor

ph commented Nov 26, 2019

Issues on CI doesn't seems related to the changes in this PR.

ok  	github.com/elastic/beats/x-pack/agent/pkg/sorted	1.067s	coverage: 100.0% of statements

=== RUN   TestTokenBucket

--- FAIL: TestTokenBucket (0.13s)

    token_bucket_test.go:39: 

        	Error Trace:	token_bucket_test.go:39

        	Error:      	Not equal: 

        	            	expected: 5

        	            	actual  : 4

        	Test:       	TestTokenBucket

@ph ph self-requested a review November 26, 2019 17:09
@michalpristas
Copy link
Copy Markdown
Contributor Author

no it's not related this is timing issue, this test fails on TEST_ENVIRONMENT=0 quite often i did some refactoring of test but timing on flagged env is quite difficult

Copy link
Copy Markdown
Contributor

@ph ph left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The strategy looks good, could we add a few more tests concerning the injection and the addition to the operator?

return result
}

func (o *Operator) handleStopSidecar(s configrequest.Step) error {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: you can also use a named return so you don't have to initialize the variable.

Suggested change
func (o *Operator) handleStopSidecar(s configrequest.Step) error {
func (o *Operator) handleStopSidecar(s configrequest.Step) result error {


outputIface, found := config[outputKey]
if !found {
o.logger.Errorf("operator.getMonitoringSteps: monitoring configuration not found for sidecar: %v", config)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can leak credentials in the log?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very good catch


output, found := outputMap["elasticsearch"]
if !found {
o.logger.Error("operator.getMonitoringSteps: monitoring is missing an elasticsearch configuration from output: %v", outputMap)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Credentials leaks possible?

)

func emitter(log *logger.Logger, router *router) emitterFunc {
type decoratorFunc = func(*transpiler.AST, []program.Program) ([]program.Program, error)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of the decorators.

}

steps = append(steps, metricbeatStep)
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should extract the "making of the step into two distinct method" so we can test them in isolation.


// NewMonitor creates a beats monitor.
func NewMonitor(process, pipelineID string, downloadConfig *artifact.Config, monitorLogs, monitorMetrics bool) *Monitor {
var monitoringEndpoint, loggingPath string
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LoggingPath is per pipeline id + process, Correct?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

path is common for multiple processes, in same directory there are multiple files e.g metricbeat, metricbeat.1... filebeat...name based on process

}

return append(args, appendix...)
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++ for using arguments here for beats, we might have to refactor that in to support endpoint, I hope they adopt the beats way here but I don't know the plan.


const (
// args: pipeline name, application name
logFileFormat = "/var/log/%s/%s"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the above case is the first %s the process? I think we should encapsulate all the logs under an elastic-agent directory.

logFileFormatWin = "%s\\logs\\%s\\%s"

// args: pipeline name, application name
mbEndpointFileFormat = "unix:///var/run/fleet/%s/%s/%s.sock"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/fleet/elastic-agent


// so far we support only beats monitoring
return beats.NewMonitor(process, pipelineID, downloadConfig, monitorLogs, monitorMetrics)
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see what you did here, this will work for endpoint too :)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with agent we will either create a new monitor and modify this factory or they will support beats interface and we do nothing. but this allow us to do both

@ph
Copy link
Copy Markdown
Contributor

ph commented Nov 26, 2019

no it's not related this is timing issue, this test fails on TEST_ENVIRONMENT=0 quite often i did some refactoring of test but timing on flagged env is quite difficult

Let's create an issue to prioritize it.

@ph
Copy link
Copy Markdown
Contributor

ph commented Dec 2, 2019

OK, code LGTM, I need to actually test/try this.

One thing we didn't consider is how we will monitor the Agent itself? Should we run the processes in a special streams?

@michalpristas
Copy link
Copy Markdown
Contributor Author

if agent is connected to fleet we have a checkin api which contains subset of logs/events.

log-wise we dont monitor so far. simplest way would be to create a sidecar for an agent or agent sending it directly. which i guess is related to your latest issue

@ph
Copy link
Copy Markdown
Contributor

ph commented Dec 3, 2019

Yes, we don't, lets create an issue to track it but having a sidecar to monitor the agents seems what we should do. in the end it's just matter of deciding in which streams the sidecards are created and inject the appropriate configuration.

@ph
Copy link
Copy Markdown
Contributor

ph commented Dec 3, 2019

@michalpristas code LGTM, can we rebase? I've created a conflict because of multiples outputs sorry about that. :(

@michalpristas
Copy link
Copy Markdown
Contributor Author

this solution creates a monitoring for each output, so streams using output A are monitored by one set of sidecars which then sends logs to output A,
and streams using B output are monitored by different set sending results to B.

another option would be to create

  • DEFAULT output which will always be there and will be used also for monitoring
  • monitoring.output option which will use monitoring
    ^^ these will also mean that we will monitor all streams using single set of monitoring sidecars

cc @ph

@ph
Copy link
Copy Markdown
Contributor

ph commented Dec 4, 2019

@michalpristas let's merge this as is and do a followup PR, but we will need to actually be able to specify the outputs using monitoring.elasticsearch, From what I've seen in the UI we can only specify a single monitoring per policy. @mattapperson or @nchaulet can confirm that?

@nchaulet
Copy link
Copy Markdown
Member

nchaulet commented Dec 4, 2019

@ph yes you can just configure one monitoring per policy

@michalpristas
Copy link
Copy Markdown
Contributor Author

i made the change to support monitoring.elasticsearch
look like it does not make sense to have it in a way i had it yesterday it is more complicated solution which does not comply to out contract.

@michalpristas michalpristas merged commit 1bc5895 into elastic:fleet Dec 5, 2019
leweafan pushed a commit to leweafan/beats that referenced this pull request Apr 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

in progress Pull request is currently in progress.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants