-
Notifications
You must be signed in to change notification settings - Fork 514
[hadoop][application] Add application data stream for hadoop #2952
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[hadoop][application] Add application data stream for hadoop #2952
Conversation
|
Pinging @elastic/integrations (Team:Integrations) |
|
This PR is a split of #2614 as discussed over the comment #2614 (comment) |
packages/hadoop/manifest.yml
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be updated with 8.2.0 after testing this integration on 8.2.0.
ed9d09c to
6e75ce7
Compare
| services: | ||
| hadoop: | ||
| build: . | ||
| hostname: hadoop_metrics |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess that all our talks from #2614 about custom Dockerfile are still valid here, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With respect to our conversation over #2614, we have entirely revamped the system tests based on the JMX-based implementation. Let us know what you think of the new system tests implementation.
We have used the custom Dockerfile here (which uses apache/hadoop:3 as the base image).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I see, but it isn't the form I meant :)
I suggested following a similar pattern as here and here. As you can see, namenode, datanode, managers are defined as different services in the same network instead of putting everything on the same node. Maybe we don't need to customize it at all.
I'm also fine with doing it as a follow-up as it isn't critical.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, we will change the system tests accordingly in a follow-up PR and get this merged since it is not a blocker
|
|
||
| This integration uses Resource Manager API and JMX API to collect above metrics. | ||
|
|
||
| ## application_metrics |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess you can rename it to "application".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense.
packages/hadoop/docs/README.md
Outdated
| | event.kind | This is one of four ECS Categorization Fields, and indicates the highest level in the ECS category hierarchy. `event.kind` gives high-level information about what type of information the event contains, without being specific to the contents of the event. For example, values of this field distinguish alert events from metric events. The value of this field can be used to inform how these kinds of events should be handled. They may warrant different retention, different access control, it may also help understand whether the data coming in at a regular interval or not. | keyword | | ||
| | event.module | Name of the module this data is coming from. If your monitoring agent supports the concept of modules or plugins to process events of a given source (e.g. Apache logs), `event.module` should contain the name of this module. | keyword | | ||
| | event.type | This is one of four ECS Categorization Fields, and indicates the third level in the ECS category hierarchy. `event.type` represents a categorization "sub-bucket" that, when used along with the `event.category` field values, enables filtering events down to a level appropriate for single visualization. This field is an array. This will allow proper categorization of some events that fall in multiple event types. | keyword | | ||
| | hadoop.application_metrics.allocated.mb | Total memory allocated to the application's running containers (Mb) | long | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:+1 for filling in descriptions
packages/hadoop/manifest.yml
Outdated
| title: "Hadoop" | ||
| version: 0.1.0 | ||
| license: basic | ||
| description: "This Elastic integration collects metrics from hadoop." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Hadoop
0811de5 to
00fe33c
Compare
mtojek
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
What does this PR do?
Checklist
changelog.ymlfile.manifest.ymlfile to point to the latest Elastic stack release (e.g.^8.0.0).How to test this PR locally