Skip to content
Closed
Show file tree
Hide file tree
Changes from 14 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions packages/hadoop/_dev/build/build.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
dependencies:
ecs:
reference: [email protected]
37 changes: 37 additions & 0 deletions packages/hadoop/_dev/build/docs/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
# Hadoop

The Hadoop integration collects and parses data from the Hadoop Events APIs and using the Jolokia Metricbeat Module.

## Compatibility

This module has been tested against `Hadoop version 3.3.1`

## Requirements

In order to ingest data from Hadoop, you must know the full hosts for the NameNode, DataNode, Cluster Metrics, Node Manager and the Hadoop Events API.

## Metrics

### Application Metrics

This is the `application_metrics` dataset.

{{event "application_metrics"}}

{{fields "application_metrics"}}

### Expanded Cluster Metrics

This is the `expanded_cluster_metrics` dataset.

{{event "expanded_cluster_metrics"}}

{{fields "expanded_cluster_metrics"}}

### Jolokia Metrics

This is the `jolokia_metrics` dataset.

{{event "jolokia_metrics"}}

{{fields "jolokia_metrics"}}
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
ARG SERVICE_VERSION=${SERVICE_VERSION:-3}
FROM apache/hadoop:${SERVICE_VERSION}

ENV JOLOKIA_VERSION=1.6.0 JOLOKIA_ENABLED='yes' JOLOKIA_HOST=0.0.0.0
RUN wget "http://search.maven.org/remotecontent?filepath=org/jolokia/jolokia-jvm/${JOLOKIA_VERSION}/jolokia-jvm-${JOLOKIA_VERSION}-agent.jar" -O "jolokia-jvm-${JOLOKIA_VERSION}-agent.jar" && \
echo "export HDFS_NAMENODE_OPTS="-javaagent\:`echo /opt/hadoop/jolokia-jvm-${JOLOKIA_VERSION}-agent.jar=host=${JOLOKIA_HOST},port=7777`"" >> "/opt/hadoop/etc/hadoop/hadoop-env.sh"
30 changes: 30 additions & 0 deletions packages/hadoop/_dev/deploy/docker/config
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
CORE-SITE.XML_fs.default.name=hdfs://namenode:9000
CORE-SITE.XML_fs.defaultFS=hdfs://namenode:9000
HDFS-SITE.XML_dfs.namenode.rpc-address=namenode:9000
HDFS-SITE.XML_dfs.replication=1
LOG4J.PROPERTIES_log4j.rootLogger=INFO, stdout
LOG4J.PROPERTIES_log4j.appender.stdout=org.apache.log4j.ConsoleAppender
LOG4J.PROPERTIES_log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
LOG4J.PROPERTIES_log4j.appender.stdout.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss} %-5p %c{1}:%L - %m%n
MAPRED-SITE.XML_mapreduce.framework.name=yarn
MAPRED-SITE.XML_yarn.app.mapreduce.am.env=HADOOP_MAPRED_HOME=/opt/hadoop
MAPRED-SITE.XML_mapreduce.map.env=HADOOP_MAPRED_HOME=/opt/hadoop
MAPRED-SITE.XML_mapreduce.reduce.env=HADOOP_MAPRED_HOME=/opt/hadoop
YARN-SITE.XML_yarn.resourcemanager.hostname=resourcemanager
YARN-SITE.XML_yarn.nodemanager.pmem-check-enabled=false
YARN-SITE.XML_yarn.nodemanager.delete.debug-delay-sec=600
YARN-SITE.XML_yarn.nodemanager.vmem-check-enabled=false
YARN-SITE.XML_yarn.nodemanager.aux-services=mapreduce_shuffle
CAPACITY-SCHEDULER.XML_yarn.scheduler.capacity.maximum-applications=10000
CAPACITY-SCHEDULER.XML_yarn.scheduler.capacity.maximum-am-resource-percent=0.1
CAPACITY-SCHEDULER.XML_yarn.scheduler.capacity.resource-calculator=org.apache.hadoop.yarn.util.resource.DefaultResourceCalculator
CAPACITY-SCHEDULER.XML_yarn.scheduler.capacity.root.queues=default
CAPACITY-SCHEDULER.XML_yarn.scheduler.capacity.root.default.capacity=100
CAPACITY-SCHEDULER.XML_yarn.scheduler.capacity.root.default.user-limit-factor=1
CAPACITY-SCHEDULER.XML_yarn.scheduler.capacity.root.default.maximum-capacity=100
CAPACITY-SCHEDULER.XML_yarn.scheduler.capacity.root.default.state=RUNNING
CAPACITY-SCHEDULER.XML_yarn.scheduler.capacity.root.default.acl_submit_applications=*
CAPACITY-SCHEDULER.XML_yarn.scheduler.capacity.root.default.acl_administer_queue=*
CAPACITY-SCHEDULER.XML_yarn.scheduler.capacity.node-locality-delay=40
CAPACITY-SCHEDULER.XML_yarn.scheduler.capacity.queue-mappings=
CAPACITY-SCHEDULER.XML_yarn.scheduler.capacity.queue-mappings-override.enable=false
16 changes: 16 additions & 0 deletions packages/hadoop/_dev/deploy/docker/docker-compose.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
version: "3"
services:
hadoop:
build:
context: ./Dockerfiles
dockerfile: Dockerfile-namenode
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yug-elastic Why do we need a custom Dockerfile? Is there something wrong with the original entry point? Correct if I'm wrong, but JMX was already exposed there?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, but we are using Jolokia which, as per our understanding wraps JMX. That is why we have added a custom Dockerfile here inside which we setup Jolokia and configure Hadoop with it. Does that make sense?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider the docker-compose I provided you, without customer Dockerfile.

If you go to the http://localhost:9864/jmx, you can fetch all these metrics and don't have to install Jolokia, right? This is because there is the Metrics2 framework exposed.

Please check if that API differs much from Jolokia. We shouldn't force users to install extensions if there is native support.

WDYT?

hostname: namenode
command: ["hdfs", "namenode"]
ports:
- 7777
- 50070
- 9870
env_file:
- config
environment:
ENSURE_NAMENODE_DIR: "/tmp/hadoop-hadoop/dfs/name"
4 changes: 4 additions & 0 deletions packages/hadoop/_dev/deploy/variants.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
variants:
v3:
SERVICE_VERSION: 3
default: v3
7 changes: 7 additions & 0 deletions packages/hadoop/changelog.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# newer versions go on top

- version: "0.1.0"
changes:
- description: Initial draft of the package.
type: enhancement
link: https://github.com/elastic/integrations/pull/2614
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"id":"application_1640863938020_0001","user":"root","name":"QuasiMonteCarlo","queue":"default","state":"FINISHED","finalStatus":"FAILED","progress":100.0,"trackingUI":"History","trackingUrl":"http://master-node1:8088/proxy/application_1640863938020_0001/","diagnostics":"Task failed task_1640863938020_0001_m_000003\nJob failed as tasks failed. failedMaps:1 failedReduces:0 killedMaps:0 killedReduces: 0\n","clusterId":1640863938020,"applicationType":"MAPREDUCE","applicationTags":"","priority":0,"startedTime":1640863969439,"launchTime":1640863971051,"finishedTime":1640863990010,"elapsedTime":20571,"amContainerLogs":"http://ip-10-0-13-217.cdsys.local:8042/node/containerlogs/container_1640863938020_0001_01_000001/root","amHostHttpAddress":"ip-10-0-13-217.cdsys.local:8042","amRPCAddress":"ip-10-0-13-217.cdsys.local:33989","masterNodeId":"ip-10-0-13-217.cdsys.local:44767","allocatedMB":-1,"allocatedVCores":-1,"reservedMB":-1,"reservedVCores":-1,"runningContainers":-1,"memorySeconds":106595,"vcoreSeconds":67,"queueUsagePercentage":0.0,"clusterUsagePercentage":0.0,"resourceSecondsMap":{"entry":{"key":"memory-mb","value":"106595"}},"preemptedResourceMB":0,"preemptedResourceVCores":0,"numNonAMContainerPreempted":0,"numAMContainerPreempted":0,"preemptedMemorySeconds":0,"preemptedVcoreSeconds":0,"preemptedResourceSecondsMap":{},"logAggregationStatus":"DISABLED","unmanagedApplication":false,"amNodeLabelExpression":"","timeouts":{"timeout":[{"type":"LIFETIME","expiryTime":"UNLIMITED","remainingTimeInSeconds":-1}]}}
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
{
"expected": [
{
"ecs": {
"version": "8.0.0"
},
"event": {
"category": "database",
"kind": "metric",
"original": "{\"id\":\"application_1640863938020_0001\",\"user\":\"root\",\"name\":\"QuasiMonteCarlo\",\"queue\":\"default\",\"state\":\"FINISHED\",\"finalStatus\":\"FAILED\",\"progress\":100.0,\"trackingUI\":\"History\",\"trackingUrl\":\"http://master-node1:8088/proxy/application_1640863938020_0001/\",\"diagnostics\":\"Task failed task_1640863938020_0001_m_000003\\nJob failed as tasks failed. failedMaps:1 failedReduces:0 killedMaps:0 killedReduces: 0\\n\",\"clusterId\":1640863938020,\"applicationType\":\"MAPREDUCE\",\"applicationTags\":\"\",\"priority\":0,\"startedTime\":1640863969439,\"launchTime\":1640863971051,\"finishedTime\":1640863990010,\"elapsedTime\":20571,\"amContainerLogs\":\"http://ip-10-0-13-217.cdsys.local:8042/node/containerlogs/container_1640863938020_0001_01_000001/root\",\"amHostHttpAddress\":\"ip-10-0-13-217.cdsys.local:8042\",\"amRPCAddress\":\"ip-10-0-13-217.cdsys.local:33989\",\"masterNodeId\":\"ip-10-0-13-217.cdsys.local:44767\",\"allocatedMB\":-1,\"allocatedVCores\":-1,\"reservedMB\":-1,\"reservedVCores\":-1,\"runningContainers\":-1,\"memorySeconds\":106595,\"vcoreSeconds\":67,\"queueUsagePercentage\":0.0,\"clusterUsagePercentage\":0.0,\"resourceSecondsMap\":{\"entry\":{\"key\":\"memory-mb\",\"value\":\"106595\"}},\"preemptedResourceMB\":0,\"preemptedResourceVCores\":0,\"numNonAMContainerPreempted\":0,\"numAMContainerPreempted\":0,\"preemptedMemorySeconds\":0,\"preemptedVcoreSeconds\":0,\"preemptedResourceSecondsMap\":{},\"logAggregationStatus\":\"DISABLED\",\"unmanagedApplication\":false,\"amNodeLabelExpression\":\"\",\"timeouts\":{\"timeout\":[{\"type\":\"LIFETIME\",\"expiryTime\":\"UNLIMITED\",\"remainingTimeInSeconds\":-1}]}}",
"type": "info"
},
"hadoop": {
"metrics": {
"application_metrics": {
"memory_seconds": 106595,
"progress": 100,
"resources_allocated": {
"mb": -1,
"vcores": -1
},
"running_containers": -1,
"time": {
"elapsed": 20571,
"finished": 1640863990010,
"started": 1640863969439
},
"vcore_seconds": 67
}
}
},
"tags": [
"preserve_original_event"
]
}
]
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
fields:
tags:
- preserve_original_event
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
config_version: 2
interval: {{period}}
request.method: GET
request.url: {{hostname}}/ws/v1/cluster/apps
{{#if proxy_url }}
request.proxy_url: {{proxy_url}}
{{/if}}
{{#if ssl}}
request.ssl: {{ssl}}
{{/if}}
response.split:
target: body.apps.app
tags:
{{#if preserve_original_event}}
- preserve_original_event
{{/if}}
{{#each tags as |tag i|}}
- {{tag}}
{{/each}}
{{#contains "forwarded" tags}}
publisher_pipeline.disable_host: true
{{/contains}}
{{#if processors}}
processors:
{{processors}}
{{/if}}
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
---
description: Pipeline for parsing application metrics logs.
processors:
- set:
field: ecs.version
value: '8.0.0'
- rename:
field: message
target_field: event.original
ignore_missing: true
- json:
field: event.original
target_field: json
ignore_failure: true
- set:
field: event.type
value: info
- set:
field: event.kind
value: metric
- set:
field: event.category
value: database
- rename:
field: json.progress
target_field: hadoop.metrics.application_metrics.progress
ignore_missing: true
ignore_failure: true
- rename:
field: json.startedTime
target_field: hadoop.metrics.application_metrics.time.started
ignore_missing: true
ignore_failure: true
- rename:
field: json.finishedTime
target_field: hadoop.metrics.application_metrics.time.finished
ignore_missing: true
ignore_failure: true
- rename:
field: json.elapsedTime
target_field: hadoop.metrics.application_metrics.time.elapsed
ignore_missing: true
ignore_failure: true
- rename:
field: json.allocatedMB
target_field: hadoop.metrics.application_metrics.resources_allocated.mb
ignore_missing: true
ignore_failure: true
- rename:
field: json.allocatedVCores
target_field: hadoop.metrics.application_metrics.resources_allocated.vcores
ignore_missing: true
ignore_failure: true
- rename:
field: json.runningContainers
target_field: hadoop.metrics.application_metrics.running_containers
ignore_missing: true
ignore_failure: true
- rename:
field: json.memorySeconds
target_field: hadoop.metrics.application_metrics.memory_seconds
ignore_missing: true
ignore_failure: true
- rename:
field: json.vcoreSeconds
target_field: hadoop.metrics.application_metrics.vcore_seconds
ignore_missing: true
ignore_failure: true
- remove:
field: json
- script:
description: Drops null/empty values recursively.
lang: painless
source: |
boolean drop(Object o) {
if (o == null || o == "") {
return true;
} else if (o instanceof Map) {
((Map) o).values().removeIf(v -> drop(v));
return (((Map) o).size() == 0);
} else if (o instanceof List) {
((List) o).removeIf(v -> drop(v));
return (((List) o).length == 0);
}
return false;
}
drop(ctx);
- remove:
field: event.original
if: "ctx?.tags == null || !(ctx.tags.contains('preserve_original_event'))"
ignore_failure: true
ignore_missing: true
on_failure:
- set:
field: error.message
value: "{{{_ingest.on_failure_message}}}"
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
- name: data_stream.dataset
type: constant_keyword
description: Data stream dataset.
- name: data_stream.namespace
type: constant_keyword
description: Data stream namespace.
- name: data_stream.type
type: constant_keyword
description: Data stream type.
- name: event.dataset
type: constant_keyword
description: Event dataset.
value: hadoop.application_metrics
- name: event.module
type: constant_keyword
description: Event module.
value: hadoop
- name: '@timestamp'
type: date
description: Event timestamp.
- name: input.type
type: keyword
10 changes: 10 additions & 0 deletions packages/hadoop/data_stream/application_metrics/fields/ecs.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
- external: ecs
name: event.category
- external: ecs
name: event.kind
- external: ecs
name: event.type
- external: ecs
name: ecs.version
- external: ecs
name: tags
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
- name: hadoop.metrics
type: group
release: beta
fields:
- name: application_metrics
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand these metrics here. They look like random data points put in the same bucket. What exactly does memory_seconds mean? Why in the same group do we have time, running containers, vcore (?), and progress?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

memory_seconds = The amount of memory the application has allocated
We have put these fields in the same bucket with reference to the PRD. Please check this link.

type: group
fields:
- name: memory_seconds
type: long
- name: progress
type: long
- name: resources_allocated
type: group
fields:
- name: mb
type: long
- name: vcores
type: long
- name: running_containers
type: long
- name: time
type: group
fields:
- name: started
type: long
- name: elapsed
type: long
- name: finished
type: long
- name: vcore_seconds
type: long
39 changes: 39 additions & 0 deletions packages/hadoop/data_stream/application_metrics/manifest.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
title: Application Metrics
type: logs
release: beta
streams:
- input: httpjson
vars:
- name: period
type: text
title: Period
default: 60s
- name: tags
type: text
title: Tags
multi: true
required: true
show_user: false
default:
- forwarded
- hadoop-application_metrics
- name: preserve_original_event
required: true
show_user: true
title: Preserve original event
description: Preserves a raw copy of the original event, added to the field `event.original`.
type: bool
multi: false
default: false
- name: processors
type: yaml
title: Processors
multi: false
required: false
show_user: false
description: >
Processors are used to reduce the number of fields in the exported event or to enhance the event with metadata. This executes in the agent before the logs are parsed. See [Processors](https://www.elastic.co/guide/en/beats/filebeat/current/filtering-and-enhancing-data.html) for details.

template_path: httpjson.yml.hbs
title: Hadoop application metrics
description: Collect Hadoop application metrics.
Loading