Skip to content

[receiver/akamaisiemreceiver] - Initial implementation of Akamai SIEM native OTEL receiver#1119

Merged
ShourieG merged 15 commits into
elastic:mainfrom
ShourieG:feature/akamai_receiver
May 12, 2026
Merged

[receiver/akamaisiemreceiver] - Initial implementation of Akamai SIEM native OTEL receiver#1119
ShourieG merged 15 commits into
elastic:mainfrom
ShourieG:feature/akamai_receiver

Conversation

@ShourieG
Copy link
Copy Markdown
Contributor

@ShourieG ShourieG commented Mar 27, 2026

Native Akamai SIEM Receiver

This is a dedicated OTel receiver for the Akamai SIEM API. It implements polling, the chain state machine, and NDJSON streaming directly with native OTel Collector interfaces. It targets the existing data stream and ingest pipeline in Elasticsearch for the Akamai integration.

Why

The CEL-based input works but uses a generic CEL execution engine, which adds overhead and limits control over Akamai-specific offset and chain details. This receiver communicates with the Akamai API directly through EdgeGrid authentication, NDJSON streaming, offset pagination, and chain recovery, without the CEL layer.

Related: elastic/security-integrations#733

Proposed Commit message

feat(receiver/akamaisiemreceiver): add native Akamai SIEM receiver

A dedicated receiver for the Akamai SIEM API.

- EdgeGrid HMAC-SHA256 auth. NDJSON responses are parsed line by line with a one-line delay. This keeps the trailing offset-context line out of the event stream.

- Three-branch chain state machine, including offset drain, chain replay, and new chain, features 416 recovery, timestamp retries, clamping for too-old data, and offset TTL handling.

- Bounded-memory streaming sets up a small channel between the scanner and the batched ConsumeLogs consumer. This limits peak memory usage regardless of page size.

- Optional cursor persistence is available through the OTel storage extension. Without this option, restarts will re-fetch from initial_lookback.

- Each record has a body map keyed "message" containing the raw JSON, plus additional data_stream information such as {type, dataset, namespace} on the body and resource. It also includes elastic.mapping.mode: bodymap on the scope. The pipeline can remain at [batch] since the Akamai integration's ingest pipeline handles ECS enrichment.

- There are 16 metrics and spans across Poll, FetchPage, ProcessPage, EmitEvents, and PersistCursor. FetchPage also carries the truncated API error body for non-200 responses to simplify debugging.

How it works

The receiver polls /siem/v1/configs/{id} using EdgeGrid HMAC-SHA256 signed requests. The responses are gzipped NDJSON. A scanner goroutine reads lines into a bounded channel (stream_buffer_size). It uses a one-line-delay pattern that separates events from the trailing offset-context line. A consumer goroutine batches these events and calls ConsumeLogs for each batch (batch_size). Peak memory is limited to stream_buffer_size + batch_size events, regardless of the page size.

The cursor state persists through the OTel storage extension interface (storage.Client) after each successful page. Storage is optional. Without it, chain state still tracks across poll cycles in memory, but every collector restart will re-fetch from initial_lookback.

The poller operates a three-branch state machine:

  • Offset drain: continues from a stored offset (steady state).
  • Chain replay: if the offset is expired or missing, it replays from a time window with overlap.
  • New chain: for the first run or if caught up; it starts a fresh time-based window.

Offset TTL detection, 416 recovery, invalid timestamp retries, and "from too old" clamping are all managed within the state machine.

Output shape

Each log record carries the raw Akamai JSON in LogRecord.Body, with a map keyed by message, alongside data_stream.{type,dataset,namespace} body keys for Kibana filters. The same data_stream.* values are also included in resource attributes for the Elasticsearch exporter’s dynamic routing. The scope carries elastic.mapping.mode: bodymap, allowing the ES exporter to serialize the body map fields directly into the indexed document.

Scope attributes are part of the plog.Logs data structure and endure batch processor flushes and exporter queues without needing metadata_keys configuration. This is recommended by the ES exporter migration docs. Since the receiver writes the scope attribute itself, there is no need for a transform processor in the user pipeline; the common case simplifies to [batch].

data_stream.* defaults to logs / akamai.siem / default, which can be configured on the receiver. The Akamai integration’s ingest pipeline handles ECS enrichment; the receiver does not parse or transform event content. Existing Kibana dashboards are immediately usable.

Observability

There are 16 metrics covering API health (requests, request_errors, request_duration, bytes_received), event flow (events_received, events_emitted, events_per_page, events_per_second), pagination (pages_processed, cursor_persists, offset_expired, offset_ttl_drops), recovery (recovery_attempts, invalid_timestamp_retries), and timing (page_processing_time, poll_duration).

Spans cover Poll, FetchPage, ProcessPage, EmitEvents, and PersistCursor. For non-200 responses, the FetchPage span includes akamai.api.status_code, akamai.api.detail, and akamai.api.body (truncated to 2 KB). This helps operators debug proxy errors and HTML error pages without needing debug logs.

Benchmarks

For the mock Akamai API, 100k events per page, nop exporter, using Apple M1 Max, the metrics are:

Metric Value
EPS ~15.6k
CPU/event 0.009ms
KB/event 1.4
RSS 105 MB

The process is I/O-bound on gzip decompression and NDJSON streaming. Body-map construction effectively uses zero CPU per event.

What's in the PR

The PR contains 15 commits, organized for review:

# Commit What
1 feat: add module scaffold and telemetry definitions go.mod, Makefile, metadata.yaml (16 metrics), generated metadata code, documentation.md.
2 feat: add EdgeGrid HMAC-SHA256 authentication HMAC-SHA256 signing, http.RoundTripper transport wrapper.
3 feat: add HTTP client and NDJSON streaming HTTP client, URL builder, APIError types, StreamEvents with one-line-delay pattern.
4 feat: add cursor persistence via storage extension Cursor, CursorStore over storage.Client, TTL detection.
5 feat: add poller state machine and fetch loop Three-branch state machine, page processing, error recovery, span/metric instrumentation.
6 feat: add config and validation Config (squashed confighttp.ClientConfig), data_stream block, Validate.
7 feat: add factory wiring OTel receiver.Factory, createLogsReceiver.
8 feat: add receiver lifecycle and bodymap emission Start/Shutdown, pollLoop, emitEvents (body map + scope + resource attrs).
9 test: add receiver, integration, and tracing tests Mock-API integration tests, scope/resource/body assertions, cursor persistence, tracing.
10 test: add benchmarks emitEvents and full-poll benchmarks at multiple event counts.
11 docs: add README Architecture, configuration reference, tuning guide, pipeline component reference, scenarios.
12 docs: add architecture diagrams Two SVGs — overall architecture and bodymap data flow.
13 feat(distributions/elastic-components): register akamaisiemreceiver Manifest gomod entry + replace directive.
14 chore: register akamaisiemreceiver ownership and ignore local artifacts CODEOWNERS entry, .gitignore additions.
15 chore: regenerate metadata mdatagen output refresh.

Package structure

receiver/akamaisiemreceiver/
├── config.go, factory.go, receiver.go
└── internal/
    ├── akamaiclient/    HTTP client, NDJSON streaming, API errors.
    ├── auth/            EdgeGrid HMAC-SHA256 signing.
    ├── cursor/          Cursor state, storage extension persistence.
    ├── metadata/        Auto-generated telemetry.
    └── poller/          Three-branch state machine, fetch loop.

Tests

Integration tests run against a local httptest server with realistic NDJSON fixtures. They cover body map shape, scope and resource attribute injection, cursor persistence through a mock storage extension, severity defaults, and error recovery paths. Unit tests address config validation, EdgeGrid signing, NDJSON streaming, cursor operations, and poller state transitions. Benchmarks gauge emit throughput and full poll cycles.

End-to-end tests against a local Elasticsearch 8.17 stack with the Akamai integration installed confirm that documents reach logs-akamai.siem-default with full ECS enrichment from the ingest pipeline. This is true for both batch and non-batch pipeline configurations.

Quick start

receivers:
  akamai_siem:
    endpoint: "https://akab-xxxxx.luna.akamaiapis.net"
    config_ids: "12345"
    authentication:
      client_token: "${AKAMAI_CLIENT_TOKEN}"
      client_secret: "${AKAMAI_CLIENT_SECRET}"
      access_token: "${AKAMAI_ACCESS_TOKEN}"
    storage: file_storage   # optional — without it, every restart re-fetches from initial_lookback
    # data_stream defaults to logs / akamai.siem / default

extensions:
  file_storage:
    directory: /var/lib/otelcol/storage

processors:
  batch:                    # optional — recommended for amortizing flushes
    timeout: 10s

exporters:
  elasticsearch:
    endpoints: ["https://elasticsearch:9200"]
    api_key: "${ES_API_KEY}"

service:
  extensions: [file_storage]
  pipelines:
    logs:
      receivers: [akamai_siem]
      processors: [batch]
      exporters: [elasticsearch]

The minimum required setup includes just the receiver and an exporter. Everything else—batch, file_storage, sending_queue, metrics endpoint—is optional and noted in the README's scenarios. Additional scenarios, such as file export with rotation, console debug, and high-throughput production with persistent queueing, are detailed in the README.


Screenshots


ECS Discover Dashboard:

akamai_ecs_discover

Akamai SIEM Dashboard 1:

Akamai_dashboard_1

Akamai SIEM Dashboard 2:

Akamai_dashboard_2

@ShourieG ShourieG force-pushed the feature/akamai_receiver branch 2 times, most recently from 2547c30 to 2beae84 Compare March 28, 2026 17:37
@ShourieG ShourieG marked this pull request as ready for review March 28, 2026 18:01
@ShourieG ShourieG requested a review from a team as a code owner March 28, 2026 18:01
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 28, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Introduces a new Akamai SIEM receiver module that polls the Akamai SIEM API using EdgeGrid-signed requests, streams NDJSON event pages, persists chain cursors to disk, and emits logs in three modes: raw (ECS passthrough), otel (semantic mapping), and dual (both outputs). Adds client, streaming, cursor store, poller state machine, mapping and telemetry code, shared-instance lifecycle, factory integration, metadata/manifest/module files, documentation and benchmarks, extensive unit/integration/tracing tests, testdata, and CODEOWNERS/.gitignore updates.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • 🛠️ Update Documentation: Commit on current branch
  • 🛠️ Update Documentation: Create PR

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@receiver/akamaisiemreceiver/factory.go`:
- Around line 77-103: The package-global map seenPartialKeys is accessed
concurrently in createLogsReceiver causing races; protect all accesses to
seenPartialKeys (both the check "if prev, seen := seenPartialKeys[pk]; seen &&
..." and the subsequent assignment seenPartialKeys[pk] = key) with a
package-level mutex (e.g., a sync.RWMutex or sync.Mutex such as
seenPartialKeysMu) so reads/writes are serialized; add the mutex variable next
to seenPartialKeys and wrap the read+write block in the appropriate Lock/Unlock
(or RLock for the read if using RWMutex and upgrade to Lock for the write) to
eliminate concurrent map read/write panics and race conditions involving
partialKey/connectionKey and createLogsReceiver.

In `@receiver/akamaisiemreceiver/go.mod`:
- Line 78: Update the grpc dependency from google.golang.org/grpc v1.79.2 to
v1.79.3 (or later) to address the authorization-bypass vulnerability; edit the
go.mod entry for google.golang.org/grpc to the newer version, then run go get
google.golang.org/grpc@v1.79.3 (or desired newer tag) and go mod tidy to update
go.sum and lock the change.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 3757fe26-f6b6-4c3b-bf5c-73352b3d90ce

📥 Commits

Reviewing files that changed from the base of the PR and between 5a7856c and 2beae84.

⛔ Files ignored due to path filters (13)
  • receiver/akamaisiemreceiver/go.sum is excluded by !**/*.sum
  • receiver/akamaisiemreceiver/img/01_eps_by_batch.png is excluded by !**/*.png
  • receiver/akamaisiemreceiver/img/02_eps_by_batch_and_buffer.png is excluded by !**/*.png
  • receiver/akamaisiemreceiver/img/03_rss_by_batch.png is excluded by !**/*.png
  • receiver/akamaisiemreceiver/img/04_alloc_per_event.png is excluded by !**/*.png
  • receiver/akamaisiemreceiver/img/05_cpu_per_event.png is excluded by !**/*.png
  • receiver/akamaisiemreceiver/img/06_sys_memory.png is excluded by !**/*.png
  • receiver/akamaisiemreceiver/img/07_page_processing_time.png is excluded by !**/*.png
  • receiver/akamaisiemreceiver/img/08_dual_mode_comparison.png is excluded by !**/*.png
  • receiver/akamaisiemreceiver/img/architecture_dual.svg is excluded by !**/*.svg
  • receiver/akamaisiemreceiver/img/architecture_ecs.svg is excluded by !**/*.svg
  • receiver/akamaisiemreceiver/img/architecture_otel.svg is excluded by !**/*.svg
  • receiver/akamaisiemreceiver/img/architecture_overview.svg is excluded by !**/*.svg
📒 Files selected for processing (40)
  • .github/CODEOWNERS
  • .gitignore
  • distributions/elastic-components/manifest.yaml
  • receiver/akamaisiemreceiver/BENCHMARKS.md
  • receiver/akamaisiemreceiver/Makefile
  • receiver/akamaisiemreceiver/README.md
  • receiver/akamaisiemreceiver/benchmark_test.go
  • receiver/akamaisiemreceiver/config.go
  • receiver/akamaisiemreceiver/config_test.go
  • receiver/akamaisiemreceiver/doc.go
  • receiver/akamaisiemreceiver/factory.go
  • receiver/akamaisiemreceiver/generated_component_test.go
  • receiver/akamaisiemreceiver/generated_package_test.go
  • receiver/akamaisiemreceiver/go.mod
  • receiver/akamaisiemreceiver/integration_test.go
  • receiver/akamaisiemreceiver/internal/auth/edgegrid.go
  • receiver/akamaisiemreceiver/internal/auth/edgegrid_test.go
  • receiver/akamaisiemreceiver/internal/mapper/benchmark_test.go
  • receiver/akamaisiemreceiver/internal/mapper/otel.go
  • receiver/akamaisiemreceiver/internal/mapper/otel_test.go
  • receiver/akamaisiemreceiver/internal/metadata/generated_status.go
  • receiver/akamaisiemreceiver/internal/metadata/generated_telemetry.go
  • receiver/akamaisiemreceiver/internal/sharedcomponent/benchmark_test.go
  • receiver/akamaisiemreceiver/internal/sharedcomponent/client.go
  • receiver/akamaisiemreceiver/internal/sharedcomponent/client_test.go
  • receiver/akamaisiemreceiver/internal/sharedcomponent/cursor.go
  • receiver/akamaisiemreceiver/internal/sharedcomponent/cursor_test.go
  • receiver/akamaisiemreceiver/internal/sharedcomponent/poller.go
  • receiver/akamaisiemreceiver/internal/sharedcomponent/poller_test.go
  • receiver/akamaisiemreceiver/internal/sharedcomponent/shared.go
  • receiver/akamaisiemreceiver/internal/sharedcomponent/shared_test.go
  • receiver/akamaisiemreceiver/metadata.yaml
  • receiver/akamaisiemreceiver/receiver.go
  • receiver/akamaisiemreceiver/receiver_test.go
  • receiver/akamaisiemreceiver/testdata/config.yaml
  • receiver/akamaisiemreceiver/testdata/siem_response.ndjson
  • receiver/akamaisiemreceiver/testdata/siem_response_empty.ndjson
  • receiver/akamaisiemreceiver/testdata/siem_response_full.ndjson
  • receiver/akamaisiemreceiver/testdata/siem_response_no_offset.ndjson
  • receiver/akamaisiemreceiver/tracing_test.go

Comment thread receiver/akamaisiemreceiver/factory.go Outdated
Comment thread receiver/akamaisiemreceiver/go.mod Outdated
@ShourieG ShourieG force-pushed the feature/akamai_receiver branch 2 times, most recently from 2583aff to 83725d8 Compare March 29, 2026 13:25
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@receiver/akamaisiemreceiver/benchmark_test.go`:
- Line 132: The test is doing a direct type assertion akRcv :=
rcv.(*akamaiReceiver) which will panic because CreateLogs returns a
receiver.Logs backed by a SharedComponent wrapper from LoadOrStore; replace the
factory-backed creation with the existing benchReceiver helper used by the
EmitEvents benchmarks (or refactor the test to call the emitEvents logic
directly) so you get a real *akamaiReceiver instance for testing; locate usages
of CreateLogs, rcv, akRcv and change the setup to use benchReceiver (or extract
emitEvents into a testable function) to avoid asserting the SharedComponent
wrapper to *akamaiReceiver.

In `@receiver/akamaisiemreceiver/internal/auth/edgegrid.go`:
- Around line 116-125: Transport.RoundTrip can panic when t.Base is nil because
http.Client allows a nil Transport; update RoundTrip to use
http.DefaultTransport as a fallback (e.g., set base := t.Base; if base == nil {
base = http.DefaultTransport.(http.RoundTripper) } ) before calling
base.RoundTrip(clone), ensuring you reference Transport.RoundTrip, t.Base, and
http.DefaultTransport when locating and modifying the code.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 815fbd50-8a07-4fd8-94aa-82da5dedf0f0

📥 Commits

Reviewing files that changed from the base of the PR and between 5122d5e and 83725d8.

⛔ Files ignored due to path filters (13)
  • receiver/akamaisiemreceiver/go.sum is excluded by !**/*.sum
  • receiver/akamaisiemreceiver/img/01_eps_by_batch.png is excluded by !**/*.png
  • receiver/akamaisiemreceiver/img/02_eps_by_batch_and_buffer.png is excluded by !**/*.png
  • receiver/akamaisiemreceiver/img/03_rss_by_batch.png is excluded by !**/*.png
  • receiver/akamaisiemreceiver/img/04_alloc_per_event.png is excluded by !**/*.png
  • receiver/akamaisiemreceiver/img/05_cpu_per_event.png is excluded by !**/*.png
  • receiver/akamaisiemreceiver/img/06_sys_memory.png is excluded by !**/*.png
  • receiver/akamaisiemreceiver/img/07_page_processing_time.png is excluded by !**/*.png
  • receiver/akamaisiemreceiver/img/08_dual_mode_comparison.png is excluded by !**/*.png
  • receiver/akamaisiemreceiver/img/architecture_dual.svg is excluded by !**/*.svg
  • receiver/akamaisiemreceiver/img/architecture_otel.svg is excluded by !**/*.svg
  • receiver/akamaisiemreceiver/img/architecture_overview.svg is excluded by !**/*.svg
  • receiver/akamaisiemreceiver/img/architecture_raw.svg is excluded by !**/*.svg
📒 Files selected for processing (40)
  • .github/CODEOWNERS
  • .gitignore
  • distributions/elastic-components/manifest.yaml
  • receiver/akamaisiemreceiver/BENCHMARKS.md
  • receiver/akamaisiemreceiver/Makefile
  • receiver/akamaisiemreceiver/README.md
  • receiver/akamaisiemreceiver/benchmark_test.go
  • receiver/akamaisiemreceiver/config.go
  • receiver/akamaisiemreceiver/config_test.go
  • receiver/akamaisiemreceiver/doc.go
  • receiver/akamaisiemreceiver/factory.go
  • receiver/akamaisiemreceiver/generated_component_test.go
  • receiver/akamaisiemreceiver/generated_package_test.go
  • receiver/akamaisiemreceiver/go.mod
  • receiver/akamaisiemreceiver/integration_test.go
  • receiver/akamaisiemreceiver/internal/auth/edgegrid.go
  • receiver/akamaisiemreceiver/internal/auth/edgegrid_test.go
  • receiver/akamaisiemreceiver/internal/mapper/benchmark_test.go
  • receiver/akamaisiemreceiver/internal/mapper/otel.go
  • receiver/akamaisiemreceiver/internal/mapper/otel_test.go
  • receiver/akamaisiemreceiver/internal/metadata/generated_status.go
  • receiver/akamaisiemreceiver/internal/metadata/generated_telemetry.go
  • receiver/akamaisiemreceiver/internal/sharedcomponent/benchmark_test.go
  • receiver/akamaisiemreceiver/internal/sharedcomponent/client.go
  • receiver/akamaisiemreceiver/internal/sharedcomponent/client_test.go
  • receiver/akamaisiemreceiver/internal/sharedcomponent/cursor.go
  • receiver/akamaisiemreceiver/internal/sharedcomponent/cursor_test.go
  • receiver/akamaisiemreceiver/internal/sharedcomponent/poller.go
  • receiver/akamaisiemreceiver/internal/sharedcomponent/poller_test.go
  • receiver/akamaisiemreceiver/internal/sharedcomponent/shared.go
  • receiver/akamaisiemreceiver/internal/sharedcomponent/shared_test.go
  • receiver/akamaisiemreceiver/metadata.yaml
  • receiver/akamaisiemreceiver/receiver.go
  • receiver/akamaisiemreceiver/receiver_test.go
  • receiver/akamaisiemreceiver/testdata/config.yaml
  • receiver/akamaisiemreceiver/testdata/siem_response.ndjson
  • receiver/akamaisiemreceiver/testdata/siem_response_empty.ndjson
  • receiver/akamaisiemreceiver/testdata/siem_response_full.ndjson
  • receiver/akamaisiemreceiver/testdata/siem_response_no_offset.ndjson
  • receiver/akamaisiemreceiver/tracing_test.go
✅ Files skipped from review due to trivial changes (20)
  • .gitignore
  • receiver/akamaisiemreceiver/testdata/siem_response_empty.ndjson
  • .github/CODEOWNERS
  • receiver/akamaisiemreceiver/testdata/siem_response_no_offset.ndjson
  • distributions/elastic-components/manifest.yaml
  • receiver/akamaisiemreceiver/doc.go
  • receiver/akamaisiemreceiver/testdata/siem_response.ndjson
  • receiver/akamaisiemreceiver/internal/metadata/generated_status.go
  • receiver/akamaisiemreceiver/metadata.yaml
  • receiver/akamaisiemreceiver/go.mod
  • receiver/akamaisiemreceiver/generated_component_test.go
  • receiver/akamaisiemreceiver/testdata/siem_response_full.ndjson
  • receiver/akamaisiemreceiver/config_test.go
  • receiver/akamaisiemreceiver/internal/sharedcomponent/shared_test.go
  • receiver/akamaisiemreceiver/README.md
  • receiver/akamaisiemreceiver/internal/sharedcomponent/client_test.go
  • receiver/akamaisiemreceiver/internal/sharedcomponent/cursor.go
  • receiver/akamaisiemreceiver/integration_test.go
  • receiver/akamaisiemreceiver/internal/metadata/generated_telemetry.go
  • receiver/akamaisiemreceiver/internal/sharedcomponent/poller.go
🚧 Files skipped from review as they are similar to previous changes (4)
  • receiver/akamaisiemreceiver/generated_package_test.go
  • receiver/akamaisiemreceiver/testdata/config.yaml
  • receiver/akamaisiemreceiver/internal/mapper/benchmark_test.go
  • receiver/akamaisiemreceiver/config.go

Comment thread receiver/akamaisiemreceiver/benchmark_test.go
Comment thread receiver/akamaisiemreceiver/internal/auth/edgegrid.go
@axw
Copy link
Copy Markdown
Member

axw commented Mar 30, 2026

@ShourieG this is only for EDOT Collector right? Should it be in the elastic-agent repo? See also #1046 (comment)

@ShourieG
Copy link
Copy Markdown
Contributor Author

ShourieG commented Mar 30, 2026

#1046 (comment)

@axw, My Idea was that this receiver should be agent agnostic as in, we will create a component in the Elastic Agent and tie this in as a factory but we also want it to function as an independent receiver that can work with any OTEL backend in headless mode or with a different agent. It's built to work independently and have compatibility with Elastic Agent.

Copy link
Copy Markdown
Member

@axw axw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've only skimmed, since I was curious about the performance. There's some low hanging fruit for optimising the full otel SemConv configuration.

Comment thread receiver/akamaisiemreceiver/internal/mapper/otel.go Outdated
Comment thread receiver/akamaisiemreceiver/receiver.go Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@receiver/akamaisiemreceiver/README.md`:
- Around line 681-697: The example for receiver "akamai_siem" places the timeout
key at the receiver root but HTTP.Timeout must be under an http: mapping; move
the `timeout: 120s` entry into an `http:` block within the `akamai_siem:` config
(i.e., add an `http:` mapping and place `timeout: 120s` beneath it) so the
`HTTP.Timeout` setting is validated correctly for the akamai_siem receiver
(Scenario 8).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c0518efe-5138-4fe5-9d3e-69983ddfdc12

📥 Commits

Reviewing files that changed from the base of the PR and between bfbe7b4 and ab5f81e.

⛔ Files ignored due to path filters (2)
  • receiver/akamaisiemreceiver/img/04_alloc_per_event.png is excluded by !**/*.png
  • receiver/akamaisiemreceiver/img/08_dual_mode_comparison.png is excluded by !**/*.png
📒 Files selected for processing (4)
  • receiver/akamaisiemreceiver/BENCHMARKS.md
  • receiver/akamaisiemreceiver/README.md
  • receiver/akamaisiemreceiver/internal/mapper/otel.go
  • receiver/akamaisiemreceiver/internal/mapper/otel_test.go

Comment thread receiver/akamaisiemreceiver/README.md
@ShourieG
Copy link
Copy Markdown
Contributor Author

Hey @axw, whom should I reach out to for starting the review process on this. I was wondering If it was somehow possible to get in before 9.4 but if there are concerns then happy to delay and work on it as necessary.

@axw
Copy link
Copy Markdown
Member

axw commented Apr 1, 2026

@ShourieG have you had a review from your team? Probably best to start there, if not.
@cmacknz what's the usual process for components that will make their wait into Elastic Agent?

A few high level comments:

@ShourieG
Copy link
Copy Markdown
Contributor Author

ShourieG commented Apr 1, 2026

Hey @axw, @andrewkroh wants to review this but he does not seem to have the permission to add himself as a reviewer.

@ShourieG
Copy link
Copy Markdown
Contributor Author

ShourieG commented Apr 1, 2026

@ShourieG have you had a review from your team? Probably best to start there, if not. @cmacknz what's the usual process for components that will make their wait into Elastic Agent?

A few high level comments:

@axw, Thanks for the feedback. On the Separate PR issue, I can rewrite the commit history into segregated clean commits so a commit by commit review can be done instead. It's already like this to an extent but I can make it more extensive and granular.

On the other points -

  • The scraperhelper model does not fit the custom akamai poll logic, how we use chain draining, streaming the logs and emitEvents() per batch inside a page. scraperhelper would not fit with the streaming model with our custom chaining logic here.

  • The sharedcomponent refactor into specialised packages make sense and I will refactor that.

  • The otel storage extension makes sense if we want to donate this to upstream contrib, I'll look at this.

  • In terms of telemetry only batches_emitted seem redundant atm, rest is very akamai specific.

@ShourieG ShourieG force-pushed the feature/akamai_receiver branch from 88bb79e to 76ca9ac Compare April 1, 2026 17:15
@ShourieG
Copy link
Copy Markdown
Contributor Author

ShourieG commented Apr 1, 2026

@axw, I've addressed all the main outstanding issues besides scraperhelper as it's not compatible with our custom polling and streaming logic and the multiple PR approach. I believe having clean focused granular commits should help the review process, multiple PR's would make it pretty messy due to logic coupling present among components and just add/increase the whole review turn around time.

I re-did the commit history to make it more granular for easy reviews.

@ShourieG ShourieG force-pushed the feature/akamai_receiver branch from 847c3a2 to 56a2e5d Compare April 2, 2026 05:42
@axw
Copy link
Copy Markdown
Member

axw commented Apr 9, 2026

@ShourieG sorry for the delay, this dropped off my radar after the long weekend.

Hey @axw, @andrewkroh wants to review this but he does not seem to have the permission to add himself as a reviewer.

I'll look into how we can fix that.

The scraperhelper model does not fit the custom akamai poll logic, how we use chain draining, streaming the logs and emitEvents() per batch inside a page. scraperhelper would not fit with the streaming model with our custom chaining logic here.

OK. FYI, the reason I ask is because we'll soon have the ability to trigger scrapes with alternative methods other than a simple in-process timer: open-telemetry/opentelemetry-collector#14469

@axw axw requested a review from andrewkroh April 9, 2026 07:35
Comment thread receiver/akamaisiemreceiver/receiver.go Outdated
@ShourieG
Copy link
Copy Markdown
Contributor Author

ShourieG commented Apr 13, 2026

@ShourieG sorry for the delay, this dropped off my radar after the long weekend.

Hey @axw, @andrewkroh wants to review this but he does not seem to have the permission to add himself as a reviewer.

I'll look into how we can fix that.

The scraperhelper model does not fit the custom akamai poll logic, how we use chain draining, streaming the logs and emitEvents() per batch inside a page. scraperhelper would not fit with the streaming model with our custom chaining logic here.

OK. FYI, the reason I ask is because we'll soon have the ability to trigger scrapes with alternative methods other than a simple in-process timer: open-telemetry/opentelemetry-collector#14469

@axw I think we can revisit this in future but right now as it is there are some bottlenecks besides a custom polling logic:

  • emitEvents() calls both raw and OTel consumers in parallel per batch, scraperhelper has no concept of per-request routing to different consumers, I think atm.

  • The current scraper signature is func scrape(ctx context.Context) (plog.Logs, error), this is a limitation for use due to the streaming concept we are using to reduce back pressure. Our NDJSON streaming uses a bounded channel between the scanner and the consumer. If we had to buffer everything into a single plog.Logs return value, we'd need to hold 100k+ events in memory per page instead of streaming them in batches.

  • Currently a single poll cycle fetches N pages until the chain is drained. Each page triggers multiple ConsumeLogs calls (one per batch of 1000/batch_size events). scraperhelper expects one scrape → one return → one ConsumeLogs.

Will all these limitations change with the future updates ?

@axw
Copy link
Copy Markdown
Member

axw commented Apr 14, 2026

Will all these limitations change with the future updates ?

All valid points. I don't think the dual-output emitEvents one would change, but I think the others will need to, and I'm planning to raise them in the Collector SIG tomorrow. If we need to support sending to two consumers concurrently, then I think that could possibly be done with sharedcomponent.

@ShourieG ShourieG self-assigned this Apr 15, 2026
@ShourieG ShourieG added the enhancement New feature or request label Apr 15, 2026
"fmt"
"time"

"go.opentelemetry.io/collector/extension/xextension/storage"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you tested that this works with https://github.com/elastic/beats/tree/main/x-pack/otel/extension/elasticsearchstorage? That's what you would need for compatibility with agentless.

Outside of agentless, users will need to manually configure a filestorage extension, the input won't be stateful by default like it is in Filebeat.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cmacknz, Currently it's not compatible with elasticsearchstorage because it lacks a GetClient() method. elasticsearchstorage needs to implement the storage.Extension interface on it's end to actually make it compatible. I can add storage specific code and libs and do something like

switch {
case isStorageExtension(ext):
    return ext.(storage.Extension).GetClient(...)
case isBeatsRegistry(ext):
    store, _ := ext.(backend.Registry).Access(...)
    return &adapter{store}, nil
default:
    return nil, fmt.Errorf("unsupported storage type %q", id)
}

but this seems to break the philosophy of OTEL receivers as being storage agnostic. If elasticsearchstorage implements storage.Extension{} the current receiver implementation would work seamlessly without any change/storage specific code.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the complexity comes in how storage extension and the beats statestore handle the actual data, where otel storage extension works with raw []byte, elasticsearchstorage works with interface{} and does custom encoding on the value. I had a brainstorm session with Claude and came up with a plan on how to handle this in Beats.

We can also stick to the approach of breaking convention and having a storage based approach inside the receiver itself. I think that's the general call to take here. If you find the plan somewhat viable, I can create a PR in beats for it.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking an initial look, the elasticsearch storage extension should be compatible with any receiver we just didn't have that need when we originally adapted it to the storage interface and it was done with some time pressure.

Probably easier to start with a PR if that is easy, then we can see exactly what the changes need to be instead of discussing what they might be in an issue. CC @VihasMakwana who did the initial port of the Beat ES registry storage to the collector storage interface.

@ShourieG
Copy link
Copy Markdown
Contributor Author

I don't have to maintain this so it's ultimately your call on if you want the dual mode feature, I just wanted to make sure you had a confirmed use case for this because it is adding complexity and this is to my knowledge the first receiver that works this way. Everything else is either/or, not that this is an invalid thing to do on its own.

@cmacknz, just as an update after discussions with @andrewkroh, latest commits have removed dual mode operation for the moment and have simplified the receiver operation.

@ShourieG
Copy link
Copy Markdown
Contributor Author

@andrewkroh, @axw, @cmacknz if this PR looks good atm, can we proceed with the approval ?

Comment thread receiver/akamaisiemreceiver/receiver.go Outdated

// emitRaw sends events as raw JSON body maps. Each log record body is a map
// with key "message" containing the raw Akamai JSON string. The downstream
// pipeline should set elastic.mapping.mode: bodymap via a transform processor
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't you set the mapping mode attribute here directly instead of requiring it to be set in a transform processor?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is mainly to stick to the otel design philosophy. If we want to commit this to upstream, should the receiver explicitly define the mapping mode ? Won't that tightly couple the receiver behaviour to Elastic specific implementations by default ?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't follow any upstream convention, the bodymap mapping mode is Elastic specific and we made it for ourselves to allow transporting non-OTel data in the OTLP logs signal.

The JSON document is ECS is it now? This entire mode of execution (along with beats receivers) is vendor specific and that's on purpose. It lets us standardize on the collector as a data collection framework without having to migrate customers outside of Observability to OTLP+SemConv, which has no automatic solution.

Copy link
Copy Markdown
Contributor Author

@ShourieG ShourieG May 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The JSON document in output_mode:raw is the raw json event wrapped in the message field, it does not have any ecs specific mapping as of yet. The entire ecs mapping happens in the ingest pipelines. In otel mode we do the sem-cov mapping in the receiver itself.

Because this JSON is the raw event without any modifications, in this receivers case, I thought it made sense for the mapping to be set using the transform processor. But since bodymap is an Elastic convention, it probably makes sense to tie it into the receiver itself. I will make this change and test it, please let me know if there are any other concerns regarding the current approach and I can address them in the next commit accordingly.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it's less about ECS then, and more what structure you want the document to have coming out of the exporter. bodymap will send only the JSON event without any of the containing OTLP logs structure, without bodymap you'd send the OTLP log as is with no attributes or scope values set.

People who want the OTLP structure will just use the otel mode. I don't think there's a use for wrapping sending the raw JSON event in an OTLP log record with none of the attributes set properly.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, the mapping LGTM

Comment thread receiver/akamaisiemreceiver/config.go Outdated
Comment thread receiver/akamaisiemreceiver/config.go Outdated
Comment thread receiver/akamaisiemreceiver/internal/mapper/otel.go Outdated
Comment thread receiver/akamaisiemreceiver/documentation.md Outdated
Comment thread receiver/akamaisiemreceiver/integration_test.go Outdated
Comment thread receiver/akamaisiemreceiver/receiver.go Outdated
Comment thread receiver/akamaisiemreceiver/internal/poller/poller.go Outdated
Comment thread receiver/akamaisiemreceiver/internal/poller/poller.go Outdated
Comment thread receiver/akamaisiemreceiver/internal/mapper/otel.go Outdated
Comment thread receiver/akamaisiemreceiver/internal/poller/poller.go Outdated
@ShourieG ShourieG force-pushed the feature/akamai_receiver branch from 736451f to 9626a9c Compare May 5, 2026 13:16
@ShourieG
Copy link
Copy Markdown
Contributor Author

ShourieG commented May 5, 2026

@andrewkroh, addressed all your suggestions and removed OTEL mode for now, this means we no longer require an explicit output_format anymore.

@cmacknz, implemented your suggestions and coupled bodymap into the log scope in the receiver as suggested so no need for a separate transform processor anymore.

I had claude re-write the commit history and clean it up for easy review. The config is much simpler now and hopefully it will be easier to maintain as an initial release.

@ShourieG ShourieG force-pushed the feature/akamai_receiver branch from 9626a9c to 96b36c1 Compare May 5, 2026 13:38
@andrewkroh
Copy link
Copy Markdown
Member

I had claude re-write the commit history and clean it up for easy review.

That might help if someone is starting a fresh review. But for reviewers that already started it means re-reviewing everything rather than doing a differential review of changes since last review.

@ShourieG
Copy link
Copy Markdown
Contributor Author

@axw , @cmacknz could I get a sign-off on the PR so it's approved to merge.

Copy link
Copy Markdown
Member

@axw axw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I've still only skimmed it.

@ShourieG ShourieG merged commit cdd457b into elastic:main May 12, 2026
19 checks passed
@ShourieG ShourieG deleted the feature/akamai_receiver branch May 12, 2026 11:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants