Skip to content

[beatreceiver] - Add status reporting#44782

Merged
VihasMakwana merged 20 commits into
mainfrom
wrap-runner-factory
Jun 26, 2025
Merged

[beatreceiver] - Add status reporting#44782
VihasMakwana merged 20 commits into
mainfrom
wrap-runner-factory

Conversation

@VihasMakwana
Copy link
Copy Markdown
Contributor

@VihasMakwana VihasMakwana commented Jun 12, 2025

Proposed commit message

This PR adds status reporting for beatreceivers. The status reporting is added while creating the runners. The first PR (#44528) was quite "hacky" and it had go deep down to inject status reporters.

This PR adds a runner factory wrapper that will:

  1. Call the parent factory to create the runner
  2. Inject status reporter

The code responsible for doing the above tasks will live in libbeat and we will only enable it for beatreceivers. From an the beat receiver high level, it will do following:

  1. The beater will be created in createReceiver
  2. We will add the factory wrapper
    if w, ok := br.beater.(cfgfile.WithFactoryWrapper); ok {
    groupReporter := status.NewGroupStatusReporter(host)
    w.WithFactoryWrapper(status.StatusReporterFactory(groupReporter))
    }
  3. The receiver will kick off the beater
    func (br *BeatReceiver) Start() error {
    if err := br.beater.Run(&br.beat.Beat); err != nil {
    return fmt.Errorf("beat receiver run error: %w", err)
    }
    return nil
    }

Note:

To accomplish the above steps, it is essential that we create the runners in beater.Run(...). Currently, metricbeat creates runners during the beater creation phase and starts them in beater.Run(...). This PR moves the runner creation code in beater.Run(...) to closely align with filebeat's implementation.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

Related issues

Screenshots

Screenshot 2025-05-29 at 8 14 07 PM

Output

Here's output of running two streams (degraded) together:

┌─ fleet
│  └─ status: (STOPPED) Not enrolled into Fleet
└─ elastic-agent
   ├─ status: (DEGRADED) 1 or more components/units in a degraded state
   ├─ pipeline:logs/_agent-component/filestream-default
   │  ├─ status: StatusRecoverableError [error while running harvester: cannot read from file source: /var/log/elasticAgent-install-20240625_133733.log]
   │  ├─ exporter:elasticsearch/_agent-component/default
   │  │  └─ status: StatusOK
   │  └─ receiver:filebeatreceiver/_agent-component/filestream-default
   │     └─ status: StatusRecoverableError [error while running harvester: cannot read from file source: /var/log/elasticAgent-install-20240625_133733.log]
   └─ pipeline:logs/_agent-component/system/metrics-default
      ├─ status: StatusRecoverableError [Error fetching data for metricset system.process: error fetching process list: non fatal error; reporting partial metrics: error fetching PID metrics for 607 processes, most likely a "permission denied" error. Enable debug logging to determine the exact cause.]
      ├─ exporter:elasticsearch/_agent-component/default
      │  └─ status: StatusOK
      └─ receiver:metricbeatreceiver/_agent-component/system/metrics-default
         └─ status: StatusRecoverableError [Error fetching data for metricset system.process: error fetching process list: non fatal error; reporting partial metrics: error fetching PID metrics for 607 processes, most likely a "permission denied" error. Enable debug logging to determine the exact cause.]

Testing

  1. Checkout this PR locally
  2. Go to elastic-agent and follow this guide to test local beats changes
  3. Package agent with mage package
  4. Follow steps on Beat receivers do not correctly report status back to the Elastic Agent elastic-agent#8210 to install agent and verify the status

Closes elastic/elastic-agent#8210

@botelastic botelastic Bot added the needs_team Indicates that the issue/PR needs a Team:* label label Jun 12, 2025
@VihasMakwana VihasMakwana self-assigned this Jun 12, 2025
@github-actions
Copy link
Copy Markdown
Contributor

🤖 GitHub comments

Expand to view the GitHub comments

Just comment with:

  • run docs-build : Re-trigger the docs validation. (use unformatted text in the comment!)

@VihasMakwana VihasMakwana added the Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team label Jun 12, 2025
@botelastic botelastic Bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Jun 12, 2025
@mergify
Copy link
Copy Markdown
Contributor

mergify Bot commented Jun 12, 2025

This pull request does not have a backport label.
If this is a bug or security fix, could you label this PR @VihasMakwana? 🙏.
For such, you'll need to label your PR with:

  • The upcoming major version of the Elastic Stack
  • The upcoming minor version of the Elastic Stack (if you're not pushing a breaking change)

To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-8./d is the label to automatically backport to the 8./d branch. /d is the digit
  • backport-active-all is the label that automatically backports to all active branches.
  • backport-active-8 is the label that automatically backports to all active minor branches for the 8 major.
  • backport-active-9 is the label that automatically backports to all active minor branches for the 9 major.

@VihasMakwana VihasMakwana force-pushed the wrap-runner-factory branch from f6e8835 to a296731 Compare June 12, 2025 12:38
@VihasMakwana VihasMakwana force-pushed the wrap-runner-factory branch from a296731 to f9b2f46 Compare June 12, 2025 12:39
@mauri870
Copy link
Copy Markdown
Member

Quite an elegant solution for this problem!

@VihasMakwana VihasMakwana changed the title [WIP][fbreceiver] - Add status reporting [WIP][beatreceiver] - Add status reporting Jun 13, 2025
@VihasMakwana VihasMakwana force-pushed the wrap-runner-factory branch from bae3a80 to 344bbce Compare June 16, 2025 18:41
@VihasMakwana VihasMakwana marked this pull request as ready for review June 17, 2025 05:32
@VihasMakwana VihasMakwana requested review from a team as code owners June 17, 2025 05:32
@VihasMakwana VihasMakwana requested review from faec and leehinman June 17, 2025 05:32
@elasticmachine
Copy link
Copy Markdown
Contributor

Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane)

@VihasMakwana VihasMakwana requested review from khushijain21 and mauri870 and removed request for faec June 17, 2025 05:40
@VihasMakwana VihasMakwana changed the title [WIP][beatreceiver] - Add status reporting [beatreceiver] - Add status reporting Jun 17, 2025
@mergify
Copy link
Copy Markdown
Contributor

mergify Bot commented Jun 17, 2025

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b wrap-runner-factory upstream/wrap-runner-factory
git merge upstream/main
git push upstream wrap-runner-factory

@VihasMakwana
Copy link
Copy Markdown
Contributor Author

VihasMakwana commented Jun 20, 2025

@mauri870 @khushijain21 I've added new test cases and have made changes to benchmark modules for testing. We can now make benchmark module return error if we want, to test status reporting.
Please take a look!

@VihasMakwana VihasMakwana force-pushed the wrap-runner-factory branch from df815c3 to c400227 Compare June 23, 2025 09:10
Copy link
Copy Markdown
Member

@cmacknz cmacknz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM, my only remaining comment is that seeing the FactoryWrapper methods outside of the context of this PR makes it quite hard to figure out what they do.

Could we rename them to have OtelStatus (or Otel or similar) in the name so we know what they are injecting when we read them?

Alternatively, can you add comments to all of them if you want to leave the names alone?

Comment thread filebeat/beater/filebeat.go Outdated
Comment thread filebeat/beater/filebeat.go Outdated
Comment thread metricbeat/beater/metricbeat.go Outdated
Comment thread metricbeat/beater/metricbeat.go Outdated
@VihasMakwana VihasMakwana added the backport-9.0 Automated backport to the 9.0 branch label Jun 24, 2025
@VihasMakwana VihasMakwana requested a review from cmacknz June 24, 2025 13:05
@VihasMakwana VihasMakwana merged commit d71266c into main Jun 26, 2025
199 of 202 checks passed
@VihasMakwana VihasMakwana deleted the wrap-runner-factory branch June 26, 2025 08:30
@VihasMakwana VihasMakwana added the backport-8.19 Automated backport to the 8.19 branch label Jun 26, 2025
mergify Bot pushed a commit that referenced this pull request Jun 26, 2025
* initial commit

* todo

* implment reporter for mbreceiver

* notice and lint

* mbreceiver

* lint

* comments

* notice

* errors

* optimization

* add test case suite

* fix benchmark and tests

* rename to otel specific

* test log

(cherry picked from commit d71266c)

# Conflicts:
#	NOTICE.txt
#	filebeat/beater/filebeat.go
#	go.mod
#	metricbeat/beater/metricbeat.go
#	metricbeat/mb/module/factory.go
#	x-pack/filebeat/fbreceiver/receiver_test.go
#	x-pack/filebeat/input/benchmark/input.go
mergify Bot pushed a commit that referenced this pull request Jun 26, 2025
* initial commit

* todo

* implment reporter for mbreceiver

* notice and lint

* mbreceiver

* lint

* comments

* notice

* errors

* optimization

* add test case suite

* fix benchmark and tests

* rename to otel specific

* test log

(cherry picked from commit d71266c)
VihasMakwana added a commit that referenced this pull request Jul 2, 2025
* initial commit

* todo

* implment reporter for mbreceiver

* notice and lint

* mbreceiver

* lint

* comments

* notice

* errors

* optimization

* add test case suite

* fix benchmark and tests

* rename to otel specific

* test log
VihasMakwana added a commit that referenced this pull request Jul 2, 2025
* initial commit

* todo

* implment reporter for mbreceiver

* notice and lint

* mbreceiver

* lint

* comments

* notice

* errors

* optimization

* add test case suite

* fix benchmark and tests

* rename to otel specific

* test log
VihasMakwana added a commit that referenced this pull request Jul 4, 2025
* [beatreceiver] - Add status reporting (#44782)

* initial commit

* todo

* implment reporter for mbreceiver

* notice and lint

* mbreceiver

* lint

* comments

* notice

* errors

* optimization

* add test case suite

* fix benchmark and tests

* rename to otel specific

* test log

(cherry picked from commit d71266c)

# Conflicts:
#	NOTICE.txt
#	filebeat/beater/filebeat.go
#	go.mod
#	metricbeat/beater/metricbeat.go
#	metricbeat/mb/module/factory.go
#	x-pack/filebeat/fbreceiver/receiver_test.go
#	x-pack/filebeat/input/benchmark/input.go

* [beatreceiver] - Add status reporting (#44782)

* initial commit

* todo

* implment reporter for mbreceiver

* notice and lint

* mbreceiver

* lint

* comments

* notice

* errors

* optimization

* add test case suite

* fix benchmark and tests

* rename to otel specific

* test log

* use ct

---------

Co-authored-by: Vihas Makwana <121151420+VihasMakwana@users.noreply.github.com>
Co-authored-by: Vihas Makwana <vihas.makwana@elastic.co>
VihasMakwana added a commit that referenced this pull request Jul 4, 2025
* [beatreceiver] - Add status reporting (#44782)

* initial commit

* todo

* implment reporter for mbreceiver

* notice and lint

* mbreceiver

* lint

* comments

* notice

* errors

* optimization

* add test case suite

* fix benchmark and tests

* rename to otel specific

* test log

(cherry picked from commit d71266c)

* use ct

---------

Co-authored-by: Vihas Makwana <121151420+VihasMakwana@users.noreply.github.com>
Co-authored-by: Vihas Makwana <vihas.makwana@elastic.co>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport-8.19 Automated backport to the 8.19 branch backport-9.0 Automated backport to the 9.0 branch Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Beat receivers do not correctly report status back to the Elastic Agent

6 participants