Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

out_forward errors on shutdown with require_ack_response enabled #3962

Closed
fujimotos opened this issue Nov 18, 2022 · 1 comment · Fixed by #4030
Closed

out_forward errors on shutdown with require_ack_response enabled #3962

fujimotos opened this issue Nov 18, 2022 · 1 comment · Fixed by #4030
Assignees
Labels
bug Something isn't working

Comments

@fujimotos
Copy link
Member

Describe the bug

When we stop td-agent service, out_forward sometimes throws
an unhandled exception.

The exact exception varies, but it always occures in ack_handler.rb.
Here is a few examples:

# case 1
unexpected error while receiving ack error_class=IOError
error="closed stream"

# case 2
unexpected error while receiving ack message error_class=IOError
error="stream closed in another thread"

# case 3
unexpected error while receiving ack error_class=NoMethodError
error="undefined method `disable!' for nil:NilClass"

It suggests that out_forward's shutdown sequence is racy i.e.
multiple threads can process the same ack message concurrently.

To Reproduce

  • Run out_foward with require_ack_response
  • Stop Fluentd during actively transmitting records.

Expected behavior

No error.

Your Environment

- Fluentd version: v1.15.2
- TD Agent version: -
- Operating system: -
- Kernel version: -

Your Configuration

<match **>
  @type forward
  require_ack_response true
  ...
</match>

Your Error Log

See above.

Additional context

No response

@fujimotos fujimotos added the bug Something isn't working label Nov 18, 2022
@fujimotos fujimotos self-assigned this Nov 18, 2022
@fujimotos fujimotos moved this to To-Do in Fluentd Kanban Nov 18, 2022
@fujimotos fujimotos moved this from To-Do to Work-In-Progress in Fluentd Kanban Dec 27, 2022
@daipom daipom self-assigned this Jan 27, 2023
@daipom
Copy link
Contributor

daipom commented Jan 27, 2023

I note how to reproduce this.

  • Launch 2 Fleuntd (Aggregator and Forwarder)

Config of Aggregator

<source>
  @type forward
</source>

<filter test.**>
  @type record_transformer
  enable_ruby
  <record>
    hoge ${ sleep(1); "hoge" } # Insert `sleep` to stack the ack check
  </record>
</filter>

<match test.**>
  @type stdout
</match>

Config of Forwarder

<source>
  @type sample
  tag test
</source>

<match test.**>
  @type forward
  require_ack_response true
  <buffer tag,time,file>
    @type file
    path /test/fluentd/forwarder/buffer
    timekey 24h
    flush_mode immediate
    flush_at_shutdown true
  </buffer>
  <server>
    host localhost
    port 24224
  </server>
</match>
  • Wait a while and we can see some chunk files staying in the Forwarder.
  • Then stop the Forwarder and check the log of Fluentd.

This is a little difficult.
It doesn't seem that storing many chunks makes it easier to reproduce.
Increasing the sleep time seems to make it harder.
It may depend on the machine specs, but if I stop Fluentd when 2 chunk files accumulate, it is easy to reproduce.

@github-project-automation github-project-automation bot moved this from Work-In-Progress to Done in Fluentd Kanban Feb 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Archived in project
2 participants