-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Apply close_timeout also when output is blocked #3511
Conversation
In a call with Steffen we came to the following conclusions:
|
b703d4a
to
d2b0036
Compare
d2b0036
to
8841b81
Compare
UPDATE: Problem should be fixed. This cause a race condition on windows. Not sure yet where it is exactly happening:
|
filebeat/harvester/harvester.go
Outdated
@@ -39,6 +40,8 @@ type Harvester struct { | |||
fileReader *LogFile | |||
encodingFactory encoding.EncodingFactory | |||
encoding encoding.Encoding | |||
once sync.Once |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it should be clear once
is used to protected prospectorDone
from multiple go-routines potentially doing close(prospectorDone)
. Uhm.. is it used for prospectorDone
?
Just seeing it protects the done
channel. sync tools and items being protected should always stand together.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, once protects done from only closing done once. Moved it one line down.
if h.config.CloseTimeout > 0 { | ||
closeTimeout = time.After(h.config.CloseTimeout) | ||
} | ||
|
||
select { | ||
// Applies when timeout is reached | ||
case <-closeTimeout: | ||
logp.Info("Closing harvester because close_timeout was reached: %s", h.state.Source) | ||
logp.Info("Closing harvester because close_timeout was reached.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see how access to h.state.Source can race. Consider printing the last state in defer statement when Harvest returns (but after workers have been finished).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is already a looking message in the closing part where the state.Source is inside. So I don't think it is necessary here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like prospectorDone
isn't closed anywhere.
@andrewkroh prospectorDone is the |
db5bff4
to
6ee65eb
Compare
@andrewkroh @urso new version pushed. |
6ee65eb
to
058f800
Compare
Currently `close_timeout` does not apply in case the output is blocked. This PR changes the behavior of `close_timeout` to also close a file handler when the output is blocked. It is important to note, that this closes the file handler but NOT the harvester. This is important as the closing of the harvester requires a state update to set `state.Finished=true`. If this would not happen and the harvester is closed, processing would not continue when the output becomes available again. Previously the internal state of a harvester was updated when the event was created. This could lead to the issue that in case an event was not sent but the state update went through, that an event would be missing. This is now prevent by overwriting the internal state only when the event was successfully sent. The done channels from prospector and harvester are renamed to be more obvious which one belongs to what: h.done -> h.prospectorDone, h.harvestDone -> h.done. As the harvester channel is close with the `stop` method in all cases `h.done` is sufficient in most places. This PR does not solve the problem related to reloading and stopping a harvester mentioned in elastic#3511 (comment) related to reloading. This will be done in a follow up PR.
058f800
to
a4a27d7
Compare
Currently
close_timeout
does not apply in case the output is blocked. This PR changes the behavior ofclose_timeout
to also close a file handler when the output is blocked.It is important to note, that this closes the file handler but NOT the harvester. This is important as the closing of the harvester requires a state update to set
state.Finished=true
. If this would not happen and the harvester is closed, processing would not continue when the output becomes available again.Previously the internal state of a harvester was updated when the event was created. This could lead to the issue that in case an event was not sent but the state update went through, that an event would be missing. This is now prevent by overwriting the internal state only when the event was successfully sent.
The done channels from prospector and harvester are renamed to be more obvious which one belongs to what: h.done -> h.prospectorDone, h.harvestDone -> h.done. As the harvester channel is close with the
stop
method in all casesh.done
is sufficient in most places.This PR does not solve the problem related to reloading and stopping a harvester mentioned in elastic#3511 (comment) related to reloading. This will be done in a follow up PR.