-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Filebeat on Windows never recovers from a timeout or EOF #905
Comments
Would it be possible for you to enable the log level debug to get some more information on what goes wrong during the publishing? |
please update your filebeat. Might be related to #872 which was fixed after your build. Why not use filebeat 1.1 ? |
Because I've had this problem since pre-1.0, so I've been trying nightlies ever since, on the assumption that 1.1 is newer than 1.0 so would have more fixes in it. I'll try the latest nightly. |
@Cylindric Thanks |
I'm seeing the same here on Windows 2012R2 / filebeat 1.1.0. Here's a log snippet (INFO level): 2016-02-04T12:21:37Z INFO Events sent: 333 The back-off is working... until suddenly it isn't. I'm restarting the service at 12:38 and all is good again. |
We found a bug in error recovery making the logstash output hang in an infinite loop if too many errors occured + some bad timing. We're preparing a 1.1.1 release containing a fix for this particular error. Maybe you want to give it a try: https://download.elastic.co/beats/filebeat/filebeat-1.1.1-SNAPSHOT-darwin.tgz The PR for master branch contains quite some more refactoring/changes and will be merged into master hopefully soonish. |
Thanks, I'm giving that a go both on my Windows IIS servers and my Debian Varnish servers, both of which seem to have similar issues. Then I'll just need to work out what my bottleneck is on the server-side of things, so my logstash doesn't keep stalling :( thanks, |
Seems to be duplicate of #878 . Also see my comment about congestion_threshold in beats plugin. If indexing occasionally requires >5s, set congestion_threshold up. Also consider increasing timeout in filebeat.yml logstash output section. Not erroring all to often may have positive effects on throughput. |
Hmm, does look very similar, although I don't remember the windowsize, and my logs have now rotated out. I'll keep an eye open. The congestion_threshold tip looks interesting though - I'll check that out for sure. |
I'm using Filebeat on a bunch of Windows web servers to ship IIS log files to logstash.
It looks like when my logstash/elastic servers get a bit backed up and start stalling the pipeline for a while, Filebeat on the webservers correctly fail to send some log entries, but then never try again.
Normally, I see this in the Filebeat logs:
Eventually, the backend blocks for whatever reason, and I'll get this, followed by infinite "Run prospector" messages until I restart filebeat.
That's a particularly odd example, because there's even one last send of 10000 before it goes silent. Usually a bunch of EOF is the last attempt I see FB making.
My config is as follows. I added the max_retries in an attempt to make FB retry for ever, but that made no difference.
I'm currently trying the nightly build filebeat-1.2.0-nightly160126172227-windows.
The text was updated successfully, but these errors were encountered: