Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

in_tail: Fluentd v1.12.3 fails to handle file rotations on Windows #3365

Closed
fujimotos opened this issue May 11, 2021 · 10 comments
Closed

in_tail: Fluentd v1.12.3 fails to handle file rotations on Windows #3365

fujimotos opened this issue May 11, 2021 · 10 comments
Assignees
Labels
bug Something isn't working windows

Comments

@fujimotos
Copy link
Member

fujimotos commented May 11, 2021

Describe the bug

After a few file rotations, in_tail stops tracking log files.

To Reproduce

  1. Run Fluentd with the attached configuration.

  2. Rotate the file twice using cmd.exe

    % cd /opt/td-agent/
    % echo aiueo >> test.log
    % del test.log
    % echo aiueo >> test.log
    % del test.log
  3. Now Fluentd just gets stuck, keeping test.log in DeletePending status forever.

Expected behavior

File rotation should work properly.

Your Environment

  • Windows Server 2016
  • td-agent v4.1.1
  • Fluentd v1.12.3

Your Configuration

<source>
@type tail
 path /opt/td-agent/test.log
 tag test.log
 <parse>
   @type none
 </parse>
</source>

<match test**>
  @type stdout
</match>

Additional context

@fujimotos fujimotos added bug Something isn't working windows labels May 11, 2021
@fujimotos
Copy link
Member Author

I suspect that this issue is the root cause of #3340.

@ashie
Copy link
Member

ashie commented May 19, 2021

@cosmo0920 Could you take look this?
Feel free to reassign me if you can't.

@cosmo0920
Copy link
Contributor

Sure. I'll tackle on this in the next week. 👀

@cosmo0920
Copy link
Contributor

I'd confirmed that the current master (b6ba1c2) can be stucked with the attached configuration.
And I found that taking over 5 seconds to adding lines into rotated file can be reproduce this issue:

% del test.log
del Adding a line after long time (over 5 seconds)
% echo aiueo >> test.log

@cosmo0920
Copy link
Contributor

cosmo0920 commented May 24, 2021

I'd confirmed that this issue is also happened on Linux. Not for Windows specific.

$ echo "aiueo" >> test.log 
$ rm test.log && sleep 60 && echo "aiueo" >> test.log 

can be reproducible this issue on Linux. And also this issue does not happen with Fluentd v1.11.5.

@AmeerAssi
Copy link

thanks @cosmo0920.
@fujimotos mentioned that the root cause of this issue: #3340 might be the issue here.

we are using Fluentd v.1.11.2, do you know if its reproducing with this version?

@cosmo0920
Copy link
Contributor

cosmo0920 commented May 25, 2021

we are using Fluentd v.1.11.2, do you know if its reproducing with this version?

That version is not reproducible this issue. This issue is only for Fluentd v1.12.x.

@cosmo0920
Copy link
Contributor

I've sent a PR to fix this issue: #3391

And I have never reproduced this issue with #3391.

@ashie
Copy link
Member

ashie commented May 26, 2021

Note for the mechanism of this bug:
Since a hash key of TargetInfo is based on whole of it's contents, it doesn't match with existing TailWatcher in @tails on refresh_watchers when the inode is changed.

def refresh_watchers
target_paths_hash = expand_paths
existence_paths_hash = existence_path
log.debug { "tailing paths: target = #{target_paths.join(",")} | existing = #{existence_paths.join(",")}" }
unwatched_hash = existence_paths_hash.reject {|key, value| target_paths_hash.key?(key)}
added_hash = target_paths_hash.reject {|key, value| existence_paths_hash.key?(key)}
stop_watchers(unwatched_hash, immediate: false, unwatched: true) unless unwatched_hash.empty?
start_watchers(added_hash) unless added_hash.empty?
end

Thus TailWatcher for the path stays seated in @tails forever, so that refresh_watcher can't create a new TailWatcher for the path.
On the other hand, since refresh_watcher is called on every 60 seconds, rotation will be handled by on_rotate (not by refresh_watcher) when a file is create before next call of refresh_watcher, so following a new file is worked correctly in this case.

def on_rotate(stat)
if @io_handler.nil?
if stat
# first time
fsize = stat.size
inode = stat.ino
last_inode = @pe.read_inode
if inode == last_inode
# rotated file has the same inode number with the last file.
# assuming following situation:
# a) file was once renamed and backed, or
# b) symlink or hardlink to the same file is recreated
# in either case of a and b, seek to the saved position
# c) file was once renamed, truncated and then backed
# in this case, consider it truncated
@pe.update(inode, 0) if fsize < @pe.read_pos
elsif last_inode != 0
# this is FilePositionEntry and fluentd once started.
# read data from the head of the rotated file.
# logs never duplicate because this file is a rotated new file.
@pe.update(inode, 0)
else
# this is MemoryPositionEntry or this is the first time fluentd started.
# seek to the end of the any files.
# logs may duplicate without this seek because it's not sure the file is
# existent file or rotated new file.
pos = @read_from_head ? 0 : fsize
@pe.update(inode, pos)
end
@io_handler = io_handler
else
@io_handler = NullIOHandler.new
end
else
watcher_needs_update = false
if stat
inode = stat.ino
if inode == @pe.read_inode # truncated
@pe.update_pos(0)
@io_handler.close
elsif !@io_handler.opened? # There is no previous file. Reuse TailWatcher
@pe.update(inode, 0)
else # file is rotated and new file found
watcher_needs_update = true
# Handle the old log file before renewing TailWatcher [fluentd#1055]
@io_handler.on_notify
end
else # file is rotated and new file not found
# Clear RotateHandler to avoid duplicated file watch in same path.
@rotate_handler = nil
watcher_needs_update = true
end
if watcher_needs_update
if @follow_inodes
# No need to update a watcher if stat is nil (file not present), because moving to inodes will create
# new watcher, and old watcher will be closed by stop_watcher in refresh_watchers method
# don't want to swap state because we need latest read offset in pos file even after rotate_wait
if stat
target_info = TargetInfo.new(@path, stat)
@update_watcher.call(target_info, @pe)
end
else
# Permit to handle if stat is nil (file not present).
# If a file is mv-ed and a new file is created during
# calling `#refresh_watchers`s, and `#refresh_watchers` won't run `#start_watchers`
# and `#stop_watchers()` for the path because `target_paths_hash`
# always contains the path.
target_info = TargetInfo.new(@path, stat ? stat.ino : nil)
@update_watcher.call(target_info, swap_state(@pe))
end
else
@log.info "detected rotation of #{@path}"
@io_handler = io_handler
end
end
end

@ashie
Copy link
Member

ashie commented May 26, 2021

Fixed via #3395

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working windows
Projects
None yet
Development

No branches or pull requests

4 participants