Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix race condition of out_secondary_file #4081

Merged
merged 10 commits into from
Mar 9, 2023

Conversation

daipom
Copy link
Contributor

@daipom daipom commented Mar 3, 2023

Which issue(s) this PR fixes:
Fixes #4080

What this PR does / why we need it:
For solving race condition issues of out_secondary_file.

Docs Changes:
Not needed.

Release Note:
Same as the title.

@daipom
Copy link
Contributor Author

daipom commented Mar 3, 2023

I have confirmed this solves #4080.

@daipom
Copy link
Contributor Author

daipom commented Mar 3, 2023

I have one point of concern.

When confirming this solves #4080, I noticed that a large number of lock files remain during Fluentd running because the files are not deleted after exiting the block.

def acquire_worker_lock(name)
if @fluentd_lock_dir.nil?
raise InvalidLockDirectory, "can't acquire lock because FLUENTD_LOCK_DIR isn't set"
end
lock_path = get_lock_path(name)
File.open(lock_path, "w") do |f|
f.flock(File::LOCK_EX)
yield
end
# Update access time to prevent tmpwatch from deleting a lock file.
FileUtils.touch(lock_path);
end

Is this OK?

@daipom daipom marked this pull request as draft March 6, 2023 00:58
@daipom daipom changed the title Fix race condition of out_secondary_file on multiple workers Draft: Fix race condition of out_secondary_file on multiple workers Mar 6, 2023
@daipom daipom force-pushed the fix-out-secondary-file-race-conditiion branch from fe65b4f to c95b52a Compare March 6, 2023 11:22
@daipom
Copy link
Contributor Author

daipom commented Mar 6, 2023

Test for c95b52a.

Use the same way (append false) and the same environment as

Result:

workers flush_thread_count percentage (1st) percentage (2nd)
1 16 0.0% (0/111) 0.0% (0/42)
3 1 0.0% (0/27) 0.0% (0/27)
3 8 0.0% (0/267) 0.0% (0/29)

@daipom daipom changed the title Draft: Fix race condition of out_secondary_file on multiple workers Draft: Fix race condition of out_secondary_file Mar 6, 2023
@daipom daipom force-pushed the fix-out-secondary-file-race-conditiion branch from c95b52a to 6acf958 Compare March 6, 2023 11:45
@daipom daipom changed the title Draft: Fix race condition of out_secondary_file Fix race condition of out_secondary_file Mar 6, 2023
@daipom daipom marked this pull request as ready for review March 6, 2023 12:56
@daipom daipom requested a review from ashie March 6, 2023 12:56
lib/fluent/plugin/out_secondary_file.rb Outdated Show resolved Hide resolved
lib/fluent/plugin/out_secondary_file.rb Outdated Show resolved Hide resolved
lib/fluent/plugin/out_secondary_file.rb Outdated Show resolved Hide resolved
lib/fluent/plugin/out_secondary_file.rb Outdated Show resolved Hide resolved
test/command/test_cat.rb Outdated Show resolved Hide resolved
daipom and others added 2 commits March 7, 2023 11:44
Signed-off-by: Daijiro Fukuda <[email protected]>

Co-authored-by: Takuro Ashie <[email protected]>
so that other plugins can use this feature.

Signed-off-by: Daijiro Fukuda <[email protected]>
@daipom
Copy link
Contributor Author

daipom commented Mar 7, 2023

Test for ce32ac5.

Use the same way (append false) and the same environment as

Result:

workers flush_thread_count percentage
3 8 0.0% (0/21)

I plan to do more tests at the end of this PR.

@daipom
Copy link
Contributor Author

daipom commented Mar 7, 2023

I will add some unit tests for the lock feature, although it is hard to test race condition strictly.

@daipom
Copy link
Contributor Author

daipom commented Mar 7, 2023

@ashie Could you please check if the direction of this fix is OK?

lib/fluent/plugin/output.rb Outdated Show resolved Hide resolved
lib/fluent/plugin/output.rb Outdated Show resolved Hide resolved
test/command/test_cat.rb Outdated Show resolved Hide resolved
lib/fluent/plugin/output.rb Outdated Show resolved Hide resolved
daipom and others added 4 commits March 7, 2023 18:54
Signed-off-by: Daijiro Fukuda <[email protected]>

Co-authored-by: Takuro Ashie <[email protected]>
Signed-off-by: Daijiro Fukuda <[email protected]>
to unify the existing method: `acquire_worker_lock`.

Signed-off-by: Daijiro Fukuda <[email protected]>
Co-authored-by: Takuro Ashie <[email protected]>
Signed-off-by: Daijiro Fukuda <[email protected]>
@daipom daipom force-pushed the fix-out-secondary-file-race-conditiion branch from 488896b to edcbd4b Compare March 7, 2023 10:06
@daipom
Copy link
Contributor Author

daipom commented Mar 7, 2023

I will test this issue again, and add some tests for this new lock feature.

@daipom
Copy link
Contributor Author

daipom commented Mar 7, 2023

Test for edcbd4b.

Use the same way (append false) and the same environment as

Result:

workers flush_thread_count percentage (1st) percentage (2nd)
1 16 0.0% (0/24) 0.0% (0/39)
3 1 0.0% (0/27) 0.0% (0/21)
3 8 0.0% (0/36) 0.0% (0/33)

Signed-off-by: Daijiro Fukuda <[email protected]>
@daipom daipom force-pushed the fix-out-secondary-file-race-conditiion branch from f55838a to f23e7e6 Compare March 8, 2023 00:30
@daipom
Copy link
Contributor Author

daipom commented Mar 8, 2023

I will test this issue again, and add some tests for this new lock feature.

Done.

lib/fluent/plugin/output.rb Outdated Show resolved Hide resolved
Signed-off-by: Daijiro Fukuda <[email protected]>
@daipom
Copy link
Contributor Author

daipom commented Mar 8, 2023

By the way, might Squash merge be better when merging this?

test/plugin/test_output.rb Outdated Show resolved Hide resolved
Signed-off-by: Daijiro Fukuda <[email protected]>

Co-authored-by: Takuro Ashie <[email protected]>
@ashie ashie merged commit 59399c1 into fluent:master Mar 9, 2023
@ashie
Copy link
Member

ashie commented Mar 9, 2023

Thanks!

@daipom daipom deleted the fix-out-secondary-file-race-conditiion branch March 9, 2023 01:23
@daipom
Copy link
Contributor Author

daipom commented Mar 9, 2023

Thanks for your review! @ashie @kou !

@ashie ashie added this to the v1.16.0 milestone Mar 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SecondaryFileOutput has race condition issues in multiple threads or workers
3 participants