This repository has been archived by the owner on Jan 23, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 4.9k
Improve FileStream WriteAsync buffering and performance #2929
Merged
stephentoub
merged 3 commits into
dotnet:master
from
stephentoub:filestream_async_buffering
Aug 26, 2015
Merged
Improve FileStream WriteAsync buffering and performance #2929
stephentoub
merged 3 commits into
dotnet:master
from
stephentoub:filestream_async_buffering
Aug 26, 2015
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
stephentoub
force-pushed
the
filestream_async_buffering
branch
from
August 22, 2015 14:22
7669678
to
66912c2
Compare
return WriteInternalCoreAsync(array, offset, numBytes, cancellationToken); | ||
} | ||
// Simple/common case: | ||
// - The write is smaller than our buffer, such that it's worth considering buffering it.t |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: typo - extra 't' after period
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will fix
A few minor comments about comments, but otherwise LGTM. |
…cing new cases: 1) Internal buffer overflows, but write can fit in next buffer - Fill the existing buffer, copy the remaining data into a new buffer, and flush the original to disk 2) Internal buffer is non-empty, overflows, and the remaining data won't fit in a second buffer - Chain the flush operation to a second write operation which writes the entire incoming block directly to disk. Additionally, adds a new unit test to ensure the different buffering cases are being covered.
- We now properly handle the case where multiple overlapped writes are issued while a flush is in progress (offsets were being handled incorrectly). - We now correctly optimize buffering for the common case where a write completely fills the buffer, such that we now do a single write for that buffer instead of doing a write for the almost full buffer and one for the additional data. - We now have some additional optimizations to avoid creating WhenAll tasks if it's not necessary. - The flush and write tasks are not appropriately coordinated in all cases.
- Mainly, I added several tests (with lots of theory-based parameterized inputs) to stress WriteAsync: many concurrent writes, long chains of writes, etc., for various size inputs and buffers, async vs not, etc. - There were multiple variations on Flush tests based on whether to use Flush(), Flush(false), or False(true). I combined these all into single theories. - I then added some more Flush tests for some missing cases highlighted by code coverage, e.g. flushing reads. - I then duplicated these tests for FlushAsync, and added some additional tests specific to FlushAsync, e.g. cancellation. - I added a few tests for pipes.
stephentoub
force-pushed
the
filestream_async_buffering
branch
from
August 24, 2015 21:01
e8bb742
to
bb34d5a
Compare
@ericstj, could you please review as well? |
stephentoub
added a commit
that referenced
this pull request
Aug 26, 2015
Improve FileStream WriteAsync buffering and performance
awesome 👍 |
picenka21
pushed a commit
to picenka21/runtime
that referenced
this pull request
Feb 18, 2022
…nc_buffering Improve FileStream WriteAsync buffering and performance Commit migrated from dotnet/corefx@6cffbc2
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Replaces PR #2802
Fixes #1531
This set of changes improves how FileStream WriteAsync handles buffering, with some significant performance benefits. Two primary issues addressed:
The changes in this PR:
I tested this out with a variety of small benchmarks that involve writing asynchronously, e.g.
with various
arr
sizes.As an example, on my laptop:
arr
length of 0x100,such a test took ~3.3 seconds and incurred ~27 gen0 GCs and ~2 gen1 GCs.In other words, for this test, this PR improved perf by ~10x. The biggest benefit comes when writing out smaller than the internal buffer size. When writing out >= the internal buffer size, throughput stayed about the same as before the change, but GCs dropped by ~6x.
The change should incur few behavioral differences. The primary observable difference is that if a WriteAsync occurs and we need to flush the buffer, we're now flushing asynchronously rather than synchronously, but because we need to ensure that Position is correct when WriteAsync returns to its synchronous caller, the subsequent asynchronous write may execute concurrently with the flush. That means that if the flush fails asynchronously, whereas before we wouldn't have issued the subsequent write and updated the position, now we will have. I believe this is a worthwhile change for the benefits.
(There are some subsequent improvements we could look at as well. For example, now that we're never allocating a secondary buffer, it becomes easier to consider using a PreAllocatedOverlapped; we would need to ensure that we only use it when there aren't concurrent writes, but that's not the common use case.)
Thanks to @tymlipari for kicking this off.
cc: @ericstj, @FiveTimesTheFun, @ianhays, @KrzysztofCwalina