Skip to content
This repository has been archived by the owner on May 6, 2021. It is now read-only.

Open FileStream with FileOptions.Asynchronous when benchmarking async file I/O #1

Closed
feO2x opened this issue Mar 24, 2019 · 6 comments

Comments

@feO2x
Copy link

feO2x commented Mar 24, 2019

Thanks for your efforts to bechmark the different methods to access the file system.

I think I found an issue with your async file stream benchmarks: when you open the file stream, you call File.Open(File, FileMode.Open, FileAccess.Read). What you should do instead is new FileStream(File, FileMode.Open, FileAccess.Read, FileShare.None, 4096, FileOptions.Asynchronous). The important part is the last parameter: on Windows, this instructs the file system to allow asynchronous operations with the disk controller. If this is not set, then the asynchronous operation is actually performed on another CLR thread-pool thread (and this thread sleeps while the disk's controller is retrieving the data). I suspect that this is the reason why your async file benchmark methods are slower than the synchronous ones. I would actually expect them to be as fast as the synchronous ones.

Unfortunately, asynchronous file access is turned off by default, you have to explicitly enable it by setting FileOptions.Asynchronous or isAsync: true when creating a file stream. Furthermore, I don't know how this behaves on operating systems other than Windows.

By the way, the 4096 is the file stream's internal buffer size, this is the default value set by Microsoft. You could also change this and measure how it affects the file load speed. I suspect that larger buffers will lead to lower loading times, but remember that arrays with a size larger than 85 000 bytes will be placed in the Large Object Heap (which one should generally avoid).

@atruskie
Copy link
Owner

Thank you for the feedback! I'll make the changes and update the results tonight 😄

@feO2x
Copy link
Author

feO2x commented Mar 25, 2019

👍
BTW, I just had a look at the Windows implementation of FileStream in corefx and it seems that read operations that read less than 64 000 bytes are always performed synchronously in NTFS.

atruskie added a commit that referenced this issue Mar 26, 2019
atruskie added a commit that referenced this issue Mar 26, 2019
@atruskie
Copy link
Owner

So I've re-run the benchmark, but further, I've added specifying FileOptions.Asynchronous as a parameter to the benchmark so we can compare the effects of this change.

In short specifying FileOptions.Asynchronous resulted in universally worse performance.

You can see the full report in the readme (and the changes in results in the recent commits).

I'd be open to any further feedback before I closed this issue, however, it takes a decent chunk of time to interpret these results and draw conclusions; so it will take some time to run this all again.

@feO2x
Copy link
Author

feO2x commented Mar 26, 2019

Hey @atruskie,
yesterday, I performed some FileStream performance tests, too, and I came to the same results. Synchronous file access is completed faster across the board, no matter the file size (I tested 10KB up to 10MB) or buffer size (512 bytes up to 84975 bytes, the largest array that is not allocated on the Large Object Heap). Furthermore, using FileOptions.SequentialScan did not change the results significantly. And: if the buffer size is larger than the file to be read, then ReadAsync completes synchronously (there is no hard 64KB limit that I mentioned in my last comment).

I then checked the official .NET performance repo and saw that they also tested FileStream. The results are similar to what we observed (see end of this comment).

What can we do to gain more insights?

  • We could benchmark the FileStream.ReadAsync overload that takes a Memory<T> parameter. It returns a ValueTask<T> - maybe this avoids allocation of Task instances which brings sync and async closer together?
  • I found a paper by Microsoft that explains how you could disable the internal buffering of FileStreams - maybe we can benchmark this, too?
  • What's the difference between SSDs and HDs? I only tested on the former.
  • I would create an issue in the performance repo and ask about it. What I would have expected is that async file access is about as fast as sync file access, but the corresponding thread is not blocking while the controller processes the data. I don't know it by heart, but there should be some ETW events that we can use to measure how long a thread is blocked / spin waits while the file is read.

Results of the official benchmark:
BenchmarkDotNet=v0.11.3.1003-nightly, OS=Windows 10.0.17763.379 (1809/October2018Update/Redstone5)
Intel Core i7-8750H CPU 2.20GHz (Coffee Lake), 1 CPU, 12 logical and 6 physical cores
.NET Core SDK=3.0.100-preview3-010431
[Host] : .NET Core 2.2.0 (CoreCLR 4.6.27110.04, CoreFX 4.6.27110.04), 64bit RyuJIT
Job-CYVMRS : .NET Core 2.2.0 (CoreCLR 4.6.27110.04, CoreFX 4.6.27110.04), 64bit RyuJIT

IterationTime=250.0000 ms MaxIterationCount=20 MinIterationCount=15
WarmupCount=1

Method BufferSize TotalSize Mean Error StdDev Median Min Max Gen 0/1k Op Gen 1/1k Op Gen 2/1k Op Allocated Memory/Op
ReadByte 512 200000 1,379.2 us 2.5998 us 2.3046 us 1,379.4 us 1,373.6 us 1,382.6 us - - - 4296 B
Read 512 200000 267.3 us 0.4948 us 0.4628 us 267.2 us 266.8 us 268.5 us - - - 4296 B
ReadAsync 512 200000 1,156.2 us 3.2913 us 2.9177 us 1,155.3 us 1,152.8 us 1,161.9 us - - - 11739 B
CopyToAsync 512 200000 186.9 us 0.3705 us 0.2893 us 187.0 us 186.4 us 187.2 us 0.7440 - - 5160 B
WriteByte 512 200000 1,618.6 us 30.2628 us 28.3078 us 1,625.0 us 1,577.2 us 1,664.7 us - - - 4296 B
Write 512 200000 639.3 us 23.4834 us 27.0436 us 648.9 us 602.6 us 700.7 us - - - 4296 B
WriteAsync 512 200000 1,815.6 us 39.5792 us 42.3493 us 1,810.7 us 1,759.7 us 1,925.4 us - - - 4824 B
ReadByte 200000 200000 1,383.1 us 1.9655 us 1.8385 us 1,382.5 us 1,380.8 us 1,387.6 us - - - 4296 B
Read 200000 200000 124.9 us 0.4619 us 0.4320 us 125.0 us 124.1 us 125.8 us - - - 176 B
ReadAsync 200000 200000 176.9 us 1.0991 us 0.9743 us 176.9 us 175.3 us 178.9 us - - - 704 B
CopyToAsync 200000 200000 190.9 us 0.9149 us 0.8558 us 190.5 us 190.0 us 192.9 us 0.7530 - - 5160 B
WriteByte 200000 200000 1,638.4 us 7.0907 us 5.9210 us 1,636.8 us 1,629.9 us 1,650.4 us - - - 4296 B
Write 200000 200000 319.6 us 10.0090 us 8.8727 us 315.5 us 312.9 us 345.2 us - - - 176 B
WriteAsync 200000 200000 424.5 us 8.2199 us 7.6889 us 422.7 us 413.6 us 439.6 us - - - 696 B

@feO2x
Copy link
Author

feO2x commented Mar 26, 2019

I've tested some more, and Memory<T> as well as ValueTask<T> bring no benefits to the table. I will now create an issue in the .NET performance repo.

@atruskie
Copy link
Owner

So I was testing with Memory<T> as well and saw no benefit so I omitted the tests from my results.

ValueTask<T> would almost certainly be the correct design choice if PipeLine adapters were written for file I/O since the operations almost always complete synchronously.

That paper you linked to was super interesting.

I'm also glad you got similar results in your https://github.com/feO2x/InsightsOnFiles repo 😄

And thanks for tagging me in the dotnet/performance repo - that was some interesting discussion.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants