Add io uring file writer#10105
Conversation
|
Ideally we would just use file creator. But do wonder how to best provide BTW tiered_storage is not used right now I don't know of plans to use it. |
|
Basically we need to move on two efforts:
Currently writing new accounts storage is done with agave/accounts-db/src/append_vec.rs Line 573 in da0d834 After 4.0 cut we will remove mmap variant for storage access and then it should be easy to refactor the APIs to simply use std::io::Write and write the whole file from start to end instead of doing random position writes.
|
|
Yeah I was looking at file creator and it seemed like a lot of work to refactor to make it not write and close in a single go. Seemed like it would be a complete rewrite.
Sent from [Proton Mail](https://proton.me/mail/home) for Android.
…-------- Original Message --------
On Monday, 01/19/26 at 18:44 Kamil Skalski ***@***.***> wrote:
kskalski left a comment [(anza-xyz/agave#10105)](#10105 (comment))
Ideally we would just use file creator. But do wonder how to best provide std::io::Write capability to file creator and maybe that would deserve a separate module, but I hope not.
BTW tiered_storage is not used right now I don't know of plans to use it.
—
Reply to this email directly, [view it on GitHub](#10105 (comment)), or [unsubscribe](https://github.com/notifications/unsubscribe-auth/AJVAUIM3VRMLLQIFAIJHD7L4HVT7RAVCNFSM6AAAAACSDJTQKWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTONZQGQ2TSMJYHE).
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
|
Yep, was just thinking about how to integrate it into existing APIs. I will implement write and seek traits.
Sent from [Proton Mail](https://proton.me/mail/home) for Android.
…-------- Original Message --------
On Monday, 01/19/26 at 18:49 Kamil Skalski ***@***.***> wrote:
kskalski left a comment [(anza-xyz/agave#10105)](#10105 (comment))
Basically we need to move on two efforts:
- implement io-uring writer returned here https://github.com/anza-xyz/agave/blob/da0d834e41eae3c0c3e2710a61a99b72e980c5e7/fs/src/buffered_writer.rs#L15
- use above interface in more places
Currently writing new accounts storage is done with https://github.com/anza-xyz/agave/blob/da0d834e41eae3c0c3e2710a61a99b72e980c5e7/accounts-db/src/append_vec.rs#L573
After 4.0 cut we will remove mmap variant for storage access and then it should be easy to refactor the APIs to simply use std::io::Write and write the whole file from start to end instead of doing random position writes.
—
Reply to this email directly, [view it on GitHub](#10105 (comment)), or [unsubscribe](https://github.com/notifications/unsubscribe-auth/AJVAUIJKCSUB6OJHJNTDC4T4HVUPFAVCNFSM6AAAAACSDJTQKWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTONZQGQ3DKOBWGU).
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
|
Master Pr So... The PR does something. But unfortunately that something is making snapshot writing take almost twice as long. Will look into this. |
|
It's faster on sequential writes(~15% faster). Does the compressor for snapshots seek a lot? My current implementation always flushes on seek. bench for sequential writes #[test]
#[ignore]
fn bench_io_uring_vs_bufwriter() {
use std::{fs::File, io::BufWriter, time::Instant};
const FILE_SIZE: usize = 5000 * 1024 * 1024; // 5000 MB
const CHUNK_SIZE: usize = 4096;
const DEFAULT_BUFFER_WRITE_SIZE: usize = 2 * 1024 * 1024;
let data: Vec<u8> = (0..CHUNK_SIZE).map(|i| i as u8).collect();
// io_uring writer
let path = "/home/ra/test-garbage-data.dat";
File::create(path).unwrap();
let buf = PageAlignedMemory::new(DEFAULT_BUFFER_WRITE_SIZE).unwrap();
let mut writer = IoUringFileWriterBuilder::new()
.build_with_buffer(path, buf)
.unwrap();
let start = Instant::now();
for _ in 0..FILE_SIZE / CHUNK_SIZE {
writer.write_all(&data).unwrap();
}
writer.flush().unwrap();
let io_uring_time = start.elapsed();
std::fs::remove_file(path).unwrap();
// BufWriter
let path = "/home/ra/test-garbage-data.dat";
let file = File::create(path).unwrap();
let mut writer = BufWriter::with_capacity(DEFAULT_BUFFER_WRITE_SIZE, file);
let start = Instant::now();
for _ in 0..FILE_SIZE / CHUNK_SIZE {
writer.write_all(&data).unwrap();
}
writer.flush().unwrap();
let bufwriter_time = start.elapsed();
std::fs::remove_file(path).unwrap();
println!("io_uring: {:?}", io_uring_time);
println!("BufWriter: {:?}", bufwriter_time);
println!(
"io_uring is {:.2}x faster",
bufwriter_time.as_secs_f64() / io_uring_time.as_secs_f64()
);
} |
|
Changing chunk size to be smaller makes IoUringWriter much slower. |
|
BufWriter + IoUringWriter is 60% faster on |
3900e80 to
df92c43
Compare
|
Added direct io and benched using the following. Direct io is slightly faster for the test data. Also changed the default write size and to #[test]
#[ignore]
fn bench_io_uring_vs_bufwriter() {
use std::{fs::File, io::BufWriter, time::Instant};
const FILE_SIZE: usize = 5000 * 1024 * 1024; // 5000 MB
const CHUNK_SIZE: usize = 800;
const DEFAULT_BUFFER_WRITE_SIZE: usize = 2 * 1024 * 1024;
let data: Vec<u8> = (0..CHUNK_SIZE).map(|i| i as u8).collect();
// io_uring writer
let path = "/home/ra/test-garbage-data.dat";
File::create(path).unwrap();
let buf = PageAlignedMemory::new(DEFAULT_BUFFER_WRITE_SIZE).unwrap();
let mut writer = BufWriter::new(IoUringFileWriterBuilder::new()
.build_with_buffer(path, buf)
.unwrap());
let start = Instant::now();
for _ in 0..FILE_SIZE / CHUNK_SIZE {
writer.write_all(&data).unwrap();
}
writer.flush().unwrap();
let io_uring_time = start.elapsed();
std::fs::remove_file(path).unwrap();
// io_uring writer with direct io
let path = "/home/ra/test-garbage-data.dat";
File::create(path).unwrap();
let buf = PageAlignedMemory::new(DEFAULT_BUFFER_WRITE_SIZE).unwrap();
let mut writer = BufWriter::new(IoUringFileWriterBuilder::new()
.use_direct_io(true)
.build_with_buffer(path, buf)
.unwrap());
let start = Instant::now();
for _ in 0..FILE_SIZE / CHUNK_SIZE {
writer.write_all(&data).unwrap();
}
writer.flush().unwrap();
let io_uring_direct_io_time = start.elapsed();
std::fs::remove_file(path).unwrap();
// BufWriter
let path = "/home/ra/test-garbage-data.dat";
let file = File::create(path).unwrap();
let mut writer = BufWriter::with_capacity(DEFAULT_BUFFER_WRITE_SIZE, file);
let start = Instant::now();
for _ in 0..FILE_SIZE / CHUNK_SIZE {
writer.write_all(&data).unwrap();
}
writer.flush().unwrap();
let bufwriter_time = start.elapsed();
std::fs::remove_file(path).unwrap();
println!("io_uring: {:?}", io_uring_time);
println!("io_uring direct io: {:?}", io_uring_direct_io_time);
println!("BufWriter: {:?}", bufwriter_time);
} |
|
bench test output: @kskalski I think this pr is ready for review. |
There was a problem hiding this comment.
Pull request overview
This PR adds IoUringFileWriter, a new io_uring-based file writer for improved write performance, particularly for accounts storage operations. The writer supports asynchronous writes with configurable buffer sizes and includes optional direct I/O support.
Changes:
- Added new
IoUringFileWriterandIoUringFileWriterBuilderwith comprehensive test coverage - Extended
IoBufferChunkwithAsMut<[u8]>trait implementation for buffer manipulation - Integrated io_uring writer into
large_file_buf_writer()replacing the previousBufWriter<File>implementation
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 11 comments.
Show a summary per file
| File | Description |
|---|---|
| fs/src/io_uring/file_writer.rs | New module implementing IoUringFileWriter with builder pattern, write/seek support, and comprehensive tests |
| fs/src/io_uring/memory.rs | Added AsMut<[u8]> implementation for IoBufferChunk to support mutable buffer access |
| fs/src/io_uring/file_creator.rs | Changed CHECK_PROGRESS_AFTER_SUBMIT_TIMEOUT visibility to pub(crate) for reuse in file_writer |
| fs/src/io_uring/mod.rs | Added file_writer module export |
| fs/src/buffered_writer.rs | Replaced direct file writing with IoUringFileWriter wrapped in BufWriter for improved performance |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| /// Enabling requires the filesystem to support directio and subbuffers to be a multiple | ||
| /// of the fs block size |
There was a problem hiding this comment.
The documentation refers to "subbuffers" which is unclear. This should clarify that it refers to the write_size chunks, e.g., "Enabling requires the filesystem to support directio and the write_size to be a multiple of the fs block size".
| /// Enabling requires the filesystem to support directio and subbuffers to be a multiple | |
| /// of the fs block size | |
| /// Enabling requires the filesystem to support directio and the `write_size` to be a | |
| /// multiple of the filesystem block size. |
9ff1615 to
d3b0fb6
Compare
kskalski
left a comment
There was a problem hiding this comment.
I was wondering if this could be implemented as a wrapper on IoUringFileCreator - the creator's internal representation should allow the required operations, e.g.:
- initialize the file state on construction (creator's
open) - obtain the buffer (
wait_free_buf) and store it locally - keep filling the current buffer as writes come
- once buffer is full perform the part of the code from
write_and_closethat schedules new op and replace the buffer
|
I think it's better to do it this way -- either completely rewrite file creator to support regular writes or just make a standalone. This feels like the type of project where it's a trap to merge both of them even though they're similar. totally vibes based take though |
|
File creator has everything that is needed for implementing I created a refactor PR (#10157) that should make this all pretty straight-forward. I foresee the writer to basically contain |
|
The refactor does make it quite easy. I'll do a rewrite using FileCreator's internals once that gets merged. |
Problem
Writing to file currently just uses
BufWriterwithoutIoUring.Summary of Changes
Add IoUringFileWriter which allows writing to a single file using io uring.
bench results:
cc @kskalski
I copied most of the code from
SequentialFileReaderandIoUringFileCreatorand didn't touchmax_iowq_workers,ring_squeue_size,shared_sqpoll_fd.