-
Notifications
You must be signed in to change notification settings - Fork 157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add split function to create a new chunk #2118
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exciting! I know this is WIP but left some comments for consideration.
@@ -833,3 +833,24 @@ module Concurrent_gc = struct | |||
tc "Test kill gc and close" test_kill_gc_and_close; | |||
] | |||
end | |||
|
|||
module Split = struct | |||
let two_splits () = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉 it works!
Codecov Report
@@ Coverage Diff @@
## main #2118 +/- ##
==========================================
+ Coverage 64.91% 68.20% +3.28%
==========================================
Files 134 134
Lines 15960 16031 +71
==========================================
+ Hits 10361 10934 +573
+ Misses 5599 5097 -502
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
2763598
to
8250ba3
Compare
c950da8
to
73df70b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More comments for consideration. I think this is really close to ready!
src/irmin-pack/unix/file_manager.ml
Outdated
Control.set_payload t.control pl | ||
in | ||
(* Step 4. Fsync the control file *) | ||
Control.fsync t.control |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still wondering about this fsync
. We update the control file in other places without fsync
ing (for instance, in swap
) -- should we break that here?
In the event that the control file data does not make it to disk, the following things would happen:
- There would be a dangling chunk that is removed on next open by
cleanup
- A future GC would be non-optimal and a chunk would live an extra GC cycle before being deleted.
Point 1 is okay (I think) and point 2 is less great since it means more disk space usage. I don't have a strong opinion here but maybe we leave it and see what impact it has on benchmarks (?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I'm not sure about the fsync
calls here neither
fsync
on one file might lead tofsync
for all files on some filesystems (see irmin-pack: inverstigate performance regression with GC #1959 (comment));- we might have to call
fsync
on the parent directory because, according to https://man7.org/linux/man-pages/man2/fdatasync.2.html:
Calling fsync() does not necessarily ensure that the entry in the
directory containing the file has also reached disk. For that an
explicit fsync() on a file descriptor for the directory is also
needed.
So I think we should remove the fsyncs
from the main process, and do an fsync
on the root at the end of each Gc (similar to what we have now). We only guarantee that data is not lost if it was added before the last gc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you also saying that you want to remove the fsync
on the suffix before creating the new chunk? I still think that is a good idea (I haven't thought through all the consequences otherwise), but either way, I think we have identified a place that needs more careful consideration.
Also, you might be saying this for fsync
ing the directory but just to be on the same page, I think that only applies when you are adding new files since the language of the manual is ensure that the entry in the directory
. From Ensuring data reaches disk:
The more subtle usages deal with newly created files, or overwriting existing files. A newly created file may require an fsync() of not just the file itself, but also of the directory in which it was created (since this is where the file system looks to find your file). This behavior is actually file system (and mount option) dependent. You can either code specifically for each file system and mount option combination, or just perform fsync() calls on the directories to ensure that your code is portable.
This would apply for the new chunk but not the control file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, I think we need to think a bit more about fsync
here. For now I removed it.
Are you also saying that you want to remove the
fsync
on the suffix before creating the new chunk? I still think that is a good idea (I haven't thought through all the consequences otherwise), but either way, I think we have identified a place that needs more careful consideration.
Yes, I think it's not a good idea to call fsync
at all in the main process. If we leave all fsync
to the gc process, we can guarantee durability for data up to the last gc. I think this is what we have now too?
Also, you might be saying this for
fsync
ing the directory but just to be on the same page, I think that only applies when you are adding new files since the language of the manual isensure that the entry in the directory
.
This would apply for the new chunk but not the control file.
Indeed, it only applies to new chunk files. But our API lets the user call split
several times before a call to gc
. If we do add fsync
to the control file here, we have to call fsync
for all the directory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 I definitely understand the hesitancy to introduce an fsync
in the main process. My main concern is that if we do not fsync
here, we have no control file field to help us after a crash. If the N-1
chunk didn't flush all the way to disk, it will open fine (will pass the consistency check bc we will just pass the length of the chunk as end_poff
) but actually not be correct. This has me thinking more about suffix_end_poff
and whether we should consider moving to storing the length of the suffix that has been persisted in order to check that for consistency on open. Let's keep this as an open discussion (particularly if we need to track more information in the control file).
3f6c87f
to
b91cc50
Compare
Rebased, this is ready for another round of reviews. |
let length_from_chunk suffix_off ao = | ||
let chunk_len = Ao.end_poff ao in | ||
Int63.Syntax.(suffix_off + chunk_len) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The length_from_chunk
name seem to suggest that this function returns the length of the chunk. What about
end_offset_of_chunk
? The first parameter could be named start_offset
(* TODO: remove this last check, kept for passing the tests. *) | ||
|| gen0 <> gen1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why should this be removed? The generation may change independently from the chunks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah never mind. this branch doesn't modify the suffix/mappin
src/irmin-pack/unix/gc.ml
Outdated
@@ -235,8 +235,7 @@ module Make (Args : Gc_args.S) = struct | |||
| Error (`Msg error) -> Error (`Corrupted_gc_result_file error) | |||
| Ok ok -> ok |> Result.map_error gc_error | |||
|
|||
let clean_after_abort t = | |||
Fm.cleanup ~root:t.root ~generation:(t.generation - 1) | |||
let clean_after_abort t = Fm.cleanup t.fm ~generation:(t.generation - 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the only place where Fn.cleanup
is called. IINM, the the generation inside fm
is t.generation - 1
, which means that passing generation
is not necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indeed, thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of small documentation comments but this LGTM!
CHANGES.md
Outdated
- Add `split` to create a new append-only file on disk, for the following | ||
writes. (#2118, @icristescu) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Add `split` to create a new append-only file on disk, for the following | |
writes. (#2118, @icristescu) | |
- Add `split` to create a new suffix chunk. Subsequent writes will append to | |
this chunk until `split` is called again. (#2118, @icristescu) |
I'm not sure what the exact wording should be, but trying to coordinate with the previous entry I added to describe the on-disk format change. We can always revisit the change log in a future PR too.
src/irmin-pack/unix/s.ml
Outdated
stores that allow gc. Raises [RO_Not_Allowed] if called by a readonly | ||
instance. *) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should also mention Split_forbidden_during_batch
and Multiple_empty_chunks
.
…ils, irmin-test, irmin-pack, irmin-mirage, irmin-mirage-graphql, irmin-mirage-git, irmin-http, irmin-graphql, irmin-git, irmin-fs, irmin-containers, irmin-cli, irmin-chunk and irmin-bench (3.5.0) CHANGES: ### Added - **irmin-pack** - Add `Irmin_pack_unix.Stats.Latest_gc` which is now the parameter of GC's `finished` callback (mirage/irmin#2089, @Ngoguey42) - Add `Gc.oldest_live_commit` which returns the key of the commit on which the latest gc was called on. (mirage/irmin#2110, @icristescu) - Add `split` to create a new suffix chunk. Subsequent writes will append to this chunk until `split` is called again. (mirage/irmin#2118, @icristescu) - Add `create_one_commit_store` to create a new store from the existing one, containing only one commit. (mirage/irmin#2125, @icristescu) ### Changed - **irmin-pack** - Upgraded on-disk format to version 4. (mirage/irmin#2110, @icristescu) - Detecting control file corruption with a checksum (mirage/irmin#2119, @art-w) - Change on-disk layout of the suffix from a single file to a multiple, chunked file design (mirage/irmin#2115, @metanivek) - Modify GC to work with new chunked suffix. See `examples/gc.ml` for a demonstration of how it works with the new `split` function. (mirage/irmin#2126, @metanivek) ### Fixed
…ils, irmin-test, irmin-pack, irmin-mirage, irmin-mirage-graphql, irmin-mirage-git, irmin-http, irmin-graphql, irmin-git, irmin-fs, irmin-containers, irmin-cli, irmin-chunk and irmin-bench (3.5.0) CHANGES: ### Added - **irmin-pack** - Add `Irmin_pack_unix.Stats.Latest_gc` which is now the parameter of GC's `finished` callback (mirage/irmin#2089, @Ngoguey42) - Add `Gc.oldest_live_commit` which returns the key of the commit on which the latest gc was called on. (mirage/irmin#2110, @icristescu) - Add `split` to create a new suffix chunk. Subsequent writes will append to this chunk until `split` is called again. (mirage/irmin#2118, @icristescu) - Add `create_one_commit_store` to create a new store from the existing one, containing only one commit. (mirage/irmin#2125, @icristescu) ### Changed - **irmin-pack** - Upgraded on-disk format to version 4. (mirage/irmin#2110, @icristescu) - Detecting control file corruption with a checksum (mirage/irmin#2119, @art-w) - Change on-disk layout of the suffix from a single file to a multiple, chunked file design (mirage/irmin#2115, @metanivek) - Modify GC to work with new chunked suffix. See `examples/gc.ml` for a demonstration of how it works with the new `split` function. (mirage/irmin#2126, @metanivek) ### Fixed
Rebased on #2115.
I want to add more tests, but I will not change too much the code, apart from adapting to #2115 and #2117.