Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate irmin-server #2031

Merged
merged 26 commits into from
Aug 9, 2023
Merged

Integrate irmin-server #2031

merged 26 commits into from
Aug 9, 2023

Conversation

zshipko
Copy link
Contributor

@zshipko zshipko commented Aug 1, 2022

This PR adds irmin-server and irmin-client packages in place of of irmin-http

TODO

  • Add copyright headers to new files
  • Decide between keeping optional subpackages or adding irmin-server-unix, irmin-client-unix and irmin-client-jsoo packages

@codecov-commenter
Copy link

codecov-commenter commented Aug 1, 2022

Codecov Report

Merging #2031 (6e70ff8) into main (020b52e) will increase coverage by 0.02%.
The diff coverage is 41.86%.

❗ Current head 6e70ff8 differs from pull request most recent head bd71dae. Consider uploading reports for the commit bd71dae to get more accurate results

❗ Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more.

@@            Coverage Diff             @@
##             main    #2031      +/-   ##
==========================================
+ Coverage   68.84%   68.86%   +0.02%     
==========================================
  Files         132      133       +1     
  Lines       16165    16208      +43     
==========================================
+ Hits        11128    11162      +34     
- Misses       5037     5046       +9     
Files Changed Coverage Δ
src/irmin-cli/cli.ml 21.46% <ø> (ø)
src/irmin-cli/server.ml 41.86% <41.86%> (ø)

... and 6 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

let* tree =
Client.Batch.Tree.of_commit client (Client.Commit.hash head) >|= Option.get
in
let* tree = Client.Batch.Tree.add client tree [ "b"; "c" ] "123" in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zshipko could you describe the rational of Batch here? Does it mean that Client.Tree always connect to the server while Client.Batch.Tree is maintaining a local tree?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's right, the Batch API allows you to manipulate a local tree and send it all at once since the Irmin Store API doesn't give much control over when data will be transferred over the network.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mind adding a bit of documentation about this somewhere? Maybe in the corresponding mli?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am in the process of adding some more comments, I will push something up this week

@samoht
Copy link
Member

samoht commented Feb 16, 2023

@zshipko could you remove the Draft status on this PR? I've force-pushed a rebase - I'm happy to shepherd it until completion. It's a really nice, useful work :-)

Edit: I actually found how to remove the Draft status..

@samoht samoht changed the title Draft: Integrate irmin-server Integrate irmin-server Feb 16, 2023
@samoht samoht marked this pull request as ready for review February 16, 2023 18:35
@zshipko
Copy link
Contributor Author

zshipko commented Feb 16, 2023

Great! I'm happy to help answer any questions

@samoht
Copy link
Member

samoht commented Feb 20, 2023

I've added minimal documentation.

@zshipko I am not sure happy with the current ID management - that seems very error prone (if you forget to call cleanup on every operation you end up with lots of memory leaks). I reckon it's a good enough solution for now, but I was wondering if you had ideas on how to fix this? I have a few ideas but wanted to check if I haven't forgotten anything :-)

  • we can add a close operation on the client - that'll automatically call cleanup_all
  • we can be a bit more clever on the server-side: generate an ID only for new trees that contains in-memory data - otherwise if no change: keep the same ID - if the tree is already fully committed on disk use its' hash

Ideally it would also be nice if Batch would actually be the same as the Tree API, so you could decide to swap those out (if think that's actually a good default!).

@samoht
Copy link
Member

samoht commented Feb 20, 2023

Modulo my comments above, this is ready to be merged - the current API subsumes the low-level HTTP API (which should be faster and more reliable). So I plan to delete irmin-http once this is merged.

This PR adds an interesting (but still needs to complete) higher-level API for efficient tree batching. I'm tempted to either remove it and open another PR with just this to iterate or to mark it as highly experimental until we find the right way to manage remote resources.

@metanivek
Copy link
Member

@samoht I'm all for merging this and then following up with more improvements. Thanks for picking this up!

@zshipko
Copy link
Contributor Author

zshipko commented Feb 21, 2023

I had some issues with getting the Batch interface to match Tree - it definitely seems possible though. The Batch API is kind of a carry-over from before Irmin.Client.S implemented Irmin.S and the whole API was full of slightly different signatures than a typical Irmin store. The idea of a separate batch interface could be avoided by adding more control over when updates are sent to the server (not sure what this would look like exactly) or implementing batching behind the scenes in the Tree API.

It seems like adding a close call to the client would be a good start - another idea I considered was to make commands "consuming" which would exchange an ID for a tree and cleanup the ID.

That being said, removing it altogether and coming up with something more sound also seems like it could yield something interesting! Also, with OCaml 5 the way the ID tracking is done is not very nice (it uses a global Hashtbl and global ref) and probably needs some additional considerations anyway.

@metanivek metanivek self-assigned this Jun 6, 2023
Flushing causes multiple packets to send and trips over Nagle's
algorithm (+ maybe delayed acks). Previously TCP tests were
significantly slower than WebSocket. The WebSocket implementation uses
TCP_NODELAY, which also fixes the slow tests, but the issue is better
addressed by properly writing/flushing our requests as a single packet.
@metanivek metanivek merged commit 8f6f56f into mirage:main Aug 9, 2023
3 of 4 checks passed
art-w added a commit to art-w/opam-repository that referenced this pull request Oct 9, 2023
CHANGES:

### Added

- **irmin-server**
  - Added `irmin-server` package (mirage/irmin#2031, @zshipko)
- **irmin-client**
  - Added `irmin-client` package to connect to `irmin-server` instances (mirage/irmin#2031,
    @zshipko)
- **irmin**
  - Add pretty printers for `Commit`, `Tree`, `Info`, `Status`, `Branch` when
    using `utop` (@metanivek, mirage/irmin#1839)

### Fixed

- **irmin-pack**
  - Fix index integrity check for v3 stores (mirage/irmin#2267, @metanivek)

### Removed

- **irmin-http**
  - Removed `irmin-http` since it is not compatible with generic keys.
    `irmin-grapqhl` or `irmin-server` should be used instead. (mirage/irmin#1902, @zshipko)
- **irmin**
  - Removed stream proofs. We now only have Merkle tree proofs. This simplifies
    the maintenance of that part of the code, as ensuring the correct order of
    cached IO operations was tricky for stream proofs (mirage/irmin#2275, @samoht)

### Changed

- **irmin-git**
  - Moved lower bounds to `git.3.14.0` to use new function (mirage/irmin#2277, @metanivek)
nberth pushed a commit to nberth/opam-repository that referenced this pull request Jun 18, 2024
CHANGES:

### Added

- **irmin-server**
  - Added `irmin-server` package (mirage/irmin#2031, @zshipko)
- **irmin-client**
  - Added `irmin-client` package to connect to `irmin-server` instances (mirage/irmin#2031,
    @zshipko)
- **irmin**
  - Add pretty printers for `Commit`, `Tree`, `Info`, `Status`, `Branch` when
    using `utop` (@metanivek, mirage/irmin#1839)

### Fixed

- **irmin-pack**
  - Fix index integrity check for v3 stores (mirage/irmin#2267, @metanivek)

### Removed

- **irmin-http**
  - Removed `irmin-http` since it is not compatible with generic keys.
    `irmin-grapqhl` or `irmin-server` should be used instead. (mirage/irmin#1902, @zshipko)
- **irmin**
  - Removed stream proofs. We now only have Merkle tree proofs. This simplifies
    the maintenance of that part of the code, as ensuring the correct order of
    cached IO operations was tricky for stream proofs (mirage/irmin#2275, @samoht)

### Changed

- **irmin-git**
  - Moved lower bounds to `git.3.14.0` to use new function (mirage/irmin#2277, @metanivek)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants