Upgrade unipi to paf-le-chien by dinosaure · Pull Request #4 · robur-coop/unipi

dinosaure · 2021-06-21T22:05:12Z

It's a draft, at least it compiles but the TLS path seems buggy and it's hard to understand why. I will try to investigate more deeper on this side soon. But it's a proposal, feel free to share your opinion on it and improve this unikernel 👍.

dinosaure · 2021-06-24T09:30:41Z

So an implementation of unipi with paf is running here: https://unipi.egar.im/ (with a debug let's encrypt certificate and small tweaks on paf available here: dinosaure/paf-le-chien#28). WDYT?

hannesm · 2021-06-24T09:54:24Z

I don't know. What is the goal? This PR adds quite some boilerplate to the codebase :/

Will this remove the cohttp and conduit dependency entirely? if not, can we get to this point (easily?) -- maybe if let's encrypt specifies their own module type for the HTTP client used (to avoid the cohttp dependency)?

dinosaure · 2021-06-24T10:10:56Z

I don't know. What is the goal? This PR adds quite some boilerplate to the codebase :/

I think indeed we can do something better on this side. The main problem is the mimic's ritual needed to let the sub-module let's encrypt to safely communicate with let's encrypt and do the challenge. Such part is provided by paf with paf.le (see this functor: https://github.com/dinosaure/paf-le-chien/blob/master/lib/lE.mli

But I'm not sure that the sub-module Letsencrypt and paf.le are equivalent. I need to check.

Will this remove the cohttp and conduit dependency entirely?

The only remaining module is the Cohttp.Client.S signature - then, conduit is definitely removed.

maybe if let's encrypt specifies their own module type for the HTTP client used (to avoid the cohttp dependency)?

It can be a nicer solution indeed! Again, we already talk about mirage-http and an ability to provide such interface without any dependencies (with cohttp or http/af). If you think that is the best way, I can dig on this way 👍 .

On the other side, paf (it's not currently the case but should be easy to do) can handle ALPN and dispatch correctly HTTP 1.1 requests and HTTP 2.0 requests which can be interesting for us.

dinosaure · 2021-06-24T10:12:43Z

PS: I can try to run some benchmark to between this version and the version with cohttp to may be highlight an improvement 👍

dinosaure · 2021-06-29T21:14:22Z

So I did a large stress-test between cohttp and http/af and it seems that http/af can handle ~ 15 000 requests per sec when cohttp handles only ~ 2500 requests per sec for the same file. The magnitude is:

for http/af, we need 0.007 sec to respond to the client
for cohttp, we need 0.027 sec to respond to the client

Such test is done over TLS for both. This is the plain text of the benchmark (http/af):

dinosaure@turbine:~$ hey -n 1000000 -c 24 https://unipi.egar.im/

Summary:
  Total:	73.8878 secs
  Slowest:	0.2575 secs
  Fastest:	0.0010 secs
  Average:	0.0018 secs
  Requests/sec:	13533.8195
  
  Total data:	22999632 bytes
  Size/request:	23 bytes

Response time histogram:
  0.001 [1]	|
  0.027 [999926]	|■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
  0.052 [47]	|
  0.078 [9]	|
  0.104 [0]	|
  0.129 [0]	|
  0.155 [0]	|
  0.181 [0]	|
  0.206 [0]	|
  0.232 [0]	|
  0.258 [1]	|


Latency distribution:
  10% in 0.0013 secs
  25% in 0.0014 secs
  50% in 0.0015 secs
  75% in 0.0017 secs
  90% in 0.0021 secs
  95% in 0.0024 secs
  99% in 0.0103 secs

Details (average, fastest, slowest):
  DNS+dialup:	0.0000 secs, 0.0010 secs, 0.2575 secs
  DNS-lookup:	0.0000 secs, 0.0000 secs, 0.0183 secs
  req write:	0.0000 secs, 0.0000 secs, 0.0021 secs
  resp wait:	0.0017 secs, 0.0009 secs, 0.2575 secs
  resp read:	0.0000 secs, 0.0000 secs, 0.0025 secs

Status code distribution:
  [200]	999984 responses

And cohttp:

dinosaure@turbine:~$ hey -n 1000000 -c 24 https://unipi.egar.im/

Summary:
  Total:	427.5950 secs
  Slowest:	0.2574 secs
  Fastest:	0.0011 secs
  Average:	0.0102 secs
  Requests/sec:	2338.6243
  
  Total data:	22999632 bytes
  Size/request:	23 bytes

Response time histogram:
  0.001 [1]	|
  0.027 [828281]	|■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
  0.052 [170993]	|■■■■■■■■
  0.078 [681]	|
  0.104 [4]	|
  0.129 [0]	|
  0.155 [0]	|
  0.180 [0]	|
  0.206 [0]	|
  0.232 [0]	|
  0.257 [24]	|


Latency distribution:
  10% in 0.0019 secs
  25% in 0.0024 secs
  50% in 0.0028 secs
  75% in 0.0040 secs
  90% in 0.0451 secs
  95% in 0.0465 secs
  99% in 0.0495 secs

Details (average, fastest, slowest):
  DNS+dialup:	0.0000 secs, 0.0011 secs, 0.2574 secs
  DNS-lookup:	0.0000 secs, 0.0000 secs, 0.0019 secs
  req write:	0.0000 secs, 0.0000 secs, 0.0011 secs
  resp wait:	0.0020 secs, 0.0010 secs, 0.0456 secs
  resp read:	0.0081 secs, 0.0000 secs, 0.0680 secs

Status code distribution:
  [200]	999984 responses

The tool hey should be available. The client is located to another geographic place than the server. It a small benchmark and it's not really reproductible but it shows nice results about http/af.

hannesm · 2021-06-30T06:13:02Z

Thanks @dinosaure, the benchmark looks convincing.

About let's encrypt: I don't know what the state of mirage-http is. I'd be fine to have a HTTP Client module type in letsencrypt directly.

talex5 · 2021-06-30T10:00:49Z

So I did a large stress-test between cohttp and http/af and it seems that http/af can handle ~ 15 000 requests per sec when cohttp handles only ~ 2500 requests per sec for the same file.

That's a surprisingly large difference. There are some http benchmarks at https://github.com/ocaml-multicore/retro-httpaf-bench and there we see httpaf being "only" about twice as fast as cohttp:

That graph is from wrk2 with 1000 open connections repeatedly requesting a single static page (which fits in one packet). All servers were configured to use a single core for this test.

dinosaure · 2021-06-30T10:36:54Z

That's a surprisingly large difference. There are some http benchmarks at https://github.com/ocaml-multicore/retro-httpaf-bench and there we see httpaf being "only" about twice as fast as cohttp:

Yeah, I think it's not a fair benchmark for many reasons:

I don't have any control on the flow between the server and the client (two differents servers on internet)
TLS is used in my context, not sure about its impact on performances (and how cohttp/conduit and http/af/paf differs on that point in details)
mirage-tcpip is used here
and the unikernel is virtualized with KVM - but http/af and cohttp are deployed into the same context

For these points, it's why I said that the "benchmark" is not reproductible and show-up (with unipi) especially a "real" usage about what we want (a simple MirageOS website synchronized to a Git repository). May be the context fits better for http/af but the huge diff between cohttp and http/af gives me some reasons to switch to http/af at the end 🙂 .

dinosaure · 2021-07-08T09:48:37Z

So https://unipi.egar.im/ is alive for a long time so I believe that I'm ready to cut a release of paf (which facilitates the way to get a let's encrypt certificate) and this PR will be ready to merge then!

hannesm · 2021-07-08T09:54:25Z

Thanks again @dinosaure -- there's still the outstanding question of the dependency cone in respect to letsencrypt (which currently has a hard dependency on cohttp, and paf has a dependency on letsencrypt). I'd appreciate if we can conclude:

have a HTTP_client.S in letsencrypt (that is a module type which is trivially implemented by cohttp -- thus there's no need for any consumer change if you use a cohttp client)
have letsencrypt.httpaf implement that module type
remove (a) letsencrypt dependency from paf (or does it serve a good purpose there?) (b) have a unipi-without-cohttp [and then we can follow with other unikernels, removing the whole cohttp & conduit dependencies]

WDYT?

dinosaure · 2021-07-08T10:04:40Z

Yes, let me sometimes to shape all of that, I still need to think to revive mirage-http or just delete the hard dependency on letsencrypt about cohttp 👍.

hannesm · 2021-07-21T09:17:19Z

So, now that letsencrypt 0.3.0 is in opam-repository, we could proceed with (a) adapting paf and (b) removing cohttp from unipi. This would clean up the dependency cone drastically :) (and reduce the binary size).

dinosaure · 2021-09-16T16:36:23Z

I updated the PR with last changes on:

ocaml-git (we use git-mirage.3.5.0 now)
letsencrypt (we use letsencrypt.3.0.0)
and paf & paf-le (we use paf.0.0.5 & paf-le.0.0.5)

However, we still need to pin irmin with the right version. It seems that irmin will break its API and updates about it does not reflect yet what is going on irmin~master (the API will change again). However, at least, unipi compiles (I believe).

dinosaure · 2021-10-20T16:00:46Z

Just waiting the release of git.3.6.0 and this PR will be ready, but as far as I can say, we can start to review it!

hannesm · 2021-10-26T17:12:09Z

Thanks for your PR. I pushed a cleanup commit on top. I'll raise some questions via the code comment / review system.

hannesm · 2021-10-26T17:12:58Z

        begin
-          hookf () >>= function
-          | Ok data -> Http.respond ~status:`OK ~body:(`String data) ()
+          Lwt.async @@ fun () -> hookf () >>= function


I'm not really comfortable with this Lwt.async -- what is the resource / connection / task story for Httpaf here?

By type, http/af requires that the request_handler (in that case, dispatch) returns unit instead of unit Lwt.t. Then, httpaf has an internal queue which care about callbacks and execute these functions (unit -> unit) which can emits the request to Read/Write or Error into the socket (via respond_with_*) via the given reqd.

Concurrently, paf processes such tasks with mirage-tcpip on the other side via a server connection. So, resources (such as buffers, internal states, etc.) are shared between the server connection and the given reqd. As long as the reqd exists, internal resources exists.

In that case, and it's the case for the other branch, we require resources which are only available via Lwt. If you are scare about what it can happens into the Lwt.async (such as an exception), may be we can/should add an Lwt.catch inside and call then Reqd.report_exn to be sure that in any way, the resource will be free whatever happens the process.

hannesm · 2021-10-26T17:13:46Z

+            let headers = Httpaf.Headers.of_list
+              [ "content-length", string_of_int (String.length data) ] in
+            let resp = Httpaf.Response.create ~headers `OK in
+            Httpaf.Reqd.respond_with_string reqd resp data ;


This one does actually send out data, does it not? but it does not seem to be in the Lwt monad -- what is the story here?

As I said above, respond_with_string will just fill an internal buffer into the reqd which is shared with a server connection owned by paf (for instance). Concurrently, paf launched an unit Lwt.t which cares about Read/Write operations.

Then, an internal queue of callbacks exists in httpaf which will execute them one per one and emits syscalls action in on side (via server connection) and consume what the user wants on the other side (via the given reqd).

A question can subsist about data race condition, but the computation model of lwt and the global GC lock ensure (due to mutation of the internal queue) ensure that everything is safe (I believed).

hannesm · 2021-10-26T17:14:14Z

-            Http.respond ~status:`Internal_server_error ~body:(`String msg) ()
+            let headers = Httpaf.Headers.of_list
+              [ "content-length", string_of_int (String.length msg) ] in
+            let resp = Httpaf.Response.create ~headers `Internal_server_error in


the above three lines can be factored out together with 109 - 111.

hannesm · 2021-10-26T17:15:13Z

+            Httpaf.Reqd.respond_with_string reqd resp data ;
+            Lwt.return_unit
+
+    let redirect port _ reqd =


could the second argument now be removed? what is provided here?

The second argument is the Ipaddr.t * int peer/the client. You can not remove it because it is given by paf.

hannesm · 2021-10-26T17:15:27Z

-    let dispatch store hookf hook_url request _body =
-      let p = Uri.path (Cohttp.Request.uri request) in
-      let path = if String.equal p "/" then "index.html" else p in
+    let dispatch store hookf hook_url _conn reqd =


should _conn be removed?

_conn is the Ipaddr.t * int too. You can not remove it (because the caller pass it to the callback).

hannesm · 2021-10-26T17:16:09Z

      let port = if port = 443 then None else Some port in
-      let new_uri = Uri.with_port new_uri port in
+      let path = request.Httpaf.Request.target in
+      let new_uri = Uri.make ~scheme:"https" ?host:(Key_gen.hostname ()) ?port ~path () in


is this the only remaining use of uri? can we drop that dependency? (I'm fine working out the string stuff to get the "new url" ;)

I don't have strong opinion about that, feel free to remove it 👍.

hannesm · 2021-10-26T17:17:34Z

+               ; LE.account_seed = Key_gen.account_seed ()
+               ; LE.account_key_type = `ED25519
+               ; LE.account_key_bits = Some 4096
+               ; LE.hostname = Key_gen.hostname () |> Option.get |> Domain_name.of_string_exn |> Domain_name.host_exn }


the previous defaults for account key type and certificate key type should be used here -- also there should be command line arguments for the key types (and bit sizes)

hannesm · 2021-11-04T12:50:47Z

with this branch, @hb9cwp reported successful builds (on Unix and hvt) on OpenBSD. But the hvt unikernel does not serve web pages -- maybe related to the httpaf/mimic/paf semantics of Lwt.async that were discussed above? @dinosaure did you do functional tests with a hvt unikernel (that it actually serves web pages)?

hb9cwp · 2021-11-07T20:08:30Z

@hannesm @dinosaure Good news: with this branch, I have got both unix as well as hvt targets working now on both OpenBSD 6.9 and 7.0 after opam update & upgrade to latest, using mirage v3.10.6 and ocaml 4.10.2 :-)
Further, on OpenBSD 7.0, I had to raise the stack size from 4 to 32 kB using ulimit -s 32768, otherwise the make depend/build fail.
Finally, in my start script for the hvt unikernel, I corrected the netmask from /32 to /24 on the unikernel's tap interface which is bridged at layer-2 to the physical NICs re or em of my OpenBSD hosts.
Tomorrow, I will try to apply the lessons learned to dns-primary-git.
Thank you for all the spontaneous support that got me so far!

hannesm · 2021-11-07T23:09:39Z

@hb9cwp great! This means your unix and hvt unikernels deliver data via HTTP(S) to clients that send requests (in contrast too your earlier comment that there's no content being delivered)?

hb9cwp · 2021-11-08T10:05:15Z

@hannesm Yes, exactly. So far, I tested serving simple .html pages and .png images with HTTP though, HTTPS clients to come. Also, Unipi's hook works and triggers it to refresh from Github using HTTPS, SSH to come.

P.S. Also, Unipi unikernels answer requests on both their IPv4 and IPv6 addresses of their tap interfaces in dual-stack OpenBSD hosts, and uses IPv6 for DNS resolution as well as HTTPS to fetch the selected branch from Github, if available.

hannesm · 2021-11-08T12:31:13Z

@hb9cwp great, thanks for your confirmation.

Unipi unikernels answer requests on both their IPv4 and IPv6 addresses of their tap interfaces in dual-stack OpenBSD hosts,

indeed :)

and uses IPv6 for DNS resolution

yes :) also using DNS-over-TLS by default (to anycast.uncensoreddns.org)

as well as HTTPS to fetch the selected branch from Github, if available.

sadly not yet AFAICT (lack of using happy-eyeballs in the git client code)

hannesm · 2021-11-10T13:09:43Z

merged manually into main, thanks for the PR!

hannesm · 2022-04-28T18:17:34Z

It has been some months after merging this, though there is some regression:

If tls/https was configured, previously a redirect from http to http (port 80 to 443) was in place - this PR removed that one (re-added in commit f2825c3
The redirect function was wrong, previously it used "let uri = Cohttp.Request.uri request", which is a full uri. Now it uses "let path = request.Httpaf.Request.target" -- which is only the path part of the uri, the result is that the redirect of http://10.0.42.2/foo.html puts the location to https:/foo.html -- i.e. missing host. (fixed in commit 6ab8f1f and 2851323)
The KV lookup used let p = Uri.path (Cohttp.Request.uri request) in, which is the path, and only the path (i.e. no query parameters), now let path = request.Httpaf.Request.target in is used, which again is the path including query parameters -- so https://10.0.42.2/foo.html?v=23 leads to a not found instead of delivering foo.html. (fixed in 91d0260)

Upgrade unipi to paf-le-chien

2cac72a

.

6e85aea

mseri mentioned this pull request Jun 30, 2021

conduit-lwt-unix: se accept_n on the server mirage/ocaml-conduit#387

Closed

dinosaure mentioned this pull request Sep 16, 2021

Unhelpful conflict messages ocaml/opam#4373

Closed

dinosaure force-pushed the with-paf branch from 66a4520 to dec5db8 Compare September 16, 2021 16:33

dinosaure force-pushed the with-paf branch from dec5db8 to 32bd59b Compare October 20, 2021 16:00

dinosaure marked this pull request as ready for review October 20, 2021 16:00

dinosaure force-pushed the with-paf branch from 32bd59b to 777de4c Compare October 21, 2021 08:45

Update unipi with the last version of git-paf/paf

0ab8cc8

dinosaure force-pushed the with-paf branch from 777de4c to 0ab8cc8 Compare October 21, 2021 13:35

minor adjustments

6a5ccac

hannesm reviewed Oct 26, 2021

View reviewed changes

hannesm mentioned this pull request Oct 27, 2021

Builds on OpenBSD and builds.robur.coop now fail which were ok before #5

Closed

hb9cwp mentioned this pull request Oct 29, 2021

update to recent paf and irmin robur-coop/dns-primary-git#9

Closed

hannesm closed this Nov 10, 2021

dinosaure deleted the with-paf branch November 11, 2021 15:03

dinosaure mentioned this pull request Nov 22, 2021

Update TLS certificates in running server mirage/ocaml-conduit#409

Open

dinosaure mentioned this pull request Jan 7, 2022

Move to paf and use paf.le instead of inner Le module yomimono/url-shortener#2

Open

dinosaure mentioned this pull request Apr 25, 2022

Upgrade the codebase to use paf-le-chien instead of CoHTTP yomimono/url-shortener#3

Open

Conversation

dinosaure commented Jun 21, 2021

Uh oh!

dinosaure commented Jun 24, 2021

Uh oh!

hannesm commented Jun 24, 2021

Uh oh!

dinosaure commented Jun 24, 2021

Uh oh!

dinosaure commented Jun 24, 2021

Uh oh!

dinosaure commented Jun 29, 2021

Uh oh!

hannesm commented Jun 30, 2021

Uh oh!

talex5 commented Jun 30, 2021

Uh oh!

dinosaure commented Jun 30, 2021

Uh oh!

dinosaure commented Jul 8, 2021

Uh oh!

hannesm commented Jul 8, 2021

Uh oh!

dinosaure commented Jul 8, 2021

Uh oh!

hannesm commented Jul 21, 2021

Uh oh!

dinosaure commented Sep 16, 2021

Uh oh!

dinosaure commented Oct 20, 2021

Uh oh!

hannesm commented Oct 26, 2021

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hannesm commented Nov 4, 2021

Uh oh!

hb9cwp commented Nov 7, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hannesm commented Nov 7, 2021

Uh oh!

hb9cwp commented Nov 8, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hannesm commented Nov 8, 2021

Uh oh!

hannesm commented Nov 10, 2021

Uh oh!

hannesm commented Apr 28, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

hb9cwp commented Nov 7, 2021 •

edited

Loading

hb9cwp commented Nov 8, 2021 •

edited

Loading

hannesm commented Apr 28, 2022 •

edited

Loading