dnsvizor: init at 0-unstable-2025-12-15#1907
Conversation
4ca5205 to
77b37ca
Compare
77b37ca to
97c641f
Compare
97c641f to
80d54d7
Compare
80d54d7 to
4449965
Compare
4449965 to
ab8af29
Compare
|
The remaining thing to do is to upstream my patches. Other than that, this PR is ready for review. |
bad: eval failure by IFD or build failure
14e43e7 to
dfb67b4
Compare
imincik
left a comment
There was a problem hiding this comment.
Please investigate if we can solve IFDs somehow. Thanks.
| inputs.opam-nix.inputs.opam-overlays.follows = "opam-overlays"; | ||
| inputs.opam-nix.inputs.mirage-opam-overlays.follows = "mirage-opam-overlays"; | ||
| # update opam-nix to fix eval error of new Nixpkgs: attribute 'overrideScope'' missing | ||
| inputs.opam-nix.url = "github:tweag/opam-nix"; |
There was a problem hiding this comment.
This one is using IFD which will prevent us to merge this PR. Is there any workaround (for example including some generated file in git)?
There was a problem hiding this comment.
By "prevent us to merge this PR", do you mean
- there is a no-IFD policy in ngipkgs
- or currently CI on aarch64 fails
Indeed, opam-nix, used by hillingar, can avoid IFD by adding generated files. However, hillingar does not expose that functionality from opam-nix yet. More work is needed to implement that in hillingar.
There was a problem hiding this comment.
I changed buildbox-nix CI config as a workaround to make CI pass in bf173c8.
See the commit message for more details and possible alternative methods.
Avoiding IFD is of low priority (to me) now since CI passes. So I'll focus on projects(dnsvizor): init next. I am also fine to work on avoiding IFD before projects(dnsvizor): init if you think that is better.
WDYT?
My patches has been upstreamed.
My patches are applied using nix.
By default, buildbot-nix looks at "checks", which consists of "checks.x86_64-linux" and "checks.aarch64-linux". Our buildbot-nix CI runs on a x86_64-linux[1] machine so buildbot-nix errors out for IFDs needing to build on aarch64-linux systems. This patch fixes that error by letting buildbot-nix only look at "checks.x86_64-linux", which was made possible by [2]. Compared to the previous state, the only disadvantage is that we do not catch eval errors on aarch64-linux in the buildbot-nix CI any more. There are 3 possible alternative fixes: 1. ban IFD in ngipkgs 2. exclude "aarch64-linux" from "checks" 3. emulate aarch64-linux on a x86_64-linux machine using boot.binfmt.emulatedSystems = [ "aarch64-linux" ] The 1st alternative fix usually needs extra work to implement and usually means we have to commit generated lock files to ngipkgs repo. The 2nd alternative fix affects more than just buildbot-nix CI, such as "nix flake check", which may not be desirable. The 3rd alternative fix will slow down the buildbox-nix CI since emulating another system is slow. [1]: https://github.com/ngi-nix/ngipkgs/blob/dfab738d4a1d00f6c1b958be29163d672badf05f/infra/makemake/default.nix#L3 [2]: nix-community/buildbot-nix#318
4123adb to
bf173c8
Compare
|
While working on #1944, I find that the built Fortunately, upstream provides binary releases of unikernels (the same unikernel binary works on many arch/platform/systems), and claims them to be reproducible. |
As @ju1m mentioned in the review meeting:
Could we use materialization? |
|
To be clear, I already know that opam-nix supports materialization. I just said materialization using different words "adding generated files":
The complex part of avoiding IFD for MirageOS unikernels lies in the (cross-)building process of the unikernel. |
|
Oh noes, unlike the prebuilt ReproducerA quicker reproducer$ nix shell nixpkgs#solo5
$ sudo ip tuntap add tap-unikernel mode tap
$ sudo ip link set dev tap-unikernel up
$ solo5-hvt --mem=512 --net:service=tap-unikernel -- $(nix build --print-out-paths --no-link -f. dnsvizor.hvt --allow-import-from-derivation)/dnsvizor.hvt
| ___|
__| _ \ | _ \ __ \
\__ \ ( | | ( | ) |
____/\___/ _|\___/____/
Solo5: Bindings version v0.10.0
Solo5: Memory map: 512 MB addressable:
Solo5: reserved @ (0x0 - 0xfffff)
Solo5: text @ (0x100000 - 0x50bfff)
Solo5: rodata @ (0x50c000 - 0x5c8fff)
Solo5: data @ (0x5c9000 - 0xa16fff)
Solo5: heap >= 0xa17000 < stack < 0x20000000
Solo5: trap: type=#PF ec=0x0 rip=0x466a86 rsp=0x1ffffc10 rflags=0x10002
Solo5: trap: cr2=0x28
Solo5: ABORT: cpu_x86_64.c:181: Fatal trapExpecting: $ aria2c https://builds.robur.coop/job/dnsvizor/build/dd2ac462-f0d4-4866-8439-e4fbbc1e97ae/f/bin/dnsvizor.hvt
$ solo5-hvt --mem=512 --net:service=tap-unikernel -- ./dnsvizor.hvt
| ___|
__| _ \ | _ \ __ \
\__ \ ( | | ( | ) |
____/\___/ _|\___/____/
Solo5: Bindings version v0.10.0
Solo5: Memory map: 512 MB addressable:
Solo5: reserved @ (0x0 - 0xfffff)
Solo5: text @ (0x100000 - 0x4ddfff)
Solo5: rodata @ (0x4de000 - 0x66afff)
Solo5: data @ (0x66b000 - 0xa02fff)
Solo5: heap >= 0xa03000 < stack < 0x20000000
2026-02-04T02:58:00-00:00: [INFO] [netif] Plugging into service with mac 42:55:0f:61:29:30 mtu 1500
2026-02-04T02:58:00-00:00: [INFO] [ethernet] Connected Ethernet interface 42:55:0f:61:29:30
2026-02-04T02:58:00-00:00: [INFO] [ARP] Sending gratuitous ARP for 10.0.0.2 (42:55:0f:61:29:30)
2026-02-04T02:58:00-00:00: [INFO] [ARP] Sending gratuitous ARP for 10.0.0.2 (42:55:0f:61:29:30)
2026-02-04T02:58:00-00:00: [INFO] [ipv6] IP6: Starting
2026-02-04T02:58:00-00:00: [INFO] [ndpc6] IP6: Processing unknown option, MSB 5
2026-02-04T02:58:00-00:00: [INFO] [ndpc6] ICMP6: Unknown packet type: ty=143
2026-02-04T02:58:01-00:00: [INFO] [ndpc6] IP6: Processing unknown option, MSB 5
2026-02-04T02:58:01-00:00: [INFO] [ndpc6] ICMP6: Unknown packet type: ty=143
2026-02-04T02:58:01-00:00: [INFO] [ipv6] IP6: Started with fe80::4055:fff:fe61:2930
2026-02-04T02:58:01-00:00: [INFO] [udp] UDP layer connected on 10.0.0.2/24, fe80::4055:fff:fe61:2930/64
2026-02-04T02:58:01-00:00: [INFO] [tcp.pcb] TCP layer connected on 10.0.0.2/24, fe80::4055:fff:fe61:2930/64
2026-02-04T02:58:01-00:00: [INFO] [tcpip-stack-direct] Dual TCP/IP stack assembled: mac=42:55:0f:61:29:30,ip=10.0.0.2/24, fe80::4055:fff:fe61:2930/64
2026-02-04T02:58:01-00:00: [WARNING] [happy-eyeballs.mirage] inject was called the 2 times
2026-02-04T02:58:01-00:00: [WARNING] [application] No password specified, endpoints requiring authentication won't be accessible.
2026-02-04T02:58:01-00:00: [ERROR] [application] Neither --no-tls nor --ca-seed specified. The seed (base64 encoded) used to generate the private key for the certificate. The seed can be prepended by the type of the key (rsa or ed25519) plus a colon. For a RSA key, the user can also specify bits: "rsa:4096:foo=".
Solo5: solo5_exit(64) calledNo better luck with $ solo5-spt --mem=512 --net:service=tap-unikernel -- $(nix build --print-out-paths --no-link -f. dnsvizor.spt --allow-import-from-derivation)/dnsvizor.spt
| ___|
__| _ \ | _ \ __ \
\__ \ ( | | ( | ) |
____/\___/ _|\___/____/
Solo5: Bindings version v0.10.0
Solo5: Memory map: 512 MB addressable:
Solo5: reserved @ (0x0 - 0xfffff)
Solo5: text @ (0x100000 - 0x509fff)
Solo5: rodata @ (0x50a000 - 0x5c6fff)
Solo5: data @ (0x5c7000 - 0xa10fff)
Solo5: heap >= 0xa11000 < stack < 0x20000000
Segmentation fault (core dumped)solo5-spt --mem=512 --net:service=tap-unikernel -- $(nix build --print-out-paths --no-link -f. dnsvizor.spt --allow-import-from-derivation)/dnsvizor.sptDebug$ mkdir dump
$ solo5-hvt-debug --dumpcore=dump --mem=512 --net:service=tap-unikernel -- $(nix build --print-out-paths --no-link -f. dnsvizor.hvt --allow-import-from-derivation)/dnsvizor.hvt
$ gdb --core=dump/core.solo5-hvt.1970565 -- result/dnsvizor.hvt
Reading symbols from result/dnsvizor.hvt...
warning: found thread with pid 0, assigned replacement Target Id: LWP 1
[New LWP 1]
#0 0x0000000000466a86 in __gmpn_cpuvec_init ()
(gdb) bt
#0 0x0000000000466a86 in __gmpn_cpuvec_init ()
#1 0x00000000001013bc in _abort ()
#2 0x0000000000100b93 in cpu_trap_handler ()
#3 0x0000000000100f01 in cpu_trap_14 ()
#4 0x0000000000000001 in ?? ()
#5 0x0000000000cdfb10 in ?? ()
#6 0x0000000000000038 in ?? ()
#7 0x000000001ffffdf8 in ?? ()
#8 0x0000000000000038 in ?? ()
#9 0x0000000000000001 in ?? ()
#10 0x0000000000000000 in ?? ()There's surely a way to load the missing symbol, but we get back to $ gdb $(nix build --print-out-paths --no-link -f. dnsvizor.hvt --allow-import-from-derivation)/dnsvizor.hvt
Reading symbols from /nix/store/zz0pbbh42440mzcbrbl3nmbgm6xncnbn-mirage-dnsvizor-hvt-0-unstable-2025-12-17/dnsvizor.hvt...
(gdb) disassemble 0x466a86
Dump of assembler code for function __gmpn_cpuvec_init:
0x0000000000466a70 <+0>: push %rbp
0x0000000000466a71 <+1>: xor %esi,%esi
0x0000000000466a73 <+3>: mov %rsp,%rbp
0x0000000000466a76 <+6>: push %r15
0x0000000000466a78 <+8>: push %r14
0x0000000000466a7a <+10>: push %r13
0x0000000000466a7c <+12>: push %r12
0x0000000000466a7e <+14>: push %rbx
0x0000000000466a7f <+15>: sub $0x138,%rsp
0x0000000000466a86 <+22>: mov %fs:0x28,%r12
(gdb) info registers
rax 0xa0cf28 10538792
rbx 0xa0cfe8 10538984
rcx 0x0 0
rdx 0x508 1288
rsi 0xa0cfe8 10538984
rdi 0xff 255
rbp 0xa0cf40 0xa0cf40 <cpu_trap_stack+3872>
rsp 0x3ffffc10 0x3ffffc10
r8 0x0 0
r9 0x0 0
r10 0x0 0
r11 0x0 0
r12 0x527602 5404162
r13 0x52760d 5404173
r14 0x527611 5404177
r15 0x0 0
rip 0x466a86 0x466a86 <__gmpn_cpuvec_init+22>
eflags 0x10002 [ RF ]
cs 0x8 8
ss 0x10 16
ds 0x10 16
es 0x10 16
fs 0x10 16
gs 0x10 16
fs_base 0x0 0
gs_base 0x0 0Solo5/solo5#331 explains that AFAIU,
The disassembling suggests it's happening before the calls to Maybe in This seems to match the fact that the perfectly working prebuilt binary on https://builds.robur.coop is built on FreeBSD for x86-64: $ file dnsvizor.hvt
dnsvizor.hvt: ELF 64-bit LSB executable, x86-64, version 1 (FreeBSD), statically linked, interpreter /nonexistent/solo5/, for OpenBSD, strippeda platform on which TLS is not supported by the building toolchain:
Which gives: $ readelf -l dnsvizor.hvt | grep TLS -A1
TLS 0x0000000000000000 0x0000000000000000 0x0000000000000000
0x0000000000000000 0x0000000000000000 0x0Whereas It may be possible to tell the assembler or linker to disable TLS, but I have not yet found a flag to do it, only |
This is the same error I already mentioned. Not sure why you seem surprised😅. I did not post the error details because I did not expect others' help. I guess I probably should have posted the error itself and maybe my investigation. FWIW, my investigation stopped at solo5 and did not went down to gmp TLS. Great progress on the investigation! 👍 |
|
merged via ##2018 closing |
To build this locally, run command like this:
nix build -f . dnsvizor.unixnix build -f . dnsvizor.hvtProgress
unixtargethvttargetspttargetxentargetqubestargetvirtiotargetmuentargetversionfor nix, currentlyversion = "dev"The
macosxandgenodetargets fail to build. They seem to be niche targets so I do not plan to fix them. So they are not exposed to avoid CI errors.Optional TODO list
unixCloses #1906