Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hermit-sys: Network regression #128

Closed
mkroening opened this issue May 16, 2021 · 5 comments · Fixed by #135
Closed

hermit-sys: Network regression #128

mkroening opened this issue May 16, 2021 · 5 comments · Fixed by #135
Assignees

Comments

@mkroening
Copy link
Member

mkroening commented May 16, 2021

I noticed this issue while rebasing #124.

While all updates on master since opening the PR bc3db32...fc6de5c seem to be unrelated to virtio, I can't get the netbench to work anymore.

fc6de5c

In both debug and release I see two ICMPv6 Multicast Listener Report Messages v2 (143) with Wireshark, but no ARP request for 10.0.5.1.
Strange that there are different outputs on debug and release...

debug

[0][INFO] Feature set wanted by network driver are in conformance with specification.
[0][ERROR] Device features set, does not satisfy minimal features needed. Aborting!
[0][ERROR] Virtio networkd driver could not be initialized with device: 1041
[INFO] Spawn network thread with id 2
[WARN] Ethernet interface not available
Connecting to the server 10.0.5.1...
Couldn't connect to server...

release

[0][INFO] Feature set wanted by network driver are in conformance with specification.
[0][INFO] Feature set wanted by network driver are in conformance with specification.
[0][INFO] Driver found a subset of features for virtio device 1041. Features are: [VIRTIO_NET_F_MAC, VIRTIO_NET_F_STATUS, VIRTIO_F_RING_INDIRECT_DESC, VIRTIO_F_VERSION_1, VIRTIO_F_RING_PACKED]
[0][INFO] Features have been negotiated between virtio network device 1041 and driver.
[0][INFO] Created PackedVq: idx=0, size=256
[0][INFO] Created PackedVq: idx=1, size=256
[0][INFO] Network driver successfully initialized virtqueues.
[0][INFO] Device specific initialization for Virtio network defice 1041 finished
[0][INFO] Network device with id 1041, has been initialized by driver!
[0][INFO] Virtio-net link is up after initialization.
[0][INFO] Virtio network driver initialized with Virtio network device.
[0][INFO] Install virtio interrupt handler at line 11
[INFO] Spawn network thread with id 2
[INFO] MAC address 52-54-00-12-34-56
[INFO] Configure network interface with address 10.0.5.3/24
[INFO] Configure gateway with address 10.0.5.1
[INFO] MTU: 1500 bytes
Connecting to the server 10.0.5.1...
Couldn't connect to server...

bc3db32 (debug/release)

[0][INFO] Feature set wanted by network driver are in conformance with specification.
[0][INFO] Feature set wanted by network driver are in conformance with specification.
[0][INFO] Driver found a subset of features for virtio device 1041. Features are: [VIRTIO_NET_F_MAC, VIRTIO_NET_F_STATUS, VIRTIO_F_RING_INDIRECT_DESC, VIRTIO_F_VERSION_1]
[0][INFO] Features have been negotiated between virtio network device 1041 and driver.
[0][INFO] Created SplitVq: idx=0, size=256
[0][INFO] Created SplitVq: idx=1, size=256
[0][INFO] Network driver successfully initialized virtqueues.
[0][INFO] Device specific initialization for Virtio network defice 1041 finished
[0][INFO] Network device with id 1041, has been initialized by driver!
[0][INFO] Virtio-net link is up after initialization.
[0][INFO] Virtio network driver initialized with Virtio network device.
[0][INFO] Install virtio interrupt handler at line 11
[INFO] Spawn network thread with id 2
[INFO] MAC address 52-54-00-12-34-56
[INFO] Configure network interface with address 10.0.5.3/24
[INFO] Configure gateway with address 10.0.5.1
[INFO] MTU: 1500 bytes
Connecting to the server 10.0.5.1...
Connection established! Ready to send...
Sent everything!
@mkroening
Copy link
Member Author

mkroening commented May 16, 2021

Ah, indeed code related to network changed, in hermit-os/kernel@5f2d3b1...44eaa9f, but GitHub does not show the diff nicely.

Edit: Nevermind, I was just confused by the force-push in libhermit and git diff... I have no idea, what could be the cause of this issue. 🤔

@mkroening
Copy link
Member Author

mkroening commented May 26, 2021

Bisecting nightly releases shows that the issue appears when upgrading RustyHermit from nightly-2021-03-22 to nightly-2021-03-23 as demonstrated in #134. I'll have to investigate, what happened in that night (rust-lang/rust@f826641...5d04957).

@mkroening
Copy link
Member Author

mkroening commented May 27, 2021

I found the cause. Mutable noalias has been enabled by default in rust-lang/rust@39ed643. Reverting to the old behavior via RUSTFLAGS="-Zmutable-noalias=no" works around the issue and network works again, but can't be considered a proper fix.

Mutable noalias has enabled further optimizations of the reads and writes in ComCfg::dev_features, even in debug mode. Making them volatile, as they should be, solves the issue. Inserting debugging prints of self.com_cfg.device_feature_select does too, making this a prime example of a Heisenbug.

I'll look into proper abstractions for volatile memory access, so we can prevent this from happening in other places as well.

@mkroening mkroening self-assigned this May 27, 2021
@stlankes
Copy link
Contributor

May be gnoliyil/fuchsia@914bdaf helps to understand the problem.

@mkroening
Copy link
Member Author

Thanks for the Link!

I think, we are not facing a miscompilation here though. A fix for that specific issue is in place since nightly-2021-05-14 (rust-lang/rust#84958 (comment), we are on nightly-2021-05-19). Also, our issue also exists in debug.

In the long term we should make sure to make all memory-mapped I/O accesses volatile, but to get things working again as soon as possible, we might want to disable mutable-noalias until we have a proper fix. I'll create tracking issues and PRs.

bors bot added a commit that referenced this issue May 28, 2021
135: Disable `mutable-noalias` r=stlankes a=mkroening

Fixes #128.

What do you think?

Co-authored-by: Martin Kröning <[email protected]>
@bors bors bot closed this as completed in af41820 May 28, 2021
bors bot added a commit that referenced this issue Sep 16, 2021
159: build.rs: Migrate to CARGO_ENCODED_RUSTFLAGS r=stlankes a=mkroening

Fixes #158.

Cargo introduced `CARGO_ENCODED_RUSTFLAGS` in rust-lang/cargo#9601 to make setting flags less error prone – it encodes arguments separated by `0x1f` (ASCII Unit Separator), instead of white spaces (old `RUSTFLAGS`).

`CARGO_ENCODED_RUSTFLAGS` are preferred over the old `RUSTFLAGS` in cargo. For build scripts, cargo converts its `RUSTFLAGS` to `CARGO_ENCODED_RUSTFLAGS`. For unset `RUSTFLAGS` it is set to an empty string. Thus our build script would call cargo for building libhermit-rs with a set – but empty – `CARGO_ENCODED_RUSTFLAGS`, which takes precedence over our prepared `RUSTFLAGS`. Specifically this caused our `-Zmutable-noalias=no` flag to be ignored, causing the same network issue as #128 again.

This PR adjusts our build script to make direct use of `CARGO_ENCODED_RUSTFLAGS`, the new and preferred way of handling flags in build scripts. This causes our flags to be correctly handled again.

Co-authored-by: Martin Kröning <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants