Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Netlink support for FreeBSD 13.2 #3201

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

ydirson
Copy link

@ydirson ydirson commented Apr 14, 2023

Closes #3194, to the point that Xen Guest Agent can works on FreeBSD.

Requires breaking-change #3367.

Downstream support in rust-netlink:

Also see #3555 for a backport to libc-0.2 branch using a feature flag to avoid the breaking change by default.

@rustbot
Copy link
Collaborator

rustbot commented Apr 14, 2023

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @JohnTitor (or someone else) soon.

Please see the contribution instructions for more information. Namely, in order to ensure the minimum review times lag, PR authors and assigned reviewers should ensure that the review label (S-waiting-on-review and S-waiting-on-author) stays updated, invoking these commands when appropriate:

  • @rustbot author: the review is finished, PR author should check the comments and take action accordingly
  • @rustbot review: the author is ready for a review, this PR will be queued again in the reviewer's queue

@ydirson
Copy link
Author

ydirson commented Apr 17, 2023

I can't find a description of what the failing test does; obviously it's compiling some C code (I infer it's to check that the Rust-level definition is compatible with the C definition) and was never told to use the proper C header. This seems to hint that we don't want those definitions exposed for versions which don't implement them, that is FreeBSD 13.1 and earlier.
This seems to imply we want to extend build.rs to define freebsd13_2 (and then likely other ones). A possiblity would be to change which_freebsd() to return (major, minor) instead of just major. What would be the strategy then, adding submodules of freebsd13 for minor versions, and duplicating the netlink definitions for freebsd13_2 and freebsd14 ?

For the tests themselves:

  • I already get 126 errors when running the tests on FreeBSD 13.2 without any change
  • if I add netlink/netlink.h to the list in test_freebsd that seems to otherwise address the problem exposed by the test, likely it should then be made conditional with a [freebsd13_2 || freebsd14] config predicate?
  • about the previous item, I note an existing [freebsd13] item, does that magically includes freebsd14 or would that be a bug? Edit: found the magic ;)

This also seems to hint that more CI configs will be needed.

@ydirson ydirson force-pushed the freebsd-netlink branch 4 times, most recently from 4714e89 to f64ef18 Compare April 18, 2023 09:34
@ydirson
Copy link
Author

ydirson commented Apr 18, 2023

The freebsd14 warning about NETLINK_GENERIC not being used is likely about its shadowing. Should we add a cfg restriction to the one in freensd/mod.rs? I'm not even sure it has any usefulness, and is not listed in the semver list for freebsd, so maybe we should just nuke it instead?

@JohnTitor
Copy link
Member

The freebsd14 warning about NETLINK_GENERIC not being used is likely about its shadowing. Should we add a cfg restriction to the one in freensd/mod.rs? I'm not even sure it has any usefulness, and is not listed in the semver list for freebsd, so maybe we should just nuke it instead?

Because it's already defined:

pub const NETLINK_GENERIC: ::c_int = 0;

@ydirson
Copy link
Author

ydirson commented May 6, 2023

I see it's already defined. The problem is, FreeBSD has 2 defines with same name and different values, each for a different API and defined in their own header file. The separate header files allow this to work for C by creating a poor man's namespacing, but how are we to handle that in the Rust case?

Some options I can see include:

  • change the definition of the other FreeBSD API providing the clashing define (breaking API for them, no clue how widespread its use is in Rust or even if it is used at all - can we know that?)
  • use a different name for the FreeBSD Netlink API's const: that make the FreeBSD Rust Netlink API incompatible with that of Linux-like OS, which does not look clean and has to be handled specially in crates building on this to provide the rustified API (or really on any crate using the libc Netlink API - that raises some documentation issues to make sure this shape-changing API would be properly used)
  • change the definition of the Netlink API in libc crate, possibly with a compatibility layer on linux-like for transition (which would make the Netlink API more special than the rest of libc)

What do you think?

@asomers
Copy link
Contributor

asomers commented Aug 12, 2023

Regarding NETLINK_GENERIC, ask @GuillaumeGomez . He's the one who added the definition from if_mib.h . Also, grep.app shows that his sysinfo crate is the only one that's using it as he intended. There are many more that try to use your version. Including https://github.com/bytecodealliance/rustix/blob/db143034d80dbe2dc62368b732378c89f16e7569/src/net/types.rs#L800 , which seems to be confused by the definition in libc. One version or the other will have to change. I suspect that the least overall pain to the community would come by removing or renaming the definition from if_mib.h

Also, you should move your definitions out of the freebsd13 and freebsd14 modules and into the freebsd/mod.rs file. The former two should only be used for stuff that changes between versions. Symbols that newly appear in a newer version of FreeBSD can safely be defined by libc in for all OS versions. For example, ino_t changed between versions, so it must be defined in freebsd11, freebsd12, etc.

@GuillaumeGomez
Copy link
Member

FreeBSD and its handling of APIs is a nightmare. They don't care about compatibility, they change types/constants instead of creating new ones. So if NETLINK_GENERIC exists in two different places with different values, to prevent breaking libc's API, here's what I recommend: adding docs to the current NETLINK_GENERIC telling where to look for which values and then create both constants in two submodules (which maybe mention each others?).

@asomers
Copy link
Contributor

asomers commented Aug 13, 2023

What do you mean @GuillaumeGomez ? This problem is caused by if anything too much backwards-compatibility, not too little. The NETLINK_GENERIC mib that you're using was added in 1996, and it still works. The NETLINK_GENERIC that @ydirson wants, by contrast, is based on an RFC written in 2003. By then it was too late to change the name of the old constant, but changing the name of the new constant would break programs' cross-platform compatibility. Hence isolating them in separate headers.

Is there any precedent for adding submodules to libc?

@GuillaumeGomez
Copy link
Member

I mean that if they wanted to change the value of NETLINK_GENERIC, they should have created a new constant (called NETLINK_GENERIC2 for example) instead of just changing the value. They often break API, the worst being in structs where they change the fields order or even remove/add some.

Is there any precedent for adding submodules to libc?

No clue but if there is a good enough reason, I think it could be ok (up to them though!).

@asomers
Copy link
Contributor

asomers commented Aug 13, 2023

I mean that if they wanted to change the value of NETLINK_GENERIC, they should have created a new constant (called NETLINK_GENERIC2 for example) instead of just changing the value. They often break API, the worst being in structs where they change the fields order or even remove/add some.

Are you talking about stuff like struct stat? That, I think, is really cool. FreeBSD uses ELF symbol versioning to change structs like that. That means zero problems for C code; old code will still compile, and old binaries will still run. And yet new code can take advantage of new features without introducing new syscalls like Linux's stat64. The only problems come when using FFI, like Rust does. Rust basically chose to be easily cross-compilable at the expense of not being able to take advantage of ELF symbol versions.

Is there any precedent for adding submodules to libc?

No clue but if there is a good enough reason, I think it could be ok (up to them though!).

I don't see why it couldn't be done. I think it's preferable to move the NETLINK_GENERIC mib into a submodule, because unlink the NETLINK_GENERIC protocol family it is platform-specific. If we do that, what do you think the submodule should be named, and what other symbols should go with it?

@GuillaumeGomez
Copy link
Member

Are you talking about stuff like struct stat? That, I think, is really cool. FreeBSD uses ELF symbol versioning to change structs like that. That means zero problems for C code; old code will still compile, and old binaries will still run. And yet new code can take advantage of new features without introducing new syscalls like Linux's stat64. The only problems come when using FFI, like Rust does. Rust basically chose to be easily cross-compilable at the expense of not being able to take advantage of ELF symbol versions.

I don't know enough about this to comment. From my understanding, having multiple types depending on which minimum version you support is much better and simpler. Anyway, there is no point debating over whether or not FreeBSD took the right decision here.

@ydirson
Copy link
Author

ydirson commented Aug 18, 2023

Also, you should move your definitions out of the freebsd13 and freebsd14 modules and into the freebsd/mod.rs file.

Pushed this, but:

  • it won't compile anywhere before the name clash gets solved, so I stacked a temporary commit to comment the if_mib definition out to keep the branch usable
  • libc-test now reports problems on freebsd12, which was the reason why I had moved the code in version-specific modules

@asomers
Copy link
Contributor

asomers commented Aug 18, 2023

* libc-test now reports problems on `freebsd12`, which was the reason why I had moved the code in version-specific modules

You will have to add an exception in libc-test. See copy_file_range for an example.

@ydirson
Copy link
Author

ydirson commented Aug 24, 2023

Starting to introduce namespaces in libc will have an impact on those generated tests, we'll likely need someone mastering this part?

@bors
Copy link
Contributor

bors commented Sep 29, 2023

☔ The latest upstream changes (presumably #3341) made this pull request unmergeable. Please resolve the merge conflicts.

// sys/net/if_mib.h

/// non-interface-specific
pub const IFMIB_SYSTEM: ::c_int = 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a breaking change btw (moving a constant to a different module).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is handled separately as #3367, on which this PR is based

@ydirson
Copy link
Author

ydirson commented Dec 18, 2023

Rebased onto 0.2.151.

Now forks for real in xen-guest-agent.
Can't wait for v0.3 really 😃

@ydirson
Copy link
Author

ydirson commented Jan 15, 2024

Since 0.2 was branched out I cannot really rebase this branch any more to keep it usable (rustix just pushed a new version demanding libc 0.2.152), so I pushed the new rebased version to https://github.com/xcp-ng/rust-libc/tree/freebsd-netlink-0.2

@ydirson
Copy link
Author

ydirson commented Jan 23, 2024

This last push removes the pcap/nflog part, which is indeed not directly related, and not necessary for our use case.

That removed part is at https://github.com/xcp-ng/rust-libc/tree/freebsd-pcap-nflog in case it is useful for a new PR.

@ydirson
Copy link
Author

ydirson commented Jan 23, 2024

... and rebased to avoid failure of obsolete freebsd12 build job

@bors
Copy link
Contributor

bors commented Feb 17, 2024

☔ The latest upstream changes (presumably #3587) made this pull request unmergeable. Please resolve the merge conflicts.

ydirson added a commit to xcp-ng/rust-libc that referenced this pull request Feb 19, 2024
To allow the "Netlink on FreeBSD" feature to be usable before 0.3 gets
released, this builds on top of both the netlink feature patch from rust-lang#3201
if explicitely requested through a feature flag, but by default
provides the ifmib constants where they are located in previous 0.2
releases.

Since this relies on creating the netlink constants in a separate
module, and it seems we cannot check those automatically, avoids
testing them.

Signed-off-by: Yann Dirson <[email protected]>
@ydirson ydirson marked this pull request as ready for review April 5, 2024 08:37
ydirson added a commit to xcp-ng/rust-libc that referenced this pull request May 22, 2024
To allow the "Netlink on FreeBSD" feature to be usable before 0.3 gets
released, this builds on top of both the netlink feature patch from rust-lang#3201
if explicitely requested through a feature flag, but by default
provides the ifmib constants where they are located in previous 0.2
releases.

Since this relies on creating the netlink constants in a separate
module, and it seems we cannot check those automatically, avoids
testing them.

Signed-off-by: Yann Dirson <[email protected]>
@tgross35 tgross35 added this to the 1.0 milestone Aug 16, 2024
@ydirson
Copy link
Author

ydirson commented Aug 28, 2024

@tgross35 What does this S-blocked tag imply? Anything that could be added to CONTRIBUTING.md?

@tgross35
Copy link
Contributor

tgross35 commented Aug 28, 2024

It just means that something needs to happen, like prerequisite work or a policy decision, before this can move forward. In this case I believe that is #3367.

There is a conflict of NETLINK_GENERIC definitions between
net/if_mib.h and netlink/netlink.h.  netlink.h is already exported in
the crate root for Linux (and those definitions are already used by at
least crates neli and netlink-packet-route), and if_mib is not much
used yet, so this moves if_mib contents into its own namespace to
leave place for netlink support on FreeBSD (rust-lang#3194).

Module definition moved to the end of file to avoid cryptic style.rs
error "constant found after module when it belongs before".

ctest as of 0.22 cannot be told a given header's symbols live in a
submodule, so let the tests ignore all of them.

Signed-off-by: Yann Dirson <[email protected]>
This is an early subset of the Netlink interface, but it
proves sufficient for monitoring changes in IP addresses,
the coverage can be extended later as needed.

Signed-off-by: Yann Dirson <[email protected]>
@ydirson
Copy link
Author

ydirson commented Aug 28, 2024

Makes sense, and now I realize that #3367 is itself blocked, and indeed needs a rebase (and maybe it should be added to the 1.0 milestone too) - but I only saw it by chance because of your post in #3248, received no notification of the tag change, and @bors did not report the need to rebase either 🤷

@tgross35
Copy link
Contributor

No need to rebase just yet, I need to read into things better and figure out what series of events needs to happen. I'll post an update when I figure that out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Netlink support on FreeBSD 13.2+
7 participants