Skip to content

Check x86 features even in no_std#469

Merged
oconnor663 merged 1 commit intoBLAKE3-team:masterfrom
nazar-pc:no_std-x86-feature-check
Apr 24, 2025
Merged

Check x86 features even in no_std#469
oconnor663 merged 1 commit intoBLAKE3-team:masterfrom
nazar-pc:no_std-x86-feature-check

Conversation

@nazar-pc
Copy link
Copy Markdown
Contributor

@nazar-pc nazar-pc commented Apr 9, 2025

This makes it possible to use accelerated versions with runtime feature detection in no_std environment.

This is nice in projects where bare-metal implementation is needed or when most all the code is already no_std and requiring std just to get a faster version of blake3 is inconvenient.

I went with well known and well maintained cpufeatures, though eventually previously used macros should become usable from ::core: rust-lang/rfcs#2725

Note that cpufeatures already handles cases where features are enabled at compile time, so there is no need for explicit #[cfg(target_feature = "X")] blocks anymore.

The only thing that still depends on std feature now is std::io.

@oconnor663
Copy link
Copy Markdown
Member

Interesting, this is my first time looking at the cpufeatures crate. I'll need to read the docs.

@oconnor663
Copy link
Copy Markdown
Member

oconnor663 commented Apr 20, 2025

Do you know what this line means in their docs?

NOTE: target features with an asterisk are unstable (nightly-only) and subject to change to match upstream name changes in the Rust standard library.

avx512f and avx512vl both have stars next to their names, even though they appear to work on the stable toolchain. Are the docs out of date?

@nazar-pc
Copy link
Copy Markdown
Contributor Author

nazar-pc commented Apr 20, 2025

There was a drama around AVX10 versions with both 512-bit and 256-bit-only vector support, but eventually Intel decided to roll it back and settled on 512-bit only. I think it is related to that, see rust-lang/rust#138843

I don't know if avx512f and avx512vl will ever go away, but I'd speculate that probably not because there are existing CPUs that don't support any version of the AVX10, but do support some of those instructions and there must be a way to detect that.

I just tested on AMD 7970X CPU (Zen 4, AVX512-capable) and performance both before and after is the same, meaning the features are detected the same way.

Maybe @tarcieri can share some more details about this if he has time.

@nazar-pc
Copy link
Copy Markdown
Contributor Author

I think this is a better link, AVX512 features are about to be stabilized: rust-lang/rust#138940

Comment thread src/platform.rs
#[cfg(blake3_avx512_ffi)]
#[cfg(any(target_arch = "x86", target_arch = "x86_64"))]
#[inline(always)]
#[allow(unreachable_code)]
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️

@oconnor663 oconnor663 merged commit ed3fd0d into BLAKE3-team:master Apr 24, 2025
64 checks passed
@nazar-pc nazar-pc deleted the no_std-x86-feature-check branch April 24, 2025 16:17
oconnor663 added a commit that referenced this pull request Apr 24, 2025
Previously "std" enabled runtime CPU feature detection on x86, but as of
#469 that's always on.
@oconnor663
Copy link
Copy Markdown
Member

Follow-up doc changes in f3e0184.

@nazar-pc
Copy link
Copy Markdown
Contributor Author

nazar-pc commented Jan 8, 2026

I'd appreciate 1.8.3 release with this change included, it has been a while since PR was merged

@oconnor663
Copy link
Copy Markdown
Member

Good idea, done!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants