Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] integrate ruapu for runtime cpu isa extension detection #4573

Open
nihui opened this issue Mar 23, 2024 · 5 comments
Open

[RFC] integrate ruapu for runtime cpu isa extension detection #4573

nihui opened this issue Mar 23, 2024 · 5 comments

Comments

@nihui
Copy link

nihui commented Mar 23, 2024

Hello

openblas uses operating system-related methods (parsing /proc/cpuinfo) and architecture-related methods (x86 cpuid) to obtain the isa extension information of the cpu at runtime and dynamically select the optimized code path.

In the neural network acceleration library ncnn ( https://github.com/Tencent/ncnn ), related strategies are also used, but these alone may not be enough to be compatible with more systems and architectures.

Therefore, I recommend integrating ruapu ( https://github.com/nihui/ruapu ) into openblas. Ruapu is a single C header implementation. It uses capture sigill to obtain CPU isa extension support. This is compatible with many operating systems such as linux, windows, macos, and can detect more directly and accurately. Sometimes /proc/cpuinfo or x86 cpuid may lie to us ;)

Comments are welcome, if ruapu is suitable for the project, or if you have any other suggestions

@brada4
Copy link
Contributor

brada4 commented Mar 25, 2024

Just that it cannot tell apart haswell from zen

@martin-frbg
Copy link
Collaborator

thanks, interesting project for sure. (though we tend to use cpuinfo&similar only for direct identification of cpu model - I'm not sure if instruction trapping offers an advantage over querying cpu capability registers for instruction set extensions?)

@nihui
Copy link
Author

nihui commented Mar 26, 2024

https://github.com/nihui/ruapu?tab=readme-ov-file#features

ruapu is not intended to replace cpuinfo or the register method of obtaining information, but is a complementary detection method. The main purpose is to be used when conventional methods such as cpuinfo cannot be implemented, such as on the windows arm platform, such as detecting risc-v vendor extension, in a unified way

Ruapu currently cannot obtain relevant CPU core architectures, such as skylake zen3 cortex-a75. I plan to complete the cpu isa extension first, and then add other information as needed.

@brada4
Copy link
Contributor

brada4 commented Mar 26, 2024

You always need CPUID bits.
https://en.wikipedia.org/wiki/FMA_instruction_set#CPUs_with_FMA4

@martin-frbg
Copy link
Collaborator

I must admit I am not aware of the situation around Windows on Arm - currently waiting for a CI solution to become available for that platform. But from what I've seen it would probably be sufficient for OpenBLAS to support a generic
ARMV8 target, and possibly detect SVE availability (later).
Finding out RISC-V extensions, in particular the presence (and version) of vector support, would indeed be a valuable feature where there appears to be only sketchy support depending on device and Linux kernel version

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants