You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For a given function f, are there any (typically architecture-specific) heuristics for when a SIMD version opposite to a serial version of an algorithm should be chosen?
There are definitely spots where platform-specific performance tuning can be a great idea. This can be a topic for a proper research project, potentially exposing threshold variables with extern and tuning our sz_find, sz_copy, and other kernels for the target platform. It get's a bit trickier, if the same kernel depends on multiple variables and we are forced to grid-search...
Describe what you are looking for
For a given function
f
, are there any (typically architecture-specific) heuristics for when a SIMD version opposite to a serial version of an algorithm should be chosen?For instance, for
one such heuristics could be
.
Can you contribute to the implementation?
Is your feature request specific to a certain interface?
It applies to everything
Contact Details
No response
Is there an existing issue for this?
Code of Conduct
The text was updated successfully, but these errors were encountered: