Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix reduce to match the documentation and use numeric limits #920

Open
gevtushenko opened this issue Mar 11, 2023 · 0 comments
Open

Fix reduce to match the documentation and use numeric limits #920

gevtushenko opened this issue Mar 11, 2023 · 0 comments

Comments

@gevtushenko
Copy link
Collaborator

In cub::DeviceReduce::{Arg,}{Max,Min} we guarantee to use std::numeric_limits<T>::{lowest,max} but what we actually use is cub::Traits<T>::{Lowest,Max}. This means that if a user specializes std::numeric_limits<T> for a custom type, there will still be a compilation error. Reproducer:

#include <cub/cub.cuh>
#include <limits>

struct A {
};

namespace std {
  // Specialize numeric_limits for A
  template <>
  struct numeric_limits<A> {
    static constexpr bool is_specialized = true;
    static constexpr A min() noexcept { return A{}; }
    static constexpr A max() noexcept { return A{}; }
    static constexpr A lowest() noexcept { return A{}; }
  };
}

int main() {
  A* in{};
  A* out{};

  std::size_t temp_storage_bytes{};
  cub::DeviceReduce::Max(nullptr, temp_storage_bytes, in, out, 0);
}

results in the following error:

/cub/device/device_reduce.cuh(761): error: class "cub::Traits<A>" has no member "Lowest"

Since this is broken, we should change the docs along with the code to use ::cuda::std::numeric_limits instead. The only thing that prevents us from doing so is the fact that ::cuda::std::numeric_limits doesn't support extended floating point types (__half, __nv_bfloat16). @wmaxey for visibility.

If this issue becomes critical, we'll have to wrap ::cuda::std::numeric_limits into our own trait that supports extended floating point types. Otherwise, I'd prefer to wait till libcu++ trait supports extended fp types. Extended fp support in libcu++, it it's turn, is blocked by bf16 not being redistributable according to the EULA license (issue).

@jarmak-nv jarmak-nv transferred this issue from NVIDIA/cub Nov 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: No status
Development

No branches or pull requests

1 participant