[Runtime] Dynamically load cuTensorMapEncodeTiled by malfet · Pull Request #4330 · triton-lang/triton

malfet · 2024-07-16T00:30:08Z

That is only present in CUDA-12 compatible drivers, and is missing in CUDA-11 ones

Spiritual follow up after #2771 allows for dynamic query of the symbol and if run on an older driver, it will return an error.
Also, fix occupancyMaxActiveClusters behavior when symbol is not found (before this change it would crash with null pointer deref, now it should return a structured exception)

The core Triton is a small number of people, and we receive many PRs (thank
you!). To help us review your code more quickly, if you are a new
contributor (less than 3 PRs merged) we ask that you complete the following
tasks and include the filled-out checklist in your PR description.

Complete the following tasks before sending your PR, and replace [ ] with
[x] to indicate you have done them.

I am not making a trivial change, such as fixing a typo in a comment.
I have written a PR description following these
rules.
I have run pre-commit run --from-ref origin/main --to-ref HEAD.
Select one of the following.
- I have added tests.
  - /test for lit tests
  - /unittest for C++ tests
  - /python/test for end-to-end tests
- This PR does not need a test because the issue only manifests itself on older systems, but otherwise this codepath should already be covered by tests.
Select one of the following.
- I have not added any lit tests.
- The lit tests I have added follow these best practices,
  including the "tests should be minimal" section. (Usually running Python code
  and using the instructions it generates is not minimal.)

That is only present in CUDA-12 compatible drivers, and is missing in CUDA-11 ones Spiritual follow up after #2771

That is only present in CUDA-12 compatible drivers, and is missing in CUDA-11 ones Spiritual follow up after triton-lang#2771 allows for dynamic query of the symbol and if run on an older driver, it will return an error. Also, fix `occupancyMaxActiveClusters` behavior when symbol is not found (before this change it would crash with null pointer deref, now it should return a structured exception)

Cherry Pick of : #4330 #4335 to release/3.0.x branch --------- Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com> Co-authored-by: Keren Zhou <kerenzhou@openai.com>

That is only present in CUDA-12 compatible drivers, and is missing in CUDA-11 ones Spiritual follow up after triton-lang#2771 allows for dynamic query of the symbol and if run on an older driver, it will return an error. Also, fix `occupancyMaxActiveClusters` behavior when symbol is not found (before this change it would crash with null pointer deref, now it should return a structured exception)

[Runtime] Dynamically load cuTensorMapEncodeTiled

739f184

That is only present in CUDA-12 compatible drivers, and is missing in CUDA-11 ones Spiritual follow up after #2771

malfet requested a review from ptillet as a code owner July 16, 2024 00:30

malfet mentioned this pull request Jul 16, 2024

Pytorch 2.4 RC cu118 wheels do not work on old drivers pytorch/pytorch#130684

Closed

ThomasRaoux requested a review from Jokeren July 16, 2024 00:33

Jokeren requested changes Jul 16, 2024

View reviewed changes

Comment thread third_party/nvidia/backend/driver.c Outdated

Fix typo

7c569e1

Jokeren approved these changes Jul 16, 2024

View reviewed changes

Jokeren merged commit f9f2960 into main Jul 16, 2024

Jokeren deleted the malfet/dynamic-fetch-cuTensorMapEncodeTiled branch July 16, 2024 19:49

atalman mentioned this pull request Jul 17, 2024

[Runtime] Dynamically load cuTensorMapEncodeTiled (#4330) #4339

Merged

jlebar mentioned this pull request Sep 3, 2024

Build LLVMAarch64CodeGen if CMAKE_OSX_ARCHITECTURES is arm64. #4637

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Runtime] Dynamically load cuTensorMapEncodeTiled#4330

[Runtime] Dynamically load cuTensorMapEncodeTiled#4330
Jokeren merged 2 commits intomainfrom
malfet/dynamic-fetch-cuTensorMapEncodeTiled

malfet commented Jul 16, 2024 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

malfet commented Jul 16, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

malfet commented Jul 16, 2024 •

edited

Loading