Skip to content

[Runtime] Dynamically load cuTensorMapEncodeTiled (#4330)#4339

Merged
ptillet merged 2 commits intotriton-lang:release/3.0.xfrom
atalman:fix_cuda11_cherry_pick
Jul 17, 2024
Merged

[Runtime] Dynamically load cuTensorMapEncodeTiled (#4330)#4339
ptillet merged 2 commits intotriton-lang:release/3.0.xfrom
atalman:fix_cuda11_cherry_pick

Conversation

@atalman
Copy link
Copy Markdown
Collaborator

@atalman atalman commented Jul 17, 2024

Cherry Pick of :
#4330
#4335

to release/3.0.x branch

malfet and others added 2 commits July 17, 2024 06:56
That is only present in CUDA-12 compatible drivers, and is missing in
CUDA-11 ones

Spiritual follow up after
triton-lang#2771 allows for dynamic query
of the symbol and if run on an older driver, it will return an error.
Also, fix `occupancyMaxActiveClusters` behavior when symbol is not found
(before this change it would crash with null pointer deref, now it
should return a structured exception)
…lang#4335)

There was a function pointer lookup missing in the previous patch.
triton-lang@f9f2960
@atalman atalman requested a review from ptillet as a code owner July 17, 2024 14:00
@ptillet ptillet merged commit 91f24d8 into triton-lang:release/3.0.x Jul 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants