Skip to content

[Runtime] Dynamically load cuTensorMapEncodeTiled#4330

Merged
Jokeren merged 2 commits intomainfrom
malfet/dynamic-fetch-cuTensorMapEncodeTiled
Jul 16, 2024
Merged

[Runtime] Dynamically load cuTensorMapEncodeTiled#4330
Jokeren merged 2 commits intomainfrom
malfet/dynamic-fetch-cuTensorMapEncodeTiled

Conversation

@malfet
Copy link
Copy Markdown
Collaborator

@malfet malfet commented Jul 16, 2024

That is only present in CUDA-12 compatible drivers, and is missing in CUDA-11 ones

Spiritual follow up after #2771 allows for dynamic query of the symbol and if run on an older driver, it will return an error.
Also, fix occupancyMaxActiveClusters behavior when symbol is not found (before this change it would crash with null pointer deref, now it should return a structured exception)

The core Triton is a small number of people, and we receive many PRs (thank
you!). To help us review your code more quickly, if you are a new
contributor (less than 3 PRs merged) we ask that you complete the following
tasks and include the filled-out checklist in your PR description.

Complete the following tasks before sending your PR, and replace [ ] with
[x] to indicate you have done them.

  • I am not making a trivial change, such as fixing a typo in a comment.

  • I have written a PR description following these
    rules.

  • I have run pre-commit run --from-ref origin/main --to-ref HEAD.

  • Select one of the following.

    • I have added tests.
      • /test for lit tests
      • /unittest for C++ tests
      • /python/test for end-to-end tests
    • This PR does not need a test because the issue only manifests itself on older systems, but otherwise this codepath should already be covered by tests.
  • Select one of the following.

    • I have not added any lit tests.
    • The lit tests I have added follow these best practices,
      including the "tests should be minimal" section. (Usually running Python code
      and using the instructions it generates is not minimal.)

That is only present in CUDA-12 compatible drivers, and is missing in CUDA-11 ones

Spiritual follow up after #2771
Comment thread third_party/nvidia/backend/driver.c Outdated
@Jokeren Jokeren merged commit f9f2960 into main Jul 16, 2024
@Jokeren Jokeren deleted the malfet/dynamic-fetch-cuTensorMapEncodeTiled branch July 16, 2024 19:49
atalman pushed a commit to atalman/triton that referenced this pull request Jul 17, 2024
That is only present in CUDA-12 compatible drivers, and is missing in
CUDA-11 ones

Spiritual follow up after
triton-lang#2771 allows for dynamic query
of the symbol and if run on an older driver, it will return an error.
Also, fix `occupancyMaxActiveClusters` behavior when symbol is not found
(before this change it would crash with null pointer deref, now it
should return a structured exception)
ptillet pushed a commit that referenced this pull request Jul 17, 2024
Cherry Pick of : 
#4330
#4335

to release/3.0.x branch

---------

Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com>
Co-authored-by: Keren Zhou <kerenzhou@openai.com>
bertmaher pushed a commit to bertmaher/triton that referenced this pull request Dec 10, 2024
That is only present in CUDA-12 compatible drivers, and is missing in
CUDA-11 ones

Spiritual follow up after
triton-lang#2771 allows for dynamic query
of the symbol and if run on an older driver, it will return an error.
Also, fix `occupancyMaxActiveClusters` behavior when symbol is not found
(before this change it would crash with null pointer deref, now it
should return a structured exception)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants