Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

[FEATURE] Add backend MXGetMaxSupportedArch() and frontend get_rtc_compile_opts() for CUDA enhanced compatibility #20443

Merged
merged 3 commits into from
Jul 13, 2021

Conversation

DickJC123
Copy link
Contributor

Description

This PR makes RTC (as invoked by our Python unittests and other model scripts) work with CUDA enhanced compatibility.
As such, it is an extension of PR #19364, which brought that functionality to the C++ backend. This PR keeps test_operator_gpu.py::test_cuda_rtc from failing on systems that rely on CUDA enhanced compatibility, though those systems may not be part of upstream CI at present.

The changes of this PR are:

  • break off the calculation of the max supported arch into a separate function GetMaxSupportedArch(), and enhance it to use nvrtcGetSupportedArchs() if CUDA_VERSION >= 11.2
  • wrap GetMaxSupportedArch() as MXGetMaxSupportedArch() and add it to the C api
  • use MXGetMaxSupportedArch() in a newly created Python utility function get_rtc_compile_opts(ctx)
  • enhance test_cuda_rtc to use this new function

Our current approach to RTC in Python code, which might fail under CUDA enhanced compatibility:

module = mx.rtc.CudaModule(source)

With this PR, the new approach that succeeds under CUDA enhanced compatibility:

ctx = < some GPU context, e.g. mx.gpu(0) >
module = mx.rtc.CudaModule(source, options=get_rtc_compile_opts(ctx))

get_rtc_compile_opts() will return a list of options that is most appropriate for the system and the gpu context. Currently this is a single option of the form --gpu-architecture=compute_NN or --gpu-architecture=sm_NN as needed.

Background

Starting with CUDA 11.1, a user can accept minor release upgrades of the CUDA toolkit (potentially picking up support for a newer GPU arch) without upgrading the driver (per https://docs.nvidia.com/deploy/cuda-compatibility/index.html). In such cases, the toolkit nvrtc compile toolchain should not only compile CUDA code to PTX, but also further translate the PTX to SASS, since the driver would be unable to JIT-compile to SASS for the newer GPU arch. This is controlled by the nvrtc compiler option used: for example, to compile to SASS for the Ampere A100 the option is --gpu-architecture=sm_80. To compile only to PTX, and so rely on the driver's ability to JIT-compile to SASS, the option is --gpu-architecture=compute_80.

Checklist

Essentials

  • PR's title starts with a category (e.g. [BUGFIX], [MODEL], [TUTORIAL], [FEATURE], [DOC], etc)
  • Changes are complete (i.e. I finished coding on this PR)
  • [~] All changes have test coverage [Verified privately, but ideally upstream's CI would have systems that stress this PR]
  • Code is well-documented

@mxnet-bot
Copy link

Hey @DickJC123 , Thanks for submitting the PR
All tests are already queued to run once. If tests fail, you can trigger one or more tests again with the following commands:

  • To trigger all jobs: @mxnet-bot run ci [all]
  • To trigger specific jobs: @mxnet-bot run ci [job1, job2]

CI supported jobs: [clang, website, miscellaneous, unix-cpu, centos-cpu, windows-cpu, unix-gpu, sanity, edge, windows-gpu, centos-gpu]


Note:
Only following 3 categories can trigger CI :PR Author, MXNet Committer, Jenkins Admin.
All CI tests must pass before the PR can be merged.

@mseth10 mseth10 added the pr-awaiting-testing PR is reviewed and waiting CI build and test label Jul 12, 2021
@DickJC123 DickJC123 requested a review from ptrendx July 12, 2021 02:53
@mseth10 mseth10 added pr-work-in-progress PR is still work in progress pr-awaiting-testing PR is reviewed and waiting CI build and test and removed pr-awaiting-testing PR is reviewed and waiting CI build and test pr-work-in-progress PR is still work in progress labels Jul 12, 2021
@mseth10 mseth10 added pr-work-in-progress PR is still work in progress pr-awaiting-testing PR is reviewed and waiting CI build and test and removed pr-awaiting-testing PR is reviewed and waiting CI build and test pr-work-in-progress PR is still work in progress labels Jul 12, 2021
@mseth10 mseth10 added pr-work-in-progress PR is still work in progress and removed pr-awaiting-testing PR is reviewed and waiting CI build and test labels Jul 13, 2021
@mseth10 mseth10 added pr-awaiting-testing PR is reviewed and waiting CI build and test and removed pr-work-in-progress PR is still work in progress labels Jul 13, 2021
Copy link
Member

@ptrendx ptrendx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mseth10 mseth10 added pr-awaiting-merge Review and CI is complete. Ready to Merge and removed pr-awaiting-testing PR is reviewed and waiting CI build and test labels Jul 13, 2021
@ptrendx ptrendx merged commit 8fd17ce into apache:master Jul 13, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
pr-awaiting-merge Review and CI is complete. Ready to Merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants