Skip to content

Replace cudaGetDriverEntryPoint with cudaGetDriverEntryPointByVersion#4349

Merged
wujingyue merged 2 commits intomainfrom
wjy/driver
Apr 30, 2025
Merged

Replace cudaGetDriverEntryPoint with cudaGetDriverEntryPointByVersion#4349
wujingyue merged 2 commits intomainfrom
wjy/driver

Conversation

@wujingyue
Copy link
Collaborator

The former will be deprecated

// cuStreamWriteValue32 and cuStreamWaitValue32 are CUDA driver API used in the
// context of synchronization in p2p communication over cudaIpcHandle
using StreamOpTest = NVFuserTest;
TEST_F(StreamOpTest, StreamWriteValue32) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test_driver_api.cc already has a similar test

@wujingyue
Copy link
Collaborator Author

!test

@github-actions
Copy link

Description

  • Replace cudaGetDriverEntryPoint with cudaGetDriverEntryPointByVersion

  • Update driver API wrapper macros to include version parameter

  • Remove deprecated CUDA driver API tests


Changes walkthrough 📝

Relevant files
Enhancement
driver_api.cpp
Update driver API wrapper with version parameter                 

csrc/driver_api.cpp

  • Update DEFINE_DRIVER_API_WRAPPER macro to include version parameter
  • Use cudaGetDriverEntryPointByVersion instead of
    cudaGetDriverEntryPoint
  • +28/-25 
    driver_api.h
    Update driver API declarations with version parameter       

    csrc/driver_api.h

  • Update DECLARE_DRIVER_API_WRAPPER macro to include version parameter
  • Specify version for each driver API in ALL_DRIVER_API_WRAPPER
  • Add conditional compilation for CUDA_VERSION >= 12000 to include newer
    APIs
  • +43/-22 
    Tests
    test_multidevice_ipc.cpp
    Remove deprecated CUDA driver API tests                                   

    tests/cpp/test_multidevice_ipc.cpp

  • Remove tests for deprecated CUDA driver APIs cuStreamWriteValue32 and
    cuStreamWaitValue32
  • +0/-21   

    PR Reviewer Guide 🔍

    Here are some key observations to aid the review process:

    🧪 No relevant tests
    ⚡ Recommended focus areas for review

    Possible Issue

    The macro DEFINE_DRIVER_API_WRAPPER now requires a version argument, but the usage of this macro in the code does not provide this argument. This will likely cause a compilation error.

    decltype(::funcName)* funcName =                                  \
    Documentation

    The comment about the magic happening in the csrc/driver_api.h file could be improved for clarity, especially since the macro usage has changed. It should explain the new version argument and its significance.

    // How to lazily load a driver API and invoke it? Just forget about lazy loading
    // and write code as if you are using the driver API directly. Magic will
    // happen. To understand how the magic works, please refer to the cpp file's doc
    Removed Test

    The test StreamWriteValue32 has been removed. It should be verified if this test is no longer needed or if it should be replaced with a similar test that uses the new cudaGetDriverEntryPointByVersion function.

    @wujingyue wujingyue requested a review from zasdfgbnm April 30, 2025 05:53
    @wujingyue wujingyue merged commit 02e8b2c into main Apr 30, 2025
    52 of 53 checks passed
    @wujingyue wujingyue deleted the wjy/driver branch April 30, 2025 17:17
    @xwang233
    Copy link
    Collaborator

    cc @jjsjann123

    The new api cudaGetDriverEntryPointByVersion doesn't exist in cuda 11.8, and this is causing nvfuser with cuda 11.8 build to fail. Since pytorch still has cuda 11.8 wheel, do we want to support nvfuser with cuda 11.8?

    @wujingyue
    Copy link
    Collaborator Author

    The new api cudaGetDriverEntryPointByVersion doesn't exist in cuda 11.8, and this is causing nvfuser with cuda 11.8 build to fail. Since pytorch still has cuda 11.8 wheel, do we want to support nvfuser with cuda 11.8?

    Can you verify #4447?

    @jjsjann123
    Copy link
    Collaborator

    cc @jjsjann123

    The new api cudaGetDriverEntryPointByVersion doesn't exist in cuda 11.8, and this is causing nvfuser with cuda 11.8 build to fail. Since pytorch still has cuda 11.8 wheel, do we want to support nvfuser with cuda 11.8?

    out of curiosity, build against cuda variants is only done in nightly CI? github CI only does the latest cuda I assume?

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

    Labels

    None yet

    Projects

    None yet

    Development

    Successfully merging this pull request may close these issues.

    4 participants