CudnnNormConvolution is no longer supported on NVIDIA Hopper GPUs #48203

Tom-Zheng · 2022-11-21T09:18:27Z

PR types

Bug fixes

PR changes

OPs

Describe

CudnnNormConvolution relies on CUDNN fused_ops, which is deprecated and is no longer supported on Hopper GPUs (H100 and later). We add necessary prompts when the user tries to use this OP on Hopper devices.

Meanwhile, we fix the related unit tests on the new hardware.

Note: This is a duplicate of #47089, which was accidentally closed and could not be reopen.

paddle-bot · 2022-11-21T09:18:31Z

你的PR提交成功，感谢你对开源项目的贡献!
请关注后续CI自动化测试结果，详情请参考Paddle-CI手册。
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

Xreki

LGTM

Xreki · 2022-11-22T06:41:01Z

paddle/fluid/operators/fused/cudnn_norm_conv.cu.h

@@ -45,6 +45,14 @@ struct NormConvolutionArgs {
           int stride,
           int dilation,
           int group) {
+    PADDLE_ENFORCE_LT(


有个小疑问，当初为了进一步优化ResNet50性能，使用cudnnFusedOpsPlan_t相关接口实现了多个融合算子，代码分别在cudnn_bn_stats_finalize.cu.h、cudnn_norm_conv.cu.h和cudnn_scale_bias_add_relu.cu.h，公共类实现在cudnn_fusion_helper.h，请问只有cudnn_norm_conv.cu.h不再支持了吗？

目前在H100上只看到cudnn_norm_conv.cu.h相关的test挂掉，别的暂时没问题。

…ddlePaddle#48203) * Skip tests that use fused_ops on H100 * Add error message to FusedOps on H100

* Reduce squeeze2_matmul_fuse_pass, flattent tests time (#47098) * Add missing fp32 config and reduce the testing combination * Reduce trt matmul pass test max examples * Loose TRT fp16 tests tolerance (#47100) * Loose TRT half test tolerance to 1e-3 (#47101) * Loose TRT half test tolerance to 1e-3 (#47106) * Update distributed_strategy.proto (#46531) * Close popen pipe after used (#47053) * Add launch_bounds (#47285) * Fix TRT UT failures (#47488) * Format cherry-picked commits * CudnnNormConvolution is no longer supported on NVIDIA Hopper GPUs (#48203) * Skip tests that use fused_ops on H100 * Add error message to FusedOps on H100 Co-authored-by: Shijie <[email protected]> Co-authored-by: Leo Chen <[email protected]> Co-authored-by: Tian Zheng <[email protected]>

Tom-Zheng added 3 commits November 21, 2022 09:07

Skip tests that use fused_ops on H100

9644fde

Add error message to FusedOps on H100

b4a4c89

Revert change to space

cadfea7

paddle-bot bot added contributor External developers status: proposed labels Nov 21, 2022

Tom-Zheng added the NVIDIA label Nov 21, 2022

paddle-bot bot removed the status: proposed label Nov 21, 2022

onecatcn assigned Xreki Nov 21, 2022

Xreki approved these changes Nov 22, 2022

View reviewed changes

Xreki merged commit df4dfda into PaddlePaddle:develop Nov 22, 2022

zlsh80826 pushed a commit to zlsh80826/Paddle that referenced this pull request Nov 24, 2022

CudnnNormConvolution is no longer supported on NVIDIA Hopper GPUs (Pa…

12d9edb

…ddlePaddle#48203) * Skip tests that use fused_ops on H100 * Add error message to FusedOps on H100

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CudnnNormConvolution is no longer supported on NVIDIA Hopper GPUs #48203

CudnnNormConvolution is no longer supported on NVIDIA Hopper GPUs #48203

Tom-Zheng commented Nov 21, 2022

paddle-bot bot commented Nov 21, 2022

Xreki left a comment

Xreki Nov 22, 2022

Tom-Zheng Nov 22, 2022

CudnnNormConvolution is no longer supported on NVIDIA Hopper GPUs #48203

CudnnNormConvolution is no longer supported on NVIDIA Hopper GPUs #48203

Conversation

Tom-Zheng commented Nov 21, 2022

PR types

PR changes

Describe

paddle-bot bot commented Nov 21, 2022

Xreki left a comment

Choose a reason for hiding this comment

Xreki Nov 22, 2022

Choose a reason for hiding this comment

Tom-Zheng Nov 22, 2022

Choose a reason for hiding this comment