clean some IS_TRT_VERSION_LT(8000) #75919

co63oc · 2025-10-17T11:55:05Z

PR Category

User Experience

PR Types

Others

Description

clean some IS_TRT_VERSION_LT(8000)

paddle-bot · 2025-10-17T11:55:16Z

你的PR提交成功，感谢你对开源项目的贡献!
请关注后续CI自动化测试结果，详情请参考Paddle-CI手册。
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

luotao1

LGTM from @yuanlehome

…els/impl (#4) * fix custom device save error (PaddlePaddle#75961) * fix blas for custom device (PaddlePaddle#75969) * Revert "Revert "Disable NVIDIA_TF32_OVERRIDE by default for better precision.…" (PaddlePaddle#75972) This reverts commit 945ea69. * [Compat] Define the macro `CHECK` only when it is not already defined (PaddlePaddle#75963) * [DLPack] Implement dtype and device exchange protocol (PaddlePaddle#75973) * [CppExtension] Support `os.PathLike` in `CppExtension`/`CUDAExtension` and expose `IS_WINDOWS` to `paddle.utils.cpp_extension` (PaddlePaddle#75976) * Support md5 checksum for API output tensor (PaddlePaddle#75835) * support md5 checksum * fix build * fix build * fix build * fix build * dump the md5 check sum to file * fix err * add switch and full support md5 * add flags to control precision and refine test * rm useless commit * add ut * add ut * fix shape=int for size_args_decorator (PaddlePaddle#75983) * fix typo disable_loggling -> disable_logging (PaddlePaddle#75978) * fix typo disable_loggling -> disable_logging * fix * fix * fix _get_arch_info (PaddlePaddle#75921) * clean some IS_TRT_VERSION_GE(5130) (PaddlePaddle#75946) * clean some IS_TRT_VERSION_GE(8000) (PaddlePaddle#75944) * clean some IS_TRT_VERSION_LT(8000) (PaddlePaddle#75919) * clean get_cuda_version < 8100 (PaddlePaddle#75895) * clean get_cuda_version < 8100 * fix * clean get_cuda_version() < 11020 - part (PaddlePaddle#75618) * clean get_cuda_version() < 11020 in test_variable_length_memory_efficient_attention.py (PaddlePaddle#75600) * clean get_cuda_version() < 11020 in test_variable_length_memory_efficient_attention.py * fix * clean IS_TRT_VERSION_LT(8000) in tensorrt plugin (PaddlePaddle#75920) * fix test_dynamic_engine (PaddlePaddle#75943) * [Bug Fix] Fix missing header include in activation_offloader.h (PaddlePaddle#75936) * revert_mkl_num_threads (PaddlePaddle#75985) * [Bug Fix] Improve error handling and compatibility in TensorRT engine tests (PaddlePaddle#75948) - 在 test_tensorrt_engine_instruction.cc 里，原先直接用 TensorRT 的 `FullyConnected` 层，现在改成手工搭建 Shuffle → Constant → MatrixMultiply → ElementWise → Shuffle 的子网，等价地实现带 bias 的全连接。这样做主要是规避 TensorRT 里旧版 FC 层的限制，并能更清楚地控制动态形状和推理流程。 - 每一步都补充了更具体的 `PADDLE_ENFORCE_NOT_NULL` 抛错信息，比如提示 reshape、常量层、矩阵乘、加法等各环节可能失败的原因，便于在引擎生成失败时快速定位问题。 - 针对 TensorRT 8.6 之后 `ICudaEngine` API 的变化，新增了 `IS_TRT_VERSION_GE(8600)` 的分支，在新老版本之间分别检查 `getNbIOTensors()` 或 `getNbBindings()`，保证测试在不同 TensorRT 版本下都能正确校验。 - 动态 shape 的测试把 Shuffle 失败时的报错信息改得更精准，明确指出是运行时 shape 绑定的问题。 - 插件测试同样完善了插件创建、层加入失败时的提示，并加入了前述的 TensorRT 版本兼容检查，使调试自定义插件时的可诊断性更好。 * 4th-batch-68-代码梯度计算错误 (PaddlePaddle#75787) * 1013 * 1015 * 1015 * 1015 * 1015 * 1015 * 1016 * 1016 * 1017 * Revert test_activation_op.py to fix bug caused by commit deed9d3 (PaddlePaddle#75937) * Revert test_activation_op.py to fix bug caused by commit deed9d3 * fix: Update max_relative_error in TestSigmoid_Complex64 to improve gradient checking accuracy * 4th-batch-19-代码调用错误 (PaddlePaddle#75759) * 1012 * 1014 * 1014 * 1016 * 1016 * 1017 * 1017 * 1018 * 1018 * 4th-batch-17-代码限制多设备场景(补充修复) (PaddlePaddle#75959) * 1012 * 1012 * 1020 * 【UnitTestFix No.3】fix test_conv3d_transpose_op.py (PaddlePaddle#75945) * [Bug Fix] add missing header include in ir_context.h (PaddlePaddle#75927) * add tensorrt 10 support int64 (PaddlePaddle#75951) * add tensorrt 10 support int64 * fix * [Compat] Try import `tvm_ffi` when enable torch proxy (PaddlePaddle#75991) * clean pip3.8 in Dockerfile.develop.npu (PaddlePaddle#75893) * clean pip3.8 in Dockerfile.develop.npu * fix * fix * fix masked_fill_grad value_grad bug (PaddlePaddle#75988) * 4th-batch-20-代码存在未被使用的变量 (PaddlePaddle#75761) * 1012 * 1014 * 1014 * 1016 * 1016 * 1017 * 1017 * 1018 * 1018 * use op_test.get_cuda_version (PaddlePaddle#75994) * merge ifdef PADDLE_WITH_CUDA in build_strategy.cc (PaddlePaddle#75962) * [Cherry-pick] Optimize FlashMask v3 performance (PaddlePaddle#75737) (PaddlePaddle#75984) * Optimize FlashMask v3 performance (PaddlePaddle#75737) * tune bwd tile size * tune bwd tile size for seqlen <= 8192 * fix cuda 700 cause by incorrect bwd tile size * set scheduler_needs_semaphore to true * update fa submodule * update fa submodule * update fa submodule * update fa submodule * fix codestyle * Revert "fix codestyle" This reverts commit e14a08e. * fix mistach tile size in phi, and refine bwd interface * refine * refine * fix codestyle * [Stride] Disable Split Stride Kernel (PaddlePaddle#75987) * [Stride] Disable Split Stride Kernel * refine * [Bug Fix] Fix NaN/Inf check to support float16, bfloat16, and complex types (PaddlePaddle#75935) - 在 nan_inf_utils_detail.h 里把 `TensorCheckerVisitor::apply` 拆成几类模板重载：整型继续直接跳过；标准浮点数走原来的检查；新增了对 `phi::dtype::float16`、`phi::dtype::bfloat16` 的专门分支，以及对复数类型的分支，并为其它不支持的类型打印明确的 `VLOG`。这样半精度、bfloat16 等之前没法依靠 `std::is_floating_point` 判定的类型也能被纳入 NaN/Inf 检查。 - 新增头文件 `<typeinfo>`、`float16.h`、`bfloat16.h` 是为了支撑上述新分支里的类型别名和 `typeid` 输出。 - 把原先分散在 `apply` 里的检查逻辑抽成了私有的 `do_check`，并把获取 `DeviceContext` 的指针改成 `const Context*`，减少代码重复同时保证不会误改上下文。 - 新增的“跳过未支持类型”的日志可以帮助调试：遇到自定义或未覆盖的数据类型时，会直接在 VLOG 中报出具体类型名字，方便扩展。 * [Stride] Optimizing H2D Copy by TensorIterator and OpenMP (PaddlePaddle#75192) * cpu init * v1 * final * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine * [Precision Depth Alignment] implement torch compatible max_pool2d grad kernel (PaddlePaddle#75965) * add torch_compatible_pool_grad * add test * update * rename flag * fix to_tensor bug (PaddlePaddle#76000) * [CINN] Fix bug of infer_symbol_shape for crop op (PaddlePaddle#75992) * fix bug of infer_symbol_shape for crop op * fix unittest * 【CUDA Kernel No.93】psroi_pool_grad_kernel算子修复 (PaddlePaddle#75938) * fix psroi_pool_grad_kernel.cu * fix psroi_pool_grad_kernel.cu header include order * fix win32 rms_norm. (PaddlePaddle#76007) * Update check_approval.sh (PaddlePaddle#76012) * Update check_approval.sh * Update check_approval.sh * [Fix] log sigmoid complex (PaddlePaddle#75953) * feature: Add specialized LogSigmoidFunctor and CudaLogSigmoidFunctor for complex numbers This commit introduces specialized implementations of LogSigmoidFunctor and CudaLogSigmoidFunctor to handle complex number inputs. The new implementations utilize direct formulas for improved accuracy and stability in calculations involving complex types. * refactor: Optimize LogSigmoidFunctor and CudaLogSigmoidFunctor for complex types by caching exp(-x) to reduce redundant computations. This change enhances performance while maintaining accuracy in calculations. * refactor: modified the formula in LogSigmoidFunctor to make it numerical stable * [PHI] Flash Attention V3 128B aligned chunking load/store (PaddlePaddle#76003) * [PHI] Flash Attention V3 128B aligned chunking load/store * Update flashattn version * [Slice] Fix big tensor (PaddlePaddle#76004) * fix python version in ci/utils.sh (PaddlePaddle#75997) * clean pip3.8 in Dockerfile.develop.dtk (PaddlePaddle#75738) * fix repeat IS_TRT_VERSION_GE (PaddlePaddle#75975) * clean IS_TRT_VERSION_GE(5000) (PaddlePaddle#75990) * clean IS_TRT_VERSION_GE(5000) * ci * Initial plan * Fix int32 overflow in elementwise_grad_kernel_impl.h Co-authored-by: zrr1999 <[email protected]> * Fix int32 overflow in accuracy_check and isclose kernel impl Co-authored-by: zrr1999 <[email protected]> * Fix int32 overflow in renorm, unstack, kldiv, and svdvals_grad impl Co-authored-by: zrr1999 <[email protected]> * Fix int32 overflow in gumbel_softmax and kldiv_loss impl Co-authored-by: zrr1999 <[email protected]> * Fix int32 overflow in lrn and frame kernel impl Co-authored-by: zrr1999 <[email protected]> * Fix function signatures in lrn_kernel_impl to match int64_t parameters Co-authored-by: zrr1999 <[email protected]> * Add validation checks for large tensor support in LRN kernels Co-authored-by: zrr1999 <[email protected]> * Fix int32 overflow in stft and fold/unfold kernel impl Co-authored-by: zrr1999 <[email protected]> * Fix int32 overflow in lstm, lstsq, qr_grad, and spectral_norm_grad impl Co-authored-by: zrr1999 <[email protected]> * Fix int32 overflow in warpctc, warprnnt, gru_unit and spectral_norm impl Co-authored-by: zrr1999 <[email protected]> * Fix int32 overflow in svd_grad and conv kernel impl Co-authored-by: zrr1999 <[email protected]> --------- Co-authored-by: Yuqiang Ge <[email protected]> Co-authored-by: Zhaowu Pan <[email protected]> Co-authored-by: co63oc <[email protected]> Co-authored-by: Nyakku Shigure <[email protected]> Co-authored-by: SUN Dong <[email protected]> Co-authored-by: HydrogenSulfate <[email protected]> Co-authored-by: Runming Xie <[email protected]> Co-authored-by: zhengshengning <[email protected]> Co-authored-by: fanhaoxuee <[email protected]> Co-authored-by: Bvicii <[email protected]> Co-authored-by: Chen Zhiyang <[email protected]> Co-authored-by: umiswing <[email protected]> Co-authored-by: Eddie-Wang <[email protected]> Co-authored-by: Zhan Rongrui <[email protected]> Co-authored-by: wanghuancoder <[email protected]> Co-authored-by: zyfncg <[email protected]> Co-authored-by: xxiu1 <[email protected]> Co-authored-by: Tao Luo <[email protected]> Co-authored-by: Qianyue He <[email protected]> Co-authored-by: copilot-swe-agent[bot] <[email protected]>

clean some IS_TRT_VERSION_LT(8000)

a94e3e3

paddle-bot bot added the contributor External developers label Oct 17, 2025

co63oc changed the title ~~CI测试不review clean some IS_TRT_VERSION_LT(8000) [fluid_ops]~~ clean some IS_TRT_VERSION_LT(8000) Oct 17, 2025

luotao1 added the HappyOpenSource 快乐开源活动issue与PR label Oct 21, 2025

luotao1 assigned luotao1 and yuanlehome Oct 21, 2025

luotao1 approved these changes Oct 22, 2025

View reviewed changes

luotao1 merged commit 5f1ea8a into PaddlePaddle:develop Oct 22, 2025
53 checks passed

co63oc deleted the l33 branch October 22, 2025 23:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

clean some IS_TRT_VERSION_LT(8000) #75919

clean some IS_TRT_VERSION_LT(8000) #75919

Uh oh!

co63oc commented Oct 17, 2025

Uh oh!

paddle-bot bot commented Oct 17, 2025

Uh oh!

luotao1 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

clean some IS_TRT_VERSION_LT(8000) #75919

clean some IS_TRT_VERSION_LT(8000) #75919

Uh oh!

Conversation

co63oc commented Oct 17, 2025

PR Category

PR Types

Description

Uh oh!

paddle-bot bot commented Oct 17, 2025

Uh oh!

luotao1 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants