Skip to content

Conversation

@megemini
Copy link
Contributor

@megemini megemini commented Oct 23, 2025

PR Category

Environment Adaptation

PR Types

Improvements

Description

本 PR 修复了 setuptools 80+ 下自定义算子安装包结构不合理的问题,同时支持了 pip install . --no-build-isolation 的现代化安装方式(python setup.py xxx 方式已经被 setuptools 弃用很久了)

本 PR 不影响之前自定义算子的书写、构建安装、使用方式,对于用户应当是透明的,对安装后的包目录结构有少许变动,但对于 Python 加载机制来说是一致的

修改前,setuptools 79:

`-- mix_relu_extension-0.0.0-py3.10-linux-x86_64.egg
    |-- EGG-INFO
    |   |-- PKG-INFO
    |   |-- SOURCES.txt
    |   |-- dependency_links.txt
    |   |-- native_libs.txt
    |   |-- not-zip-safe
    |   `-- top_level.txt
    |-- mix_relu_extension.py
    |-- mix_relu_extension_pd_.so
    `-- version.txt

修改后,无论 setuptools 79 还是 80 都生成如下结构的文件

|-- mix_relu_extension
|   |-- __init__.py
|   `-- mix_relu_extension_pd_.so
`-- mix_relu_extension-0.0.0-py3.10.egg-info
    |-- PKG-INFO
    |-- SOURCES.txt
    |-- dependency_links.txt
    |-- not-zip-safe
    `-- top_level.txt

另外本 PR 新增的 pip install . --no-build-isolation 安装方式则是生成如下结构的文件

|--custom_setup_ops # 换了个 OP,名字变了,只看目录结构即可
|   |-- __init__.py
|   `-- custom_setup_ops_pd_.so
`-- custom_setup_ops-0.0.0.dist-info
    |-- INSTALLER
    |-- METADATA
    |-- RECORD
    |-- REQUESTED
    |-- WHEEL
    |-- direct_url.json
    `-- top_level.txt

Related links

@paddle-bot
Copy link

paddle-bot bot commented Oct 23, 2025

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@paddle-bot paddle-bot bot added the contributor External developers label Oct 23, 2025
@codecov-commenter
Copy link

codecov-commenter commented Oct 23, 2025

Codecov Report

❌ Patch coverage is 7.75862% with 107 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (develop@c7ed86a). Learn more about missing BASE report.

Files with missing lines Patch % Lines
python/paddle/utils/cpp_extension/cpp_extension.py 7.75% 107 Missing ⚠️

❌ Your patch status has failed because the patch coverage (7.75%) is below the target coverage (90.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files
@@            Coverage Diff             @@
##             develop   #76008   +/-   ##
==========================================
  Coverage           ?    7.75%           
==========================================
  Files              ?        1           
  Lines              ?      116           
  Branches           ?        0           
==========================================
  Hits               ?        9           
  Misses             ?      107           
  Partials           ?        0           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Member

@SigureMo SigureMo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

顺师傅牛啊,我找时间看看

@megemini
Copy link
Contributor Author

顺师傅牛啊,我找时间看看

呃 ... ... 这道题你的啊 ~ 🤣

我还没准备好提交 review ~ 🤭 虽然测试过了,不过,应该还是有问题 ~

整体逻辑是先 custom_write_stub,然后把包整理整理放到对应的目录中 ~

我尝试在本地 install,它把包构建在当前目录,但没有把包移到 /usr/local/lib/python3.9/dist-packages ,所以 pip list 是看不到的,这里应该有问题 ~

之所有测试能过,也是因为包在当前目录 ~ 而在其他目录 import 这个 module 会出错 ~

第一次 commit 的时候,import 是没问题的,但是 CI 错了 ~ GPT-5 的定位:


失败原因定位
你的用例在 setUp 里这样判断:

用 site.getsitepackages()[0] 作为安装目录
运行 setup.py install --install-lib=
然后 os.listdir(site_dir),筛出包含 "mix_relu_extension" 的条目,要求恰好只有 1 个,再把这个条目追加到 sys.path
之前我们的安装会在 site_dir 里放入三个和名字匹配的条目:

mix_relu_extension.py(桥接 stub)
mix_relu_extension.so 或 mix_relu_extension_pd_.so(共享库)
mix_relu_extension-0.0.0-...-egg-info(元数据目录)
因此匹配数为 3,触发断言。

我再看看具体要怎么解决 ~ 🫠

@megemini
Copy link
Contributor Author

Update 20251028

修改如下:

  • egg-info 的清理逻辑,pip show 和 pip list 可以找到包了
  • 修改了测试用例,assert len(custom_egg_path) == 1 改为 assert len(custom_egg_path) == 2, 测试了一下 setuptools 在 80.9.0 和 57.1.0 中测试通过。不太清楚为什么之前的测试用例这里是 1

以下是测试日志:


# 构建并安装、测试

➜  cpp_extension git:(setuptools80) ✗ python test_mixed_extension_setup.py
/usr/local/lib/python3.9/dist-packages/setuptools/_distutils/extension.py:150: UserWarning: Unknown Extension options: 'verbose'
  warnings.warn(msg)
/usr/local/lib/python3.9/dist-packages/setuptools/_distutils/dist.py:289: UserWarning: Unknown distribution option: 'metadata_version'
  warnings.warn(msg)
[2025-10-27 05:51:53,174] [    INFO] dist.py:1018 - running install
/usr/local/lib/python3.9/dist-packages/setuptools/_distutils/cmd.py:90: SetuptoolsDeprecationWarning: setup.py install is deprecated.
!!

        ********************************************************************************
        Please avoid running ``setup.py`` directly.
        Instead, use pypa/build, pypa/installer or other
        standards-based tools.

        By 2025-Oct-31, you need to update your project and remove deprecated calls
        or your builds will no longer be supported.

        See https://blog.ganssle.io/articles/2021/10/setup-py-deprecated.html for details.
        ********************************************************************************

!!
  self.initialize_options()
[2025-10-27 05:51:53,177] [    INFO] dist.py:1018 - running build
[2025-10-27 05:51:53,177] [    INFO] dist.py:1018 - running build_ext
[2025-10-27 05:51:53,197] [    INFO] build_ext.py:538 - building 'mix_relu_extension' extension
[2025-10-27 05:51:53,197] [    INFO] dir_util.py:58 - creating /paddle/Paddle/test/cpp_extension/build/mix_relu_extension/lib.linux-x86_64-cpython-39/build/mix_relu_extension/temp.linux-x86_64-cpython-39
[2025-10-27 05:51:53,198] [    INFO] spawn.py:77 - x86_64-linux-gnu-g++ -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -fPIC -I/usr/local/lib/python3.9/dist-packages/paddle/include -I/usr/local/lib/python3.9/dist-packages/paddle/include/third_party -I/usr/local/lib/python3.9/dist-packages/paddle/include/paddle/phi/api/include/compat -I/usr/local/lib/python3.9/dist-packages/paddle/include/paddle/phi/api/include/compat/torch/csrc/api/include -I/usr/lib/python3/dist-packages/paddle/include -I/usr/lib/python3/dist-packages/paddle/include/third_party -I/usr/lib/python3/dist-packages/paddle/include/paddle/phi/api/include/compat -I/usr/lib/python3/dist-packages/paddle/include/paddle/phi/api/include/compat/torch/csrc/api/include -I/usr/lib/python3.9/dist-packages/paddle/include -I/usr/lib/python3.9/dist-packages/paddle/include/third_party -I/usr/lib/python3.9/dist-packages/paddle/include/paddle/phi/api/include/compat -I/usr/lib/python3.9/dist-packages/paddle/include/paddle/phi/api/include/compat/torch/csrc/api/include -I/paddle/Paddle/test/cpp_extension -I/usr/local/lib/python3.9/dist-packages/paddle/include -I/usr/local/lib/python3.9/dist-packages/paddle/include/third_party -I/usr/local/lib/python3.9/dist-packages/paddle/include/paddle/phi/api/include/compat -I/usr/local/lib/python3.9/dist-packages/paddle/include/paddle/phi/api/include/compat/torch/csrc/api/include -I/usr/include/python3.9 -I/usr/include/python3.9 -c /paddle/Paddle/test/cpp_extension/mix_relu_and_extension.cc -o /paddle/Paddle/test/cpp_extension/build/mix_relu_extension/lib.linux-x86_64-cpython-39/build/mix_relu_extension/temp.linux-x86_64-cpython-39/mix_relu_and_extension.o -w -DPADDLE_WITH_CUSTOM_KERNEL -DPADDLE_EXTENSION_NAME=mix_relu_extension -D_GLIBCXX_USE_CXX11_ABI=1 -std=c++17
[2025-10-27 05:51:58,773] [    INFO] spawn.py:77 - x86_64-linux-gnu-g++ -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -shared -Wl,-O1 -Wl,-Bsymbolic-functions /paddle/Paddle/test/cpp_extension/build/mix_relu_extension/lib.linux-x86_64-cpython-39/build/mix_relu_extension/temp.linux-x86_64-cpython-39/mix_relu_and_extension.o -L/usr/local/lib/python3.9/dist-packages/paddle/libs -L/usr/local/lib/python3.9/dist-packages/paddle/base -L/usr/lib/x86_64-linux-gnu -Wl,--enable-new-dtags,-rpath,/usr/local/lib/python3.9/dist-packages/paddle/libs -Wl,--enable-new-dtags,-rpath,/usr/local/lib/python3.9/dist-packages/paddle/base -o build/mix_relu_extension/lib.linux-x86_64-cpython-39/mix_relu_extension.so -l:libpaddle.so
[2025-10-27 05:51:59,211] [    INFO] dist.py:1018 - running install_lib
[2025-10-27 05:51:59,219] [    INFO] file_util.py:130 - copying build/mix_relu_extension/lib.linux-x86_64-cpython-39/version.txt -> /usr/local/lib/python3.9/dist-packages
[2025-10-27 05:51:59,219] [    INFO] file_util.py:130 - copying build/mix_relu_extension/lib.linux-x86_64-cpython-39/mix_relu_extension.py -> /usr/local/lib/python3.9/dist-packages
[2025-10-27 05:51:59,219] [    INFO] file_util.py:130 - copying build/mix_relu_extension/lib.linux-x86_64-cpython-39/mix_relu_extension.so -> /usr/local/lib/python3.9/dist-packages
[2025-10-27 05:51:59,223] [    INFO] util.py:485 - byte-compiling /usr/local/lib/python3.9/dist-packages/mix_relu_extension.py to mix_relu_extension.cpython-39.pyc
[2025-10-27 05:51:59,223] [    INFO] dist.py:1018 - running install_egg_info
[2025-10-27 05:51:59,243] [    INFO] dist.py:1018 - running egg_info
[2025-10-27 05:51:59,250] [    INFO] dir_util.py:58 - creating mix_relu_extension.egg-info
[2025-10-27 05:51:59,251] [    INFO] egg_info.py:651 - writing mix_relu_extension.egg-info/PKG-INFO
[2025-10-27 05:51:59,251] [    INFO] egg_info.py:279 - writing dependency_links to mix_relu_extension.egg-info/dependency_links.txt
[2025-10-27 05:51:59,251] [    INFO] egg_info.py:279 - writing top-level names to mix_relu_extension.egg-info/top_level.txt
[2025-10-27 05:51:59,251] [    INFO] util.py:332 - writing manifest file 'mix_relu_extension.egg-info/SOURCES.txt'
[2025-10-27 05:51:59,259] [    INFO] sdist.py:203 - reading manifest file 'mix_relu_extension.egg-info/SOURCES.txt'
[2025-10-27 05:51:59,259] [    INFO] util.py:332 - writing manifest file 'mix_relu_extension.egg-info/SOURCES.txt'
[2025-10-27 05:51:59,259] [    INFO] util.py:332 - Copying mix_relu_extension.egg-info to /usr/local/lib/python3.9/dist-packages/mix_relu_extension-0.0.0-py3.9.egg-info
[2025-10-27 05:51:59,260] [    INFO] dist.py:1018 - running install_scripts
W1027 05:51:59.466156 11598 gpu_resources.cc:114] Please NOTE: device: 0, GPU Compute Capability: 6.1, Driver API Version: 12.2, Runtime API Version: 11.8
/usr/local/lib/python3.9/dist-packages/paddle/pir/math_op_patch.py:240: UserWarning: Tensor do not have 'place' interface for pir graph mode, try not to use it. None will be returned.
  warnings.warn(
I1027 05:51:59.583057 11598 pir_interpreter.cc:1524] New Executor is Running ...
I1027 05:51:59.583117 11598 pir_interpreter.cc:1547] pir interpreter is running by multi-thread mode ...
W1027 05:51:59.595265 11598 tensor.cc:241] Allocating memory through `mutable_data` method is deprecated since version 2.3, and `mutable_data` method will be removed in version 2.4! Please use `paddle::empty/full` method to create a new Tensor with allocated memory, and use data<T>() method to get the memory pointer of tensor instead. Reason: When calling `mutable_data` to allocate memory, the datatype, and data layout of tensor may be in an illegal state.
W1027 05:51:59.596131 11598 tensor.cc:241] Allocating memory through `mutable_data` method is deprecated since version 2.3, and `mutable_data` method will be removed in version 2.4! Please use `paddle::empty/full` method to create a new Tensor with allocated memory, and use data<T>() method to get the memory pointer of tensor instead. Reason: When calling `mutable_data` to allocate memory, the datatype, and data layout of tensor may be in an illegal state.
.
----------------------------------------------------------------------
Ran 1 test in 6.988s

OK

# 测试通过,查看 mix_relu_extension

➜  cpp_extension git:(setuptools80) ✗ pip show mix_relu_extension
Name: mix_relu_extension
Version: 0.0.0
Summary: 
Home-page: 
Author: 
Author-email: 
License: 
Location: /usr/local/lib/python3.9/dist-packages
Requires: 
Required-by: 

# 查看安装目录,这里安装了两个目录,所以测试用例中 `assert len(custom_egg_path) == 2`

➜  cpp_extension git:(setuptools80) ✗ ls /usr/local/lib/python3.9/dist-packages | grep mix_relu
mix_relu_extension
mix_relu_extension-0.0.0-py3.9.egg-info

# 查看 setuptools 的版本

➜  cpp_extension git:(setuptools80) ✗ pip show setuptools
Name: setuptools
Version: 80.9.0
Summary: Easily download, build, install, upgrade, and uninstall Python packages
Home-page: 
Author: 
Author-email: Python Packaging Authority <[email protected]>
License: 
Location: /usr/local/lib/python3.9/dist-packages
Requires: 
Required-by: astroid, nodeenv, wandb

# 安装 setuptools==57.1.0 进行验证

➜  cpp_extension git:(setuptools80) ✗ pip install setuptools==57.1.0
Looking in indexes: https://mirrors.aliyun.com/pypi/simple/
Collecting setuptools==57.1.0
  Using cached https://mirrors.aliyun.com/pypi/packages/a2/e1/902fbc2f61ad6243cd3d57ffa195a9eb123021ec912ec5d811acf54a39f8/setuptools-57.1.0-py3-none-any.whl (818 kB)
Installing collected packages: setuptools
  Attempting uninstall: setuptools
    Found existing installation: setuptools 80.9.0
    Uninstalling setuptools-80.9.0:
      Successfully uninstalled setuptools-80.9.0
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
wandb 0.15.12 requires protobuf!=4.21.0,<5,>=3.15.0; python_version == "3.9" and sys_platform == "linux", but you have protobuf 6.33.0 which is incompatible.
Successfully installed setuptools-57.1.0
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv

# 手动删除当前构建目录中的遗留文件

➜  cpp_extension git:(setuptools80) ✗ rm -rf mix_relu_extension.egg-info 
➜  cpp_extension git:(setuptools80) ✗ rm -rf build 

# 移除掉之前安装的测试包

➜  cpp_extension git:(setuptools80) ✗ pip uninstall mix_relu_extension    
Found existing installation: mix_relu_extension 0.0.0
Uninstalling mix_relu_extension-0.0.0:
  Would remove:
    /usr/local/lib/python3.9/dist-packages/mix_relu_extension
    /usr/local/lib/python3.9/dist-packages/mix_relu_extension-0.0.0-py3.9.egg-info
Proceed (Y/n)? y
  Successfully uninstalled mix_relu_extension-0.0.0
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv

# 查看当前目录

➜  cpp_extension git:(setuptools80) ✗ l
total 76K
drwxrwxr-x  3 1000 1000 4.0K Oct 27 05:52 .
drwxrwxr-x 45 1000 1000 4.0K Oct 21 05:26 ..
-rw-rw-r--  1 1000 1000  482 Jun 17  2024 CMakeLists.txt
drwxr-xr-x  2 root root 4.0K Oct 27 05:42 __pycache__
-rw-rw-r--  1 1000 1000 1.7K Sep  6 05:10 cpp_extension_setup.py
-rw-rw-r--  1 1000 1000 2.4K Sep  6 05:10 custom_extension.cc
-rw-rw-r--  1 1000 1000 1006 Dec 25  2023 custom_power.h
-rw-rw-r--  1 1000 1000 1.9K Aug 12  2024 custom_relu_forward.cu
-rw-rw-r--  1 1000 1000  780 Dec 25  2023 custom_sub.cc
-rw-rw-r--  1 1000 1000 5.7K Oct 21 05:26 mix_relu_and_extension.cc
-rw-rw-r--  1 1000 1000 1.1K Aug 12  2024 mix_relu_and_extension_setup.py
-rw-rw-r--  1 1000 1000 5.8K Sep  6 05:10 test_cpp_extension_jit.py
-rw-rw-r--  1 1000 1000 5.7K Sep  6 05:10 test_cpp_extension_setup.py
-rw-rw-r--  1 1000 1000 7.1K Oct 27 05:49 test_mixed_extension_setup.py
-rw-rw-r--  1 1000 1000 2.7K Sep  6 05:10 utils.py

# 重新进行测试

➜  cpp_extension git:(setuptools80) ✗ python test_mixed_extension_setup.py
/usr/lib/python3.9/distutils/extension.py:131: UserWarning: Unknown Extension options: 'verbose'
  warnings.warn(msg)
/usr/lib/python3.9/distutils/dist.py:274: UserWarning: Unknown distribution option: 'metadata_version'
  warnings.warn(msg)
W1027 05:53:35.374228 12145 gpu_resources.cc:114] Please NOTE: device: 0, GPU Compute Capability: 6.1, Driver API Version: 12.2, Runtime API Version: 11.8
/usr/local/lib/python3.9/dist-packages/paddle/pir/math_op_patch.py:240: UserWarning: Tensor do not have 'place' interface for pir graph mode, try not to use it. None will be returned.
  warnings.warn(
I1027 05:53:35.432821 12145 pir_interpreter.cc:1524] New Executor is Running ...
I1027 05:53:35.432870 12145 pir_interpreter.cc:1547] pir interpreter is running by multi-thread mode ...
W1027 05:53:35.444814 12145 tensor.cc:241] Allocating memory through `mutable_data` method is deprecated since version 2.3, and `mutable_data` method will be removed in version 2.4! Please use `paddle::empty/full` method to create a new Tensor with allocated memory, and use data<T>() method to get the memory pointer of tensor instead. Reason: When calling `mutable_data` to allocate memory, the datatype, and data layout of tensor may be in an illegal state.
W1027 05:53:35.445716 12145 tensor.cc:241] Allocating memory through `mutable_data` method is deprecated since version 2.3, and `mutable_data` method will be removed in version 2.4! Please use `paddle::empty/full` method to create a new Tensor with allocated memory, and use data<T>() method to get the memory pointer of tensor instead. Reason: When calling `mutable_data` to allocate memory, the datatype, and data layout of tensor may be in an illegal state.
.
----------------------------------------------------------------------
Ran 1 test in 1.372s

OK

# 查看当前的 setuptools 版本

➜  cpp_extension git:(setuptools80) ✗ pip show setuptools
Name: setuptools
Version: 57.1.0
Summary: Easily download, build, install, upgrade, and uninstall Python packages
Home-page: https://github.com/pypa/setuptools
Author: Python Packaging Authority
Author-email: [email protected]
License: UNKNOWN
Location: /usr/local/lib/python3.9/dist-packages
Requires: 
Required-by: astroid, nodeenv, wandb

# 查看安装的测试包

➜  cpp_extension git:(setuptools80) ✗ pip show mix_relu_extension          
Name: mix-relu-extension
Version: 0.0.0
Summary: UNKNOWN
Home-page: UNKNOWN
Author: 
Author-email: 
License: UNKNOWN
Location: /usr/local/lib/python3.9/dist-packages
Requires: 
Required-by: 


看看有啥要注意的不?

@SigureMo

Copy link
Member

@SigureMo SigureMo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

在 PR 描述里先说明下思路吧,这个题目是要求 RFC 的

x for x in os.listdir(site_dir) if 'custom_cpp_extension' in x
]
assert len(custom_egg_path) == 1, (
assert len(custom_egg_path) == 2, (
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

setuptools 在 80.9.0 和 57.1.0 中测试通过。不太清楚为什么之前的测试用例这里是 1 。

奇怪,这些单测在 CI 中是没跑么?

# Setting metadata_version >= 2.1 (introduced in PEP 566) forces setuptools
# to create .dist-info directories instead of .egg-info, which allows pip
# to properly detect and list installed packages via `pip list`.
# Version 2.1 is sufficient for this purpose and maintains compatibility.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

➜ cpp_extension git:(setuptools80) ✗ ls /usr/local/lib/python3.9/dist-packages | grep mix_relu
mix_relu_extension
mix_relu_extension-0.0.0-py3.9.egg-info

可是这里看还是 egg-info 而不是 dist-info

# Write stub; it will reference the _pd_ renamed resource at import time
custom_write_stub(so_name, pyfile)
except Exception as e:
print(f"Warning: failed to generate python api file: {e}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里应该是 warning 么?不应该直接报错么?

def finalize_options(self) -> None:
super().finalize_options()
try:
import sys
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

import 放在文件开始

try:
candidates.extend(site.getsitepackages())
except Exception:
pass
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这些 try-except 有必要么?感觉满满的 AI 风格

@megemini
Copy link
Contributor Author

megemini commented Nov 3, 2025

Update 20251103

旧代码不能使用 pip install . 进行安装,这里修复了 pip install . 的 wheel 安装方式,现在不需要区分 setuptools 版本:

  • 由于兼容了 wheel 的安装方式,不再使用 custom_egg_path,而是 custom_install_path ,并且 len(custom_install_path) == 2
  • 旧的 bdist_egg.write_stub = custom_write_stub 实际上不需要了,这里保留下来以防有其他地方或者第三方调用

rfc : PaddlePaddle/community#1174

测试验证过程在 rfc 中有写 ~

奇怪,这些单测在 CI 中是没跑么?

旧的测试没问题,我之前改的时候没注意,实际上已经是产生两个目录了。

可是这里看还是 egg-info 而不是 dist-info 呀

python setup.py install 生成的是 egg-info ,pip install . 生成的是 dist-info 。之前 AI 建议加上这个,后来发现没啥用 ... ...

p.s. 这两天为了兼容 setuptools 的版本绕了很多弯路,后来发现,根本不需要考虑 setuptools 的版本兼容问题,把 wheel 的安装修复就可以了 ... ... 旧代码当时应该没有考虑 wheel 的安装,所以用的覆盖 bdist_egg.write_stub 的方式,目前看来完全没有必要 ... ...

@SigureMo

@SigureMo
Copy link
Member

SigureMo commented Nov 4, 2025

本地基于 demo case 测了下目测没啥问题,我找时间看看具体实现~

@SigureMo SigureMo changed the title 【Hackathon 9th No.109 】基于 Setuptools 80+ 版本自定义算子机制适配 【Hackathon 9th No.109】基于 Setuptools 80+ 版本自定义算子机制适配 Nov 4, 2025
@SigureMo
Copy link
Member

SigureMo commented Nov 7, 2025

这周末 review!

Copy link
Member

@SigureMo SigureMo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

整体我觉得 ok,走不到的逻辑,该清理的我觉得可以清理,大不了 revert

等会或者周一我在 FastDeploy 这种重度依赖自定义算子的场景再测一下,如果没问题的话我觉得就可以直接合了

# to properly detect and list installed packages via `pip list`.
# Version 2.1 is sufficient for this purpose and maintains compatibility.
if 'metadata_version' not in attr:
attr['metadata_version'] = '2.1'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里我有点担心的是,最新 metadata version 已经到了 2.51,2.4 也已经基本是普及的状态,这里我们写死 2.1 会不会反而是阻碍现代化 metadata?

话说这里说的生成 .egg-info 是 2.1 之下的行为还是 2.1 之上的行为(按我理解是前者)?

Footnotes

  1. https://packaging.python.org/en/latest/specifications/core-metadata/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

现在也说不好这个设置会对哪个版本的 setuptools 起作用,我理解的是,建议最低版本是 2.1 ,所以这里写了个判断,如果没有这个参数再进行设置,主要是针对旧的 setuptools 起作用 ~ 实在不行,就把这部分先删掉?

Copy link
Member

@SigureMo SigureMo Nov 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

嗯 可以删掉后在低版本测试下,如果没问题(能正常 import),.egg-info 还是 .dist-info 均可

3) rename the compiled library to *_pd_.so to avoid shadowing the python stub
Note: This is primarily for legacy 'python setup.py install' usage.
For modern 'pip install', the BdistWheelCommand handles file layout.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BdistWheelCommand 是不需要 extend 一些行为的是么?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不需要了 ~ 这里是当时针对不同 setuptools 版本进行调试的时候遗留下来的注释 ~ 我把这部分注释删了吧 ~

@SigureMo SigureMo changed the title 【Hackathon 9th No.109】基于 Setuptools 80+ 版本自定义算子机制适配 【Hackathon 9th No.109】[CppExtension] Support build Custom OP in setuptools 80+ and support install extension via pip install . --no-build-isolation Nov 7, 2025
@SigureMo
Copy link
Member

SigureMo commented Nov 7, 2025

FastDeploy 上好像还有些问题

https://github.com/PaddlePaddle/FastDeploy/blob/80aedb82cee1b46ad1b720d502baedc004e54285/build.sh#L73-L74

这里两行可以把后缀删掉以屏蔽目录结构的差异(后续可能需要加一个 if 判断两种目录存在哪个),但是修改后不知道为什么会把 stub 生成到源码里

fastdeploy/model_executor/ops/gpu/__init__.py 里的内容变成了 stub 内容

@megemini 顺师傅有时间可以看下这个么?这个不需要保证目录结构什么的完全不变,构建脚本可以调整

@megemini
Copy link
Contributor Author

megemini commented Nov 8, 2025

FastDeploy 上好像还有些问题

https://github.com/PaddlePaddle/FastDeploy/blob/80aedb82cee1b46ad1b720d502baedc004e54285/build.sh#L73-L74

这里两行可以把后缀删掉以屏蔽目录结构的差异(后续可能需要加一个 if 判断两种目录存在哪个),但是修改后不知道为什么会把 stub 生成到源码里

fastdeploy/model_executor/ops/gpu/__init__.py 里的内容变成了 stub 内容

@megemini 顺师傅有时间可以看下这个么?这个不需要保证目录结构什么的完全不变,构建脚本可以调整

有木有测试用例或者测试方法?我这里之前 fd 安装不了,cuda 版本太低 ... ...

@SigureMo
Copy link
Member

SigureMo commented Nov 8, 2025

有木有测试用例或者测试方法?我这里之前 fd 安装不了,cuda 版本太低 ... ...

我觉得可以关注下使用方式

https://github.com/PaddlePaddle/FastDeploy/blob/80aedb82cee1b46ad1b720d502baedc004e54285/build.sh#L157

这里使用了 --install-lib

https://github.com/PaddlePaddle/FastDeploy/blob/80aedb82cee1b46ad1b720d502baedc004e54285/build.sh#L78

以及这里进行了 cp

可以按照这个构造一个 case?

@SigureMo
Copy link
Member

SigureMo commented Nov 8, 2025

走不到的逻辑,该清理的我觉得可以清理

考虑到 FD 这种依赖目录结构的场景,可能其他下游生态库也有类似使用,周一我问问其他同学这里改动的影响面,暂时可以先不清理

如果影响面实在不可控,可能还是得低版本(79-)用旧逻辑、高版本(80+)用新逻辑这样(按我理解需要在新增代码前面加上 if

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

啊 好快的删,没事我周一问问其他人再确定,反正就是一个 revert

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

昨天忘记说了,昨天和 @zyfncg 讨论的结论是先不为 79-、80+ 单独添加兼容性逻辑,单独解决一下 FD 里的问题就可以了,解决完 FD 问题后如果其他场景有问题反馈再为 79-、80+ 单独添加兼容性逻辑

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

有个地方没想明白,fd 为啥要把安装好的包 copy 到 fd 目录里面?方便后续一并卸载?应该也不是啊,它用的是 cp 而不是 mv ... ...

fd 如果要改的话,把现在生成的两个目录一起 copy 过去应该就可以了 ~

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

按我理解是,这些自定义算子 fastdeploy_ops 只是中间产物(仅算子实现),最终会被打包到 fastdeploy 包(含 Python 实现)里进行发布

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

相当于,将这些算子统一放到 fd 的命名空间中进行管理,而不是作为一个个单独的包?

OK ~ 那咱们这里还需要做啥?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fastdeploy/model_executor/ops/gpu/init.py 里的内容变成了 stub 内容

这个?现在是需要咱们这里修改兼容 fd 目前的实现方式?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个?现在是需要咱们这里修改兼容 fd 目前的实现方式?

不一定是框架兼容,可以修改 FD 的 build 脚本

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

嗯 ~ 我的意思也是,咱们这个 pr 应该不需要修改,最好是修改 fd 那边的脚本 ~ 那我给 fd 那边提个 pr ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

嗯,如果确定不是这个 PR 导致的 bug,可以提 PR 改动下,不过那边的 PR 应该兼容两种方式,因为不能保证所有人都用 develop

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--install-lib 之前是有问题的,前两天最新的 commit ba9349b 已经改了 ~

fd 那边我这两天提个 pr 改一下 ~

so_name = os.path.basename(so_path)
build_dir = os.path.dirname(so_path)
# The package name equals distribution name
pkg_name = self.distribution.get_name()
Copy link
Member

@SigureMo SigureMo Nov 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

话说这里的路径可以从 extension.name 获取么?不与包名强绑定,比如 self.get_ext_fullpath(self.extensions[0].name) 这种?因为后续可能会考虑支持 extension.name = pkg.submodule.ops 这种形式,虽然现在的话 extension.name 应该被 distribution.name 覆盖掉了 这里描述有点问题,看下面~

Copy link
Member

@SigureMo SigureMo Nov 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

具体使用场景是这样的~

# setup.py
from paddle.utils.cpp_extension import CppExtension, setup

setup(
    name='custom_setup_ops.my_ops.custom_relu',
    ext_modules=CppExtension(
        sources=['my_ops/relu_cpu.cc']
    )
)
# pyproject.toml
[project]
name = "custom-setup-ops"  # distribution.name 可能拿到 custom-setup-ops 这种没有 normalize 的 name
version = "0.0.0"

[tool.setuptools.packages.find]
exclude = []

[build-system]
requires = ["setuptools>=61.0.0,<80.0.0"]
build-backend = "setuptools.build_meta"
// my_ops/relu_cpu.cc
#include "paddle/extension.h"

#include <vector>

#define CHECK_INPUT(x) PD_CHECK(x.is_cpu(), #x " must be a CPU Tensor.")

template <typename data_t>
void relu_cpu_forward_kernel(const data_t* x_data,
                             data_t* out_data,
                             int64_t x_numel) {
  for (int64_t i = 0; i < x_numel; ++i) {
    out_data[i] = std::max(static_cast<data_t>(0.), x_data[i]);
  }
}

std::vector<paddle::Tensor> ReluCPUForward(const paddle::Tensor& x) {
  CHECK_INPUT(x);

  auto out = paddle::empty_like(x);

  PD_DISPATCH_FLOATING_TYPES(
      x.type(), "relu_cpu_forward_kernel", ([&] {
        relu_cpu_forward_kernel<data_t>(
            x.data<data_t>(), out.data<data_t>(), x.numel());
      }));

  return {out};
}

// 维度推导
std::vector<std::vector<int64_t>> ReluInferShape(std::vector<int64_t> x_shape) {
  return {x_shape};
}

// 类型推导
std::vector<paddle::DataType> ReluInferDtype(paddle::DataType x_dtype) {
  return {x_dtype};
}

PD_BUILD_OP(custom_relu)
    .Inputs({"X"})
    .Outputs({"Out"})
    .SetKernelFn(PD_KERNEL(ReluCPUForward))
    .SetInferShapeFn(PD_INFER_SHAPE(ReluInferShape))
    .SetInferDtypeFn(PD_INFER_DTYPE(ReluInferDtype));

pyproject.toml 用于 uv buildpython setup.py bdist_wheel 同样可以复现~

目前打包出来的目录结构如下

dist/custom_setup_ops/my_ops
├── custom_relu.so  # so 没有 rename
├── custom_setup_ops.py  # 生成的 stub name 应该是 ext name split(".")[-1] 而不是 distribution name
└── version.txt

预期应该是

dist/custom_setup_ops/my_ops
├── custom_relu_pd_.so
├── custom_relu.py
└── version.txt

另外注意这里可能需要一并调整下~

def custom_write_stub(resource, pyfile):
    ...
    with open(pyfile, 'w') as f:
        f.write(
            _stub_template.format(
                resource=os.path.basename(resource),  # 这里需要删除前缀
                custom_api='\n\n'.join(api_content)
            )
        )

我们有假设 assert self.extensions == 1,可以直接取 self.extensions[0],不需要考虑多个 extensions 的 case,按我理解是微调,不需要大改

虽然这里不在 Setuptools 80+ 支持的考虑范围内,不过这里的调整看起来是泛用性支持,辛苦顺师傅再稍微调整下~

cc. @zhangbo9674

ext_name = (
self.extensions[0].name
if self.extensions
else self.distribution.get_name()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里 self.distribution.get_name() 还有使用场景嘛?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extensions 正常构建后,理论上应该不会用到 ~ 那么 _get_extension_name 里面也不用做判断了?

    def _get_extension_name(self) -> str:
        """
        Get the extension name from the extension module, not the distribution name.
        This ensures we use the correct package name from setup.py.

        Note: This assumes there is only one extension module (len(ext_modules) == 1).

        Returns:
            str: The extension name
        """
        ext_name = None
        if (
            hasattr(self.distribution, 'ext_modules')
            and self.distribution.ext_modules
        ):
            ext_name = self.distribution.ext_modules[0].name

        # If no extension name is found, fall back to distribution name
        if ext_name is None:
            ext_name = self.distribution.get_name()

        return ext_name

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

嗯,可以清理下~

filename, ext = os.path.splitext(resource)
resource = filename + "_pd_" + ext
if not filename.endswith("_pd_"):
resource = filename + "_pd_" + ext
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里具体是什么 case 呀?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

之前在想,是否可以把重命名的过程放到 BuildExtension 中,这样的话,这里传进来的就是带有 pd 的 ~ 不过,现在用不到 ~ 删掉?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

嗯嗯 现在可以先删掉吧~

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

我这边没其他问题了,FD 验证通过,@zhangbo9674 也在 PaddleFleet 里验证过多级包名的 case 了,后面可以走合入流程了~

Copy link
Member

@SigureMo SigureMo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTMeow 🐾

@SigureMo
Copy link
Member

自定义算子模块为独立进程,coverage 测不到,相关单测均已通过,下游生态测试 FD、PaddleFleet 均已测试,Coverage 中的覆盖率部分应当豁免

@megemini
Copy link
Contributor Author

LGTMeow 🐾

呃 ... ... PaddlePaddle/FastDeploy#4998 我这里还有点疑问 ~

  1. 貌似 gpu 的算子找不到
  2. xpu 有个 build.sh 应该还需要修改

合入先等等 ~~~

@SigureMo SigureMo merged commit 049f507 into PaddlePaddle:develop Nov 19, 2025
148 of 156 checks passed
LittleHeroZZZX pushed a commit to LittleHeroZZZX/Paddle that referenced this pull request Nov 19, 2025
…tools 80+ and support install extension via `pip install . --no-build-isolation` (PaddlePaddle#76008)
@SigureMo
Copy link
Member

@megemini 顺师傅,FD 那边 CI/CE 都挂了,辛苦有时间再在 CI 上看一下呢?https://github.com/PaddlePaddle/FastDeploy/actions/runs/19521463241/job/55886058007?pr=5123

目前 QA 暂时 pin 了下 PaddlePaddle 版本 PaddlePaddle/FastDeploy#5136,看看下周一之前能不能修复好,修不好的话我们做一下兼容性策略吧~

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants