Skip to content

[triton-ext] Allow plugin tests with static LLVM#9549

Closed
neildhar wants to merge 1 commit into
triton-lang:mainfrom
neildhar:unified-plugin-lib
Closed

[triton-ext] Allow plugin tests with static LLVM#9549
neildhar wants to merge 1 commit into
triton-lang:mainfrom
neildhar:unified-plugin-lib

Conversation

@neildhar
Copy link
Copy Markdown
Collaborator

This PR performs a minimal change to export all Triton and LLVM symbols from libtriton.so, and have the plugin resolve symbols directly from it, instead of depending on having both of them link separately against LLVM dynamic libraries. This allows us to run the Triton plugin Python tests without any special build configuration.

This is intended to be a minimal incremental step to get the existing tests working. There are at least two remaining considerations:

  1. This PR does not guarantee that libtriton includes all of the symbols in the static libraries that are linked into libtriton, since we will only pull in the object files containing symbols that libtriton itself uses. This may be good enough in practice, otherwise, we may have to link with whole-archive, which will have a binary size cost.
  2. This PR does not get triton-opt working with the new setup. Since the plugin now takes a direct dependency on libtriton, it must be available when the plugin is loaded. We could either:
    1. Make triton-opt also get its symbols from libtriton. This is potentially invasive, and makes it less convenient to copy around triton-opt executables since they will have a dependency on the dynamic library.
    2. Export all the Triton and LLVM symbols from triton-opt. Then create a fake empty libtriton.so, and make triton-opt dlopen it before loading a plugin. This "hack" ensures that the plugin's dependency is satisfied, but the symbols are actually resolved from triton-opt.

@neildhar
Copy link
Copy Markdown
Collaborator Author

neildhar commented Feb 24, 2026

This PR reverses #3879, which seems to have caused problems in the past with conflicts on LLVM symbols. If that is still an issue, it may warrant further discussion on the right path forward for plugins.

cc @whitneywhtsang

@whitneywhtsang
Copy link
Copy Markdown
Collaborator

This PR reverses #3879, which seems to have caused problems in the past with conflicts on LLVM symbols. If that is still an issue, it may warrant further discussion on the right path forward for plugins.

cc @whitneywhtsang

Thanks for the heads up. We'll perform early testing to check for any issues with this change and will update this thread if we encounter any problems.

quinnlp added a commit to intel/intel-xpu-backend-for-triton that referenced this pull request Feb 24, 2026
@plotfi
Copy link
Copy Markdown
Contributor

plotfi commented Feb 25, 2026

Excellent so when this merges I’ll take my pr for triton-opt out of draft then. Thanks @neildhar

@ThomasRaoux
Copy link
Copy Markdown
Collaborator

Exporting all LLVM symbols seems like a bad practice. That means that if any library that if anybody that use Triton and has a different LLVM library linked with it will have a linker error right?

@neildhar
Copy link
Copy Markdown
Collaborator Author

neildhar commented Mar 5, 2026

@ThomasRaoux My consideration here focused on the case where libtriton is loaded with dlopen by Python. In that case, in my experiments we use RTLD_LOCAL, so the symbols don't pollute the global namespace. I believe it should be possible to safely load multiple libraries exporting LLVM symbols into a single python process with RTLD_LOCAL.

I agree that if someone either puts both Triton and some other copy of LLVM on the link line, or loads multiple copies of LLVM/Triton with RTLD_GLOBAL, the symbol resolution will create a mess. The linker may complain, or the resolution may come down to the library load order.

Since the plugins need access to MLIR and need to agree on the global data structures in LLVM, exporting at least some portion of LLVM symbols seems necessary for the current plugin design to work.

In terms of potential alternative approaches:

  1. Force the plugins to operate IR serialized to text. This could make them completely independent of the LLVM in Triton, and they could ship their own internally. However, this may be restrictive for custom dialects interacting with existing passes.
  2. Create an array of pointers to all the LLVM symbols, and just export that. Then reconstruct LLVM on the plugin side. This would be very elaborate, but probably doable if exporting is a major constraint.

@plotfi
Copy link
Copy Markdown
Contributor

plotfi commented Mar 5, 2026

I agree that if someone either puts both Triton and some other copy of LLVM on the link line, or loads multiple copies of LLVM/Triton with RTLD_GLOBAL, the symbol resolution will create a mess. The linker may complain, or the resolution may come down to the library load order.

How likely is this to happen? I thought we explored this scenario when we chose not to pursue the LLVM dylib route.

@neildhar
Copy link
Copy Markdown
Collaborator Author

neildhar commented Mar 5, 2026

@plotfi I think this is orthogonal to having a separate LLVM dylib. An LLVM dylib would also have to export all of the LLVM symbols, which presents a similar set of challenges.

@plotfi
Copy link
Copy Markdown
Contributor

plotfi commented Mar 5, 2026

@plotfi I think this is orthogonal to having a separate LLVM dylib. An LLVM dylib would also have to export all of the LLVM symbols, which presents a similar set of challenges.

Ah, thanks for clarifying.

@CRobeck
Copy link
Copy Markdown
Contributor

CRobeck commented Mar 5, 2026

That means that if any library that if anybody that use Triton and has a different LLVM library linked with it will have a linker error right?

isn't that also the case with LLVM proper though? It's not uncommon to run into link issues if you start mixing and matching LLVM libraries for a project built on top of/using LLVM as library.

@ThomasRaoux
Copy link
Copy Markdown
Collaborator

That means that if any library that if anybody that use Triton and has a different LLVM library linked with it will have a linker error right?

isn't that also the case with LLVM proper though? It's not uncommon to run into link issues if you start mixing and matching LLVM libraries for a project built on top of/using LLVM as library.

right, I thought that's why hiding the symbols was important? Maybe I'm missing something

@neildhar
Copy link
Copy Markdown
Collaborator Author

neildhar commented Mar 5, 2026

@ThomasRaoux You're right that exporting the LLVM symbols could cause problems if there are multiple copies of LLVM and one of them ends up in the global symbol namespace. I believe for the purposes of loading this into Python, exporting the symbols should be safe because the library is opened with RTLD_LOCAL.

Are there other ways to consume libtriton.so that we care about, or do you have reservations about the Python use case as well?

@plotfi
Copy link
Copy Markdown
Contributor

plotfi commented Mar 6, 2026

@ThomasRaoux You're right that exporting the LLVM symbols could cause problems if there are multiple copies of LLVM and one of them ends up in the global symbol namespace. I believe for the purposes of loading this into Python, exporting the symbols should be safe because the library is opened with RTLD_LOCAL.

@neildhar Some Questions:

From what I gather, it seems invoking symbols from a dlopen'ed lib using RTLD_LOCAL requires invocation through dlsym.

Q1: So then is it the case that global symbols can cause problems mainly because something the RTLD_LOCAL'ed .so is looking for is found in the global table?

Q2: Could we use an lld linker flag like --Bsymbolicto tell dlopen to resolve symbols locally when possible instead of looking in the global table?

Comment thread CMakeLists.txt
# Only build plugins when building libtriton since they depend on libtriton.
add_subdirectory(examples/plugins)
endif()

Copy link
Copy Markdown
Contributor

@plotfi plotfi Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If --Bsymbolic is a workable solution, I think we could use it here:

if (UNIX AND NOT APPLE)
  set(CMAKE_SHARED_LINKER_FLAGS "${CMAKE_SHARED_LINKER_FLAGS} -Wl,--Bsymbolic")
endif()

whitneywhtsang pushed a commit to intel/intel-xpu-backend-for-triton that referenced this pull request Mar 6, 2026
We need to disable the example plugins on Windows in preparation for
merging triton-lang/triton#9549 into our repo.
Prior to triton-lang/triton#9549, the example
plugin libraries build successfully with LLVM/MLIR's static libraries
and the example plugin tests are guarded by not building against shared
libs. Enabling Triton extensions/plugins on Windows is difficult:
#6248 (comment).

This patch adds a CMake option `TRITON_ENABLE_PLUGINS` to enable/disable
building the plugin libraries in `example/plugins`. It also adds a LIT
config feature that reads the option to guard LIT tests that require the
example plugin libraries. If the option is on, the example plugin
libraries are built and the example plugin tests are enabled. If the
option is off, the example plugin libraries are not built and the
example plugin tests are skipped.

We require that `TRITON_ENABLE_PLUGINS` is set off to build on Windows.
The default value of the option is OFF for Windows and ON otherwise.
@neildhar
Copy link
Copy Markdown
Collaborator Author

neildhar commented Mar 6, 2026

global symbols can cause problems mainly because something the RTLD_LOCAL'ed .so is looking for is found in the global table

That is one failure mode. Note that this is primarily a concern for the plugin library. If there is some other library exporting LLVM, the plugins may resolve symbols from those libraries instead of libtriton.

The other failure mode (which is what I think Thomas is alluding to) is if we are not using RTLD_LOCAL, our symbols will also be in the global namespace, and can be resolved by other libraries which wanted to use their own copy of LLVM.

use an lld linker flag like --Bsymbolic

I agree that we should be building with Bsymbolic and fno-semantic-interposition to prevent interposition of the exported symbols. But this would only solve the problem for LLVM usage inside libtriton itself. The two failure modes I mentioned above would still apply. (technically we can solve the plugin case with RTLD_DEEPBIND, but I don't know how widely supported that is, and we don't want to add too many layers of complexity here)

If we are concerned about the case where LLVM symbols may be introduced to the global namespace for dynamic symbol resolution (either by us or some other library), I think the solution will have to be built around not exporting the symbols.

The other possible concern here is around binary size. Exporting everything also means we cannot analyze which symbols are dead.

@abrown
Copy link
Copy Markdown
Contributor

abrown commented Mar 7, 2026

I think there is another alternative to this approach, exporting all the symbols; the alternative is to build a big LLVM shared library, a mega-library (LLVM_BUILD_LLVM_DYLIB). Then both Triton and Triton plugins would dynamically link to that. @plotfi mentioned above something about not pursuing that route so I tried to figure out why; I talked to @neildhar for some time about this (thanks!) to try to understand the drawbacks on both sides. My conclusion is that I'm fine with either approach but I'm slightly leaning toward the mega-library. Here is my take on the pros and cons:

  • export-all-symbols
    • PRO: it is easier to distribute a single library, libtriton.so
    • CON: I'm unsure about portability, maintainability; that's a bit hazy, but this is a non-conventional approach; is there some other large project that does this?
    • CON: some increased file size; no DCE since all symbols exported
    • CON: does not work on Windows yet due to 64k symbol limit
  • mega-library
    • CON: requires an extra library to distribute; we must set the linking environment, even for tools like triton-opt (possibly via a rpath?)
    • PRO: it is LLVM's recommended distribution method (link)--a bit more conventional
    • CON: incurs a larger package size for the mega-libraries, though can be configured somewhat by adding or removing components
    • PRO: reduces file size for each individual Triton component (e.g., I triton-opt drop from >200MB to <20MB)
    • PRO: allows users to (unwisely?) swap out underlying LLVM; e.g., (1) expand which LLVM components are available, (2) to use a LLVM (Intel's #1236 issue)
    • CON: does not work on Windows yet (maybe due to #152371)

Hopefully that helps lay out the decision here. For me, the more conventional shared library approach carries a bit of weight, but if I were told that the only reasonable way to distribute things was with a single library, I would lean back towards exporting all the symbols. If we decide to export all the symbols, then this PR is definitely needed.

quinnlp added a commit to intel/intel-xpu-backend-for-triton that referenced this pull request Mar 26, 2026
@neildhar
Copy link
Copy Markdown
Collaborator Author

Superseded by #9783

@neildhar neildhar closed this Mar 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants