Investigate tvm tir to tensor core lowering #6

Lurkrazy · 2025-10-19T21:28:41Z

This pull request contains changes made by a Background Agent.

Branch: cursor/investigate-tvm-tir-to-tensor-core-lowering-d0ab

Open in Web • Open in Cursor • Open Docs

This PR fixes the tvm/dlpack/dmlc header lookup in the FlashInfer kernel JIT compilation. Prior to this fix, the JIT compilation assumes the environment variable `TVM_SOURCE_DIR` is always defined, which is not always true. This PR fixes the behavior and considers multiple cases, including TVM source builds and pip-installed packages.

…#18245)

…core (apache#18239) This pr migrates the TVM Python packaging system from the setup.py flow to the modern, PEP 517/518 compliant pyproject.toml standard, which allows us to produce a single, Python-version-agnostic wheel. This change streamlines the process for both developers and users. For local development, you can now set up a fully-functional editable environment with a single command: `pip install -e .`. To create the distributable package for release, simply run `pip wheel -w dist .` , which will produce a universal wheel in the `dist/` folder. This ensures that end-users can reliably install TVM with a standard pip install tvm, regardless of their specific Python 3 version.

…he#18249) * Revert the URL out from cmake for libbacktrace * Switch git submodule to upstream HEAD instead As per discussed here apache#18246 (comment), this reverts in favour of git submodule way. As per finding in the same discuss the upstream [already](https://github.com/ianlancetaylor/libbacktrace/blob/793921876c981ce49759114d7bb89bb89b2d3a2d/macho.c#L1273-L1275) incorporates [the one patch](ianlancetaylor/libbacktrace@master...tlc-pack:libbacktrace:master) used, and MacOS works fine.

…he#18248) This PR updates the `version.py`, so every time when running this file, it also bumps the version number in `pyproject.toml` automatically.

Following apache#18239, this PR fixes a few issues we ran into during testing the packaging flow through scikit-build-core.

* upgrade cutlass v4.2.0 supporting cuda 13 * upgrade cutlass v4.2.0 supporting cuda 13

…pache#18254) This PR updates the ABI to enable potential future need for getting metadata from a dynamically loaded module. Orders the current static object into simple objects that have C ABI and more complex one that may need c++. These items changes ABI to be future compact before we freeze.

[FFI] Wheel packaging example This PR add an example about wheel packaging. Also fixes various source packaging minor nits.

This PR adds weak ref counter support to the FFI ABI. Weak rc is useful when we want to break cyclic dependencies. - When a strong rc goes to zero, we call the destructor of the object, but not freeing the memory - When both strong and weak rc goes to zero, we call the memory free operation The weak rc mechanism is useful when we want to break cyclic dependencies in object, where the weak rc can keep memory alive but the destructor is called. As of now, because we deliberately avoid cyles in codebase, we do not have strong use-case for weak rc. However, given weak rc is common practice in shared_ptr, Rust RC, and also used in torch's c10::intrusive_ptr. It is better to make sure the ABI is future compatible to such use-cases before we freeze. This PR implements weak rc as a u32 counter and strong rc as a u64 counter, with the following design consideration. - Weak rc is very rarely used and u32 is sufficient. - Keeping weak rc in u32 allows us to keep object header size to 24 bytes, saving extra 8 bytes(considering alignment) We also need to update deleter to take flags that consider both weak and strong deletion events. The implementation tries to optimize common case where both strong and weak goes to 0 at the same time and call deleter once with both flags set.

This PR adds the missing files in packaging example also renames get_started to quick_start

…pache#18262) [BugFix][NNAPI] Use kind() after FFI refactor This commit updates nnapi_runtime.cc to override kind() instead of type_key(), aligning NNAPI with the new FFI interface. Behavior is consistent with other runtimes that were updated in commit b8eb80b.

This PR provides misc docs fix, updates the requirements of ffi docs remove stale webpages from header, update embedding script to allow path.

* finish1 * finish2 * finish3 * update * update2 * update3 * update4 * update4 * update6 * Rename build step and update installation commandFix * fix * fix2 * fix3

Co-authored-by: chendi.li <[email protected]>

MasterJH5574 and others added 22 commits August 27, 2025 17:28

[LLVM][MSWIN][CI] Fix LLVM module build with latest CI update (apache…

335bc16

…#18245)

[FFI][CMAKE] Add missing download path for libbacktrace (apache#18246)

dd1e3f8

resnet 50 bench 20000trials

7652043

Add TVM/MetaSchedule and PyTorch benchmark scripts

80d4ce2

upd

bd88024

[Python] Update version.py to bump pyproject.toml automatically (apac…

e3efec2

…he#18248) This PR updates the `version.py`, so every time when running this file, it also bumps the version number in `pyproject.toml` automatically.

[Python] Complete Python packaging with scikit-build-core (apache#18251)

5feed58

Following apache#18239, this PR fixes a few issues we ran into during testing the packaging flow through scikit-build-core.

upgrade cutlass v4.2.0 supporting cuda 13 (apache#18236)

601da7b

* upgrade cutlass v4.2.0 supporting cuda 13 * upgrade cutlass v4.2.0 supporting cuda 13

[FFI][DOCS] Wheel Packaging (apache#18256)

4ec1709

[FFI] Wheel packaging example This PR add an example about wheel packaging. Also fixes various source packaging minor nits.

[FFI] fix two seemingly migration issue (apache#18258)

b67650f

[FFI][DOCS] Add missing files in packaging example (apache#18261)

9b5930d

This PR adds the missing files in packaging example also renames get_started to quick_start

[FFI][DOCS] Initial docs scaffolding (apache#18263)

ab2b2d0

[DOCS] Misc docs fix (apache#18264)

322298a

This PR provides misc docs fix, updates the requirements of ffi docs remove stale webpages from header, update embedding script to allow path.

[Build] Complete TVM wheel building migration (apache#18252)

e56d4b2

* finish1 * finish2 * finish3 * update * update2 * update3 * update4 * update4 * update6 * Rename build step and update installation commandFix * fix * fix2 * fix3

Merge branch 'apache:main' into v21_bench

c2144d4

feat: Add TensorCore lowering report and MRE

2fa4fbd

Co-authored-by: chendi.li <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Investigate tvm tir to tensor core lowering #6

Investigate tvm tir to tensor core lowering #6

Uh oh!

Lurkrazy commented Oct 19, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

10 participants

Investigate tvm tir to tensor core lowering #6

Are you sure you want to change the base?

Investigate tvm tir to tensor core lowering #6

Uh oh!

Conversation

Lurkrazy commented Oct 19, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

10 participants