forked from apache/tvm
-
Notifications
You must be signed in to change notification settings - Fork 0
Investigate tvm tir to tensor core lowering #6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
Lurkrazy
wants to merge
22
commits into
main
Choose a base branch
from
cursor/investigate-tvm-tir-to-tensor-core-lowering-d0ab
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Investigate tvm tir to tensor core lowering #6
Lurkrazy
wants to merge
22
commits into
main
from
cursor/investigate-tvm-tir-to-tensor-core-lowering-d0ab
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This PR fixes the tvm/dlpack/dmlc header lookup in the FlashInfer kernel JIT compilation. Prior to this fix, the JIT compilation assumes the environment variable `TVM_SOURCE_DIR` is always defined, which is not always true. This PR fixes the behavior and considers multiple cases, including TVM source builds and pip-installed packages.
…core (apache#18239) This pr migrates the TVM Python packaging system from the setup.py flow to the modern, PEP 517/518 compliant pyproject.toml standard, which allows us to produce a single, Python-version-agnostic wheel. This change streamlines the process for both developers and users. For local development, you can now set up a fully-functional editable environment with a single command: `pip install -e .`. To create the distributable package for release, simply run `pip wheel -w dist .` , which will produce a universal wheel in the `dist/` folder. This ensures that end-users can reliably install TVM with a standard pip install tvm, regardless of their specific Python 3 version.
…he#18249) * Revert the URL out from cmake for libbacktrace * Switch git submodule to upstream HEAD instead As per discussed here apache#18246 (comment), this reverts in favour of git submodule way. As per finding in the same discuss the upstream [already](https://github.com/ianlancetaylor/libbacktrace/blob/793921876c981ce49759114d7bb89bb89b2d3a2d/macho.c#L1273-L1275) incorporates [the one patch](ianlancetaylor/libbacktrace@master...tlc-pack:libbacktrace:master) used, and MacOS works fine.
…he#18248) This PR updates the `version.py`, so every time when running this file, it also bumps the version number in `pyproject.toml` automatically.
Following apache#18239, this PR fixes a few issues we ran into during testing the packaging flow through scikit-build-core.
* upgrade cutlass v4.2.0 supporting cuda 13 * upgrade cutlass v4.2.0 supporting cuda 13
…pache#18254) This PR updates the ABI to enable potential future need for getting metadata from a dynamically loaded module. Orders the current static object into simple objects that have C ABI and more complex one that may need c++. These items changes ABI to be future compact before we freeze.
[FFI] Wheel packaging example This PR add an example about wheel packaging. Also fixes various source packaging minor nits.
This PR adds weak ref counter support to the FFI ABI. Weak rc is useful when we want to break cyclic dependencies. - When a strong rc goes to zero, we call the destructor of the object, but not freeing the memory - When both strong and weak rc goes to zero, we call the memory free operation The weak rc mechanism is useful when we want to break cyclic dependencies in object, where the weak rc can keep memory alive but the destructor is called. As of now, because we deliberately avoid cyles in codebase, we do not have strong use-case for weak rc. However, given weak rc is common practice in shared_ptr, Rust RC, and also used in torch's c10::intrusive_ptr. It is better to make sure the ABI is future compatible to such use-cases before we freeze. This PR implements weak rc as a u32 counter and strong rc as a u64 counter, with the following design consideration. - Weak rc is very rarely used and u32 is sufficient. - Keeping weak rc in u32 allows us to keep object header size to 24 bytes, saving extra 8 bytes(considering alignment) We also need to update deleter to take flags that consider both weak and strong deletion events. The implementation tries to optimize common case where both strong and weak goes to 0 at the same time and call deleter once with both flags set.
This PR adds the missing files in packaging example also renames get_started to quick_start
…pache#18262) [BugFix][NNAPI] Use kind() after FFI refactor This commit updates nnapi_runtime.cc to override kind() instead of type_key(), aligning NNAPI with the new FFI interface. Behavior is consistent with other runtimes that were updated in commit b8eb80b.
This PR provides misc docs fix, updates the requirements of ffi docs remove stale webpages from header, update embedding script to allow path.
* finish1 * finish2 * finish3 * update * update2 * update3 * update4 * update4 * update6 * Rename build step and update installation commandFix * fix * fix2 * fix3
Co-authored-by: chendi.li <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This pull request contains changes made by a Background Agent.
Branch: cursor/investigate-tvm-tir-to-tensor-core-lowering-d0ab
Open in Web • Open in Cursor • Open Docs