Skip to content
View masahi's full-sized avatar

Organizations

@apache @dmlc @octoml

Block or report masahi

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Popular repositories Loading

  1. torchscript-to-tvm torchscript-to-tvm Public

    Cuda 69 17

  2. nnvm-vision-demo nnvm-vision-demo Public

    Demos interesting image-in, image-out networks running on both NVIDIA and AMD GPUs, with NNVM

    Python 49 1

  3. tvm-winograd tvm-winograd Public

    Test winograd convolution written in TVM for CUDA and AMDGPU

    Python 41 2

  4. tvm-cutlass-eval tvm-cutlass-eval Public

    Python 38 7

  5. libflash_attn libflash_attn Public

    C++ 14

  6. mxnet-cpp-inference mxnet-cpp-inference Public

    Test MXNet C++ API for doing inference, given a trained model

    C++ 6 1

95 contributions in the last year

Contribution Graph
Day of Week April May June July August September October November December January February March April
Sunday
Monday
Tuesday
Wednesday
Thursday
Friday
Saturday
Less
No contributions.
Low contributions.
Medium-low contributions.
Medium-high contributions.
High contributions.
More

Contribution activity

April 2025

Created 1 commit in 1 repository

Created a pull request in triton-lang/triton that received 42 comments

[TritonGPU] Enable accum-init optimization for unconditionally zero-ed accumulators

Currently, the pass doesn't fire when there is no explicit op that conditionally clears the accumulator. Thus, it misses the simplest case where th…

+136 −78 lines changed 42 comments
Reviewed 7 pull requests in 1 repository
Loading