Skip to content
View xxyux's full-sized avatar
:shipit:
:shipit:
  • Data Science and Analytic Thrust, Information Hub, HKUST(GZ)
  • Guangzhou
  • 19:01 - 8h ahead

Block or report xxyux

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
xxyux/README.md

Hi there 👋

Pinned Loading

  1. Attention Public

    This is my GPU course final project in MICS600J. The main content is my attempt to handwrite the attention process.

    Cuda 1

  2. Fine-tuning-LLM-with-2-4-sparse Public

    Fine-tuning Llama-2-7B for Text classification. Datasets: imdb , framework: deepspeed.

    Python 1

  3. Distributed-SpMV Public

    Distributed-SpMV, c/mpi/openmp, this work was accepted by IEEE/ACM CCGrid'23.

    C 3

  4. cuAlias Public

    Graph Sampling for GNN, using GPU. Build and use alias table for random search, especially.

    C

35 contributions in the last year

Contribution Graph
Day of Week April May June July August September October November December January February March April
Sunday
Monday
Tuesday
Wednesday
Thursday
Friday
Saturday
Less
No contributions.
Low contributions.
Medium-low contributions.
Medium-high contributions.
High contributions.
More

Contribution activity

April 2025

Created 1 commit in 1 repository

Created an issue in NVIDIA/cutlass that received 1 comment

[QST] Why sparse-gemm's performance is not very good?

Problem description: I test example/62_hopper_sparse_gemm for different size. The performance is not good as it can be in theory (near 2x speedup …

1 comment
Loading