Skip to content

Navigation Menu

Explore
By company size
By use case
By industry
View all solutions
Topics
- AI
- DevOps
- Security
- Software Development
- View all
Explore
- GitHub Sponsors
  Fund open source developers
- The ReadME Project
  GitHub community articles
Repositories
- Enterprise platform
  AI-powered developer platform
Available add-ons
Pricing

Search code, repositories, users, issues, pull requests...

Search

Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

jssonx Follow

Overview Repositories 43 Projects 0 Packages 0 Stars 590 Sponsoring 1

More

Overview
Repositories
Projects
Packages
Stars
Sponsoring

jssonx

Follow

💨

accelerating at scale is all you need

Yuning Xia jssonx

💨

accelerating at scale is all you need

Follow

Rice CS | Building scalable performance tools @HPCToolkit 🛠️⚡ | Performance Engineering + ML Compiler

16 followers · 195 following

Rice University
Houston, TX
09:26 - 6h behind
in/yuningxia

Sponsoring

Achievements

Achievements

Highlights

Pro

Block or report jssonx

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Add an optional note:

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Overview Repositories 43 Projects 0 Packages 0 Stars 590 Sponsoring 1

More

Overview
Repositories
Projects
Packages
Stars
Sponsoring

Type All

Select type

All Sources Forks Archived Can be sponsored Mirrors Templates

Language All

Select language

All C++ JavaScript HTML LLVM Python C Jupyter Notebook CSS Makefile Java Shell

Sort Last updated

Select order

Last updated Name Stars

awesome-gemm Public

📚 A curated list of awesome matrix-matrix multiplication (A * B = C) frameworks, libraries and software

12 1 MIT License Updated Dec 21, 2024
oneapi-verify-headroom Public

C++ Updated Dec 19, 2024
2025 Public
Forked from asplos-contest/2025

The ASPLOS 2025 / EuroSys 2025 Contest Track

Apache License 2.0 Updated Dec 8, 2024
quarto-cli Public
Forked from quarto-dev/quarto-cli

Open-source scientific and technical publishing system built on Pandoc.

JavaScript Other Updated Nov 26, 2024
sycl-samples Public

C++ Updated Oct 1, 2024
readinglist Public
Forked from cplmakerlab/simple-website-template

Website URL

HTML Updated Sep 27, 2024
llvm Public
Forked from intel/llvm

Intel staging area for llvm.org contribution. Home for Intel LLVM-based projects.

LLVM Other Updated Sep 1, 2024
oneAPI-samples Public
Forked from oneapi-src/oneAPI-samples

Samples for Intel® oneAPI Toolkits

C++ MIT License Updated Aug 27, 2024
ray Public
Forked from ray-project/ray

Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

Python Apache License 2.0 Updated Jul 17, 2024
compute-runtime Public
Forked from intel/compute-runtime

Intel® Graphics Compute Runtime for oneAPI Level Zero and OpenCL™ Driver

C++ MIT License Updated Jun 21, 2024
compute-benchmarks Public
Forked from intel/compute-benchmarks

Compute Benchmarks for oneAPI Level Zero and OpenCL™ Driver

C++ MIT License Updated Jun 7, 2024
hatchet Public
Forked from LLNL/hatchet

Graph-indexed Pandas DataFrames for analyzing hierarchical performance data

JavaScript MIT License Updated May 10, 2024
level-zero Public
Forked from oneapi-src/level-zero

oneAPI Level Zero Specification Headers and Loader

C++ MIT License Updated May 9, 2024
pti-gpu Public
Forked from intel/pti-gpu

Profiling Tools Interfaces for GPU (PTI for GPU) is a set of Getting Started Documentation and Tools Library to start performance analysis on Intel(R) Processor Graphics easily

C++ MIT License Updated May 9, 2024
gemm-kernel-microbenchmark Public

📊 A microbenchmark for GEMM kernels on NVIDIA GPUs with Ampere architecture

C++ 1 Updated Mar 26, 2024
jssonx Public

Updated Mar 5, 2024
cutlass Public
Forked from NVIDIA/cutlass

CUDA Templates for Linear Algebra Subroutines

C++ Other Updated Mar 3, 2024
batched_gemm Public
Forked from lixiuhong/batched_gemm

C Updated Feb 18, 2024
DeepSpeed Public
Forked from microsoft/DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Python Apache License 2.0 Updated Feb 17, 2024
smith-waterman Public

Pairwise sequence alignment algorithm.

C++ MIT License Updated Jan 18, 2024
lightneuron Public

⚡️ An educational ConvNet inference framework designed for x86 architectures

C 1 MIT License Updated Jan 15, 2024
hands-on-simd-programming Public

🧩 Hands-on SIMD Programming with C++

C++ 4 1 Updated Jan 14, 2024
lz77 Public

LZ77 in C.

C MIT License Updated Jan 7, 2024
nlp-with-spark Public

🦣 Insight Mastodon: NLP Analysis with Spark

Python Updated Jan 4, 2024
eva-lang Public

A functional programming language in JavaScript.

programming-language interpreter

JavaScript 1 Updated Jan 2, 2024
leakcheck Public

🛠️ Memory leak detector for C programs

C MIT License Updated Jan 2, 2024
hpcconf Public

HTML Updated Dec 24, 2023
gpu_poor Public
Forked from RahulSChand/gpu_poor

Calculate token/s & GPU memory requirement for any LLM. Supports llama.cpp/ggml/bnb/QLoRA quantization

JavaScript Updated Nov 4, 2023
algo-playground Public

optimize to push the limits.

Python 1 Updated Sep 12, 2023
ByteTransformer Public
Forked from bytedance/ByteTransformer

optimized BERT transformer inference on NVIDIA GPU. https://arxiv.org/abs/2210.03052

C++ Apache License 2.0 Updated Jul 24, 2023

Previous Next

Footer

© 2024 GitHub, Inc.

Footer navigation

Terms
Privacy
Security
Status
Docs
Contact

You can’t perform that action at this time.