Skip to content
View jssonx's full-sized avatar
💨
accelerating at scale is all you need
💨
accelerating at scale is all you need
  • Rice University
  • Houston, TX
  • 09:26 - 6h behind
  • LinkedIn in/yuningxia

Sponsoring

@tinygrad

Highlights

  • Pro

Block or report jssonx

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
  • 📚 A curated list of awesome matrix-matrix multiplication (A * B = C) frameworks, libraries and software

    12 1 MIT License Updated Dec 21, 2024
  • C++ Updated Dec 19, 2024
  • 2025 Public

    Forked from asplos-contest/2025

    The ASPLOS 2025 / EuroSys 2025 Contest Track

    Apache License 2.0 Updated Dec 8, 2024
  • Open-source scientific and technical publishing system built on Pandoc.

    JavaScript Other Updated Nov 26, 2024
  • C++ Updated Oct 1, 2024
  • Website URL

    HTML Updated Sep 27, 2024
  • llvm Public

    Forked from intel/llvm

    Intel staging area for llvm.org contribution. Home for Intel LLVM-based projects.

    LLVM Other Updated Sep 1, 2024
  • Samples for Intel® oneAPI Toolkits

    C++ MIT License Updated Aug 27, 2024
  • ray Public

    Forked from ray-project/ray

    Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

    Python Apache License 2.0 Updated Jul 17, 2024
  • Intel® Graphics Compute Runtime for oneAPI Level Zero and OpenCL™ Driver

    C++ MIT License Updated Jun 21, 2024
  • Compute Benchmarks for oneAPI Level Zero and OpenCL™ Driver

    C++ MIT License Updated Jun 7, 2024
  • hatchet Public

    Forked from LLNL/hatchet

    Graph-indexed Pandas DataFrames for analyzing hierarchical performance data

    JavaScript MIT License Updated May 10, 2024
  • oneAPI Level Zero Specification Headers and Loader

    C++ MIT License Updated May 9, 2024
  • pti-gpu Public

    Forked from intel/pti-gpu

    Profiling Tools Interfaces for GPU (PTI for GPU) is a set of Getting Started Documentation and Tools Library to start performance analysis on Intel(R) Processor Graphics easily

    C++ MIT License Updated May 9, 2024
  • 📊 A microbenchmark for GEMM kernels on NVIDIA GPUs with Ampere architecture

    C++ 1 Updated Mar 26, 2024
  • jssonx Public

    Updated Mar 5, 2024
  • cutlass Public

    Forked from NVIDIA/cutlass

    CUDA Templates for Linear Algebra Subroutines

    C++ Other Updated Mar 3, 2024
  • C Updated Feb 18, 2024
  • DeepSpeed Public

    Forked from microsoft/DeepSpeed

    DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

    Python Apache License 2.0 Updated Feb 17, 2024
  • Pairwise sequence alignment algorithm.

    C++ MIT License Updated Jan 18, 2024
  • lightneuron Public

    ⚡️ An educational ConvNet inference framework designed for x86 architectures

    C 1 MIT License Updated Jan 15, 2024
  • 🧩 Hands-on SIMD Programming with C++

    C++ 4 1 Updated Jan 14, 2024
  • lz77 Public

    LZ77 in C.

    C MIT License Updated Jan 7, 2024
  • 🦣 Insight Mastodon: NLP Analysis with Spark

    Python Updated Jan 4, 2024
  • eva-lang Public

    A functional programming language in JavaScript.

    JavaScript 1 Updated Jan 2, 2024
  • leakcheck Public

    🛠️ Memory leak detector for C programs

    C MIT License Updated Jan 2, 2024
  • hpcconf Public

    HTML Updated Dec 24, 2023
  • gpu_poor Public

    Forked from RahulSChand/gpu_poor

    Calculate token/s & GPU memory requirement for any LLM. Supports llama.cpp/ggml/bnb/QLoRA quantization

    JavaScript Updated Nov 4, 2023
  • optimize to push the limits.

    Python 1 Updated Sep 12, 2023
  • optimized BERT transformer inference on NVIDIA GPU. https://arxiv.org/abs/2210.03052

    C++ Apache License 2.0 Updated Jul 24, 2023