-
Rice University
- Houston, TX
-
09:26
- 6h behind - in/yuningxia
Highlights
- Pro
-
awesome-gemm Public
📚 A curated list of awesome matrix-matrix multiplication (A * B = C) frameworks, libraries and software
-
-
2025 Public
Forked from asplos-contest/2025The ASPLOS 2025 / EuroSys 2025 Contest Track
Apache License 2.0 UpdatedDec 8, 2024 -
quarto-cli Public
Forked from quarto-dev/quarto-cliOpen-source scientific and technical publishing system built on Pandoc.
JavaScript Other UpdatedNov 26, 2024 -
-
readinglist Public
Forked from cplmakerlab/simple-website-templateWebsite URL
HTML UpdatedSep 27, 2024 -
llvm Public
Forked from intel/llvmIntel staging area for llvm.org contribution. Home for Intel LLVM-based projects.
LLVM Other UpdatedSep 1, 2024 -
oneAPI-samples Public
Forked from oneapi-src/oneAPI-samplesSamples for Intel® oneAPI Toolkits
C++ MIT License UpdatedAug 27, 2024 -
ray Public
Forked from ray-project/rayRay is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
Python Apache License 2.0 UpdatedJul 17, 2024 -
compute-runtime Public
Forked from intel/compute-runtimeIntel® Graphics Compute Runtime for oneAPI Level Zero and OpenCL™ Driver
C++ MIT License UpdatedJun 21, 2024 -
compute-benchmarks Public
Forked from intel/compute-benchmarksCompute Benchmarks for oneAPI Level Zero and OpenCL™ Driver
C++ MIT License UpdatedJun 7, 2024 -
hatchet Public
Forked from LLNL/hatchetGraph-indexed Pandas DataFrames for analyzing hierarchical performance data
JavaScript MIT License UpdatedMay 10, 2024 -
level-zero Public
Forked from oneapi-src/level-zerooneAPI Level Zero Specification Headers and Loader
C++ MIT License UpdatedMay 9, 2024 -
pti-gpu Public
Forked from intel/pti-gpuProfiling Tools Interfaces for GPU (PTI for GPU) is a set of Getting Started Documentation and Tools Library to start performance analysis on Intel(R) Processor Graphics easily
C++ MIT License UpdatedMay 9, 2024 -
gemm-kernel-microbenchmark Public
📊 A microbenchmark for GEMM kernels on NVIDIA GPUs with Ampere architecture
-
-
cutlass Public
Forked from NVIDIA/cutlassCUDA Templates for Linear Algebra Subroutines
C++ Other UpdatedMar 3, 2024 -
-
DeepSpeed Public
Forked from microsoft/DeepSpeedDeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Python Apache License 2.0 UpdatedFeb 17, 2024 -
-
lightneuron Public
⚡️ An educational ConvNet inference framework designed for x86 architectures
-
-
-
-
eva-lang Public
A functional programming language in JavaScript.
-
-
-
gpu_poor Public
Forked from RahulSChand/gpu_poorCalculate token/s & GPU memory requirement for any LLM. Supports llama.cpp/ggml/bnb/QLoRA quantization
JavaScript UpdatedNov 4, 2023 -
-
ByteTransformer Public
Forked from bytedance/ByteTransformeroptimized BERT transformer inference on NVIDIA GPU. https://arxiv.org/abs/2210.03052
C++ Apache License 2.0 UpdatedJul 24, 2023