Skip to content
View Brainkite's full-sized avatar
🌋
Exploding gradients
🌋
Exploding gradients

Highlights

  • Pro

Block or report Brainkite

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned Loading

  1. CUDA_Flash_Attention2 CUDA_Flash_Attention2 Public

    Implement Flash Attention v2 just from the paper in Numba JIT and CUDA

    Cuda 2

  2. SmolLm2-zero SmolLm2-zero Public

    Forked from philschmid/deep-learning-pytorch-huggingface

    Train a small LLM to "think" with not SFT, only RL

    Jupyter Notebook 2

  3. CleanGPT CleanGPT Public

    simple GPT2 pytorch implementation and pre-training on edu-Fineweb dataset

    Python