Brainkite

Follow

🌋

Exploding gradients

Antonin Sumner Brainkite

🌋

Exploding gradients

Follow

4 followers · 14 following

Lyon

Achievements

Achievements

Highlights

Pro

Pinned Loading

CUDA_Flash_Attention2 CUDA_Flash_Attention2 Public

Implement Flash Attention v2 just from the paper in Numba JIT and CUDA

Cuda 2
SmolLm2-zero SmolLm2-zero Public

Forked from philschmid/deep-learning-pytorch-huggingface

Train a small LLM to "think" with not SFT, only RL

Jupyter Notebook 2
CleanGPT CleanGPT Public

simple GPT2 pytorch implementation and pre-training on edu-Fineweb dataset

Python