ysjprojects

Follow

SJ Yu ysjprojects

Follow

14 followers · 10 following

www.sjyu.ai

Achievements

Achievements

Highlights

Pro

ysjprojects/README.md

Hi 👋, I'm Shi Jie

CS@NYU; aspiring AI Engineer/Researcher

🌱 I’m currently learning Local LLM Training + Inference, RLVR, Agentic AI, Model Context Protocol
💬 Ask me about Transformers, Local LLM Training + Inference, Efficient LLMs, Synthetic Data Generation, Reinforcement Learning from Human Feedback
📫 How to reach me: [email protected] OR [email protected]
📄 Know about my experiences here

Education

09/2024-Present Master's in Computer Science, New York University
08/2020-05/2024 Bachelor of Science in Business Analytics (with Honours), National University of Singapore

Research Interests

Large Language Models
RLHF / RLVR
LLM Alignment
Synthetic Data Generation

Open-source Contributions

Lightning-AI/litgpt: 33 commits
- Implemented Multihead Latent Attention (DeepseekV3Attention) architecture #2113
- Optimized LoRA finetune script #2086
- Added Qwen 3 series models (Dense + MoE) #2125, #2110, #2060, #2046, #2044
- Fixed incorrect gradient accumulation steps #1947
- Added OlMo 2 by AI2 #1897 and SmolLM 2 by Hugging Face #1848
- (first commit) Added Qwen 2.5 series models #1834
huggingface/aisheets: 2 commits
- Added DuckDB parameterized queries for greater type safety in building SQL statements #430
EleutherAI/lm-evaluation-harness: 4 commits
- Implemented GSM-Plus #2103
- Implemented MMLU-Pro #1961
hiyouga/LLaMA-Factory: 1 commit
- Fixed use_cache patching for Gemma 3 multimodal models #7500

🔧 Technologies & Tools

📈 GitHub Stats

Connect with me:

Acknowledgements

Profile Pic: all credit goes to u/King_of_FAX__No_Cap

Pinned Loading

litgpt litgpt Public

Forked from Lightning-AI/litgpt

20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.

Python
Quaternion-BERT Quaternion-BERT Public

Parameter efficient BERT architecture

Python
aisheets aisheets Public

Forked from huggingface/aisheets

Build, enrich, and transform datasets using AI models with no code

TypeScript
gemma3n-unsloth-edutech gemma3n-unsloth-edutech Public

TypeScript