Skip to content
View szrlee's full-sized avatar

Highlights

  • Pro

Block or report szrlee

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

The official implementation of Tensor ProducT ATTenTion Transformer (T6)

Python 336 31 Updated Feb 20, 2025

Codebase for Iterative DPO Using Rule-based Rewards

Python 225 30 Updated Feb 25, 2025

diagnosis_zero, R1 Zero reproduce on disease diagnosis

Python 11 Updated Feb 8, 2025

This is a replicate of DeepSeek-R1-Zero and DeepSeek-R1 training on small models with limited data

Python 3,193 236 Updated Mar 17, 2025

Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

Jupyter Notebook 42,201 5,771 Updated Mar 17, 2025

A GUI Agent application based on UI-TARS(Vision-Lanuage Model) that allows you to control your computer using natural language.

TypeScript 3,139 232 Updated Mar 19, 2025

RAGEN leverages reinforcement learning to train LLM reasoning agents in interactive, stochastic environments.

Python 1,182 84 Updated Mar 19, 2025

Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.

Jupyter Notebook 48,247 5,151 Updated Jan 22, 2025

An open-source library for GPU-accelerated robot learning and sim-to-real transfer.

Jupyter Notebook 810 86 Updated Mar 18, 2025

Scalable RL solution for advanced reasoning of language models

Python 1,408 86 Updated Mar 18, 2025

Finetune Llama 3.3, DeepSeek-R1, Gemma 3 & Reasoning LLMs 2x faster with 70% less memory! 🦥

Python 35,235 2,695 Updated Mar 19, 2025

润学全球官方指定GITHUB,整理润学宗旨、纲领、理论和各类润之实例;解决为什么润,润去哪里,怎么润三大问题; 并成为新中国人的核心宗教,核心信念。

31,935 2,619 Updated Jul 31, 2024

A generative world for general-purpose robotics & embodied AI learning.

Python 24,445 2,132 Updated Mar 19, 2025

A curated list of Large Language Model resources, covering model training, serving, fine-tuning, and building LLM applications.

2,855 349 Updated Jan 30, 2025

LoRA-Ensemble: Efficient Uncertainty Modelling for Self-attention Networks

Python 47 3 Updated Oct 1, 2024

A curated list of awesome exploration RL resources (continually updated)

454 14 Updated Feb 7, 2025

DeepSeek LLM: Let there be answers

Makefile 6,201 954 Updated Feb 4, 2024

A LLM-based Agent that predict its tasks proactively.

Python 329 29 Updated Mar 7, 2025

Awesome-LLM-Robustness: a curated list of Uncertainty, Reliability and Robustness in Large Language Models

731 48 Updated Feb 28, 2025

Let your Claude able to think

TypeScript 14,746 1,714 Updated Mar 10, 2025

A bibliography and survey of the papers surrounding o1

TeX 1,181 50 Updated Nov 16, 2024

🌾 OAT: A research-friendly framework for LLM online alignment, including preference learning, reinforcement learning, etc.

Python 224 13 Updated Mar 10, 2025

Official repo for the paper "Scaling Synthetic Data Creation with 1,000,000,000 Personas"

Python 1,086 80 Updated Feb 19, 2025

We perform functional grounding of LLMs' knowledge in BabyAI-Text

Python 249 28 Updated Aug 23, 2024

Fine-tune LLM agents with online reinforcement learning

Python 1,091 50 Updated Mar 19, 2024

Code for Paper (Preserving Diversity in Supervised Fine-tuning of Large Language Models)

Python 14 Updated Mar 4, 2025

O1 Replication Journey

1,977 65 Updated Jan 14, 2025

Official implementation of paper "On the Diagram of Thought" (https://arxiv.org/abs/2409.10038)

176 11 Updated Mar 13, 2025

Autonomous Agents (LLMs) research papers. Updated Daily.

720 39 Updated Mar 19, 2025
Next
Showing results