Skip to content

Add delta weight sync blogpost#3386

Merged
kashif merged 14 commits into
huggingface:mainfrom
AmineDiro:amine_dirhoussi/delta-weight-sync
May 27, 2026
Merged

Add delta weight sync blogpost#3386
kashif merged 14 commits into
huggingface:mainfrom
AmineDiro:amine_dirhoussi/delta-weight-sync

Conversation

@AmineDiro
Copy link
Copy Markdown
Member

@AmineDiro AmineDiro commented May 22, 2026

Sparse safetensors over HF Buckets for async RL weight sync in TRL. ~99% bf16 sparsity at RL learning rates means per-step payload drops from 1.2 GB to 20-35 MB on Qwen3-0.6B. Includes four interactive animations and a disaggregated demo running on HF Spaces.

Sparse safetensors over HF Buckets for async RL weight sync in TRL.
~99% bf16 sparsity at RL learning rates means per-step payload drops
from 1.2 GB to 20-35 MB on Qwen3-0.6B. Includes four interactive
animations and a disaggregated demo running on HF Spaces.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@AmineDiro
Copy link
Copy Markdown
Member Author

@lewtun @kashif @qgallouedec kindly review when you got time :)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice!!

Copy link
Copy Markdown
Member

@qgallouedec qgallouedec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very easy to ready from top to bottom! Nice work! I couldn't access the figures though.
I only have some minor recommendations.

Comment thread delta-weight-sync.md Outdated

If you read our previous post on [the landscape of async RL training](https://huggingface.co/blog/async-rl-landscape), you already know the punchline. Every async RL library, regardless of how it spells "actor model" or which color its NCCL backend is painted, eventually trips over the same root: **weight synchronization**.

The inference engine speaks the policy of step N. The trainer just finished step N+1. The fresh weights have to get from one side to the other before the inference engine starts drifting hopelessly off-policy. In a synchronous setup you pay for this once per step and it is no big deal. In an async setup it happens constantly, in the background, while generation is also trying to happen, and it had better be fast.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a synchronous setup you pay for this once per step and it is no big deal

I would disagree with this. It also the same problem with the sync setup: you interrupt the training 1min30 where you could only interrupt it 1 sec.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, I need to fix it . The better point to make, is that you don't need to stop syncing on the upload part for the trainer. You can just inform inference enginer that weights are ready and go fetch from the rollout buffer

Comment thread delta-weight-sync.md Outdated
Comment thread delta-weight-sync.md
Comment thread delta-weight-sync.md Outdated
Comment thread delta-weight-sync.md Outdated
Comment thread delta-weight-sync.md
Comment thread delta-weight-sync.md
Comment thread delta-weight-sync.md Outdated
Comment thread delta-weight-sync.md Outdated
AmineDiro and others added 3 commits May 22, 2026 23:59
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
Comment thread delta-weight-sync.md
Comment thread delta-weight-sync.md Outdated
Comment thread delta-weight-sync.md Outdated
Comment thread delta-weight-sync.md Outdated
Comment thread delta-weight-sync.md Outdated
Comment thread delta-weight-sync.md Outdated
Comment thread delta-weight-sync.md Outdated
Comment thread delta-weight-sync.md Outdated
AmineDiro and others added 3 commits May 25, 2026 15:28
Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>
Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>
Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>
@AmineDiro
Copy link
Copy Markdown
Member Author

Thanks @kashif 🙏🏼 🙏🏼

@AmineDiro AmineDiro requested review from kashif and qgallouedec May 25, 2026 13:28
AmineDiro and others added 6 commits May 26, 2026 10:10
Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>
Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>
Removed user 'nouamanetazi' from the list of users.
Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>
Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>
Comment thread _blog.yml Outdated
@kashif kashif merged commit 3a5a344 into huggingface:main May 27, 2026
2 checks passed
Comment thread delta-weight-sync.md

## 1. The One Terabyte Problem

If you read our previous post on [the landscape of async RL training](https://huggingface.co/blog/async-rl-landscape), you already know the punchline. Every async RL library, regardless of how it spells "actor model" or which color its NCCL backend is painted, eventually trips over the same root: **weight synchronization**.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
If you read our previous post on [the landscape of async RL training](https://huggingface.co/blog/async-rl-landscape), you already know the punchline. Every async RL library, regardless of how it spells "actor model" or which color its NCCL backend is painted, eventually trips over the same root: **weight synchronization**.
If you read our previous post on [the landscape of async RL training](https://huggingface.co/blog/async-rl-training-landscape), you already know the punchline. Every async RL library, regardless of how it spells "actor model" or which color its NCCL backend is painted, eventually trips over the same root: **weight synchronization**.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants