Add delta weight sync blogpost by AmineDiro · Pull Request #3386 · huggingface/blog

AmineDiro · 2026-05-22T20:07:49Z

Sparse safetensors over HF Buckets for async RL weight sync in TRL. ~99% bf16 sparsity at RL learning rates means per-step payload drops from 1.2 GB to 20-35 MB on Qwen3-0.6B. Includes four interactive animations and a disaggregated demo running on HF Spaces.

Sparse safetensors over HF Buckets for async RL weight sync in TRL. ~99% bf16 sparsity at RL learning rates means per-step payload drops from 1.2 GB to 20-35 MB on Qwen3-0.6B. Includes four interactive animations and a disaggregated demo running on HF Spaces. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

AmineDiro · 2026-05-22T20:08:38Z

@lewtun @kashif @qgallouedec kindly review when you got time :)

qgallouedec · 2026-05-22T20:13:07Z

qgallouedec

very easy to ready from top to bottom! Nice work! I couldn't access the figures though.
I only have some minor recommendations.

qgallouedec · 2026-05-22T20:19:14Z

+
+If you read our previous post on [the landscape of async RL training](https://huggingface.co/blog/async-rl-landscape), you already know the punchline. Every async RL library, regardless of how it spells "actor model" or which color its NCCL backend is painted, eventually trips over the same root: **weight synchronization**.
+
+The inference engine speaks the policy of step N. The trainer just finished step N+1. The fresh weights have to get from one side to the other before the inference engine starts drifting hopelessly off-policy. In a synchronous setup you pay for this once per step and it is no big deal. In an async setup it happens constantly, in the background, while generation is also trying to happen, and it had better be fast.


In a synchronous setup you pay for this once per step and it is no big deal

I would disagree with this. It also the same problem with the sync setup: you interrupt the training 1min30 where you could only interrupt it 1 sec.

Agreed, I need to fix it . The better point to make, is that you don't need to stop syncing on the upload part for the trainer. You can just inform inference enginer that weights are ready and go fetch from the rollout buffer

Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>

Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>

AmineDiro · 2026-05-25T13:28:38Z

Thanks @kashif 🙏🏼 🙏🏼

Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>

Removed user 'nouamanetazi' from the list of users.

Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>

AmineDiro · 2026-05-27T14:10:37Z

+
+## 1. The One Terabyte Problem
+
+If you read our previous post on [the landscape of async RL training](https://huggingface.co/blog/async-rl-landscape), you already know the punchline. Every async RL library, regardless of how it spells "actor model" or which color its NCCL backend is painted, eventually trips over the same root: **weight synchronization**.


Suggested change

If you read our previous post on [the landscape of async RL training](https://huggingface.co/blog/async-rl-landscape), you already know the punchline. Every async RL library, regardless of how it spells "actor model" or which color its NCCL backend is painted, eventually trips over the same root: **weight synchronization**.

If you read our previous post on [the landscape of async RL training](https://huggingface.co/blog/async-rl-training-landscape), you already know the punchline. Every async RL library, regardless of how it spells "actor model" or which color its NCCL backend is painted, eventually trips over the same root: **weight synchronization**.

qgallouedec reviewed May 22, 2026

View reviewed changes

Comment thread assets/delta-weight-sync/thumbnail.png

Copy link
Copy Markdown

Member

qgallouedec May 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice!!

AmineDiro reacted with heart emoji

qgallouedec reviewed May 22, 2026

View reviewed changes

AmineDiro and others added 3 commits May 22, 2026 23:59

Update delta-weight-sync.md

d4650a8

Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>

Update delta-weight-sync.md

54941af

Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>

Address review comments: iframes, vLLM PR, link cleanup

b9f6540