Implement serialization to make buffer checkpoint-compliant #823

semioz · 2025-08-24T15:09:32Z

Resuming data buffer properly from a checkpoint. When saving, it stores the rollout buffer, metadata, and the dataset to disk. Whether to restore from a checkpoint or start fresh is controlled by a simple config flag just like we had with resume_step.

GitHub Issue: #748

…-ckpt

mikasenghaas · 2025-08-24T20:16:22Z

sorry, we gon fix the w&b api key so the integration test runs proper.

semioz · 2025-08-25T08:06:30Z

@mikasenghaas was about to test but should i also convert my torch.save's to safetensors as you mentioned in this issue? #821

mikasenghaas · 2025-08-25T08:15:12Z

@semioz nope, wait with this for now. we will rewrite from torch.save to safetensors.save in a separate pr. depending on whether this or the other pr goes in first, you may have to rebase

mikasenghaas

couple of comments, but on a good way! nice job

mikasenghaas · 2025-08-25T08:26:15Z

src/prime_rl/orchestrator/buffer.py

+        torch.save(self.rollout_buffer, path / "rollout_buffer.pt")
+        torch.save(self.metadata, path / "metadata.pt")
+
+        self.dataset.save_to_disk(path / "buffer_dataset")


hm, i can see how this is the easiest way to implement this but i wonder if somehow we could save/load from a single dataset instance. eg. metadata could easily be a column in the dataset. rollout buffer is a bit trickier tho.. i think it's fine like this for now but it feels like a single hf dataset might be the most clean way of serializing

mikasenghaas · 2025-08-25T08:27:27Z

src/prime_rl/orchestrator/buffer.py

        return sampled_rollouts


+def load_buffer(path: Path, buffer_config: DataBufferConfigType) -> Buffer:


can we make this function in-place like our other checkpoint logic?

src/prime_rl/orchestrator/ckpt.py

mikasenghaas · 2025-08-25T08:30:40Z

src/prime_rl/orchestrator/ckpt.py

+        buffer_path = step_path / "buffer"
+        buffer.save(buffer_path)


this should be inside _save_to_path for async checkpointing to work properly

src/prime_rl/orchestrator/config.py

mikasenghaas · 2025-08-25T08:33:50Z

src/prime_rl/orchestrator/orchestrator.py

+        # Load buffer from checkpoint
+        if config.ckpt.resume_buffer_from_checkpoint:
+            logger.info("Resuming buffer from checkpoint")
+            buffer_path = ckpt_manager.get_ckpt_path(config.ckpt.resume_step) / "buffer"
+            buffer = load_buffer(str(buffer_path), config.buffer)
+        else:
+            logger.info("Initializing buffer from scratch")
+            buffer = setup_buffer(dataset, config.buffer)


if we make the suggested changes to the buffer load function and the checkpoint, there should be very minimal changes to the orchestrator code

semioz · 2025-08-25T17:04:22Z

@mikasenghaas thanks. converted the load/save process to fully hf via columns and moved the resume option to buffer configs(lmk if you guys don't wanna make it configurable at all) also other changes you requested. ready for your review again i guess.

mikasenghaas · 2025-08-26T10:18:35Z

Closing as continued in #839

semioz added 6 commits August 23, 2025 22:38

init commit

1d8467e

replace ospath, structure

d058c28

Merge branch 'main' of https://github.com/semioz/prime-rl into buffer…

9b5e270

…-ckpt

ckpt path

f7b0bb0

fix conflict

7a283d5

ruff

f4fdf7e

mikasenghaas self-requested a review August 25, 2025 08:15

mikasenghaas reviewed Aug 25, 2025

View reviewed changes

convert to hf fully, fixes

bbb4f00

mikasenghaas mentioned this pull request Aug 26, 2025

Buffer checkpointing and offline data filtering #839

Merged

4 tasks

mikasenghaas closed this Aug 26, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Implement serialization to make buffer checkpoint-compliant #823

Implement serialization to make buffer checkpoint-compliant #823

Uh oh!

semioz commented Aug 24, 2025

Uh oh!

mikasenghaas commented Aug 24, 2025

Uh oh!

semioz commented Aug 25, 2025

Uh oh!

mikasenghaas commented Aug 25, 2025

Uh oh!

mikasenghaas left a comment

Uh oh!

mikasenghaas Aug 25, 2025

Uh oh!

mikasenghaas Aug 25, 2025

Uh oh!

Uh oh!

mikasenghaas Aug 25, 2025

Uh oh!

Uh oh!

mikasenghaas Aug 25, 2025

Uh oh!

semioz commented Aug 25, 2025

Uh oh!

mikasenghaas commented Aug 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		return sampled_rollouts


		def load_buffer(path: Path, buffer_config: DataBufferConfigType) -> Buffer:

Implement serialization to make buffer checkpoint-compliant #823

Implement serialization to make buffer checkpoint-compliant #823

Uh oh!

Conversation

semioz commented Aug 24, 2025

Uh oh!

mikasenghaas commented Aug 24, 2025

Uh oh!

semioz commented Aug 25, 2025

Uh oh!

mikasenghaas commented Aug 25, 2025

Uh oh!

mikasenghaas left a comment

Choose a reason for hiding this comment

Uh oh!

mikasenghaas Aug 25, 2025

Choose a reason for hiding this comment

Uh oh!

mikasenghaas Aug 25, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mikasenghaas Aug 25, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mikasenghaas Aug 25, 2025

Choose a reason for hiding this comment

Uh oh!

semioz commented Aug 25, 2025

Uh oh!

mikasenghaas commented Aug 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants