Minibatch impl #364

Dahoas · 2023-03-13T16:41:36Z

Implements minibatching for PPO.

PPO sentiments bs: 32, mbs: 16. https://wandb.ai/dahoas/trlx/runs/oo6t8rla/overview?workspace=user-dahoas

PPO HH on GPT-NeoX bs: 4, mbs: 1. https://wandb.ai/dahoas/trlx/runs/9rnbmtu6?workspace=user-dahoas

cat-state · 2023-03-16T12:51:47Z

trlx/data/configs.py

@@ -196,6 +196,9 @@ class TrainConfig:

    :param seed: Random seed
    :type seed: int
+
+    :param minibatch_size: Size of model input during one forward pass. Must divide batch size


(very pedantic) nit, feel free to ignore: usually I've heard this called micro batch, with minibatch referring to what we usually call "a batch" (to distinguish from a single batch of the whole dataset)

cat-state · 2023-03-21T16:53:46Z

trlx/trainer/accelerate_base_trainer.py

+                    stats_accum = []
+                    for mbi in range(self.num_mb):
+                        forward_time -= time()
+                        loss, stats = self.loss(batch)


Shouldn't this get loss using a minibatch sliced from batch?

Ah yeah good catch (I copied this over from a different branch and forgot to change this)

eluzhnica · 2023-03-26T16:20:55Z

trlx/trainer/accelerate_base_trainer.py

+                    forward_time = 0
+                    backward_time = 0
+                    stats_accum = []
+                    for mb in mbs:


to avoid unnecessary gradient synchronization when doing using gradient accumulation you can simply add: self.accelerator.accumulate(self.model) here

Does that require setting gradient_accumulation_steps for the accelerator? cc @Dahoas

fwiw i tried using that atop this PR and got weird results: https://wandb.ai/uwu1/trlx/reports/Untitled-Report--VmlldzozOTAyNDg4

yes, but we're specifying that already in the config files like zero2-bf16.yaml

did you add it right below this line or literally at the top of the script? it should be added below this line since the entering of the context is how accumulate tracks the number of steps that are being executed so that it knows when to sync (has an internal step counter)

Also the division of the loss by self.num_mb should go away since that would be handled by accelerator

Here's an example: https://github.com/muellerzr/timing_experiments/blob/main/good.py#L152

@eluzhnica want to make a PR adding that atop this PR? then can merge in after this one. It would be good to be able to specify it using the TRLConfig still vs having to use a seperate one

@cat-state Okay just made a PR. I'll try to run the same tests that @Dahoas ran to confirm it works as intended.

As for the TRLConfig, happy to do so, but it seems to me that the configs for accelerate are being set up separately in the repo (configs/accelerate/....) and that is the general pattern I've seen before too. And accelerate feeds those params behind the scenes for us automatically, so if we were to also specify in the TRLConfig it is a bit redundant (and potentially conflictual values). let me know what you think

cat-state

LGTM!

eluzhnica · 2023-03-27T20:15:55Z

trlx/trainer/accelerate_base_trainer.py

@@ -468,18 +475,40 @@ def learn(self):  # noqa: C901
        for _ in range(self.config.train.epochs):
            # For each batch
            for batch in self.train_dataloader:
+                mbs = [
+                    PPORLBatch(


@Dahoas Just to confirm, this is still a draft correct? Since (although I haven't run it) this I think would break for ILQLBatch and other types of datatypes

Ah you're right, I just ran this with the benchmarking scripts and it crashed here with this error

Fixed this here: #403

cat-state

Need to make it not incompatible with ILQL

* Add minibatch iterator * Add tests

Dahoas · 2023-04-03T18:44:25Z

Merged #403 which makes mini-batching compatible with all trainers. Thank you @eluzhnica!

Dahoas · 2023-04-03T18:48:00Z

Let's merge this into main once #396 gets merged.

* Avoid gradient synchronization when accumulating * Fix accumulation to account for dataloader * Add some tests

Dahoas · 2023-04-05T18:45:45Z

@cat-state Can you take another look and if it looks good we can merge?

cat-state · 2023-04-06T15:30:23Z

thanks @eluzhnica and @Dahoas ! just tried w ilql and it seems to work https://wandb.ai/uwu1/trlx/runs/m2e4rwga?workspace=user-uwu1
Thanks for working together and adding this!

Dahoas added 3 commits March 13, 2023 16:26

fixes half exp not implemented error

1a243ae

added minibatching

800c433

fix num_mb name

17b543a

Dahoas requested review from maxreciprocate and cat-state March 13, 2023 16:41

cat-state reviewed Mar 16, 2023

View reviewed changes

cat-state reviewed Mar 21, 2023

View reviewed changes

Dahoas added 4 commits March 22, 2023 15:27

fix minibatch indexing

a196418

fixing style

17b12be

fixing style

8994187

fixing style

f389941

eluzhnica reviewed Mar 26, 2023

View reviewed changes

cat-state approved these changes Mar 27, 2023

View reviewed changes

eluzhnica reviewed Mar 27, 2023

View reviewed changes

cat-state requested changes Mar 27, 2023

View reviewed changes

eluzhnica and others added 2 commits April 3, 2023 14:36

Minibatch iterator (#403)

8485e78

* Add minibatch iterator * Add tests

Merge branch 'main' into minibatch-impl

d636448

Dahoas requested a review from cat-state April 3, 2023 18:43

Avoid gradient synchronization when accumulating (#396)

833d049

* Avoid gradient synchronization when accumulating * Fix accumulation to account for dataloader * Add some tests

cat-state approved these changes Apr 6, 2023

View reviewed changes

cat-state merged commit 565c316 into main Apr 6, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Minibatch impl #364

Minibatch impl #364

Dahoas commented Mar 13, 2023

cat-state Mar 16, 2023

cat-state Mar 21, 2023

Dahoas Mar 22, 2023

eluzhnica Mar 26, 2023

cat-state Mar 27, 2023

cat-state Mar 27, 2023

eluzhnica Mar 27, 2023

cat-state Mar 27, 2023 •

edited

Loading

eluzhnica Mar 27, 2023

cat-state left a comment

eluzhnica Mar 27, 2023

cat-state Mar 27, 2023

eluzhnica Mar 29, 2023

cat-state left a comment •

edited

Loading

Dahoas commented Apr 3, 2023 •

edited

Loading

Dahoas commented Apr 3, 2023

Dahoas commented Apr 5, 2023

cat-state commented Apr 6, 2023

Minibatch impl #364

Minibatch impl #364

Conversation

Dahoas commented Mar 13, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cat-state Mar 27, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cat-state left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cat-state left a comment • edited Loading

Choose a reason for hiding this comment

Dahoas commented Apr 3, 2023 • edited Loading

Dahoas commented Apr 3, 2023

Dahoas commented Apr 5, 2023

cat-state commented Apr 6, 2023

cat-state Mar 27, 2023 •

edited

Loading

cat-state left a comment •

edited

Loading

Dahoas commented Apr 3, 2023 •

edited

Loading