Tests #1

thomasw21 · 2021-08-07T16:45:52Z

Context

I started experimenting with tests after coming up with an issue regarding prefix lm where the deepspeed version of gpt has some hacks that changes some behaviours concerning attention mechanisms. https://github.com/bigscience-workshop/Megatron-DeepSpeed/blob/main/pretrain_gpt.py#L56-L71

basically the issue, is that deepspeed would override some arguments passed down forcing the model to ignore whatever mask I would feed to the rest, causin everything to run smoothly but using a causal mask instead of a prefix one.

Fixes: Add tests & setup CI bigscience-workshop/Megatron-DeepSpeed#53

What I want to test

Ideally I would want to test out deepspeed --num_gpus 1 pretrain_gpt.py ${ARGS}. However that prevents us from easily handling different types of inputs because data is fed via a file. Instead I did the following:

model, _, _ = setup_model_and_optimizer(gpt_model_provider)
model = model[0]
input_batch = get_gpt_batch_pipe({"text": token_ids})[0]
output = model(*input_batch)

It's not ideal, as it feels like a hacky way to intialise the model correctly. But it allows to use the model "interactively", ie I can process an input and then test outputs.

We use patch to simulate different arguments.

How I ran test

# Prevent hf transformers and datasets library to access internet
export TRANSFORMERS_OFFLINE=1
export HF_DATASETS_OFFLINE=1

# Emulate a distributed env
export MASTER_ADDR=localhost
export MASTER_PORT=9994
export RANK=0
export LOCAL_RANK=0
export WORLD_SIZE=1

# TESTING SOME STUFF

python -m unittest Megatron-DeepSpeed.megatron.model.test.test_gpt_model

TODO

Write test for causal gpt
Write test for prefix lm
Write test for rotary embeddings (I'm not sure how to test that exactly, besides it's running)

thomasw21 · 2021-08-07T16:46:47Z

pretrain_prefix_lm.py

    )

-    return (tokens, position_ids, attention_mask), (labels, loss_mask)
+    return (tokens, position_ids, attention_mask), (labels, loss_mask), prefix_indices


This is okay as all values beyong the 3rd index are ignored. Which is nice to be used to expose variables needed for testing.

stas00 · 2021-08-08T01:48:54Z

Glad to see you started working on it, @thomasw21

FYI, there will be more tests coming from: bigscience-workshop#47

So, first we will build the test suite around pytest since we have a rich library of helper test utils already at HF transformers and use unittest on top of it where it makes sense.
In particular the distributed target application support that is implemented here:
https://github.com/huggingface/transformers/blob/7fcee113c163a95d1b125ef35dc49a0a1aa13a50/src/transformers/testing_utils.py#L1200
If you want to see a good example of how it's being used, please see:
https://github.com/huggingface/transformers/blob/7fcee113c163a95d1b125ef35dc49a0a1aa13a50/tests/deepspeed/test_deepspeed.py#L780
Then the important one is TestCasePlus that provides lots of great features.
https://github.com/huggingface/transformers/blob/7fcee113c163a95d1b125ef35dc49a0a1aa13a50/src/transformers/testing_utils.py#L753
the key is managing temp dirs and and much more.
Then log and std stream catchers...

well a lot more

I guess we can just drop the helper library into tests for now.

The base for the tests will be under ./tests and run as pytest tests. i.e. all tests will be there.

Let me know if you prefer I set things up first and then you can drop in all your awesome additions.

thomasw21 · 2021-08-08T14:07:56Z

Cool! So I don't have much experience with pytest (none actually) or how it's done in transformers. Maybe you could setup it up and then I can migrate to the new framework?

Either way I think I can still merge this (on my prefix lm branch but also on master) as those are runnable tests (see commands in the description). Which might be helpful for engs to at least check they haven't broken anything on their setup. Once we setup CI it'll be much smoother (and costly). The tests here don't guarantee you haven't broken anything, but IMO they are a good attempt at catching some basic things (like my attention issue).

stas00 · 2021-08-08T16:43:31Z

Sounds good, @thomasw21. You can merge and I will setup the test suite tomorrow and then you will move/adapt it if you prefer that order.

* ICT zeroshot evaluation code * made more generic, aligned with other tasks * Fixed based on review recoemmendation * fixed another issue * implementing DPR * implementation dpr * adding dpr code * removed commnets * removed commnets * removed commnets * DPR evaluation debugging * DPR ongoing * DPR finetune and evaluation * fixing model evaluation of retriver * added pre ad post process * added pre ad post process * evaluation works! * debugging DPR * fix copy-n-paste error remove erroneous arg. * Typo fix in readme * t5 fixes * before cleaning the comments * vit pipeline fixes * cleaning the code * additional cleaning * renaming the folders * Add temporary assert to finetuning until it can be fixed. * Fixed issues with ICT pretraining * updated the evaluation script for retriver * updated the evaluation script for retriver * updated the evaluation script for retriver * updated the evaluation script for retriver * added exit interval for finetuning * updating the scripts * updating no load rng * updating script * Update T5 scripts * resolved hang issue * fixed the tensor size miss-mass issue * fixed the evaluation hangs * Adding readme * Adding readme * Adding readme * Adding readme * Adding readme * Adding readme * Adding readme * Adding readme * Clean up README.md a bit * addressed comments * updated readme * updated readme * updated readme * updated readme * Basic handling of prefix lm by updating the mask * Add prefix option to gpt temporarily and prevent it to use custom kernel * Add argument for prefix lm, in order to configure masking strategy * Woops * loss_on_targets_only flag, assert that current prefix implementation only works with reset_attention_mask set to True and attempt to fix empty slice issue * Format * Reverse renaming * Allow prefix on partial document at the end * WIP: add prefix per row feature * Document the use of None * Woops * Handle empty document better * We might not be able to concat empty tensors * Handle empty tensor seperately * Debug * Test * Add loss masking as script argument * Turns out deepspeed integration of attention matrices prevented dynamic masks * Add more asserts * Prefix can only see the prefix, it cannot see target * Remove prefix-lm argument as we split the pretrain script * Iz PR review * Make masking row dependent when using prefix * Revert "Merge remote-tracking branch 'origin/master' into prefix_lm" This reverts commit d49d6e5, reversing changes made to 28a712d. * Tests (#1) * WIP: test * Still trying to figure out deepspeed * WIP * Test test * Test how to setup deepspeed in unit tests * Test something else * Empty strings might be problematic * Remove unecessary arguments * Woops * Remove global variables at the end of each test and init deepspeed * Woops * Maybe adding classmethod * Woops * Add debug print to check that tear down happends * Reset global variables before * Let's test this * Try something else * WIP * More fix * More fix * More stuff to fix * We really want to compare vectors and not coordinates * Reformat * check something out * fix test * Remove prefix-lm flag as it's integrated * Woops * Add test for without reset attention mask * Fix test for non reset attention mask * Fix test * Update code for prefix lm Co-authored-by: Mostofa Patwary <mostofa.patwary@gmail.com> Co-authored-by: Mostofa Patwary <mpatwary@nvidia.com> Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Devrim <46989091+devrimcavusoglu@users.noreply.github.com> Co-authored-by: Stas Bekman <stas@stason.org> Co-authored-by: Vijay Korthikanti <vkorthikanti@nvidia.com> Co-authored-by: Jared Casper <jcasper@nvidia.com> Co-authored-by: Mohammad Shoeybi <mshoeybi@nvidia.com> Co-authored-by: Deepak Narayanan <dnarayanan@nvidia.com>

thomasw21 commented Aug 7, 2021

View reviewed changes

thomasw21 mentioned this pull request Aug 7, 2021

Prefix lm bigscience-workshop/Megatron-DeepSpeed#52

Merged

5 tasks

thomasw21 added 25 commits August 9, 2021 11:18

WIP: test

516df5e

Still trying to figure out deepspeed

eececbb

WIP

b78dfaa

Test test

0bfc2f4

Test how to setup deepspeed in unit tests

b801637

Test something else

2b81d40

Empty strings might be problematic

aeca8c1

Remove unecessary arguments

520ef72

Woops

37522b4

Remove global variables at the end of each test and init deepspeed

76f01fe

Woops

188b33b

Maybe adding classmethod

57191c4

Woops

1389e6d

Add debug print to check that tear down happends

e458540

Reset global variables before

d7f331f

Let's test this

af9a716

Try something else

b908124

WIP

28cea95

More fix

642ef91

More fix

5143ce6

More stuff to fix

8cfb92c

We really want to compare vectors and not coordinates

9dbd939

Reformat

82c6ca1

check something out

7c6ea15

fix test

076b69f

Remove prefix-lm flag as it's integrated

2e0f71a

thomasw21 force-pushed the setup_tests branch from 00c2103 to 2e0f71a Compare August 9, 2021 09:18

thomasw21 added 5 commits September 16, 2021 09:30

Update test

4ec9aca

Woops

18b1c97

Add test for without reset attention mask

76aad89

Fix test for non reset attention mask

86f8928

Fix test

fe4a815

thomasw21 merged commit 295e8d0 into prefix_lm Sep 16, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tests #1

Tests #1

Uh oh!

thomasw21 commented Aug 7, 2021 •

edited

Loading

Uh oh!

thomasw21 Aug 7, 2021

Uh oh!

stas00 commented Aug 8, 2021 •

edited

Loading

Uh oh!

thomasw21 commented Aug 8, 2021

Uh oh!

stas00 commented Aug 8, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Tests #1

Tests #1

Uh oh!

Conversation

thomasw21 commented Aug 7, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Context

What I want to test

How I ran test

TODO

Uh oh!

thomasw21 Aug 7, 2021

Choose a reason for hiding this comment

Uh oh!

stas00 commented Aug 8, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

thomasw21 commented Aug 8, 2021

Uh oh!

stas00 commented Aug 8, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

thomasw21 commented Aug 7, 2021 •

edited

Loading

stas00 commented Aug 8, 2021 •

edited

Loading