Fix skip of test_training_gradient_checkpointing by dvrogozh · Pull Request #34723 · huggingface/transformers

dvrogozh · 2024-11-14T00:09:35Z

19d58d3 has introduced a context manager to manage subtests of test_training_gradient_checkpointing. However, test body was not moved under the "with" statement. Thus, while tests are correctly marked as skipped, test bodies were still executed. In some cases, as with llama this caused attribute errors.

Fixes: #34722
Fixes: 19d58d3 ("Add MLLama (#33703)")

CC: @amyeroberts, @ArthurZucker

dvrogozh · 2024-11-14T00:34:47Z

@amyeroberts : test failures seem unrelated. Please, advice when I should rebase/try again?

_ ERROR at setup of JambaModelIntegrationTest.test_simple_batched_generate_with_padding _
....
/usr/local/lib/python3.10/ssl.py:1163: Failed

19d58d3 has introduced a context manager to manage subtests of test_training_gradient_checkpointing. However, test body was not moved under "with" statement. Thus, while tests are correctly marked as skipped, test bodies were still executed. In some cases, as with llama this caused attribute errors. Fixes: huggingface#34722 Fixes: 19d58d3 ("Add MLLama (huggingface#33703)") Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>

dvrogozh · 2024-11-14T18:37:37Z

@amyeroberts : rerun tests, no side fails now. Please, help review.

LysandreJik · 2024-11-15T10:14:58Z

Thanks a lot for your PR @dvrogozh! Pinging @qubvel and @ydshieh

ydshieh · 2024-11-15T14:23:40Z

Hi @dvrogozh, thank you for the PR.

When I run things like

python -m pytest -v tests\models\llama\test_modeling_llama.py -k "test_training_gradient_checkpointing"

I see

tests/models/llama/test_modeling_llama.py::LlamaModelTest::test_training_gradient_checkpointing SKIPPED (`supports_gradient_checkpointing` is False for LlamaModel.)                                  [ 33%] tests/models/llama/test_modeling_llama.py::LlamaModelTest::test_training_gradient_checkpointing_use_reentrant SKIPPED (`supports_gradient_checkpointing` is False for LlamaModel.)                    [ 66%] tests/models/llama/test_modeling_llama.py::LlamaModelTest::test_training_gradient_checkpointing_use_reentrant_false SKIPPED (`supports_gradient_checkpointing` is False for LlamaModel.)

and I also check the body after the subTest block which is not run.

Could you provide us in which cases it will enter that body while self.skipTest is executed.

ydshieh · 2024-11-15T15:27:06Z

Could you set breakpoint(s) to check which model_class is not skipped.

Also, is pytorch/pytorch@1a8752b necessary to produce the issue? I am running with 2.6.0.dev20241112+cu121 and it works well.

dvrogozh · 2024-11-15T15:34:13Z

Could you provide us in which cases it will enter that body while self.skipTest is executed.

Shortly: if pytest-subtests package is installed.

@ydshieh : this seems interesting. I was wondering why HF ci does not see this issue. Your comment above also suggests that you don't see the issue on your side. It seems I've found the reason. I have 2 systems, one with XPU, another with CUDA. Initially I saw issue on both systems. Yesterday, I fully cleaned and reinstall environment for XPU and issue was gone. So, the issue which I observe is triggered by environment difference. In particular, it shows up if the following package is installed: pytest-subtests (I have version 0.13.1). I am not sure when I got this package in my environment. I did check that if I install it on XPU I again can reproduce the issue.

dvrogozh · 2024-11-15T15:38:42Z

Most likely I got pytest-subtests installed from the HF Accelerate. It's in dependencies:
https://github.com/huggingface/accelerate/blob/c0552c9012a9bae7f125e1df89cf9ee0b0d250fd/setup.py#L25

dvrogozh · 2024-11-15T15:40:59Z

Also, is pytorch/pytorch@1a8752b necessary to produce the issue? I am running with 2.6.0.dev20241112+cu121 and it works well.

Most likely no. I am just building pytorch from sources on my side since I on purposely look for XPU backend issues in the most recent code. I did not try to check other pytorch versions, but the issue I see does not seem to be related to pytorch.

ydshieh · 2024-11-15T15:41:26Z

Hi thanks for the information. Indeed, our CI env. doesn't have pytest-subtests.

I will check it and see how a fix would be.

Currently, without pytest-subtests, although we don't see the mentioned issue, the logic is not good: when a model_class is identified to be skipped and run self.skipTest, it will skip the whole test case (not just that single subTest) and the other model_class won't be checked if to run at all (i.e. no more for loop).

dvrogozh · 2024-11-15T15:44:53Z

Currently, without pytest-subtests, although we don't see the mentioned issue, the logic is not good: <...> it will skip the whole test case (not just that single subTest) <...>

This sounds like we need to add pytest-subtests to the HF Transformers dependencies list. Let me know if you want me to do that in this PR.

As discussed in [1], pytest-subtests changes behavior of .skipTest() having effect to really skip individual subtests or skip the entire test if module is not installed. Huggingface Accelerate has module in its dependencies. It makes sense to add it for Transformers as well to avoid divergent environment between users and ci. See[1]: huggingface#34723 (comment) Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>

ydshieh · 2024-11-15T16:34:11Z

Although I think installing pytest-subtests is the way to go, I find something quite strange like:

import unittest


class T(unittest.TestCase):
    def test_foo(self):
        for i in range(7):
            with self.subTest("custom message"):
                if i < 3:
                    self.skipTest(f"bon {i}")
                # self.assertLess(i, 3)
                assert i < 3

fails with

[custom message] SUBFAIL test_foo.py::T::test_foo - AssertionError: assert 6 < 3
====================================================================================== 1 failed, 1 passed, 3 skipped in 0.67s =======================================================================================

while

change for i in range(7): to for i in range(6): gives

1 passed, 3 skipped in 0.43s

I am not sure I am doing stupid things 😭

ydshieh · 2024-11-15T16:41:11Z

without with self.subTest(f"custom message {i}"), it works with

[custom message 3] SUBFAIL test_foo.py::T::test_foo - AssertionError: assert 3 < 3
[custom message 4] SUBFAIL test_foo.py::T::test_foo - AssertionError: assert 4 < 3
[custom message 5] SUBFAIL test_foo.py::T::test_foo - AssertionError: assert 5 < 3
[custom message 6] SUBFAIL test_foo.py::T::test_foo - AssertionError: assert 6 < 3

dvrogozh · 2024-11-15T16:49:43Z

with range(7) adding print() right after the range:

>>> 0
>>> 1
>>> 2
>>> 3
>>> 4
>>> 5
>>> 6

T
 » foo
 » foo
 » foo
 ✗ foo
 ✓ foo
[custom message] SUBFAIL d.py::T::test_foo - AssertionError: assert 6 < 3
============================== 1 failed, 1 passed, 3 skipped in 0.11s ==============================

So, it loops thru all 6 cases. It correctly skips first 3, but after that I don't understand what's going on. It should have reported 4 failures, but it reported 1 failure, 1 pass and ate up 2 others. And in the end only printed assertion failure for the last iteration :).

ydshieh · 2024-11-15T17:15:13Z

Yeah. Maybe it's a bug. I will open an issue.
In the meantime, for test_training_gradient_checkpointing, I will talk to @ArthurZucker .

dvrogozh · 2024-11-15T17:20:53Z

without with self.subTest(f"custom message {i}"), it works with

You mean without if + skip condition? It works reasonably on my side without that. It behaves strange with if + skip though. I am checking this now.

import unittest
class T(unittest.TestCase):
    def test_foo(self):
        for i in range(7):
            print(f">>> {i}")
            with self.subTest(i=i):
                assert i < 3

output:

$ python3 -m pytest --capture=no d.py
======================================= test session starts ========================================
platform linux -- Python 3.10.12, pytest-7.4.4, pluggy-1.5.0
rootdir: /home/dvrogozh/tmp
plugins: pspec-0.0.4, timeout-2.3.1, hypothesis-6.118.8, subtests-0.13.1, dash-2.18.2, xdist-3.6.1, rich-0.1.1
collected 1 item

d.py >>> 0
>>> 1
>>> 2
>>> 3
>>> 4
>>> 5
>>> 6
uuuu.

============================================= FAILURES =============================================
_________________________________________ T.test_foo (i=3) _________________________________________

self = <d.T testMethod=test_foo>

    def test_foo(self):
        for i in range(7):
            print(f">>> {i}")
            with self.subTest(i=i):
                #if i < 3:
                #    self.skipTest(f"bon {i}")
                #self.assertLess(i, 3)
>               assert i < 3
E               AssertionError: assert 3 < 3

d.py:12: AssertionError
_________________________________________ T.test_foo (i=4) _________________________________________

self = <d.T testMethod=test_foo>

    def test_foo(self):
        for i in range(7):
            print(f">>> {i}")
            with self.subTest(i=i):
                #if i < 3:
                #    self.skipTest(f"bon {i}")
                #self.assertLess(i, 3)
>               assert i < 3
E               AssertionError: assert 4 < 3

d.py:12: AssertionError
_________________________________________ T.test_foo (i=5) _________________________________________

self = <d.T testMethod=test_foo>

    def test_foo(self):
        for i in range(7):
            print(f">>> {i}")
            with self.subTest(i=i):
                #if i < 3:
                #    self.skipTest(f"bon {i}")
                #self.assertLess(i, 3)
>               assert i < 3
E               AssertionError: assert 5 < 3

d.py:12: AssertionError
_________________________________________ T.test_foo (i=6) _________________________________________

self = <d.T testMethod=test_foo>

    def test_foo(self):
        for i in range(7):
            print(f">>> {i}")
            with self.subTest(i=i):
                #if i < 3:
                #    self.skipTest(f"bon {i}")
                #self.assertLess(i, 3)
>               assert i < 3
E               AssertionError: assert 6 < 3

d.py:12: AssertionError
===================================== short test summary info ======================================
(i=3) SUBFAIL d.py::T::test_foo - AssertionError: assert 3 < 3
(i=4) SUBFAIL d.py::T::test_foo - AssertionError: assert 4 < 3
(i=5) SUBFAIL d.py::T::test_foo - AssertionError: assert 5 < 3
(i=6) SUBFAIL d.py::T::test_foo - AssertionError: assert 6 < 3
=================================== 4 failed, 1 passed in 0.11s ====================================

ydshieh · 2024-11-15T17:25:33Z

yes, sorry. I mean without if ... skipTest it works (as you noted). But with it, things goes crazy ..

dvrogozh · 2024-11-15T20:00:23Z

@ydshieh : I filed #34755 to specifically consider what to do with subtests story. See breakdown on how it works in different cases. Really confusing...

As for the test_training_gradient_checkpointing story I think there are 2 possible ways to handle it separate from subtests story:

Either merge in PR 34723 as is, i.e. with self.subTest() remains, but we move the actual test body under the clause
Remove with self.subTest() entirely, i.e. revert to previous test version when skipped cases were handled with just continue in the loop without explict markup of what passed or failed

ArthurZucker

Leaving this up to you @ydshieh 🤗 I think this one makes sense!

ydshieh

Despite the weired issue from the pytest-subtests with subTest + skipTest, this PR itself makes sense.

Furthermore, it doesn't change the current behavior when pytest-subtests is not installed (although that behavior is not desirable).

Therefore LGTM to merge.

HuggingFaceDocBuilderDev · 2024-11-18T12:03:54Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

19d58d3 has introduced a context manager to manage subtests of test_training_gradient_checkpointing. However, test body was not moved under "with" statement. Thus, while tests are correctly marked as skipped, test bodies were still executed. In some cases, as with llama this caused attribute errors. Fixes: huggingface#34722 Fixes: 19d58d3 ("Add MLLama (huggingface#33703)") Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>

dvrogozh mentioned this pull request Nov 14, 2024

CI fails on few test_training_gradient_checkpointing tests for LLAMA #34722

Closed

dvrogozh force-pushed the tests branch from a6d6139 to 95fa9fb Compare November 14, 2024 00:25

dvrogozh force-pushed the tests branch from 95fa9fb to a32c4e6 Compare November 14, 2024 18:25

ydshieh self-assigned this Nov 15, 2024

dvrogozh mentioned this pull request Nov 15, 2024

[DO NOT MERGE] Add pytest-subtests to the dependencies #34753

Draft

dvrogozh mentioned this pull request Nov 15, 2024

Issues counting passing rates on tests which use subTest() #34755

Closed

ArthurZucker reviewed Nov 15, 2024

View reviewed changes

ydshieh approved these changes Nov 18, 2024

View reviewed changes

ArthurZucker approved these changes Nov 18, 2024

View reviewed changes

ydshieh merged commit 1c471fc into huggingface:main Nov 18, 2024

Conversation

dvrogozh commented Nov 14, 2024

Uh oh!

dvrogozh commented Nov 14, 2024

Uh oh!

dvrogozh commented Nov 14, 2024

Uh oh!

LysandreJik commented Nov 15, 2024

Uh oh!

ydshieh commented Nov 15, 2024

Uh oh!

ydshieh commented Nov 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dvrogozh commented Nov 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dvrogozh commented Nov 15, 2024

Uh oh!

dvrogozh commented Nov 15, 2024

Uh oh!

ydshieh commented Nov 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dvrogozh commented Nov 15, 2024

Uh oh!

ydshieh commented Nov 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ydshieh commented Nov 15, 2024

Uh oh!

dvrogozh commented Nov 15, 2024

Uh oh!

ydshieh commented Nov 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dvrogozh commented Nov 15, 2024

Uh oh!

ydshieh commented Nov 15, 2024

Uh oh!

dvrogozh commented Nov 15, 2024

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

ydshieh left a comment

Choose a reason for hiding this comment

Uh oh!

HuggingFaceDocBuilderDev commented Nov 18, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

ydshieh commented Nov 15, 2024 •

edited

Loading

dvrogozh commented Nov 15, 2024 •

edited

Loading

ydshieh commented Nov 15, 2024 •

edited

Loading

ydshieh commented Nov 15, 2024 •

edited

Loading

ydshieh commented Nov 15, 2024 •

edited

Loading