Skip to content

use patch to fix flaky test optim test in PyTorch 1.12.1 w/ foss/2021a#17733

Closed
branfosj wants to merge 1 commit intoeasybuilders:developfrom
branfosj:20230415130205_new_pr_PyTorch1121
Closed

use patch to fix flaky test optim test in PyTorch 1.12.1 w/ foss/2021a#17733
branfosj wants to merge 1 commit intoeasybuilders:developfrom
branfosj:20230415130205_new_pr_PyTorch1121

Conversation

@branfosj
Copy link
Copy Markdown
Member

@branfosj branfosj commented Apr 15, 2023

(created using eb --new-pr)

add patch from #17726 - using separate PRs for each easyconfig

@branfosj
Copy link
Copy Markdown
Member Author

Test report by @branfosj
FAILED
Build succeeded for 0 out of 1 (1 easyconfigs in total)
bear-pg0104u17b.bear.cluster - Linux RHEL 8.6, x86_64, Intel(R) Xeon(R) Platinum 8360Y CPU @ 2.40GHz (icelake), Python 3.6.8
See https://gist.github.com/branfosj/37a02d1c28810475bbe5249da2f1e77f for a full test report.

@boegel boegel added this to the next release (4.7.2) milestone Apr 15, 2023
@boegel boegel changed the title fix flaky test optim test PyTorch 1.12.1 foss/2021a use patch to fix flaky test optim test in PyTorch 1.12.1 w/ foss/2021a Apr 15, 2023
@boegel
Copy link
Copy Markdown
Member

boegel commented Apr 15, 2023

test_optim failed!

@branfosj 🤔

@boegel
Copy link
Copy Markdown
Member

boegel commented Apr 15, 2023

@boegelbot please test @ generoso
CORE_CNT=16

@branfosj
Copy link
Copy Markdown
Member Author

:(

======================================================================
FAIL: test_adagrad (__main__.TestOptim)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/dev/shm/branfosj/build-up-EL8/PyTorch/1.12.1/foss-2021a/pytorch-v1.12.1/test/test_optim.py", line 746, in test_adagrad
    self._test_basic_cases(
  File "/dev/shm/branfosj/build-up-EL8/PyTorch/1.12.1/foss-2021a/pytorch-v1.12.1/test/test_optim.py", line 313, in _test_basic_cases
    self._test_basic_cases_template(
  File "/dev/shm/branfosj/build-up-EL8/PyTorch/1.12.1/foss-2021a/pytorch-v1.12.1/test/test_optim.py", line 141, in _test_basic_cases_template
    self.assertGreater(fn().item(), initial_value)
AssertionError: 0.0 not greater than 0.0

----------------------------------------------------------------------

@boegelbot
Copy link
Copy Markdown
Collaborator

@boegel: Request for testing this PR well received on login1

PR test command 'EB_PR=17733 EB_ARGS= EB_CONTAINER= /opt/software/slurm/bin/sbatch --job-name test_PR_17733 --ntasks="16" ~/boegelbot/eb_from_pr_upload_generoso.sh' executed!

  • exit code: 0
  • output:
Submitted batch job 10661

Test results coming soon (I hope)...

Details

- notification for comment with ID 1509843732 processed

Message to humans: this is just bookkeeping information for me,
it is of no use to you (unless you think I have a bug, which I don't).

@boegelbot
Copy link
Copy Markdown
Collaborator

Test report by @boegelbot
FAILED
Build succeeded for 0 out of 1 (1 easyconfigs in total)
cnx1 - Linux Rocky Linux 8.5, x86_64, Intel(R) Xeon(R) CPU E5-2690 v3 @ 2.60GHz (haswell), Python 3.6.8
See https://gist.github.com/boegelbot/b305b7963efdd9d7639f58222eb60be5 for a full test report.

@branfosj
Copy link
Copy Markdown
Member Author

Based on the failures seen, I am closing this and suggesting we revert adding it to the other PyTorch 1.12.1 that we've merged ( #17737)

@branfosj branfosj closed this Apr 15, 2023
@branfosj branfosj deleted the 20230415130205_new_pr_PyTorch1121 branch April 18, 2023 09:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants