Disabling Unrelated Tests When Enabling CUDA Async Allocator in CI by eee4017 · Pull Request #65094 · PaddlePaddle/Paddle

eee4017 · 2024-06-12T17:12:42Z

PR Category

Others

PR Types

Others

Description

When enabling the CUDA Async Allocator in the CI test, we disable the unrelated tests.

paddle-bot · 2024-06-12T17:12:47Z

你的PR提交成功，感谢你对开源项目的贡献!
请关注后续CI自动化测试结果，详情请参考Paddle-CI手册。
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

jeng1220 · 2024-06-13T02:28:28Z

CI failed but it was NOT related to this PR

ERROR: test_simple_net_hybrid_strategy (__main__.TestSemiAutoParallelLlamaDataLoader)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/workspace/Paddle/test/collective/test_communication_api_base.py", line 79, in run_test_case
    self._launcher = subprocess.run(
  File "/usr/lib/python3.10/subprocess.py", line 526, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['/usr/bin/python', '-u', '-m', 'paddle.distributed.launch', '--log_dir', '/tmp/tmpkx_sv8w9', '--devices', '0,1,2,3,4,5,6,7', 'semi_auto_llama_dataloader.py']' returned non-zero exit status 1.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/workspace/Paddle/test/auto_parallel/hybrid_strategy/test_semi_auto_parallel_llama_model.py", line 224, in test_simple_net_hybrid_strategy
    self.run_test_case(
  File "/workspace/Paddle/test/collective/test_communication_api_base.py", line 90, in run_test_case
    raise RuntimeError(
RuntimeError: Error occurs when running this test case. The return code of command ['/usr/bin/python', '-u', '-m', 'paddle.distributed.launch', '--log_dir', '/tmp/tmpkx_sv8w9', '--devices', '0,1,2,3,4,5,6,7', 'semi_auto_llama_dataloader.py'] is 1
----------------------------------------------------------------------
Ran 9 tests in 444.641s
FAILED (errors=1)

tianshuo78520a · 2024-06-13T07:43:08Z

Ok, I'll take a look at the reason

paddle-ci-bot · 2024-06-27T03:12:23Z

Sorry to inform you that 1ca684f's CIs have passed for more than 7 days. To prevent PR conflicts, you need to re-run all CIs manually.

zyfncg · 2024-07-01T02:07:31Z

paddle/fluid/memory/allocation/allocator_facade.cc

+          if (FLAGS_use_cuda_managed_memory) {
+            PADDLE_ENFORCE_EQ(FLAGS_use_cuda_managed_memory,


这里的check逻辑和if判断条件是不是冲突的？

eee4017 · 2024-07-04T02:37:05Z

You must have one RD (From00, zhangbo9674) approval for file changes in paddle/fluid/framework/new_executor.

From00

LGTM

paddle-bot bot added the contributor External developers label Jun 12, 2024

eee4017 mentioned this pull request Jun 12, 2024

The CUDA Async Allocator #65092

Merged

jeng1220 added the NVIDIA label Jun 13, 2024

onecatcn requested a review from risemeup1 June 14, 2024 02:16

onecatcn assigned zyfncg Jun 14, 2024

eee4017 force-pushed the lawu/disable_tests branch from 492ab3c to 1ca684f Compare June 19, 2024 06:38

lawrence910426 added 5 commits June 27, 2024 03:16

Either stream safe or async allocator

a7b5cb2

Ignore if not enabled

c1e5bec

fix: ignore cuda managed

6dc535e

fix: disable async allocator

f88902e

fix: either async or stream safe

b9f9e08

eee4017 force-pushed the lawu/disable_tests branch from 1ca684f to b9f9e08 Compare June 27, 2024 03:35

zyfncg reviewed Jul 1, 2024

View reviewed changes

fix useless if

2751d7c

eee4017 requested a review from zyfncg July 2, 2024 08:09

zyfncg approved these changes Jul 4, 2024

View reviewed changes

onecatcn requested a review from From00 July 4, 2024 02:37

From00 approved these changes Jul 4, 2024

View reviewed changes

From00 merged commit a30c8a5 into PaddlePaddle:develop Jul 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Disabling Unrelated Tests When Enabling CUDA Async Allocator in CI#65094

Disabling Unrelated Tests When Enabling CUDA Async Allocator in CI#65094
From00 merged 6 commits intoPaddlePaddle:developfrom
eee4017:lawu/disable_tests

eee4017 commented Jun 12, 2024

Uh oh!

paddle-bot bot commented Jun 12, 2024

Uh oh!

jeng1220 commented Jun 13, 2024

Uh oh!

tianshuo78520a commented Jun 13, 2024

Uh oh!

paddle-ci-bot bot commented Jun 27, 2024

Uh oh!

zyfncg Jul 1, 2024

Uh oh!

eee4017 Jul 1, 2024

Uh oh!

eee4017 commented Jul 4, 2024

Uh oh!

From00 left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

		if (FLAGS_use_cuda_managed_memory) {
		PADDLE_ENFORCE_EQ(FLAGS_use_cuda_managed_memory,

Conversation

eee4017 commented Jun 12, 2024

PR Category

PR Types

Description

Uh oh!

paddle-bot bot commented Jun 12, 2024

Uh oh!

jeng1220 commented Jun 13, 2024

Uh oh!

tianshuo78520a commented Jun 13, 2024

Uh oh!

paddle-ci-bot bot commented Jun 27, 2024

Uh oh!

zyfncg Jul 1, 2024

Choose a reason for hiding this comment

Uh oh!

eee4017 Jul 1, 2024

Choose a reason for hiding this comment

Uh oh!

eee4017 commented Jul 4, 2024

Uh oh!

From00 left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants