Skip to content

Conversation

@rkooo567
Copy link
Contributor

@rkooo567 rkooo567 commented Oct 8, 2022

…195)"

This reverts commit 75a0c49.

Why are these changes needed?

This reverts the PR and fixes the test failures

This also fixes a bug around _monitor_jobs API. the monitor job can be called on the same job twice now, which will break the event (because at the end of monitor job, we record the event that job is completed). The same completed event can be reported twice without the fix

Related issue number

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

Signed-off-by: SangBin Cho <[email protected]>
Signed-off-by: SangBin Cho <[email protected]>
@rkooo567
Copy link
Contributor Author

rkooo567 commented Nov 2, 2022

cc @architkulkarni I also added the idempotent monitor_jobs method here. Can you review one more time? I am planning to merge it soon.

Signed-off-by: SangBin Cho <[email protected]>
Signed-off-by: SangBin Cho <[email protected]>
@rkooo567
Copy link
Contributor Author

rkooo567 commented Nov 2, 2022

test_dashboard is unrelated. It is flaky in the master, and the flaky test has been fixed from other PR

@architkulkarni
Copy link
Contributor

@rkooo567 can you add some context to the PR description (what test was broken, how does this fix it)

Copy link
Contributor

@architkulkarni architkulkarni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

monitor_jobs changes look good!

@architkulkarni
Copy link
Contributor

test_event failed https://buildkite.com/ray-project/oss-ci-build-pr/builds/3935#018438f6-bd8b-4397-bc31-8523630b6a5f/3477-4459

Ideally we would also have a unit test for self._monitored_jobs.

@architkulkarni architkulkarni added the @author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer. label Nov 2, 2022
@rkooo567
Copy link
Contributor Author

rkooo567 commented Nov 2, 2022

Let me make a follow up PRs for unit tests. I will add a bunch of them including verifying events.

@rkooo567
Copy link
Contributor Author

rkooo567 commented Nov 2, 2022

the dashboard failure should be unrelated and fixed in the mastesr

@rkooo567 rkooo567 merged commit 0540b1f into ray-project:master Nov 2, 2022
jjyao pushed a commit that referenced this pull request Nov 3, 2022
Lint failed after it is merged due to some conflict. This should fix the issue

Signed-off-by: SangBin Cho <[email protected]>
minerharry pushed a commit to minerharry/ray that referenced this pull request Dec 6, 2022
…9164)" (ray-project#29… (ray-project#29196)

This reverts the PR and fixes the test failures

This also fixes a bug around _monitor_jobs API. the monitor job can be called on the same job twice now, which will break the event (because at the end of monitor job, we record the event that job is completed). The same completed event can be reported twice without the fix

upgrade gtest to 1.12.1 (ray-project#29902)

While working on ray-project#28209 I hit issue google/googletest#3514, which causes a couple gtest-dependent tests to be unable to build

Signed-off-by: Pablo E <[email protected]>
WeichenXu123 pushed a commit to WeichenXu123/ray that referenced this pull request Dec 19, 2022
…9164)" (ray-project#29… (ray-project#29196)

This reverts the PR and fixes the test failures

This also fixes a bug around _monitor_jobs API. the monitor job can be called on the same job twice now, which will break the event (because at the end of monitor job, we record the event that job is completed). The same completed event can be reported twice without the fix

Signed-off-by: Weichen Xu <[email protected]>
WeichenXu123 pushed a commit to WeichenXu123/ray that referenced this pull request Dec 19, 2022
Lint failed after it is merged due to some conflict. This should fix the issue

Signed-off-by: SangBin Cho <[email protected]>
Signed-off-by: Weichen Xu <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

@author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants