Skip to content

Conversation

@HeartSaVioR
Copy link
Contributor

@HeartSaVioR HeartSaVioR commented Jan 10, 2020

What changes were proposed in this pull request?

This patch disables "test_memory_limit" pyspark test temporarily.

Why are the changes needed?

I'm seeing consistent pyspark test failures on multiple PRs (#26955, #26201, #27064) and it's not reproducible in local dev. hence I expect it would take some time to fix it. (And I don't have context to fix it and have to rely on Pyspark expert.)

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Ran python/run-tests --testnames 'pyspark.tests.test_worker' and checked the test is skipped.

@HeartSaVioR
Copy link
Contributor Author

Ideally we should fix the test sooner than later, but given multiple PRs are being stuck by this consistent failure, I'm proposing to disable this temporarily, and file an issue to fix and re-enable it.

@HeartSaVioR
Copy link
Contributor Author

cc. @srowen @viirya @HyukjinKwon


@unittest.skip("disabled temporarily since it's failing consistently")
def test_memory_limit(self):
self.sc._conf.set("spark.executor.pyspark.memory", "1m")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only other thing I can think of is that this toy memory limit is too low. What if it were 8m or something? if you feel like it you could try testing that too

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The odd thing is that it passes with local dev. (MacOS, python3 - 3.7.4/python2 - 2.7.10/pypy - Python 2.7.13 & pypy 7.3.0 GCC 4.2.1) so not clear it would help, but if we suspect it we can try.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will pass on Mac OS because Mac doesn't actually limits the memory. it doesn't work on Windows either. So, this hypothesis could make sense. Can we increase to something like 8m?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK let me do it instead. Let's modify the PR title/description after confirming it works.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's best to file a JIRA btw.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about filing an issue after confirming the trick doesn't work and we have to disable? We may be able to just treat this PR as MINOR but mentioning the new JIRA issue in "reason" which is expected to address this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, that works too. FWIW, I doubt if it consistently fails in the master yet. If that's the case, it should better fix the test here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI #27159 is the patch increasing the value to 8m

@SparkQA
Copy link

SparkQA commented Jan 10, 2020

Test build #116439 has finished for PR 27158 at commit 07608af.

  • This patch fails Python style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member

@HeartSaVioR, also to doubly make sure, can you cherry pick #26955, apply this change, and see if the tests pass in another [DO-NOT-MERGE] PR?

for pid in current_pids:
self.assertTrue(pid in previous_pids)


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't change such newlines BTW. This is per pep8. another newline below, I don't think that violates pep 8.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My bad. linter doesn't work in my machine (maybe it's because the the default python in my dev. is 2.7?) so missed that. Will revert.

@HeartSaVioR
Copy link
Contributor Author

HeartSaVioR commented Jan 10, 2020

OK good point as Jenkins may not pick the test for this PR. Will do. Thanks!

@SparkQA
Copy link

SparkQA commented Jan 10, 2020

Test build #116440 has finished for PR 27158 at commit 23ad40b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@viirya
Copy link
Member

viirya commented Jan 10, 2020

This isn't caused by any recent commit, right? I suppose maybe due to jenkin env?

@HeartSaVioR
Copy link
Contributor Author

#27159 succeeds. I'll convert the PR to the individual one to fix the test.

@HeartSaVioR
Copy link
Contributor Author

Done. Let's move to #27162 . Closing this one. Thanks!

@HeartSaVioR
Copy link
Contributor Author

We've realized the patch didn't work, although the builds succeeded for the patch itself.

I feel we may need to reopen this and merge to unblock PRs, or deal with the test failure immediately. (Latter is not easy for me to do.) WDYT?

@HyukjinKwon
Copy link
Member

Let me take a look today and see if it's able to fix. Otherwise yes, we might have to disable the test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants