Skip to content

Conversation

@LuciferYang
Copy link
Contributor

What changes were proposed in this pull request?

This pr ignore collect data with single partition larger than 2GB bytes array limit in DatasetLargeResultCollectingSuite as default due it cannot run successfully with Spark default Java Options.

Why are the changes needed?

Avoid local test failure.

Does this PR introduce any user-facing change?

No, just for test

How was this patch tested?

  • Pass GA
  • Manual test: in my test environment, change -Xmx4g to -Xmx10g, maven and sbt can test successfully in my

@github-actions github-actions bot added the SQL label Nov 18, 2022
@LuciferYang
Copy link
Contributor Author

cc @HyukjinKwon

Copy link
Member

@HyukjinKwon HyukjinKwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. LGTM

// default Java Options, if user need do local test, please make the following changes:
// - Maven test: change `-Xmx4g` of `scalatest-maven-plugin` in `sql/core/pom.xml` to `-Xmx10g`
// - SBT test: change `-Xmx4g` of `Test / javaOptions` in `SparkBuild.scala` to `-Xmx10g`
ignore("collect data with single partition larger than 2GB bytes array limit") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@liuzqt, I know this was iterated on multiple times to get it to work - instead of the shared local spark session, did it work locally when using a local spark cluster instead ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes @LuciferYang is right, need to change -Xmx4g to -Xmx10g to make it work (it works for both shared local session and local cluster, but without the change neither work).

Thanks for the fix! Previously I only tested this using IDE and I guess it increased the mem under the hood......Sorry for the inconvenience.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So how do we move forward? This is a blocking for developers

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can leave it as ignore for now with the comments about using larger mem to make it work. I'm not sure if we're able to configure the build args for a specific test suite.

@LuciferYang
Copy link
Contributor Author

friendly ping @HyukjinKwon

@HyukjinKwon
Copy link
Member

Merged to master.

@LuciferYang
Copy link
Contributor Author

Thanks @HyukjinKwon @mridulm @liuzqt

SandishKumarHN pushed a commit to SandishKumarHN/spark that referenced this pull request Dec 12, 2022
…larger than 2GB bytes array limit` in `DatasetLargeResultCollectingSuite`

### What changes were proposed in this pull request?
This pr ignore `collect data with single partition larger than 2GB bytes array limit` in `DatasetLargeResultCollectingSuite` as default due it cannot run successfully with Spark default Java Options.

### Why are the changes needed?
Avoid local test failure.

### Does this PR introduce _any_ user-facing change?
No, just for test

### How was this patch tested?
- Pass GA
- Manual test:  in my test environment, change `-Xmx4g` to `-Xmx10g`, maven and sbt can test successfully in my

Closes apache#38704 from LuciferYang/SPARK-41193.

Authored-by: yangjie01 <[email protected]>
Signed-off-by: Hyukjin Kwon <[email protected]>
beliefer pushed a commit to beliefer/spark that referenced this pull request Dec 15, 2022
…larger than 2GB bytes array limit` in `DatasetLargeResultCollectingSuite`

### What changes were proposed in this pull request?
This pr ignore `collect data with single partition larger than 2GB bytes array limit` in `DatasetLargeResultCollectingSuite` as default due it cannot run successfully with Spark default Java Options.

### Why are the changes needed?
Avoid local test failure.

### Does this PR introduce _any_ user-facing change?
No, just for test

### How was this patch tested?
- Pass GA
- Manual test:  in my test environment, change `-Xmx4g` to `-Xmx10g`, maven and sbt can test successfully in my

Closes apache#38704 from LuciferYang/SPARK-41193.

Authored-by: yangjie01 <[email protected]>
Signed-off-by: Hyukjin Kwon <[email protected]>
beliefer pushed a commit to beliefer/spark that referenced this pull request Dec 18, 2022
…larger than 2GB bytes array limit` in `DatasetLargeResultCollectingSuite`

### What changes were proposed in this pull request?
This pr ignore `collect data with single partition larger than 2GB bytes array limit` in `DatasetLargeResultCollectingSuite` as default due it cannot run successfully with Spark default Java Options.

### Why are the changes needed?
Avoid local test failure.

### Does this PR introduce _any_ user-facing change?
No, just for test

### How was this patch tested?
- Pass GA
- Manual test:  in my test environment, change `-Xmx4g` to `-Xmx10g`, maven and sbt can test successfully in my

Closes apache#38704 from LuciferYang/SPARK-41193.

Authored-by: yangjie01 <[email protected]>
Signed-off-by: Hyukjin Kwon <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants