Skip to content

Conversation

@zhengruifeng
Copy link
Contributor

What changes were proposed in this pull request?

use submitJob instead of runJob

Why are the changes needed?

spark.sparkContext.runJob is blocked until finishes all partitions

Does this PR introduce any user-facing change?

No

How was this patch tested?

Existing Tests

@zhengruifeng
Copy link
Contributor Author

zhengruifeng commented Nov 11, 2022

thanks @HyukjinKwon for pointing it out.

also cc @hvanhovell @cloud-fan

@HyukjinKwon
Copy link
Member

#38613 will handle this actually. Let's leave this closed.

@zhengruifeng
Copy link
Contributor Author

close this PR in favor of #38613

Copy link
Member

@HyukjinKwon HyukjinKwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zhengruifeng mind reopening this? LGTM

@zhengruifeng zhengruifeng restored the connect_collect_submitJob branch November 11, 2022 12:15
@zhengruifeng zhengruifeng reopened this Nov 11, 2022
@HyukjinKwon
Copy link
Member

Merged to master.

@zhengruifeng zhengruifeng deleted the connect_collect_submitJob branch November 14, 2022 00:17
SandishKumarHN pushed a commit to SandishKumarHN/spark that referenced this pull request Dec 12, 2022
…ad of `runJob`

### What changes were proposed in this pull request?
use `submitJob` instead of `runJob`

### Why are the changes needed?
`spark.sparkContext.runJob` is blocked until finishes all partitions

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Existing Tests

Closes apache#38614 from zhengruifeng/connect_collect_submitJob.

Authored-by: Ruifeng Zheng <[email protected]>
Signed-off-by: Hyukjin Kwon <[email protected]>
dongjoon-hyun pushed a commit that referenced this pull request Jan 23, 2024
### What changes were proposed in this pull request?
This pr aims to upgrade Arrow from 14.0.2 to 15.0.0, this version fixes the compatibility issue with Netty 4.1.104.Final(GH-39265).

Additionally, since the `arrow-vector` module uses `eclipse-collections` to replace `netty-common` as a compile-level dependency, Apache Spark has added a dependency on `eclipse-collections` after upgrading to use Arrow 15.0.0.

### Why are the changes needed?
The new version brings the following major changes:

Bug Fixes
GH-34610 - [Java] Fix valueCount and field name when loading/transferring NullVector
GH-38242 - [Java] Fix incorrect internal struct accounting for DenseUnionVector#getBufferSizeFor
GH-38254 - [Java] Add reusable buffer getters to char/binary vectors
GH-38366 - [Java] Fix Murmur hash on buffers less than 4 bytes
GH-38387 - [Java] Fix JDK8 compilation issue with TestAllTypes
GH-38614 - [Java] Add VarBinary and VarCharWriter helper methods to more writers
GH-38725 - [Java] decompression in Lz4CompressionCodec.java does not set writer index

New Features and Improvements
GH-38511 - [Java] Add getTransferPair(Field, BufferAllocator, CallBack) for StructVector and MapVector
GH-14936 - [Java] Remove netty dependency from arrow-vector
GH-38990 - [Java] Upgrade to flatc version 23.5.26
GH-39265 - [Java] Make it run well with the netty newest version 4.1.104

The full release notes as follows:

- https://arrow.apache.org/release/15.0.0.html

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Pass GitHub Actions

### Was this patch authored or co-authored using generative AI tooling?
No

Closes #44797 from LuciferYang/SPARK-46718.

Authored-by: yangjie01 <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants