Skip to content

Conversation

@gengliangwang
Copy link
Member

@gengliangwang gengliangwang commented Feb 15, 2019

What changes were proposed in this pull request?

This is a followup PR to fix two issues in #23601:

  1. the class FileWriterFactory contains conf: SerializableConfiguration as a member, which is duplicated with WriteJobDescription. serializableHadoopConf . By removing it we can reduce the broadcast task binary size by around 70KB
  2. The test suite OrcV1QuerySuite/OrcV1QuerySuite/OrcV1PartitionDiscoverySuite didn't change the configuration SQLConf.USE_V1_SOURCE_WRITER_LIST to "orc". We should set the conf.

How was this patch tested?

Unit test

@gengliangwang
Copy link
Member Author

@cloud-fan

@cloud-fan
Copy link
Contributor

LGTM

@gengliangwang gengliangwang changed the title [SPARK-26673][SQL] File source V2: remove duplicated broadcast object in FileWriterFactory [SPARK-26673][FollowUp][SQL] File source V2: remove duplicated broadcast object in FileWriterFactory Feb 15, 2019
@SparkQA
Copy link

SparkQA commented Feb 15, 2019

Test build #102394 has finished for PR 23800 at commit 07f7a34.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member

Merged to master.

jackylee-ch pushed a commit to jackylee-ch/spark that referenced this pull request Feb 18, 2019
…ast object in FileWriterFactory

## What changes were proposed in this pull request?

This is a followup PR to fix two issues in apache#23601:
1.  the class `FileWriterFactory` contains `conf: SerializableConfiguration` as a member, which is duplicated with `WriteJobDescription. serializableHadoopConf `. By removing it we can reduce the broadcast task binary size by around 70KB
2. The test suite `OrcV1QuerySuite`/`OrcV1QuerySuite`/`OrcV1PartitionDiscoverySuite` didn't change the configuration `SQLConf.USE_V1_SOURCE_WRITER_LIST` to `"orc"`. We should set the conf.

## How was this patch tested?

Unit test

Closes apache#23800 from gengliangwang/reduceWriteTaskSize.

Authored-by: Gengliang Wang <[email protected]>
Signed-off-by: Hyukjin Kwon <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants