Skip to content

Conversation

@JingsongLi
Copy link
Contributor

In #1293, we found lot of places use switch case avro orc parquet to create appender, these logicals can be extract to GenericAppenderFactory.

return Avro.write(outputFile)
.schema(schema)
.createWriterFunc(DataWriter::create)
.named(fileFormat.name())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name is usually a table name for the given schema. not fileFormat. we may need to remove this line, or set it with a reasonable table name.

return Parquet.write(outputFile)
.schema(schema)
.createWriterFunc(GenericParquetWriter::buildWriter)
.named(fileFormat.name())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

return ORC.write(outputFile)
.schema(schema)
.createWriterFunc(GenericOrcWriter::buildWriter)
.setAll(config)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To keep the consistent semantics, I suggest to set overwrite=true. Both flink and spark's FileAppenderFactory have enabled the overwrite.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, they do.

.withPath(file.toURI().toString())
.withMetrics(appender.metrics())
.withFormat(format);
if (partition != null) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could call the withPartition(partition) directly without this nullable check, because builder will handle this inside it.

@rdblue
Copy link
Contributor

rdblue commented Aug 20, 2020

Thanks @JingsongLi, looks good.

@rdblue rdblue merged commit c66d060 into apache:master Aug 20, 2020
@JingsongLi JingsongLi deleted the appender branch November 5, 2020 09:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants