Skip to content

Conversation

@prazanna
Copy link
Contributor

Fixes #149

Implemented a custom log format for hoodie merge on read

@prazanna prazanna self-assigned this May 17, 2017
@prazanna prazanna requested a review from vinothchandar May 17, 2017 20:02
@vinothchandar
Copy link
Member

@prazanna why does it say CLA not signed for you ?

Copy link
Member

@vinothchandar vinothchandar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall ..

We still need to implement rollbacks and as-it-is, this will break the HoodieRealtimeInputFormat until that's fixed. I will make the changes there..


public enum HoodieFileFormat {
PARQUET(".parquet"), AVRO(".avro");
PARQUET(".parquet"), HOODIE_LOG(".log");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may be just LOG instead of HOODIE_LOG ?

/**
* Writer interface to allow appending block to this file format
*/
interface Writer extends Closeable {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sub interfaces. ;). I like these and also nested classes (which most folks seem unexcited for some reason)

return HoodieCorruptBlock.fromBytes(content);
}

private boolean isBlockCorrupt(int blocksize) throws IOException {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of seeking once to find if its corrupt and seek again to read the data, can we determine the corrupt block from just the exception thrown when we attempt to read?

@@ -0,0 +1,164 @@
/*
* Copyright (c) 2016 Uber Technologies, Inc. (hoodie-dev-group@uber.com)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2017..

return HoodieLogBlockType.AVRO_DATA_BLOCK;
}

public static HoodieLogBlock fromBytes(byte[] content, Schema readerSchema) throws IOException {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may be move to AvroUtils?

String baseCommitTime, int version) {
return String.format("%s_%s%s.%d", fileId, baseCommitTime, logFileExtension, version);
String baseCommitTime, int version) {
return "." + String.format("%s_%s%s.%d", fileId, baseCommitTime, logFileExtension, version);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets pull "." into a constant..

@vinothchandar vinothchandar merged commit 240c912 into apache:master May 23, 2017
@prazanna prazanna deleted the hoodielogformat branch May 24, 2017 06:31
vinishjail97 pushed a commit to vinishjail97/hudi that referenced this pull request Dec 15, 2023
…ith hudi incr source (apache#7132) (apache#162)

Co-authored-by: Sivabalan Narayanan <n.siva.b@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants