Skip to content

Conversation

@codope
Copy link
Member

@codope codope commented Nov 17, 2021

What is the purpose of the pull request

Upgrade HBase version from 1.x to 2.x.

Brief change log

  • Upgrade HBase version from 1.2.3 to 2.4.7 and Hadoop from 2.7.3 to 2.10.1.
  • KeyValue.KVComparator is deprecated in 2.x and completely removed from 3.x. We need to extend CellComparatorImpl which implements CellComparator interface.
  • HBase 2.x reader and writer factory methods changed.

Verify this pull request

(Please pick either of the following options)

This pull request is a trivial rework / code cleanup without any test coverage.

(or)

This pull request is already covered by existing tests, such as (please describe tests).

(or)

This change added tests and can be verified as follows:

(example:)

  • Added integration tests for end-to-end.
  • Added HoodieClientWriteTest to verify the change.
  • Manually verified the change by running a job locally.

Committer checklist

  • Has a corresponding JIRA in PR title & commit

  • Commit message is descriptive of the change

  • CI is green

  • Necessary doc changes done or have another open PR

  • For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A little worried,This method does not exist even in hbase2.2.3

@codope codope force-pushed the hudi-2783-hbase-upgrade branch from 6e7f204 to 9dc39a8 Compare November 18, 2021 08:00
@codope codope added the priority:blocker Production down; release blocker label Nov 24, 2021
@codope codope force-pushed the hudi-2783-hbase-upgrade branch 2 times, most recently from ca86e9d to 7c4b6d9 Compare November 24, 2021 16:49
@codope codope force-pushed the hudi-2783-hbase-upgrade branch from 7c4b6d9 to 548c193 Compare November 24, 2021 19:49
@codope
Copy link
Member Author

codope commented Nov 24, 2021

@danny0405 Could you please help in testing this patch for Flink?

@vinothchandar vinothchandar changed the title [WIP][HUDI-2783] Upgrade HBase [WIP][HUDI-2783] Upgrade HBase to 2.x Nov 24, 2021
@vinothchandar vinothchandar removed the priority:blocker Production down; release blocker label Nov 26, 2021
ReaderContext context = new ReaderContextBuilder()
.withFilePath(path)
.withInputStreamWrapper(stream)
.withFileSize(getFs("hoodie", conf).getFileStatus(path).getLen())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a fake path so getFileStatus() will fail. file size could be content.length?

.build();
HFileInfo fileInfo = new HFileInfo(context, conf);
this.reader = HFile.createReader(context, fileInfo, new CacheConfig(conf), conf);
fileInfo.initMetaAndIndex(reader);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the use of this explicit call? Is this required or some optimization?

<groupId>org.apache.hbase</groupId>
<artifactId>hbase-hadoop-compat</artifactId>
<version>${hbase.version}</version>
</dependency>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Too many new jar dependencies and i'm worried about the conflicts, we must exclude the jars that are conflict-prone like google guava explicitly if there are indirect dependency.

@codope codope force-pushed the hudi-2783-hbase-upgrade branch 2 times, most recently from a23ba86 to 790d2ba Compare December 9, 2021 12:21
@alexeykudinkin
Copy link
Contributor

@hudi-bot run azure

public List<Pair<String, R>> readRecords(List<String> keys) throws IOException {
reader.loadFileInfo();
Schema schema = new Schema.Parser().parse(new String(reader.loadFileInfo().get(KEY_SCHEMA.getBytes())));
reader.getHFileInfo();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we doing this here?

@alexeykudinkin
Copy link
Contributor

I've been trying to address to fix IT tests after HBase upgrade and kept hitting HBase classes conflicts b/w our HBase deps and Hadoop 2.x deps (there are non-BWC changes).

Tried shading our HBase classes but that didn't go too well, and as such decided to go ahead and try upgrading Hadoop to 3.3.x branch.

It's WIP and you can track progress in the following PR #4286

@codope codope force-pushed the hudi-2783-hbase-upgrade branch from 790d2ba to b8a7bfa Compare January 3, 2022 16:47
@yihua
Copy link
Contributor

yihua commented Feb 17, 2022

I'm going to take over this PR and get it ready for review.

@yihua yihua changed the title [WIP][HUDI-2783] Upgrade HBase to 2.x [HUDI-1180] Upgrade HBase to 2.x Feb 17, 2022
@yihua yihua changed the title [HUDI-1180] Upgrade HBase to 2.x [WIP][HUDI-1180] Upgrade HBase to 2.x Feb 17, 2022
@yihua yihua changed the title [WIP][HUDI-1180] Upgrade HBase to 2.x [WIP][HUDI-1180] Upgrade HBase to 2.4.9 Feb 17, 2022
codope and others added 4 commits February 17, 2022 13:10
Fix some unit tests

Resolve dependency issue

Upgrade Hadoop to 2.10.1 and fix HFile inline reader test

Separate hbase shaded version for presto bundle

Resolve hbase dep conflicts in flink, utilities and hadoop-mr bundles
Diasble access time validation
@yihua yihua force-pushed the hudi-2783-hbase-upgrade branch from b8a7bfa to c904e8f Compare February 17, 2022 23:27
@yihua yihua self-assigned this Feb 17, 2022
@hudi-bot
Copy link
Collaborator

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

@yihua
Copy link
Contributor

yihua commented Mar 11, 2022

I'm closing this in favor of #5004 which has more changes and deviates from this one which has conflicts with master.

@yihua yihua closed this Mar 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants