Skip to content

[HUDI-3104] Kafka-connect support hadoop config environments and properties#4451

Merged
yihua merged 1 commit intoapache:masterfrom
cdmikechen:HUDI-3104
Jan 9, 2022
Merged

[HUDI-3104] Kafka-connect support hadoop config environments and properties#4451
yihua merged 1 commit intoapache:masterfrom
cdmikechen:HUDI-3104

Conversation

@cdmikechen
Copy link
Copy Markdown
Contributor

@cdmikechen cdmikechen commented Dec 27, 2021

What is the purpose of the pull request

Let kafka-connect support hadoop config environments/properties

Brief change log

  • Support hadoop config environments HADOOP_CONF_DIR and HADOOP_HOME
  • Support hadoop config properties hadoop.conf.dir and hadoop.home

Verify this pull request

This change added tests and can be verified as follows:

  • Added TestHdfsConfiguration to verify the change.

Committer checklist

  • Has a corresponding JIRA in PR title & commit

  • Commit message is descriptive of the change

  • CI is green

  • Necessary doc changes done or have another open PR

  • For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.

@cdmikechen
Copy link
Copy Markdown
Contributor Author

cdmikechen commented Jan 2, 2022

@yihua Thanks for reviewing this

@nsivabalan nsivabalan added the priority:critical Production degraded; pipelines stalled label Jan 4, 2022
@nsivabalan nsivabalan removed their assignment Jan 6, 2022
Copy link
Copy Markdown
Member

@codope codope left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cdmikechen Few minior comments. I would prefer to avoid a new dependency if possible.
@yihua Can you also take a look?

Comment thread hudi-kafka-connect/pom.xml Outdated
<scope>test</scope>
</dependency>

<dependency>
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see this dependency is required to set env vars at test time. Is there a way to avoid this? Maybe add a wrapper class for system.getEnv calls and mock that class in test?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@codope @yihua
I reconsidered and modified the codes. Use System.env to get environment variables should not be necessary in these test cases, so I deleted dependencies and related codes.

Comment thread hudi-kafka-connect/pom.xml Outdated
<scope>test</scope>
</dependency>

<dependency>
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

// Reference https://docs.confluent.io/kafka-connect-hdfs/current/configuration_options.html#hdfs
public static final ConfigProperty<String> HADOOP_CONF_DIR = ConfigProperty
.key("hadoop.conf.dir")
.defaultValue("")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use .noDefault()?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yihua
Have replace this~


public static final ConfigProperty<String> HADOOP_HOME = ConfigProperty
.key("hadoop.home")
.defaultValue("")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

import java.nio.file.Path;
import java.util.List;

import static com.github.stefanbirkner.systemlambda.SystemLambda.withEnvironmentVariable;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you use mocks to get rid of this import?

@hudi-bot
Copy link
Copy Markdown
Collaborator

hudi-bot commented Jan 7, 2022

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

@nsivabalan
Copy link
Copy Markdown
Contributor

@cdmikechen @yihua @codope : We are targeting this patch for 0.10.1. We have code freeze planned this monday. Would be nice to get this in by then. Wanted to send out a reminder.

Copy link
Copy Markdown
Member

@codope codope left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cdmikechen Thanks for getting rid of the dependency.
Left a couple of minor comments.

Copy link
Copy Markdown
Contributor

@yihua yihua left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@yihua yihua merged commit 0d8ca8d into apache:master Jan 9, 2022
@vinishjail97 vinishjail97 mentioned this pull request Jan 24, 2022
5 tasks
vingov pushed a commit to vingov/hudi that referenced this pull request Jan 26, 2022
liusenhua pushed a commit to liusenhua/hudi that referenced this pull request Mar 1, 2022
vingov pushed a commit to vingov/hudi that referenced this pull request Apr 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

priority:critical Production degraded; pipelines stalled

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants