-
Notifications
You must be signed in to change notification settings - Fork 2.5k
[HUDI-1594] Some fixes and enhancements to test suite framework #2400
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## master #2400 +/- ##
=============================================
+ Coverage 50.69% 69.44% +18.75%
+ Complexity 3059 357 -2702
=============================================
Files 419 53 -366
Lines 18810 1931 -16879
Branches 1924 230 -1694
=============================================
- Hits 9535 1341 -8194
+ Misses 8498 458 -8040
+ Partials 777 132 -645
Flags with carried forward coverage won't be shown. Click here to find out more. |
|
@nsivabalan Please work with @satishkotha before landing this since he is also looking to add similar tests around clustering |
b405f25 to
ab40bd6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no changes except adding this if condition
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these are part of reverting e33a8f7
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these are part of reverting e33a8f7
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all changes in this file are part of reverting e33a8f7
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all changes in this file are part of reverting e33a8f7
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these are part of reverting e33a8f7
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@n3nash @satishkotha : after clustering, when we call readFromSource in deltaStreamer, execution was going into lines 314 to 316. hence have added this condition to return empty checkpoint. Can you confirm this looks ok.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nsivabalan This should be fine since the commit timeline to read the CHECKPOINT from does not include the clustering instants. But I think this is already fixed with this PR -> #2400
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be fine since the commit timeline to read the CHECKPOINT from does not include the clustering instants.
@n3nash @nsivabalan This is not correct. The commit timeline does have the clustering instants (replacecommit) based on the changes in #2048. So this breaks the logic of getting the last checkpoint, even if we have the walk-back logic in #4034, which is skipped.
|
@n3nash : Patch is ready for review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nsivabalan high level dag looks good, will review code changes more deeply later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Record size of 100 bytes ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
anyways, w/o complex schema, we are not generating records of size 7000 bytes. So, thought will keep it to some sane value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nsivabalan Is this supposed to mean run this twice ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nope. execute this node only during iteration count 2.
...lient/hudi-client-common/src/main/java/org/apache/hudi/client/AbstractHoodieWriteClient.java
Outdated
Show resolved
Hide resolved
hudi-integ-test/src/main/java/org/apache/hudi/integ/testsuite/HoodieTestSuiteWriter.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we just rename this to REPEAT_COUNT and use it that way (if that's the intention) ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@n3nash : this refers to the iteration count/number where this node needs to be executed. Basically "execute at iteration count" config. I couldn't come up w/ a better name. If you have good suggestions, let me know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
may be "itr_count_to_execute" to be explicit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we just use "REPEAT_COUNT" ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as synced up offline, this refers to current iteration count. Have renamed this arg to "curItrCount".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this because the checkpoint is not passed ? There is some code that passes the checkpoint even if write client is used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have created a follow up task for satish to look into async clustering. I went through clustering test examples and added support for inline. haven't much time to get async clustering working.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, Satish has a PR ready as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to use the config to drive the itrCount and not have this signature change please..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, this refers to current iteration count. Don't think we can get away with this.
...eg-test/src/main/java/org/apache/hudi/integ/testsuite/dag/nodes/ValidateAsyncOperations.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/maxCommitsRetailed/maxCommitsRetained.
Also, not sure what 11 means..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
have fixed this. I instantiate writeClientConfig in HoodieTestSuiteWriter for both code paths(deltastreamer, writeClient) and use it here.
...eg-test/src/main/java/org/apache/hudi/integ/testsuite/dag/nodes/ValidateAsyncOperations.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's chat about all of this logic 1-1. This code is very difficult to read..may be there is a better way to achieve what you are trying
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure.
n3nash
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nsivabalan left some comments. Also since you are reverting e33a8f7, does this mean we cannot generate correct timestamps again ?
|
@nsivabalan Is this ready ? |
…date async operations node.
ab40bd6 to
729c34c
Compare
|
@nsivabalan can we create a JIRA for this. prioritize this higher and land it. We want to get the integ-suite into good shape asap. |
|
@n3nash : Addressed all your comments. |
|
@vinothchandar : yes, I synced up with nishith yesterday on some of the pending comments. we should land it sooner. |
n3nash
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Can you open a ticket to refactor the itr_count so we don't lost context ?
|
@n3nash @nsivabalan I would like for some of this to run on every commit. at least 1 test for each COW and MOR. Can one of you be able to add that to the CI? |
|
@n3nash : https://issues.apache.org/jira/browse/HUDI-1616 |
…date async operations node. (apache#2400)
What is the purpose of the pull request
Enhancements and fixes to test suite framework.
Brief change log
validate_hive: optional config to include hive validation as part of ValidateDatasetNode
execute_itr_count: Some nodes need to be executed only in one off iterations with a long running job. For eg: do clustering at 25th iteration among 50 iterations, etc.
validate_archival: optional config to include to add archival validation in ValidateAsyncOperationsNode
validate_clean: optional config to include to add clean up validation in ValidateAsyncOperationsNode
Verify this pull request
Manually ran test suite jobs to validate the changes.
Committer checklist
Has a corresponding JIRA in PR title & commit
Commit message is descriptive of the change
CI is green
Necessary doc changes done or have another open PR
For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.