-
Notifications
You must be signed in to change notification settings - Fork 2.5k
[HUDI-4071] Change defaults for some of the configs #5643
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@codope : whats the follow up on this |
Need to rebase and fix tests. Will get to that in this week. |
8c5174c to
cd339e3
Compare
| .key("hoodie.embed.timeline.server.async") | ||
| .defaultValue("false") | ||
| .defaultValue("true") | ||
| .withDocumentation("Controls whether or not, the requests to the timeline server are processed in asynchronous fashion, " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is exactly the effect of this param, i'm scared whether it can cause data loss like #6179
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. The intent is to improve the throughput by requesting the timeline server asynchronously. Internally, we have run a long-running test (30+ commits) with this config but did not see any data loss. Our validations were based on count. However, if you have any concern let me know. We can keep it false by default for now. Data loss issues are more critical.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
30+ commits is too few to reproduce, in #6179 , we run about 2000+ commits to reproduce the problem. I would suggest you to do the same test before switch the flag.
- BULK_INSERT_SORT_MODE from GLOBAL_SORT to NONE - RECONCILE_SCHEMA from false to true - Match ROLLBACK_USING_MARKERS_ENABLE in spark sql as spark datasource
cd339e3 to
4a2ccb4
Compare
| public static final ConfigProperty<Boolean> RECONCILE_SCHEMA = ConfigProperty | ||
| .key("hoodie.datasource.write.reconcile.schema") | ||
| .defaultValue(false) | ||
| .defaultValue(true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| .withPath(basePath) | ||
| .withIndexConfig(HoodieIndexConfig.newBuilder.withIndexType(IndexType.BLOOM).build) | ||
| .withRollbackUsingMarkers(false) | ||
| .withRollbackUsingMarkers(HoodieWriteConfig.ROLLBACK_USING_MARKERS_ENABLE.defaultValue.toBoolean) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the purpose of the pull request
(For example: This pull request adds quick-start document.)
Brief change log
(for example:)
Verify this pull request
(Please pick either of the following options)
This pull request is a trivial rework / code cleanup without any test coverage.
(or)
This pull request is already covered by existing tests, such as (please describe tests).
(or)
This change added tests and can be verified as follows:
(example:)
Committer checklist
Has a corresponding JIRA in PR title & commit
Commit message is descriptive of the change
CI is green
Necessary doc changes done or have another open PR
For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.