-
Notifications
You must be signed in to change notification settings - Fork 14k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KAFKA-17705: Add Transactions V2 system tests and mark as production ready #18132
base: trunk
Are you sure you want to change the base?
Conversation
@jolshan #17881 adds a "triage" label to PRs from non-committers. Turns out this also affect committers if their membership visibility in the ASF GitHub org is not public. I added instructions for setting your membership visibility to public https://github.com/apache/kafka/blob/trunk/.github/workflows/README.md#pr-triage |
@@ -63,7 +63,7 @@ public static Map<String, VersionRange> defaultFeatureMap(boolean enableUnstable | |||
MetadataVersion.latestTesting().featureLevel() : | |||
MetadataVersion.latestProduction().featureLevel())); | |||
for (Feature feature : Feature.PRODUCTION_FEATURES) { | |||
short maxVersion = enableUnstable ? feature.latestTesting() : feature.latestProduction(); | |||
short maxVersion = enableUnstable ? feature.latestTesting() : feature.defaultLevel(MetadataVersion.LATEST_PRODUCTION); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@junrao @dongnuo123 I noticed we didn't change the defaults here on the previous PR. I have done so here. A test was failing since the production version for transaction version is now not the same as the default based on the latest production MV.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jolshan : Not sure that I understand this change. The result of defaultFeatureMap
is used for Controller/Broker registration. So, it seems that we should pass in the max supported version of each feature, instead of the default version, right? In fact, defaultFeatureMap
should be renamed to sth like supportedFeatureMap
.
A test was failing since the production version for transaction version is now not the same as the default based on the latest production MV.
Hmm, I thought that with #17886, it's ok for the latest production version for TV to be different from the default. It just needs to be larger.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. My understanding from #17886 was that we want a separate production vs default value.
I thought these methods were also meant to create the default features, not the max supported ones. It's my bad if I misunderstood that. I will take another look and if that is the case, fix the test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated with the changes to the name and the test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jolshan : Thanks for the updated PR. Just a minor comment.
@@ -446,6 +446,7 @@ public synchronized void maybeUpdateTransactionV2Enabled() { | |||
latestFinalizedFeaturesEpoch = info.finalizedFeaturesEpoch; | |||
Short transactionVersion = info.finalizedFeatures.get("transaction.version"); | |||
isTransactionV2Enabled = transactionVersion != null && transactionVersion >= 2; | |||
log.debug("Updating isTV2 enabled to {} at with FinalizedFeaturesEpoch {}", isTransactionV2Enabled, latestFinalizedFeaturesEpoch); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
at with FinalizedFeaturesEpoch => with FinalizedFeaturesEpoch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops. good catch :)
I am seeing some test failures in the system tests, so I will investigate. some issues with the old tests: https://confluent-open-source-kafka-branch-builder-system-test-results.s3-us-west-2.amazonaws.com/trunk/2024-12-11--001.600822cf-da88-4744-b834-795c565d98d5--1733957556--jolshan--kafka-17705--e69893c32c/report.html (some could be already failing on trunk) |
I also pushed a change and started a new test specific run. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jolshan : Thanks for the updated PR. Is the unit test failure related? Also, it seems that a bunch of system tests timed out.
Thanks Jun. Taking a look. |
Here's the new run of just the changed tests: https://confluent-open-source-kafka-branch-builder-system-test-results.s3-us-west-2.amazonaws.com/trunk/2024-12-12--001.65845de4-1d44-4e7d-8d23-ac15c9440cb2--1733986841--jolshan--kafka-17705--92dfdf6028/report.html Still is a bit flaky even with the timeout increased. Will look at that. I also need to see if the consumer failures are unique to this PR or something that was in trunk at the time I branched. |
The unit test failure does not seem related. |
This explains the main divergence from trunk failures #18036. I do see some issues with fencing in the changes to the tests, so I will continue to investigate |
System tests uncovered a bug! Will fix that and come back here :) https://issues.apache.org/jira/browse/KAFKA-18227 |
Added transaction version 2 to some of the system tests. Also marking TV2 as production ready.
Will share the results of the tests when I get them.