Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KAFKA-17705: Add Transactions V2 system tests and mark as production ready #18132

Open
wants to merge 6 commits into
base: trunk
Choose a base branch
from

Conversation

jolshan
Copy link
Member

@jolshan jolshan commented Dec 10, 2024

Added transaction version 2 to some of the system tests. Also marking TV2 as production ready.
Will share the results of the tests when I get them.

@github-actions github-actions bot added triage PRs from the community core Kafka Broker producer clients small Small PRs labels Dec 10, 2024
@mumrah
Copy link
Member

mumrah commented Dec 11, 2024

@jolshan #17881 adds a "triage" label to PRs from non-committers. Turns out this also affect committers if their membership visibility in the ASF GitHub org is not public. I added instructions for setting your membership visibility to public https://github.com/apache/kafka/blob/trunk/.github/workflows/README.md#pr-triage

@mumrah mumrah removed the triage PRs from the community label Dec 11, 2024
@github-actions github-actions bot added the kraft label Dec 11, 2024
@@ -63,7 +63,7 @@ public static Map<String, VersionRange> defaultFeatureMap(boolean enableUnstable
MetadataVersion.latestTesting().featureLevel() :
MetadataVersion.latestProduction().featureLevel()));
for (Feature feature : Feature.PRODUCTION_FEATURES) {
short maxVersion = enableUnstable ? feature.latestTesting() : feature.latestProduction();
short maxVersion = enableUnstable ? feature.latestTesting() : feature.defaultLevel(MetadataVersion.LATEST_PRODUCTION);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@junrao @dongnuo123 I noticed we didn't change the defaults here on the previous PR. I have done so here. A test was failing since the production version for transaction version is now not the same as the default based on the latest production MV.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jolshan : Not sure that I understand this change. The result of defaultFeatureMap is used for Controller/Broker registration. So, it seems that we should pass in the max supported version of each feature, instead of the default version, right? In fact, defaultFeatureMap should be renamed to sth like supportedFeatureMap.

A test was failing since the production version for transaction version is now not the same as the default based on the latest production MV.

Hmm, I thought that with #17886, it's ok for the latest production version for TV to be different from the default. It just needs to be larger.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. My understanding from #17886 was that we want a separate production vs default value.

I thought these methods were also meant to create the default features, not the max supported ones. It's my bad if I misunderstood that. I will take another look and if that is the case, fix the test.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated with the changes to the name and the test

@github-actions github-actions bot removed the small Small PRs label Dec 11, 2024
Copy link
Contributor

@junrao junrao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jolshan : Thanks for the updated PR. Just a minor comment.

@@ -446,6 +446,7 @@ public synchronized void maybeUpdateTransactionV2Enabled() {
latestFinalizedFeaturesEpoch = info.finalizedFeaturesEpoch;
Short transactionVersion = info.finalizedFeatures.get("transaction.version");
isTransactionV2Enabled = transactionVersion != null && transactionVersion >= 2;
log.debug("Updating isTV2 enabled to {} at with FinalizedFeaturesEpoch {}", isTransactionV2Enabled, latestFinalizedFeaturesEpoch);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at with FinalizedFeaturesEpoch => with FinalizedFeaturesEpoch

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops. good catch :)

@jolshan
Copy link
Member Author

jolshan commented Dec 12, 2024

Copy link
Contributor

@junrao junrao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jolshan : Thanks for the updated PR. Is the unit test failure related? Also, it seems that a bunch of system tests timed out.

@jolshan
Copy link
Member Author

jolshan commented Dec 12, 2024

Thanks Jun. Taking a look.

@jolshan
Copy link
Member Author

jolshan commented Dec 12, 2024

Here's the new run of just the changed tests: https://confluent-open-source-kafka-branch-builder-system-test-results.s3-us-west-2.amazonaws.com/trunk/2024-12-12--001.65845de4-1d44-4e7d-8d23-ac15c9440cb2--1733986841--jolshan--kafka-17705--92dfdf6028/report.html

Still is a bit flaky even with the timeout increased. Will look at that. I also need to see if the consumer failures are unique to this PR or something that was in trunk at the time I branched.

@jolshan
Copy link
Member Author

jolshan commented Dec 12, 2024

The unit test failure does not seem related.

@jolshan
Copy link
Member Author

jolshan commented Dec 12, 2024

This explains the main divergence from trunk failures #18036. I do see some issues with fencing in the changes to the tests, so I will continue to investigate

@jolshan
Copy link
Member Author

jolshan commented Dec 13, 2024

System tests uncovered a bug! Will fix that and come back here :) https://issues.apache.org/jira/browse/KAFKA-18227

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants