Skip to content

Conversation

@Zouxxyy
Copy link
Contributor

@Zouxxyy Zouxxyy commented Jul 21, 2023

…eClient

Change Logs

  • Follow up #8080, WriteClient and TableServiceClient 's txnManager should be consist
  • Fix flaky test testMultiWriterWithAsyncTableServicesWithConflict in TestHoodieClientMultiWriter caused by #8832
  • Remove unused constructors in HoodieJavaWriteClient

Impact

Make TableServiceClient's txnManager consistent with WriteClient

Risk level (write none, low medium or high below)

low

Documentation Update

None

Contributor's checklist

  • Read through contributor's guide
  • Change Logs and Impact were stated clearly
  • Adequate tests were added if applicable
  • CI passed

@Zouxxyy
Copy link
Contributor Author

Zouxxyy commented Jul 22, 2023

In addition, before #6732, TableServiceClient and WriteClient share member variables. After this PR, there are still metrics and heartbeatClient maintained by them separately. Do we need to pass them to TableServiceClient?

@Zouxxyy Zouxxyy closed this Jul 24, 2023
@danny0405
Copy link
Contributor

6503.patch.zip
Thanks for the contribution, I have reviewd and add some modifications ~

@Zouxxyy Zouxxyy reopened this Jul 24, 2023
@Zouxxyy
Copy link
Contributor Author

Zouxxyy commented Jul 24, 2023

@danny0405 Thank you for the patch, it looks clearer. I closed this PR before, because (1) I think about it again, and think that it’s okay to not pass the txtmanger (2) It caused a lot of unstable tests, but I didn’t locate the reason (Although I don't think this PR will at least cause errors).

I'll reopen it and have a look at the test again.

@danny0405
Copy link
Contributor

I think about it again, and think that it’s okay to not pass the txtmanger

Can you elaborate a little more why the table service client can hold a separate lock ?

@Zouxxyy
Copy link
Contributor Author

Zouxxyy commented Jul 26, 2023

Can you elaborate a little more why the table service client can hold a separate lock ?

Because InProcessLockProvider is valid as long as it is in the same JVM process (see static final Map<String, ReentrantReadWriteLock> LOCK_INSTANCE_PER_BASEPATH = new ConcurrentHashMap<>();), other locks can not even be in the same JVM. Maybe i'm missing something.

Of course, it is best to use the same lock manager, just like it used to be (before #6732). And CI seems to be stable now by the way.

@danny0405
Copy link
Contributor

Okay, a static lock mapping makes sense to me.

@hudi-bot
Copy link
Collaborator

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

@nsivabalan nsivabalan added release-0.14.0 priority:blocker Production down; release blocker labels Aug 4, 2023
@yihua
Copy link
Contributor

yihua commented Aug 4, 2023

Can you elaborate a little more why the table service client can hold a separate lock ?

Because InProcessLockProvider is valid as long as it is in the same JVM process (see static final Map<String, ReentrantReadWriteLock> LOCK_INSTANCE_PER_BASEPATH = new ConcurrentHashMap<>();), other locks can not even be in the same JVM. Maybe i'm missing something.

Of course, it is best to use the same lock manager, just like it used to be (before #6732). And CI seems to be stable now by the way.

LOCK_INSTANCE_PER_BASEPATH is already static final, and multiple InProcessLockProvider instances using the same table base path in the same process should get the same underlying ReentrantReadWriteLock instance. Check this test which passes now: TestInProcessLockProvider.testLockIdentity().

There was a bug before in InProcessLockProvider which is fixed in #8658. Not sure if you hit that.

Option<EmbeddedTimelineService> timelineService) {
super(context, clientConfig, timelineService);
Option<EmbeddedTimelineService> timelineService,
Option<TransactionManager> txnManager) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if it's a good idea to expose this in the constructor, as the client can escape the transaction manager and cause correctness issues in concurrency control.

@danny0405 danny0405 added release-1.0.0 and removed priority:blocker Production down; release blocker release-0.14.0 labels Aug 6, 2023
@danny0405
Copy link
Contributor

Let's move this out of 0.14.0 because it's only a code refactoring.

@vinothchandar vinothchandar self-assigned this Jan 17, 2024
final SparkRDDWriteClient client3 = getHoodieWriteClient(cfg);

// Create upserts, schedule cleaning, schedule compaction in parallel
String fourthInstantTime = HoodieActiveTimeline.createNewInstantTime();
Copy link
Member

@vinothchandar vinothchandar Jan 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Zouxxyy we are trying to reduce flaky tests before we pick up pace on the reviews. Do you think these changes make the test less flaky? if so, love to land this in a minor PR first if possible.

https://dev.azure.com/apache-hudi-ci-org/apache-hudi-ci/_build/results?buildId=21997&view=ms.vss-test-web.build-test-results-tab&runId=8591&resultId=100038 is the failure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, add it, hope it helpful #10526

@Zouxxyy
Copy link
Contributor Author

Zouxxyy commented Jan 18, 2024

Close it because it no need, see #9255 (comment)

@Zouxxyy Zouxxyy closed this Jan 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants