-
Notifications
You must be signed in to change notification settings - Fork 91
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DVC] Add RequestBasedMetaRepository to enable metadata retrieval directly from server #1467
base: main
Are you sure you want to change the base?
[DVC] Add RequestBasedMetaRepository to enable metadata retrieval directly from server #1467
Conversation
cfa6434
to
80f0745
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good overall, thanks for the refactoring. Just some minor comments and also looks like test coverage is not passing.
...client/src/main/java/com/linkedin/davinci/repository/ThinClientMetaStoreBasedRepository.java
Outdated
Show resolved
Hide resolved
.../da-vinci-client/src/main/java/com/linkedin/davinci/repository/NativeMetadataRepository.java
Outdated
Show resolved
Hide resolved
...client/src/main/java/com/linkedin/davinci/repository/ThinClientMetaStoreBasedRepository.java
Outdated
Show resolved
Hide resolved
...vinci-client/src/test/java/com/linkedin/davinci/repository/NativeMetadataRepositoryTest.java
Outdated
Show resolved
Hide resolved
...vinci-client/src/test/java/com/linkedin/davinci/repository/NativeMetadataRepositoryTest.java
Show resolved
Hide resolved
00fdc3a
to
50d3ef2
Compare
50d3ef2
to
de203f4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the changes and unit tests, I think we are a bit lacking on the integration tests. See comments below.
.../da-vinci-client/src/main/java/com/linkedin/davinci/repository/NativeMetadataRepository.java
Show resolved
Hide resolved
...e-test-common/src/integrationTest/java/com/linkedin/venice/endToEnd/MetaSystemStoreTest.java
Outdated
Show resolved
Hide resolved
de203f4
to
277b16c
Compare
21595b2
to
521c4b3
Compare
06c75e5
to
6cc8286
Compare
File dataDir = getPushJobAvroFileDirectory(storeName); | ||
String dataDirPath = "file:" + dataDir.getAbsolutePath(); | ||
|
||
if (!controllerClients.containsKey(storeName)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using the ContollerClients map to keep track of created stores. I think this is appropriate since the IntegrationTestPushUtils.createStoreForJob
method returns a controllerClient, even though we only need one instance of the ControllerClient to create new value schemas.
Digging into the createStoreForJob method, it does look like the D2ControllerClientFactory
caches the controllerClient by (clusterName, d2ServiceName,d2ZkHost)
. So this should result in all of the clients in this map to point to the same client instance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typically we share the cluster environment across tests because a cluster is expensive and the things we are testing for generally don't break the cluster environment. However, I'd recommend you stay away from sharing stores across different tests. Stores are cheaper to create and if certain test fail it might leave the store in a bad state where other tests will also fail. This will lead to scenarios where you see 10 tests fail but really only 1 test failed and caused the store to be in a bad state. The order of test execution might also have side effects on the result of the test (making it flaky). Can we refactor this class so we are not sharing stores across tests?
@@ -70,7 +62,7 @@ public abstract class NativeMetadataRepository | |||
private final Map<String, StoreConfig> storeConfigMap = new VeniceConcurrentHashMap<>(); | |||
// Local cache for key/value schemas. SchemaData supports one key schema per store only, which may need to be changed | |||
// for key schema evolvability. | |||
private final Map<String, SchemaData> schemaMap = new VeniceConcurrentHashMap<>(); | |||
protected Map<String, SchemaData> schemaMap = new VeniceConcurrentHashMap<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing the final
here so we may initialize this map in unit test mocking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you point me to the unit test you are talking about? I don't see why this is necessary to achieve unit test mocking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RequestBasedMetaRepositoryTest.java and ThinClientMetaStoreBasedRepositoryTest.java.
In the unit tests, I am partially mocking the client object by using Mockito's thenCallRealMethod
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The mocked clients do not seem to run any initialization on private final class variables, so these maps need to be initialized by the test.
6cc8286
to
ea4f353
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change looks good, just one minor concern about Thread.sleep
usage in test and exposing some class field for unit test mocking.
assertFalse(resp.isError()); | ||
assertFalse(parentControllerClient.emptyPush(storeName, "test-push-job", 100).isError()); | ||
try { | ||
Thread.sleep(10000); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are already using TestUtils.waitForNonDeterministicPushCompletion
. We should avoid using Thread.sleep
and why do we even need it here?.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tests were not passing early on due to the stores not being ready fast enough, however removing it now and testing locally seems to work. I pushed changes to remove this sleep, waiting on CI to finish.
b66105c
to
445c790
Compare
445c790
to
c9b0f16
Compare
[DVC] Add RequestBasedMetaRepository to enable metadata retrieval directly from server
This PR introduces the use of the
/store_properties/{store_name}
endpoint to retrieve and cache store metadata for the DaVinci client. The endpoint was implemented in the Venice Server in this PR: #1374.The
RequestBasedMetaRepository
leverages this endpoint to maintain a cache ofStore
andSchemaData
objects for requested stores. The cache is refreshed by theNativeMetadataRepository
, which this new class extends.By default, this feature is disabled in the
ClientConfig
passed to theNativeMetadataRepository
. The default configuration preserves the current behavior of using theThinClientMetaStoreBasedRepository
. The relevant configuration field isClientConfig.useRequestBasedMetaRepository
. For more details, see NativeMetadataRepository.getInstance.How was this PR tested?
Unit and Integration test included in PR.
Does this PR introduce any user-facing changes?