Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enhancement]: Adding fetch from tiered storage system test #10302

Closed
showuon opened this issue Jul 4, 2024 · 4 comments · Fixed by #10351
Closed

[Enhancement]: Adding fetch from tiered storage system test #10302

showuon opened this issue Jul 4, 2024 · 4 comments · Fixed by #10351

Comments

@showuon
Copy link
Member

showuon commented Jul 4, 2024

Related problem

No response

Suggested solution

In #10149, we added system test for tiered storage. But the test case testTieredStorageWithAivenPlugin only test the "upload" path, which is the local data can successfully upload log segments to the remote storage. We didn't have test for the "download" path, which is the consumer fetch data from remote storage.

I think we should add some more verification like:

  1. Making sure the local log segments are deleted after uploading to remote storage. We have set
.addToConfig("file.delete.delay.ms", 1000)
.addToConfig("local.retention.ms", 1000)

It should be quick
2. Create a consumer to fetch from the beginning offset, it should read from the remote storage and get the expected record.
3. I'm not sure if we want to test the delete remote log path. If so, we can create a tiered storage enabled topic with short "retention.ms" or "retention.bytes", then verify it will delete the remote logs after expiration or size breached.

Alternatives

No response

Additional context

No response

@Frawless
Copy link
Member

@showuon hey, current implementation of consumer uses from the begging config so it sounds to me that your suggested changes are already in place. Or do you suggest some additional check like - start consuming only when Kafka local storage is empty ?

Regarding 3. - is it something we should test or it's more likely plugin functionality?

@showuon
Copy link
Member Author

showuon commented Jul 11, 2024

Or do you suggest some additional check like - start consuming only when Kafka local storage is empty ?

Correct! That way, we can verify the path of fetching from remote storage works well.

@showuon
Copy link
Member Author

showuon commented Jul 11, 2024

Regarding 3. - is it something we should test or it's more likely plugin functionality?

We should test it. There are 2 parts of logic here:
(1) in the kafka side, we should be able to honor the retention setting (i.e. retention.ms/retention.bytes) to delete the expired log segments.
(2) In the remote storage, we will invoke the methods via plugin to delete the remote log segment.

So, yes, you are right, it's partially the plugin's functionality. But there is also part of Kafka's logic involved. Like we tested the log uploading to remote storage part, we want to verify the kafka side logic works well, and "assume" the Aiven's plugin works as expected.

Does that make sense?

@scholzj
Copy link
Member

scholzj commented Jul 11, 2024

Discussed on the community call on 10.7.2024: This should be addressed. @Frawless will take a look.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants