Skip to content

Conversation

@bshashikant
Copy link
Contributor

What changes were proposed in this pull request?

The issue was primarily caused when first seek to an offset , then read followed by seek to a different offset and read data again both containing overlapping set of chunks . Once a seek to a position is done, the chunkPosition inside each blockInputStream is not correctly set to 0 thereby, the 1st which to which the seek offset belongs is correctly read but for the next subsequent chunks , data to be read will be returned as zero as a result of which , all the read for the subsequent chunks will return length to be read as 0. The solution here is to reset all the subsequent chunks for all subsequent blocks after a seek to set to 0 so once that it will start read from the beginning of each chunk.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-2359

How was this patch tested?

The patch was tested with addition of unit tests which reliably reproduce the issue. This was also deployed in real cluster where the issue was first discovered and verified.

Thanks @fapifta for discovering the issue and help verifying the fix as well. Thanks @bharatviswa504 and @hanishakoneru for the contribution in the fix provided.

@hanishakoneru
Copy link
Contributor

Thank you @bshashikant for working on this.
LGTM. +1 pending CI.

@lokeshj1703
Copy link
Contributor

The changes look good to me. Can you please verify the test failures? There is a failure in TestKeyInputStream.

@mukul1987
Copy link
Contributor

/retest

@bshashikant
Copy link
Contributor Author

Thanks @lokeshj1703 for having a look. The test failure in KeyInputStream is happening while write because write chunk request counter is mismatches with the expected result as a result of retry of the request. I tried the test locally and it all seem to pass locally. The failure is not related to the patch itself.

@bharatviswa504
Copy link
Contributor

bharatviswa504 commented Nov 1, 2019

Thanks @lokeshj1703 for having a look. The test failure in KeyInputStream is happening while write because write chunk request counter is mismatches with the expected result as a result of retry of the request. I tried the test locally and it all seem to pass locally. The failure is not related to the patch itself.

To avoid this flakiness, can we change the check to >=writechunkcount+3, instead of equals to consider retry. (Or if we can have some way to know if retry happened then only we can check for >3 check.)

@bshashikant
Copy link
Contributor Author

Thanks @bharatviswa504 for the review. There are multiple client test failures which are flaky in nature which are failing intermittently because of random retries in the test execution. Can we address this as a part of separate jira altogether?

@bharatviswa504
Copy link
Contributor

bharatviswa504 commented Nov 6, 2019

Thanks @bharatviswa504 for the review. There are multiple client test failures which are flaky in nature which are failing intermittently because of random retries in the test execution. Can we address this as a part of separate jira altogether?

Sure. We can open a new Jira to address this.

@bharatviswa504 bharatviswa504 merged commit 9565cc5 into apache:master Nov 6, 2019
@bharatviswa504
Copy link
Contributor

Thank you @bshashikant for the contribution and all for the reviews.

ptlrs pushed a commit to ptlrs/ozone that referenced this pull request Mar 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants