[INITIAL VERSION] New PhysicalIO#286
Conversation
| private static final boolean DEFAULT_SMALL_OBJECTS_PREFETCHING_ENABLED = true; | ||
| private static final long DEFAULT_SMALL_OBJECT_SIZE_THRESHOLD = 8 * ONE_MB; | ||
| private static final int DEFAULT_THREAD_POOL_SIZE = 96; | ||
| private static final long DEFAULT_READ_BUFFER_SIZE = 8 * ONE_KB; |
There was a problem hiding this comment.
this is the default block size right? should we rename this to DEFAULT_BLOCK_SIZE then?
There was a problem hiding this comment.
We already have a configuration named DEFAULT_BLOCK_SIZE_BYTES which is not being used actively and defined for a separate purpose. The reason I created a new config is not to expose a config name which is a term to library's internals so I used as read buffer. Open to different name suggestions.
| @NonNull Metrics aggregatingMetrics, | ||
| @NonNull BlobStoreIndexCache indexCache, | ||
| @NonNull ExecutorService threadPool, | ||
| StreamContext streamContext) { |
There was a problem hiding this comment.
we are no longer passing down stream context, we pass down the openstreaminfo.
There was a problem hiding this comment.
That is a recent change and those changes will be adapted as part of next steps
| * @param readMode the mode in which the read is being performed (used for tracking or metrics) | ||
| * @throws IllegalArgumentException if the {@code blocks} list is empty | ||
| * @implNote This method uses a fire-and-forget strategy and doesn't return a {@code Future}; | ||
| * failures are logged or wrapped in a {@code RuntimeException}. |
There was a problem hiding this comment.
will be good if you can also talk about the computations involved in filling the blocks in the javadocs.
There was a problem hiding this comment.
Will update the Javadocs in later PRs
| .range(requestRange) | ||
| .etag(objectKey.getEtag()) | ||
| .referrer(new Referrer(requestRange.toHttpString(), readMode)) | ||
| .build(); |
There was a problem hiding this comment.
here we are making a single get request, and this might include ranges which are not required as well. How have we confirmed that this does not cause any performance issues?
There was a problem hiding this comment.
We currently skip over unneeded byte ranges while reading from the stream, but this can be more expensive than issuing multiple GetObject requests, especially when the skip range is large. To address this, we plan to introduce optimizations at the BlockManager level — merging block requests only when the gaps between them are within a reasonable threshold. Otherwise, separate S3 requests will be issued to avoid inefficient skipping.
|
|
||
| try (InputStream inputStream = objectContent.getStream()) { | ||
| long currentOffset = rangeStart; | ||
| for (DataBlock block : blocks) { |
There was a problem hiding this comment.
can we parallelise the filling of the blocks?
There was a problem hiding this comment.
Currently, block filling is done sequentially from a single InputStream, which inherently limits parallelism. Since S3 provides a single continuous stream for a given range request, we cannot parallelize reading from it without breaking the stream. However, if we need true parallelism, we can issue separate GetObject requests per block or per group of blocks and process them in parallel — at the cost of increased request overhead and potential S3 throttling. This trade-off can be managed at the BlockManager level based on access patterns and performance goals.
|
|
||
| /** Closes the {@link DataBlockManager} and frees up all resources it holds */ | ||
| @Override | ||
| public void close() {} |
There was a problem hiding this comment.
This method will be implemented as part of MemoryManager adoption task so I left it empty for initial version
|
|
||
| @Getter private final BlockKey blockKey; | ||
| @Getter private final long generation; | ||
| private final CountDownLatch dataReadyLatch = new CountDownLatch(1); |
There was a problem hiding this comment.
nit: a lot of people (like me!) won't be familiar with what a countdownlatch does, would be great to add a comment here about the purpose of this variable.
| final Range range = | ||
| new Range( | ||
| blockIndex * configuration.getReadBufferSize(), | ||
| Math.min((blockIndex + 1) * configuration.getReadBufferSize(), getLastObjectByte())); |
There was a problem hiding this comment.
nit: maybe move this last byte calculation out of this, and add comments on how you calculate this, eg: what does (blockIndex + 1) * configuration.getReadBufferSize() mean
ahmarsuhail
left a comment
There was a problem hiding this comment.
took a high level look, looks good :) will review again properly next week
…alIO (#287) ## Description of change - This change adopts the changes from [PR](#283) to the new Physical IO implementation. - Updates comment in DataBlock object #### Relevant issues [OpenStremInformation PR](#283) [Initial version of Physical IO](#286) --- By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and I agree to the terms of the [Developer Certificate of Origin (DCO)](https://developercertificate.org/). --------- Co-authored-by: Erdogan Ozkoca <ozkoca@amazon.com>
## Description of change This PR adopts the memory manager changes to new physicalIO/ #### Relevant issues PR History: - #286 - #287 #### Does this contribution introduce any breaking changes to the existing APIs or behaviors? No #### Does this contribution introduce any new public APIs or behaviors? No --- By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and I agree to the terms of the [Developer Certificate of Origin (DCO)](https://developercertificate.org/). --------- Co-authored-by: Erdogan Ozkoca <ozkoca@amazon.com>
## Description of change This PR adds a new method optimizeReads to the RangeOptimiser class to improve read performance by intelligently grouping and splitting block indexes. The implementation reduces the complexity in DataBlockManager and makes the optimization logic more testable. Changes are: - Adds readAheadBytes logic - Adds sequential prefetching logic - Groups sequential block indexes together - Splits large sequential groups into smaller chunks based on configuration parameters - Refactored DataBlockManager to use the new method instead of implementing the logic itself - Added comprehensive unit tests for the new method Out of Scope - Range coalescing will be implemented in a separate PR #### Relevant issues PR History: #286 #287 #288 #### Does this contribution introduce any breaking changes to the existing APIs or behaviors? No #### Does this contribution introduce any new public APIs or behaviors? No --- By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and I agree to the terms of the [Developer Certificate of Origin (DCO)](https://developercertificate.org/).
## Description of change This PR merges the new PhysicalIO changes to the Blob object and start to use the new implementation. Next Steps: - Range coalescing implementation - Retry policy implementation #### Relevant issues PR History: #286 #287 #288 #289 #### Does this contribution introduce any breaking changes to the existing APIs or behaviors? No #### Does this contribution introduce any new public APIs or behaviors? No #### How was the contribution tested? Unit test #### Does this contribution need a changelog entry? n/A --- By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and I agree to the terms of the [Developer Certificate of Origin (DCO)](https://developercertificate.org/).
…alIO (awslabs#287) ## Description of change - This change adopts the changes from [PR](awslabs#283) to the new Physical IO implementation. - Updates comment in DataBlock object #### Relevant issues [OpenStremInformation PR](awslabs#283) [Initial version of Physical IO](awslabs#286) --- By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and I agree to the terms of the [Developer Certificate of Origin (DCO)](https://developercertificate.org/). --------- Co-authored-by: Erdogan Ozkoca <ozkoca@amazon.com>
…slabs#288) ## Description of change This PR adopts the memory manager changes to new physicalIO/ #### Relevant issues PR History: - awslabs#286 - awslabs#287 #### Does this contribution introduce any breaking changes to the existing APIs or behaviors? No #### Does this contribution introduce any new public APIs or behaviors? No --- By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and I agree to the terms of the [Developer Certificate of Origin (DCO)](https://developercertificate.org/). --------- Co-authored-by: Erdogan Ozkoca <ozkoca@amazon.com>
## Description of change This PR adds a new method optimizeReads to the RangeOptimiser class to improve read performance by intelligently grouping and splitting block indexes. The implementation reduces the complexity in DataBlockManager and makes the optimization logic more testable. Changes are: - Adds readAheadBytes logic - Adds sequential prefetching logic - Groups sequential block indexes together - Splits large sequential groups into smaller chunks based on configuration parameters - Refactored DataBlockManager to use the new method instead of implementing the logic itself - Added comprehensive unit tests for the new method Out of Scope - Range coalescing will be implemented in a separate PR #### Relevant issues PR History: awslabs#286 awslabs#287 awslabs#288 #### Does this contribution introduce any breaking changes to the existing APIs or behaviors? No #### Does this contribution introduce any new public APIs or behaviors? No --- By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and I agree to the terms of the [Developer Certificate of Origin (DCO)](https://developercertificate.org/).
This PR merges the new PhysicalIO changes to the Blob object and start to use the new implementation. Next Steps: - Range coalescing implementation - Retry policy implementation PR History: awslabs#286 awslabs#287 awslabs#288 awslabs#289 existing APIs or behaviors? No No Unit test n/A --- By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and I agree to the terms of the [Developer Certificate of Origin (DCO)](https://developercertificate.org/).
…alIO (awslabs#287) ## Description of change - This change adopts the changes from [PR](awslabs#283) to the new Physical IO implementation. - Updates comment in DataBlock object #### Relevant issues [OpenStremInformation PR](awslabs#283) [Initial version of Physical IO](awslabs#286) --- By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and I agree to the terms of the [Developer Certificate of Origin (DCO)](https://developercertificate.org/). --------- Co-authored-by: Erdogan Ozkoca <ozkoca@amazon.com>
…slabs#288) ## Description of change This PR adopts the memory manager changes to new physicalIO/ #### Relevant issues PR History: - awslabs#286 - awslabs#287 #### Does this contribution introduce any breaking changes to the existing APIs or behaviors? No #### Does this contribution introduce any new public APIs or behaviors? No --- By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and I agree to the terms of the [Developer Certificate of Origin (DCO)](https://developercertificate.org/). --------- Co-authored-by: Erdogan Ozkoca <ozkoca@amazon.com>
## Description of change This PR adds a new method optimizeReads to the RangeOptimiser class to improve read performance by intelligently grouping and splitting block indexes. The implementation reduces the complexity in DataBlockManager and makes the optimization logic more testable. Changes are: - Adds readAheadBytes logic - Adds sequential prefetching logic - Groups sequential block indexes together - Splits large sequential groups into smaller chunks based on configuration parameters - Refactored DataBlockManager to use the new method instead of implementing the logic itself - Added comprehensive unit tests for the new method Out of Scope - Range coalescing will be implemented in a separate PR #### Relevant issues PR History: awslabs#286 awslabs#287 awslabs#288 #### Does this contribution introduce any breaking changes to the existing APIs or behaviors? No #### Does this contribution introduce any new public APIs or behaviors? No --- By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and I agree to the terms of the [Developer Certificate of Origin (DCO)](https://developercertificate.org/).
This PR merges the new PhysicalIO changes to the Blob object and start to use the new implementation. Next Steps: - Range coalescing implementation - Retry policy implementation PR History: awslabs#286 awslabs#287 awslabs#288 awslabs#289 existing APIs or behaviors? No No Unit test n/A --- By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and I agree to the terms of the [Developer Certificate of Origin (DCO)](https://developercertificate.org/).
…alIO (awslabs#287) ## Description of change - This change adopts the changes from [PR](awslabs#283) to the new Physical IO implementation. - Updates comment in DataBlock object #### Relevant issues [OpenStremInformation PR](awslabs#283) [Initial version of Physical IO](awslabs#286) --- By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and I agree to the terms of the [Developer Certificate of Origin (DCO)](https://developercertificate.org/). --------- Co-authored-by: Erdogan Ozkoca <ozkoca@amazon.com>
…slabs#288) ## Description of change This PR adopts the memory manager changes to new physicalIO/ #### Relevant issues PR History: - awslabs#286 - awslabs#287 #### Does this contribution introduce any breaking changes to the existing APIs or behaviors? No #### Does this contribution introduce any new public APIs or behaviors? No --- By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and I agree to the terms of the [Developer Certificate of Origin (DCO)](https://developercertificate.org/). --------- Co-authored-by: Erdogan Ozkoca <ozkoca@amazon.com>
## Description of change This PR adds a new method optimizeReads to the RangeOptimiser class to improve read performance by intelligently grouping and splitting block indexes. The implementation reduces the complexity in DataBlockManager and makes the optimization logic more testable. Changes are: - Adds readAheadBytes logic - Adds sequential prefetching logic - Groups sequential block indexes together - Splits large sequential groups into smaller chunks based on configuration parameters - Refactored DataBlockManager to use the new method instead of implementing the logic itself - Added comprehensive unit tests for the new method Out of Scope - Range coalescing will be implemented in a separate PR #### Relevant issues PR History: awslabs#286 awslabs#287 awslabs#288 #### Does this contribution introduce any breaking changes to the existing APIs or behaviors? No #### Does this contribution introduce any new public APIs or behaviors? No --- By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and I agree to the terms of the [Developer Certificate of Origin (DCO)](https://developercertificate.org/).
This PR merges the new PhysicalIO changes to the Blob object and start to use the new implementation. Next Steps: - Range coalescing implementation - Retry policy implementation PR History: awslabs#286 awslabs#287 awslabs#288 awslabs#289 existing APIs or behaviors? No No Unit test n/A --- By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and I agree to the terms of the [Developer Certificate of Origin (DCO)](https://developercertificate.org/).
This PR merges the new PhysicalIO changes to the Blob object and start to use the new implementation. Next Steps: - Range coalescing implementation - Retry policy implementation PR History: #286 #287 #288 #289 existing APIs or behaviors? No No Unit test n/A --- By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and I agree to the terms of the [Developer Certificate of Origin (DCO)](https://developercertificate.org/).
…alIO (awslabs#287) ## Description of change - This change adopts the changes from [PR](awslabs#283) to the new Physical IO implementation. - Updates comment in DataBlock object #### Relevant issues [OpenStremInformation PR](awslabs#283) [Initial version of Physical IO](awslabs#286) --- By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and I agree to the terms of the [Developer Certificate of Origin (DCO)](https://developercertificate.org/). --------- Co-authored-by: Erdogan Ozkoca <ozkoca@amazon.com>
…slabs#288) ## Description of change This PR adopts the memory manager changes to new physicalIO/ #### Relevant issues PR History: - awslabs#286 - awslabs#287 #### Does this contribution introduce any breaking changes to the existing APIs or behaviors? No #### Does this contribution introduce any new public APIs or behaviors? No --- By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and I agree to the terms of the [Developer Certificate of Origin (DCO)](https://developercertificate.org/). --------- Co-authored-by: Erdogan Ozkoca <ozkoca@amazon.com>
## Description of change This PR adds a new method optimizeReads to the RangeOptimiser class to improve read performance by intelligently grouping and splitting block indexes. The implementation reduces the complexity in DataBlockManager and makes the optimization logic more testable. Changes are: - Adds readAheadBytes logic - Adds sequential prefetching logic - Groups sequential block indexes together - Splits large sequential groups into smaller chunks based on configuration parameters - Refactored DataBlockManager to use the new method instead of implementing the logic itself - Added comprehensive unit tests for the new method Out of Scope - Range coalescing will be implemented in a separate PR #### Relevant issues PR History: awslabs#286 awslabs#287 awslabs#288 #### Does this contribution introduce any breaking changes to the existing APIs or behaviors? No #### Does this contribution introduce any new public APIs or behaviors? No --- By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and I agree to the terms of the [Developer Certificate of Origin (DCO)](https://developercertificate.org/).
This PR merges the new PhysicalIO changes to the Blob object and start to use the new implementation. Next Steps: - Range coalescing implementation - Retry policy implementation PR History: awslabs#286 awslabs#287 awslabs#288 awslabs#289 existing APIs or behaviors? No No Unit test n/A --- By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and I agree to the terms of the [Developer Certificate of Origin (DCO)](https://developercertificate.org/).
…alIO (awslabs#287) ## Description of change - This change adopts the changes from [PR](awslabs#283) to the new Physical IO implementation. - Updates comment in DataBlock object #### Relevant issues [OpenStremInformation PR](awslabs#283) [Initial version of Physical IO](awslabs#286) --- By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and I agree to the terms of the [Developer Certificate of Origin (DCO)](https://developercertificate.org/). --------- Co-authored-by: Erdogan Ozkoca <ozkoca@amazon.com>
…slabs#288) ## Description of change This PR adopts the memory manager changes to new physicalIO/ #### Relevant issues PR History: - awslabs#286 - awslabs#287 #### Does this contribution introduce any breaking changes to the existing APIs or behaviors? No #### Does this contribution introduce any new public APIs or behaviors? No --- By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and I agree to the terms of the [Developer Certificate of Origin (DCO)](https://developercertificate.org/). --------- Co-authored-by: Erdogan Ozkoca <ozkoca@amazon.com>
## Description of change This PR adds a new method optimizeReads to the RangeOptimiser class to improve read performance by intelligently grouping and splitting block indexes. The implementation reduces the complexity in DataBlockManager and makes the optimization logic more testable. Changes are: - Adds readAheadBytes logic - Adds sequential prefetching logic - Groups sequential block indexes together - Splits large sequential groups into smaller chunks based on configuration parameters - Refactored DataBlockManager to use the new method instead of implementing the logic itself - Added comprehensive unit tests for the new method Out of Scope - Range coalescing will be implemented in a separate PR #### Relevant issues PR History: awslabs#286 awslabs#287 awslabs#288 #### Does this contribution introduce any breaking changes to the existing APIs or behaviors? No #### Does this contribution introduce any new public APIs or behaviors? No --- By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and I agree to the terms of the [Developer Certificate of Origin (DCO)](https://developercertificate.org/).
This PR merges the new PhysicalIO changes to the Blob object and start to use the new implementation. Next Steps: - Range coalescing implementation - Retry policy implementation PR History: awslabs#286 awslabs#287 awslabs#288 awslabs#289 existing APIs or behaviors? No No Unit test n/A --- By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and I agree to the terms of the [Developer Certificate of Origin (DCO)](https://developercertificate.org/).
…alIO (awslabs#287) ## Description of change - This change adopts the changes from [PR](awslabs#283) to the new Physical IO implementation. - Updates comment in DataBlock object #### Relevant issues [OpenStremInformation PR](awslabs#283) [Initial version of Physical IO](awslabs#286) --- By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and I agree to the terms of the [Developer Certificate of Origin (DCO)](https://developercertificate.org/). --------- Co-authored-by: Erdogan Ozkoca <ozkoca@amazon.com>
…slabs#288) ## Description of change This PR adopts the memory manager changes to new physicalIO/ #### Relevant issues PR History: - awslabs#286 - awslabs#287 #### Does this contribution introduce any breaking changes to the existing APIs or behaviors? No #### Does this contribution introduce any new public APIs or behaviors? No --- By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and I agree to the terms of the [Developer Certificate of Origin (DCO)](https://developercertificate.org/). --------- Co-authored-by: Erdogan Ozkoca <ozkoca@amazon.com>
## Description of change This PR adds a new method optimizeReads to the RangeOptimiser class to improve read performance by intelligently grouping and splitting block indexes. The implementation reduces the complexity in DataBlockManager and makes the optimization logic more testable. Changes are: - Adds readAheadBytes logic - Adds sequential prefetching logic - Groups sequential block indexes together - Splits large sequential groups into smaller chunks based on configuration parameters - Refactored DataBlockManager to use the new method instead of implementing the logic itself - Added comprehensive unit tests for the new method Out of Scope - Range coalescing will be implemented in a separate PR #### Relevant issues PR History: awslabs#286 awslabs#287 awslabs#288 #### Does this contribution introduce any breaking changes to the existing APIs or behaviors? No #### Does this contribution introduce any new public APIs or behaviors? No --- By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and I agree to the terms of the [Developer Certificate of Origin (DCO)](https://developercertificate.org/).
This PR merges the new PhysicalIO changes to the Blob object and start to use the new implementation. Next Steps: - Range coalescing implementation - Retry policy implementation PR History: awslabs#286 awslabs#287 awslabs#288 awslabs#289 existing APIs or behaviors? No No Unit test n/A --- By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and I agree to the terms of the [Developer Certificate of Origin (DCO)](https://developercertificate.org/).
## Description of change This PR adds the capability of retry to the new PhysicalIO #### Relevant issues #286 #287 #288 #289 #294 #316 #### Does this contribution introduce any breaking changes to the existing APIs or behaviors? No #### Does this contribution introduce any new public APIs or behaviors? No #### How was the contribution tested? Unit tests --- By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and I agree to the terms of the [Developer Certificate of Origin (DCO)](https://developercertificate.org/).
Added iostats get request callback to streamReader (awslabs#317) This PR moves IOStat callback method request from Block to StreamReader. Needs to be done as part of code rebase --- By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and I agree to the terms of the [Developer Certificate of Origin (DCO)](https://developercertificate.org/). Telemetry support for physical IO (awslabs#318) This PR adds telemetry measures for StreamReader and BlockManager. Introducing retry policy to new PhysicalIO (awslabs#320) This PR adds the capability of retry to the new PhysicalIO awslabs#286 awslabs#287 awslabs#288 awslabs#289 awslabs#294 awslabs#316 existing APIs or behaviors? No No Unit tests --- By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and I agree to the terms of the [Developer Certificate of Origin (DCO)](https://developercertificate.org/).
Added iostats get request callback to streamReader (awslabs#317) This PR moves IOStat callback method request from Block to StreamReader. Needs to be done as part of code rebase --- By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and I agree to the terms of the [Developer Certificate of Origin (DCO)](https://developercertificate.org/). Telemetry support for physical IO (awslabs#318) This PR adds telemetry measures for StreamReader and BlockManager. Introducing retry policy to new PhysicalIO (awslabs#320) This PR adds the capability of retry to the new PhysicalIO awslabs#286 awslabs#287 awslabs#288 awslabs#289 awslabs#294 awslabs#316 existing APIs or behaviors? No No Unit tests --- By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and I agree to the terms of the [Developer Certificate of Origin (DCO)](https://developercertificate.org/).
## Description of change This PR rebases ant integrates the changes in PR #321 #### Relevant issues #286 #287 #288 #289 #294 #316 #320 #### Does this contribution introduce any breaking changes to the existing APIs or behaviors? No #### Does this contribution introduce any new public APIs or behaviors? No #### How was the contribution tested? Unit tests --- By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and I agree to the terms of the [Developer Certificate of Origin (DCO)](https://developercertificate.org/).
## Description of change This PR changes default read buffer size to 128KB to have a better performance #### Relevant issues #286 #287 #288 #289 #294 #316 #320 #323 #### Does this contribution introduce any breaking changes to the existing APIs or behaviors? No #### Does this contribution introduce any new public APIs or behaviors? No #### How was the contribution tested? Unit tests #### Does this contribution need a changelog entry? N/A --- By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and I agree to the terms of the [Developer Certificate of Origin (DCO)](https://developercertificate.org/).
## Description of change This PR implements a new PhysicalIO design with key improvements: **Fixed Block Size:** Previously, block sizes varied based on request ranges, requiring entire request ranges to complete before blocks became available. The new design uses fixed-size blocks that become ready as soon as individual blocks are filled, enabling faster data access and better parallelization. **Direct Block Writing:** Eliminates an extra memory copy by writing S3 data directly into Block storage instead of copying from intermediate buffers, reducing memory overhead and CPU usage. **Improved Concurrency:** Fixed-size blocks allow multiple blocks to be processed independently, improving throughput for concurrent read operations. **Better Memory Management:** Predictable block sizes enable more efficient memory allocation and cache management strategies. **Enhanced Read Performance:** Blocks become available for reading as soon as they're filled, rather than waiting for entire request ranges to complete, reducing read latency. #### Relevant issues #286 #287 #288 #289 #294 #316 #320 #323 #324 #### Does this contribution introduce any breaking changes to the existing APIs or behaviors? No #### Does this contribution introduce any new public APIs or behaviors? No #### How was the contribution tested? Unit tests, microbenchmarks #### Does this contribution need a changelog entry? - [ ] I have updated the CHANGELOG or README if appropriate --- By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and I agree to the terms of the [Developer Certificate of Origin (DCO)](https://developercertificate.org/).
Description of change
This PR includes the initial changes required for new PhysicalIO. The new objects are representative of the existing objects like DataBlock is the new version of Block object which will be renamed at the end of the integration.
Next Steps:
Does this contribution introduce any breaking changes to the existing APIs or behaviors?
No
Does this contribution introduce any new public APIs or behaviors?
No
How was the contribution tested?
Integrated with existing system in a separate branch and ran tests
Does this contribution need a changelog entry?
N/A
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and I agree to the terms of the Developer Certificate of Origin (DCO).