Add SourcePage interface for delayed materialization of ConnectorSourceData#24011
Add SourcePage interface for delayed materialization of ConnectorSourceData#24011
Conversation
e789d3e to
8c4733c
Compare
2aea253 to
48fd73c
Compare
48fd73c to
1481ad5
Compare
core/trino-spi/src/main/java/io/trino/spi/connector/NoChannelsSourcePage.java
Outdated
Show resolved
Hide resolved
core/trino-spi/src/main/java/io/trino/spi/connector/StaticSourcePage.java
Outdated
Show resolved
Hide resolved
| /** | ||
| * Gets all data. | ||
| */ | ||
| Page getPage(); |
There was a problem hiding this comment.
Could we call it getLoadedPage to make it more obvious that this method will load the underlying data ?
There was a problem hiding this comment.
I have been thinking we call this getAllColumns and generally use the term column instead of Block or Page.
Anyway, the next PR after this removes lazy loading entirelly, so I don't really want to use that term in the codebase for a while.
| * and {@link Page#getPositions(int[], int, int)} where possible, as this allows | ||
| * the underlying reader to filter positions on subsequent reads. | ||
| */ | ||
| void selectPositions(int[] positions, int offset, int size); |
There was a problem hiding this comment.
This is forcing the selected positions to be a positions list, why not use SelectedPositions here instead of int[] positions to allow ranges to be passed where that is cheaper ?
I expect that for file format readers it will be more efficient to choose to decode/skip batches of positions rather than making that decision at the granularity of each row
There was a problem hiding this comment.
I think we will be able to use this API more easily within the new columnar filter evaluation if it takes SelectedPositions as input, otherwise we'd need to always convert to positions list.
There was a problem hiding this comment.
SelectedPositions isn't in the SPI. We could move I wasn't sure that was something we wanted.
Generally the APIs for SourcePage were created directly from Page with all unnecessary functions removed. Later in the development process I made selected positions a mutation operation and ended up with this API.
There was a problem hiding this comment.
I understand it's not in the SPI today, but can we consider moving it there given my rationale above ? Or do you prefer deferring that to a future PR ?
Also, does selectPositions necessarily have to be a mutation operation ? Why not return a new Page ?
There was a problem hiding this comment.
I would prefer to delay to a future PR. We also need to decide if we want to have selected positions or just a selectRange method. I don't have strong feelings either way.
As for why select positions is a mutation has to do with the desire to allow readers to skip data. If it is not a mutation operation, the reader is not free to skip positions because the original object exists. We could make it create a new object and at the same time destroy the original, but that seems worse in practice.
| @Override | ||
| public void selectPositions(int[] positions, int offset, int size) | ||
| { | ||
| page = page.getPositions(positions, offset, size); |
There was a problem hiding this comment.
Is converting to dictionary blocks (that happens internally in getPositions) always a good idea ?
I think dictionary blocks created this way won't benefit from dictionary processing optimizations and will have overhead of dictionary look-ups along with higher memory usage, compared to blocks created from copyPositions.
Also, most of dictionary optimizations around re-using work done on the dictionary is based on reference check on the dictionary in DictionaryBlock, so we might need to think about how to avoid affecting that optimization due to the change in dictionary reference from using getPositions/copyPositions on original DictionaryBlock produced by the page source.
There was a problem hiding this comment.
I understand all of that. The code in this PR doesn't try to make significant performance changes like this. I think we could look at making the change you mention, but I think it requires a lot more thought and performance analysis. Or said another way, this is what our code already does today.
| HiveColumnHandle column = buildColumnHandle(field, checkpointSchemaManager, this.metadataEntry, this.protocolEntry, addStatsMinMaxColumnFilter).toHiveColumnHandle(); | ||
| columnsBuilder.add(column); | ||
| disjunctDomainsBuilder.add(buildTupleDomainColumnHandle(field, column)); | ||
| if (field == ADD) { |
There was a problem hiding this comment.
Please read the commit comment for more details. This code was challenging to figure out (hours in a debugger), but I think figured out the intent.
| /** | ||
| * Gets the number of positions in the page. | ||
| */ | ||
| int getPositionCount(); |
There was a problem hiding this comment.
With SourcePage now being mutable, there's a potential issue where you might retrieve the positionCount, but then another operation (like calling selectPositions) alters the source, causing the positions to no longer align with the current state.
There was a problem hiding this comment.
Yes. There are lots of scenarios where you can get in trouble. The interface is single threaded so there should be no worries about external actors modifying the contents. The interface design is a compromise between simple usability and performance.
I considered designs where you select positions resulted in a new object, but it has the problem that it does not allow the reading code to skip data, because the original object still exists and someone may decide to use that object.
Users of this interface need to be aware of what they are doing, and if they don't want to deal with stuff chaning they can simply materialize the whole page.
There was a problem hiding this comment.
On the other hand, can we leverage this behavior and avoid setting the position count until we materialize the Page?
e.g.
When a page needs to return as many rows as possible while keeping the total size under 1MB, determining the number of positions is straightforward if the page contains only fixed-size columns. However, if it includes non-fixed-size columns, the number of rows must be estimated, typically using a worst-case scenario.
if we won't need to commit the positionCount for the SourcePage this problem can be solved
There was a problem hiding this comment.
Maybe. The implementation could delay the position count determinization until this method is called... but the position count is needed when any block is fetched, so I'm not sure if this will help as much as you thing. The most common scenario will be:
- execute filter - load one or more blocks and filter
- select filtered positions - reduce page to a set of positions
- project remaining blocks - load the remaining blocks for the selected positions
or there is no filtering so all blocks just get loaded. Either way, the first piece of information you need is the number of positions to return.
1481ad5 to
21a3527
Compare
abd68f7 to
786cea9
Compare
d8880f3 to
b6f227a
Compare
Instead of monitoring for lazy block loading, the size page size can be checked after state changes in SFP.
This data source can be used to transform raw file output to the shape required for the query.
Make BucketAdapter and BucketValidator top level classes
The AddFileEntryExtractor was relying on a side effect of the Parquet that merged columns with same name and different fields into a base column. The proper way is to use a dereference projection, but this is not needed here. Instead this code only needs one base column with the correct field names. With this change CheckpointFieldExtractor only need a single block.
Rename variables to match actual meaning Set useOrcColumnNames when ORC full acid is used Simplify code structure and fix typos in docs
Description
This adds a new interface to the SPI,
SourcePage, which will be the eventual replacement forPageinConnectorPageSource. SinceSourcePageis an interface it allows the connector to directly know when columns are being accessed.Additionally,
SourcePageis not intended to be thread safe, so it can be mutable. Specifically, the interface contains the method:This reduces the positions that will be returned from the SourcePage, and since this is a mutation operation the connector knows that only the specified positions can be accessed. This allows data sources to use this information for skipping unnecessary reads.
This is based in #24062, so ignore the first three commits. The first commit in this PR is
Move Iceberg reader early exit checks to start of method.Additional Changes
Add
TransformConnectorPageSourceThis utility class in Hive is used by all object store connectors to transform the raw data from file format readers into the final for needed for the query. Specifically, this class has methods for remapping columns, adding constant values, transforming blocks, and most importantly dereferencing fields. The
TransformConnectorPageSourcehas replaced the custom adapters in ORC and Parquet.Removal of Hive, Iceberg, Hudi, and Delta ConnectorPageSource
All of these implementations are were doing some simple transforms, and have been replaced with
TransformConnectorPageSource.Removal of
ReaderColumnsandReaderPageSourceWith the introduction of
TransformConnectorPageSource, the existing code for managing field dereference pushdown is no longer needed. All places where these classes were used have been updated to useTransformConnectorPageSourceinstead. This has the added benefit of simplifying the code by consolidating the multiple layers of transforms into a single place that creates the transformer, which is much easier to read.Release notes
( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
( ) Release notes are required, with the following suggested text: