-
Notifications
You must be signed in to change notification settings - Fork 3k
Flink: Project the RowData to remove meta-columns #3240
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
15 commits
Select commit
Hold shift + click to select a range
37db27a
Flink: Project the RowData from DeleteFilter to remove meta-columns.
openinx 071079b
Address the nested projection issues
openinx 8a6a6f0
Merge remote-tracking branch 'community/master' into project-row-data
Reo-LEI b9e95c6
Merge branch 'apache:master' into flink-project-row-data
Reo-LEI a3b3741
Merge branch 'apache:master' into flink-project-row-data
Reo-LEI 71e1938
Merge remote-tracking branch 'community/master' into flink-project-ro…
Reo-LEI b29c56f
Make RowDataProjection as row data wrapper and support project nested…
Reo-LEI 82ea9e5
Fix checkstyle.
Reo-LEI c69608c
Merge branch 'apache:master' into flink-project-row-data
Reo-LEI 34670be
Remove the commented out code.
Reo-LEI e3a73da
Merge branch 'flink-project-row-data' of https://github.com/Reo-LEI/i…
Reo-LEI 35eda4c
Merge branch 'apache:master' into flink-project-row-data
Reo-LEI b14ace1
Adressing some comments.
Reo-LEI 4e7b786
Adressing some comments.
Reo-LEI debc299
Adressing some comments.
Reo-LEI File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
239 changes: 239 additions & 0 deletions
239
flink/src/main/java/org/apache/iceberg/flink/data/RowDataProjection.java
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,239 @@ | ||
| /* | ||
| * Licensed to the Apache Software Foundation (ASF) under one | ||
| * or more contributor license agreements. See the NOTICE file | ||
| * distributed with this work for additional information | ||
| * regarding copyright ownership. The ASF licenses this file | ||
| * to you under the Apache License, Version 2.0 (the | ||
| * "License"); you may not use this file except in compliance | ||
| * with the License. You may obtain a copy of the License at | ||
| * | ||
| * http://www.apache.org/licenses/LICENSE-2.0 | ||
| * | ||
| * Unless required by applicable law or agreed to in writing, | ||
| * software distributed under the License is distributed on an | ||
| * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
| * KIND, either express or implied. See the License for the | ||
| * specific language governing permissions and limitations | ||
| * under the License. | ||
| */ | ||
|
|
||
| package org.apache.iceberg.flink.data; | ||
|
|
||
| import java.util.Map; | ||
| import org.apache.flink.table.data.ArrayData; | ||
| import org.apache.flink.table.data.DecimalData; | ||
| import org.apache.flink.table.data.MapData; | ||
| import org.apache.flink.table.data.RawValueData; | ||
| import org.apache.flink.table.data.RowData; | ||
| import org.apache.flink.table.data.StringData; | ||
| import org.apache.flink.table.data.TimestampData; | ||
| import org.apache.flink.table.types.logical.RowType; | ||
| import org.apache.flink.types.RowKind; | ||
| import org.apache.iceberg.Schema; | ||
| import org.apache.iceberg.flink.FlinkSchemaUtil; | ||
| import org.apache.iceberg.relocated.com.google.common.base.Preconditions; | ||
| import org.apache.iceberg.relocated.com.google.common.collect.Maps; | ||
| import org.apache.iceberg.types.Types; | ||
|
|
||
| public class RowDataProjection implements RowData { | ||
Reo-LEI marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| /** | ||
| * Creates a projecting wrapper for {@link RowData} rows. | ||
| * <p> | ||
| * This projection will not project the nested children types of repeated types like lists and maps. | ||
| * | ||
| * @param schema schema of rows wrapped by this projection | ||
| * @param projectedSchema result schema of the projected rows | ||
| * @return a wrapper to project rows | ||
| */ | ||
| public static RowDataProjection create(Schema schema, Schema projectedSchema) { | ||
| return RowDataProjection.create(FlinkSchemaUtil.convert(schema), schema.asStruct(), projectedSchema.asStruct()); | ||
| } | ||
|
|
||
| /** | ||
| * Creates a projecting wrapper for {@link RowData} rows. | ||
| * <p> | ||
| * This projection will not project the nested children types of repeated types like lists and maps. | ||
| * | ||
| * @param rowType flink row type of rows wrapped by this projection | ||
| * @param schema schema of rows wrapped by this projection | ||
| * @param projectedSchema result schema of the projected rows | ||
| * @return a wrapper to project rows | ||
| */ | ||
| public static RowDataProjection create(RowType rowType, Types.StructType schema, Types.StructType projectedSchema) { | ||
| return new RowDataProjection(rowType, schema, projectedSchema); | ||
| } | ||
|
|
||
| private final RowData.FieldGetter[] getters; | ||
| private RowData rowData; | ||
|
|
||
| private RowDataProjection(RowType rowType, Types.StructType rowStruct, Types.StructType projectType) { | ||
| Map<Integer, Integer> fieldIdToPosition = Maps.newHashMap(); | ||
| for (int i = 0; i < rowStruct.fields().size(); i++) { | ||
| fieldIdToPosition.put(rowStruct.fields().get(i).fieldId(), i); | ||
| } | ||
|
|
||
| this.getters = new RowData.FieldGetter[projectType.fields().size()]; | ||
| for (int i = 0; i < getters.length; i++) { | ||
| Types.NestedField projectField = projectType.fields().get(i); | ||
| Types.NestedField rowField = rowStruct.field(projectField.fieldId()); | ||
|
|
||
| Preconditions.checkNotNull(rowField, | ||
| "Cannot locate the project field <%s> in the iceberg struct <%s>", projectField, rowStruct); | ||
|
|
||
| getters[i] = createFieldGetter(rowType, fieldIdToPosition.get(projectField.fieldId()), rowField, projectField); | ||
| } | ||
| } | ||
|
|
||
| private static RowData.FieldGetter createFieldGetter(RowType rowType, | ||
| int position, | ||
| Types.NestedField rowField, | ||
| Types.NestedField projectField) { | ||
| Preconditions.checkArgument(rowField.type().typeId() == projectField.type().typeId(), | ||
| "Different iceberg type between row field <%s> and project field <%s>", rowField, projectField); | ||
|
|
||
| switch (projectField.type().typeId()) { | ||
| case STRUCT: | ||
| RowType nestedRowType = (RowType) rowType.getTypeAt(position); | ||
| return row -> { | ||
| RowData nestedRow = row.isNullAt(position) ? null : row.getRow(position, nestedRowType.getFieldCount()); | ||
| return RowDataProjection | ||
| .create(nestedRowType, rowField.type().asStructType(), projectField.type().asStructType()) | ||
| .wrap(nestedRow); | ||
| }; | ||
|
|
||
| case MAP: | ||
| Types.MapType projectedMap = projectField.type().asMapType(); | ||
| Types.MapType originalMap = rowField.type().asMapType(); | ||
|
|
||
| boolean keyProjectable = !projectedMap.keyType().isNestedType() || | ||
| projectedMap.keyType().equals(originalMap.keyType()); | ||
| boolean valueProjectable = !projectedMap.valueType().isNestedType() || | ||
| projectedMap.valueType().equals(originalMap.valueType()); | ||
| Preconditions.checkArgument(keyProjectable && valueProjectable, | ||
| "Cannot project a partial map key or value with non-primitive type. Trying to project <%s> out of <%s>", | ||
| projectField, rowField); | ||
|
|
||
| return RowData.createFieldGetter(rowType.getTypeAt(position), position); | ||
|
|
||
| case LIST: | ||
| Types.ListType projectedList = projectField.type().asListType(); | ||
| Types.ListType originalList = rowField.type().asListType(); | ||
|
|
||
| boolean elementProjectable = !projectedList.elementType().isNestedType() || | ||
| projectedList.elementType().equals(originalList.elementType()); | ||
| Preconditions.checkArgument(elementProjectable, | ||
| "Cannot project a partial list element with non-primitive type. Trying to project <%s> out of <%s>", | ||
| projectField, rowField); | ||
|
|
||
| return RowData.createFieldGetter(rowType.getTypeAt(position), position); | ||
|
|
||
| default: | ||
| return RowData.createFieldGetter(rowType.getTypeAt(position), position); | ||
| } | ||
| } | ||
|
|
||
| public RowData wrap(RowData row) { | ||
openinx marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| this.rowData = row; | ||
| return this; | ||
| } | ||
|
|
||
| private Object getValue(int pos) { | ||
| return getters[pos].getFieldOrNull(rowData); | ||
| } | ||
|
|
||
| @Override | ||
| public int getArity() { | ||
| return getters.length; | ||
| } | ||
|
|
||
| @Override | ||
| public RowKind getRowKind() { | ||
| return rowData.getRowKind(); | ||
| } | ||
|
|
||
| @Override | ||
| public void setRowKind(RowKind kind) { | ||
| throw new UnsupportedOperationException("Cannot set row kind in the RowDataProjection"); | ||
| } | ||
|
|
||
| @Override | ||
| public boolean isNullAt(int pos) { | ||
| return rowData == null || getValue(pos) == null; | ||
| } | ||
|
|
||
| @Override | ||
| public boolean getBoolean(int pos) { | ||
| return (boolean) getValue(pos); | ||
| } | ||
|
|
||
| @Override | ||
| public byte getByte(int pos) { | ||
| return (byte) getValue(pos); | ||
| } | ||
|
|
||
| @Override | ||
| public short getShort(int pos) { | ||
| return (short) getValue(pos); | ||
| } | ||
|
|
||
| @Override | ||
| public int getInt(int pos) { | ||
| return (int) getValue(pos); | ||
| } | ||
|
|
||
| @Override | ||
| public long getLong(int pos) { | ||
| return (long) getValue(pos); | ||
| } | ||
|
|
||
| @Override | ||
| public float getFloat(int pos) { | ||
| return (float) getValue(pos); | ||
| } | ||
|
|
||
| @Override | ||
| public double getDouble(int pos) { | ||
| return (double) getValue(pos); | ||
| } | ||
|
|
||
| @Override | ||
| public StringData getString(int pos) { | ||
| return (StringData) getValue(pos); | ||
| } | ||
|
|
||
| @Override | ||
| public DecimalData getDecimal(int pos, int precision, int scale) { | ||
| return (DecimalData) getValue(pos); | ||
| } | ||
|
|
||
| @Override | ||
| public TimestampData getTimestamp(int pos, int precision) { | ||
| return (TimestampData) getValue(pos); | ||
| } | ||
|
|
||
| @Override | ||
| @SuppressWarnings("unchecked") | ||
| public <T> RawValueData<T> getRawValue(int pos) { | ||
| return (RawValueData<T>) getValue(pos); | ||
| } | ||
|
|
||
| @Override | ||
| public byte[] getBinary(int pos) { | ||
| return (byte[]) getValue(pos); | ||
| } | ||
|
|
||
| @Override | ||
| public ArrayData getArray(int pos) { | ||
| return (ArrayData) getValue(pos); | ||
| } | ||
|
|
||
| @Override | ||
| public MapData getMap(int pos) { | ||
| return (MapData) getValue(pos); | ||
| } | ||
|
|
||
| @Override | ||
| public RowData getRow(int pos, int numFields) { | ||
| return (RowData) getValue(pos); | ||
| } | ||
| } | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.