Skip to content

[WIP] Query plan pushdown#12375

Closed
hellium01 wants to merge 10 commits intoprestodb:masterfrom
hellium01:QueryPlanPushdown
Closed

[WIP] Query plan pushdown#12375
hellium01 wants to merge 10 commits intoprestodb:masterfrom
hellium01:QueryPlanPushdown

Conversation

@hellium01
Copy link
Contributor

@hellium01 hellium01 commented Feb 22, 2019

Related to: #12368

@highker highker self-assigned this Feb 22, 2019
Copy link

@highker highker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo "funtion" in commit title.

Finished reviewing commit "Move funtion signature to SPI". Minor comments only. I think currently it makes sense to move Signature to spi.function rather than a new package given most function dependencies won't go to the new package and Signature is a common helper that will be used by connectors.

return new Signature(name, FunctionKind.SCALAR, ImmutableList.of(), ImmutableList.of(), returnType, argumentTypes, false);
}

public static SignatureBuilder builder()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moving Signature to presto-spi but leaving SignatureBuilder in presto-main reminds me that we have Type in presto-spi but all type coercion in presto-main. This weakens the power of the moved class.

We can

  • move SignatureBuilder + OperatorSignatureUtils to presto-spi
  • rename OperatorSignatureUtils to SignatureUtils.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think SignatureBuilder depends on guava.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will use UnmodifiableList

@@ -229,9 +209,4 @@ public static LongVariableConstraint longVariableExpression(String variable, Str
{
return new LongVariableConstraint(variable, expression);
}

public static SignatureBuilder builder()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leave as is

Copy link

@highker highker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finished reviewing "Move RowExpression to SPI".
Maybe add commit body "Move RowExpression to SPI and rename RowExpression to ColumnExpression".

This commit fails to compile. There are uncleaned RowExpression existing in the codebase. For example, SqlToRowExpressionTranslator is barely a file move without altering the class name.

Copy link

@highker highker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finished commit "Add ColumnReferenceExpression". Please update the commit body to say: ColumnReferenceExpression is a new expression for table source blah blah. The only concern of this commit is the translate() interface. That may need some more discussion. Otherwise looks good to me.

}

@Override
public Void visitColumnReference(ColumnReferenceExpression columnReferenceExpression, Void context)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

columnReferenceExpression -> reference

import static java.util.stream.Collectors.toMap;

public final class SqlToRowExpressionTranslator
public final class SqlToColumnExpressionTranslator
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either squash the current commit with the previous one ("Move RowExpression to SPI") or move the renaming change to the previous one in order to not to fail the compilation.

return translate(expression, functionKind, types, ImmutableMap.of(), functionRegistry, typeManager, session, optimize);
}

public static ColumnExpression translate(
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Users can be confused by these interfaces. Just keep the most expressive one and remove the other two. Update the callers' parameters with ImmutableList.of() if necessary.

if (columnHandles.containsKey(node.getName())) {
return new ColumnReferenceExpression(columnHandles.get(node.getName()), getType(node));
}
else if (inputChannels.containsKey(node.getName())) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove else

public ColumnExpression visitCall(CallExpression call, Void context)
{
return call.replaceChildren(call.getArguments().stream()
.map(argument -> argument.accept(this, context)).collect(toImmutableList()));
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we usually do

something
    .func1()
    .func2()
    .func3();

}
}

private static class InlineInputChannelVistor
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo Vistor and call it InlineInputChannelsVisitor

this.inputs = requireNonNull(inputs, "input is null");
}

public static ColumnExpression inlineInputs(ColumnExpression columnExpression, List<ColumnExpression> inputs)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inlineInputChannels

public ColumnExpression visitInputReference(InputReferenceExpression reference, Void context)
{
int field = reference.getField();
checkArgument(field >= 0 && field < inputs.size(), "Unknown input field");
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

format("Unknown input field %s", field)

FunctionKind functionKind,
Map<NodeRef<Expression>, Type> types,
Map<String, ColumnHandle> columnHandles,
List<String> inputChannelNames,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a big fan passing these two parameters in. They are too specific for table source. But before we address them, I find columnHandles is always empty passed from the callers (even in your last commit). Is that true? If it is, let's just remove it. If not, I'm thinking maybe leveraging context to tell the visitor what exactly to translate: a filter, a project, ... in order to hide the parameters.

protected RowExpression visitSymbolReference(SymbolReference node, Void context)
protected ColumnExpression visitSymbolReference(SymbolReference node, Void context)
{
if (columnHandles.containsKey(node.getName())) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This condition uses the implication of column names are symbol names. I feel a bit risky here. Also, these if statements kinda break abstraction because they are basically serving two different callers. One possibility is to leverage Context as stated above. But let's figure out whether columnHandles is used or not...

Copy link

@highker highker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Skip the review of "Convert ColumnExpression back to Expression" for now. I will come back to this commit after going through all following commits to see if there is way to refactor PlanNode to avoid dependency on sql.tree.

Copy link

@highker highker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finished review "Add translation from planNode to tableExpression". Most are minor issues. TableExpression is a much cleaner IR than the current PlanNode. It's very unfortunate we cannot use PlanNode for connectors.

The biggest concern is the translation, not the correctness per se but consistency. Especially if someone alters the PlanNode in the future. Currently, the tests with reflection can avoid this to some extent. But still, I'm thinking if we can move the translation to some place (e.g., RelationPlanner) to have a consistent translation.


public TableExpressionToPlanNodeTranslator(PlanNodeIdAllocator idAllocator, SymbolAllocator symbolAllocator, LiteralEncoder literalEncoder, Metadata metadata)
{
requireNonNull(metadata, "metadata is null");
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move private final Metadata metadata; as the first member variable. Make this line this.metadata = requireNonNull(metadata, "metadata is null");

return tableExpression.accept(new Visitor(session, connectorId), new Context(outputSymbols));
}

private static class Context
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PlanNodeTranslatorContext

}
}

private class Visitor
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PlanNodeTranslatorVisitor

return plan.accept(new PlanRewriter(), null);
}

private class PlanRewriter
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TableExpressionTranslatorVisitor

private ColumnExpression toColumnExpression(Expression expression, List<Symbol> inputs, FunctionKind type)
{
Map<NodeRef<Expression>, Type> expressionTypes =
getExpressionTypes(session, metadata, parser, types, expression, ImmutableList.of(), WarningCollector.NOOP, false);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move this to the previous line

@Override
public String toString()
{
return "FilterExpression{" +
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use com.google.common.base.MoreObjects.toStringHelper

@Override
public String toString()
{
return "AggregateExpression{" +
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use com.google.common.base.MoreObjects.toStringHelper

@Override
public String toString()
{
return "ProjectExpression{" +
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use toStringHelper


import java.util.List;

public abstract class TableExpression
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add getSources if you want.

@Override
public String toString()
{
return "TableScanExpression{" +
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use toStringHelper

Copy link

@highker highker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finished review "Add connector based optimizer". Pretty clean patch % minor comments

PageSourceManager pageSourceManager,
IndexManager indexManager,
NodePartitioningManager nodePartitioningManager,
ConnectorOptimizationRuleManager optimizationRuleManager, NodePartitioningManager nodePartitioningManager,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One parameter a line

// index manager
binder.bind(IndexManager.class).in(Scopes.SINGLETON);

binder.bind(ConnectorOptimizationRuleManager.class).in(Scopes.SINGLETON);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Abstract ConnectorOptimizationRuleManager out with RuleManager as the interface. Create NoopRuleManager for worker. Bind different implementation for coordinator and worker.


public class ConnectorOptimizationRuleManager
{
private final ConcurrentMap<ConnectorId, ConnectorRuleProvider> providers = new ConcurrentHashMap<>();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use Map interface

}

@Override
public PlanNode optimize(PlanNode plan, Session session, TypeProvider types, SymbolAllocator symbolAllocator, PlanNodeIdAllocator idAllocator, WarningCollector warningCollector)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add requireNonNulls in the function body.

return progress;
}

private static class Context
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ConnectorOptimizerContext

{
requireNonNull(node, "node is null");
TraitGroup merged = TraitGroup.emptyTraitGroup();
node.getSources().stream()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new line stream(). Same for the one below.

traitCollectors.stream()
.filter(traitCollector -> traitCollector.canApplyTo(node))
.map(traitCollector -> traitCollector.exploreTraits(node))
.forEach(traitGroup -> merged.addAll(traitGroup));
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

merged::addAll


public void addAll(TraitGroup other)
{
other.traitGroups.entrySet().stream()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The following statement is good enough

other.traitGroups.forEach((type, traits) -> traits.getValues().forEach(value -> addTrait(type, value)));


boolean match(ConnectorSession session, TableExpression tableExpression);

Optional<TableExpression> optimize(ConnectorSession session, TableExpression tableExpression);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why Optional? If there is nothing to optimize, we can return the given tableExpression.

Copy link

@highker highker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finished review "Prevent overriding picked layout and move connector optimzation to later stage". LGTM. Refine the title as "Move connector optimization after picking table layout".

{
List<PlanNode> possiblePlans = PickTableLayout.listTableLayouts(node, predicate, true, session, types, idAllocator, metadata, parser, domainTranslator);
List<PlanWithProperties> possiblePlansWithProperties = possiblePlans.stream()
List<PlanWithProperties> possiblePlansWithProperties = Stream.concat(node.getLayout().isPresent() ? Stream.of(node) : Stream.empty(), possiblePlans.stream())
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comment

.addAll(new ExtractSpatialJoins(metadata, splitManager, pageSourceManager, sqlParser).rules())
.add(new InlineProjections())
.build()));
builder.add(new ConnectorOptimizer(metadata, sqlParser, connectorOptimizationRuleManager, new LiteralEncoder(metadata.getBlockEncodingSerde())));
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comment saying why we put the connector optimizer here.

Copy link

@highker highker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finished review "Add checks to prevent uncleaned relation be returned from connector". Alter the title to "Add checks to prevent uncleaned relation returned from connectors"

BTW, the commit timestamps are messed up. Use git rebase master --ignore-date -x 'git commit --amend -C HEAD --date=\"$(date -R)\" && sleep 1.05' to clean up timestamps before sending out a PR.


private void checkNoColumnInputs(UnaryTableExpression node)
{
int numColumnReferences = node.getOutput().stream()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put stream() to its own line.

import static java.util.stream.Collectors.toMap;

public final class SqlToColumnExpressionTranslator
public final class
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be a typo

public static class InputCollector
implements ColumnExpressionVisitor<Set<ColumnExpression>, Void>
{
private InputCollector()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

private InputCollector() {}

@Override
public Set<ColumnExpression> visitCall(CallExpression call, Void context)
{
return call.getArguments().stream()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put stream() to its own line.

Copy link

@highker highker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed "Allow use intermediate type as input of aggregation functin" but has not finished yet. I will pause it here for a moment given the logic is quite hard to follow. This commit may need some refactoring. But let's don't worry about this for now and focus on the translation part first.

.anyMatch(this::isSpecializedType);
}

private boolean isSpecializedType(TypeSignature typeSignature)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

static


private boolean isSpecializedType(TypeSignature typeSignature)
{
return !typeSignature.getParameters()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inline and use

return typeSignature.getParameters().stream().noneMatch(TypeSignatureParameter::isVariable);

try {
return specializedAggregationCache.getUnchecked(getSpecializedFunctionKey(signature));
}
catch (UncheckedExecutionException e) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change?

throw new PrestoException(FUNCTION_NOT_FOUND, message);
}

private boolean isSpecializedFunction(Signature signature)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

static


private boolean isSpecializedFunction(Signature signature)
{
return isSpecializedType(signature.getReturnType()) && signature.getArgumentTypes().stream()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return isSpecializedType(signature.getReturnType()) &&
                signature.getArgumentTypes().stream().anyMatch(FunctionRegistry::isSpecializedType);

ConnectorPageSource source = pageSourceProvider.createPageSource(operatorContext.getSession(), split, columns);
if (source instanceof RecordPageSource) {
cursor = ((RecordPageSource) source).getCursor();
pageSource = source;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not right.... a typo?

@highker
Copy link

highker commented Feb 25, 2019

Did a first round review. Two high-level comments:

  • Given the amount of code moved to SPI is little, now I'm NOT that concerned whether Signature or TableExpression should go to SPI. It is ok to put them there for now (maybe even with visitor as a first attempt; ultimately we will get rid of visitor pattern).
  • Translation is an issue. From long term, TableExpression will be the IR and PlanNode is the execution plan. To achieve that, we may consider the following bullets while providing the capability for connectors to participate plan optimization:
    • Can we move out dependency of Expression from PlanNode to avoid ColumnExpression translation back to Expression?
    • Can we unify the translation of RelationPlan and PlanNodeToTableExpression/TableExpressionToPlanNode? The reasons are (1) making sure the translation is consistent and (2) moving towards the long term goal to deprecate PlanNodeToTableExpression translation and break RelationPlan translation into AST->IR and IR->PlanNode (TableExpressionToPlanNode).

@highker
Copy link

highker commented Feb 25, 2019

@hellium01, as a first step, I think it is safe to separate the first commit ("move signature to SPI") as a separate PR to prevent future rebase. That one looks clean.

@rongrong
Copy link
Contributor

rongrong commented Feb 25, 2019

@highker @hellium01 Except that it should be FunctionHandle that's moved to SPI rather than Signature after #12360. I'd suggest you wait for that one. Otherwise this (move signature to SPI) will need to be reverted later.

Copy link

@highker highker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

High-level comments regarding commit "Convert ColumnExpression back to Expression":

  • ColumnExpression is not expressive enough as the IR representation for Expression. Indeed, I feel there is a need to introduce the concept of IR fully. An IR is the augmented representation for AST with metadata info in Reverse Polish notation.
  • With IR introduced, the refactoring can be much easier. Most Expression operations are just doing disjunction/conjunction decomposition, which can be easily achieve by IR. I was also thinking directly use Expression given it has been deeply rooted in our codebase doing the job that originally belongs to IR.

cc: @wenleix @rongrong

@highker
Copy link

highker commented Feb 25, 2019

@hellium01 Add up to the previous comment:
ColumnExpression is very close to what we want now; we can just increment it a bit. Shouldn't be very hard. BTW, instead of using ColumnExpression or RowExpression, how about calling it Value or ValueExpression? PyTorch (https://github.com/pytorch/pytorch/wiki/PyTorch-IR) uses this term...

@highker
Copy link

highker commented Feb 26, 2019

I think I might have a way to avoid explicit translation. The ultimate goal is to

  • Make sure the bi-directional translation is loss-less (i.e., do not lose information after roundtrip).
  • Make sure the bi-directional translation is complete (i.e., every member of PlanNode should be mapped).
  • The future translation (join, union, etc) is scalable (i.e., users do not need to writer visitors or rules).

Proposing annotation-based translation

  • Have a "mapping manager" (e.g., TableExpressionManager or ConnectorPlanNodeManager) to keep the mapping between the two nodes.

  • Like JsonPropery/JsonCreator, we introduce @PlanNodeCreator, @PlanNodeProperty, etc. Each annotation should have a corresponding function to translate each type. Table filter for example:

@Immutable
@PlanNodeMapping(FilterExpression.class)
public class FilterNode
        extends PlanNode
{
    private final PlanNode source;
    private final Expression predicate;

    @JsonCreator
    @PlanNodeCreator
    public FilterNode(
            @JsonProperty("id") PlanNodeId id,
            @JsonProperty("source") @PlanNodeProperty("source") PlanNode source,
            @JsonProperty("predicate") @PlanNodeProperty("predicate") Expression predicate)
    {
        super(id);

        this.source = source;
        this.predicate = predicate;
    }

    @PlanNodeProperty  // with name or not is optional; it can be inferred from getter.
    @JsonProperty("predicate")
    public Expression getPredicate()
    {
        return predicate;
    }

    @PlanNodeProperty
    @JsonProperty("source")
    public PlanNode getSource()
    {
        return source;
    }
}

@PlanNodeProperty tells connectorPlanNodeManager the field to translate. The translation should based on types. Each annotated type should have all its inner types translatable. For example Expression <-> RowExpression is the leaf translation; while PlanNode <-> TableExpression is recursively. In terms of what PlanNode is translatable, it depends on what has been annotated by @PlanNodeMapping.

Both FilterNode and FilterExpression should have annotations to denote their mapping (together with the mappings for their member variables).

The benefit of this approach is to shift node-oriented translation to type-oriented translation, which can easily scale up in the future.

@highker
Copy link

highker commented Feb 27, 2019

Some more example code as a starting point

@Target({ElementType.TYPE})
@Retention(RetentionPolicy.RUNTIME)
public @interface PlanNodeMapping
{
    Class<? extends TableExpression> value() default UnmappedTableExpression.class;

    final class UnmappedTableExpression
            extends TableExpression
    {
        @Override
        public List<ColumnExpression> getOutput()
        {
            throw new UnsupportedOperationException();
        }
    }
}



public class TestTranslationManager
{
    @Test
    public void testTranslation()
    {
        ConnectorPlanNodeManager manager = new ConnectorPlanNodeManager();
        manager.register(FilterNode.class);
        manager.toTableExprssion(new FilterNode(new PlanNodeId("1"), null, new LogicalBinaryExpression(AND, new BooleanLiteral("False"), new BooleanLiteral("true"))));
    }

    public class ConnectorPlanNodeManager
    {
        private final BiMap<Class<? extends PlanNode>, Class<? extends TableExpression>> mapping = HashBiMap.create();

        public ConnectorPlanNodeManager()
        {
        }

        public void register(Class<? extends PlanNode> clazz)
        {
            PlanNodeMapping planNodeMapping = clazz.getAnnotation(PlanNodeMapping.class);
            mapping.put(clazz, planNodeMapping.value());

            // TODO: sanity check to make sure the translation is (1) complete and (2) loss-less
        }

        public TableExpression toTableExprssion(PlanNode planNode)
        {
            System.out.println(mapping.get(planNode.getClass()));
            Method[] methods = planNode.getClass().getMethods();

            for (Method method : methods) {
                PlanNodeProperty planNodeProperty = method.getAnnotation(PlanNodeProperty.class);
                if (planNodeProperty != null) {
                    String property = planNodeProperty.value();
                    if (property.equals(USE_DEFAULT_NAME)) {
                        String methodName = method.getName();
                        if (methodName.startsWith("is")) {
                            property = methodName.substring(2);
                            property = Character.toLowerCase(property.charAt(0)) + property.substring(1);
                        }
                        else if (methodName.startsWith("get")) {
                            property = methodName.substring(3);
                            property = Character.toLowerCase(property.charAt(0)) + property.substring(1);
                        }
                        else {
                            throw new IllegalArgumentException();
                        }
                    }
                    Class<?> fromType = method.getReturnType();
                    // TODO: translate
                }
            }

            return null;
        }

        // TODO: put type-to-type translation here
        private ColumnExpression toColumnExpression(Session session, Expression projectionExpression)
        {
            // TODO: put translation here
            return null;
        }
    }
}

@hellium01
Copy link
Contributor Author

Thanks for the review, I will start addressing it in following days once I am back. About the type to type translation, the difficulty is we sometimes have more than 1 node to represent a relation.
For example, in the case of aggregation with grouping set, we have planNode aggregation -> groupId. GroupId is actually implementation details from engine which we don't want or have no need to let the connector to know about.

@hellium01
Copy link
Contributor Author

Expression has too much semantic information that we actually never use later in planning phase. The only reason we convert columnExpression (or whatever name we decide) back to expression is because we are currently using expression in planNode. That being said, ColumnExpression should be able to fully converted back to expression given enough care but this means we should always attempt to make this mapping work (we today have to manage conversion from expression to columnExpression anyway).

I think we should provide only the simple relation information to connector and leave all the implementation related information in the engine. Simpler representation should always be better since it is easier to be operated on. Rewrite columnExpression into connector logic will be much easier than rewrite expression. Assuming in the future we want to match if a subtree has corresponding materialized table, matching column/table expression (simple relation) will be much easier than matching planNode mapped expression which has more information that connector don't care but has to handle (so even we don't do it in engine level, connector will most likely eventually do a translation itself so that it can reduce complexity so that it can operate and rewrite).

Simpler representation (tableExpression vs planNode) also prevents connector from seeing changing node types or node structure, which makes connector optimization rules more stable across versions: the only care we need is when we have a change in planNode, we need to make sure the translation back/forth does not break which is much easier than making sure all different connectors have no broken rule.

@highker
Copy link

highker commented Feb 27, 2019

Nah, I'm not object to a simple representation, especially given I have proposed the annotation-based translator. I'm totally cool with a "view" concept on top of PlanNode. In terms of how we should implement the translation in details, we can work it out later. Whether it is 1:1 or m:n mapping. Annotation is very powerful to have the capability of supporting m:n mapping as well.

@highker
Copy link

highker commented Jun 19, 2019

superseded by #12920

@highker highker closed this Jun 19, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants