Skip to content

Function implementation#13384

Merged
rongrong merged 11 commits intoprestodb:masterfrom
rongrong:function-implementation
Oct 29, 2019
Merged

Function implementation#13384
rongrong merged 11 commits intoprestodb:masterfrom
rongrong:function-implementation

Conversation

@rongrong
Copy link
Contributor

@rongrong rongrong commented Sep 12, 2019

Ready for pre-review on high level. I'll probably use some data structures introduced in #13484 and add tests once it's merged.

@rongrong rongrong force-pushed the function-implementation branch from c525055 to b08514e Compare September 12, 2019 06:19
@rongrong rongrong force-pushed the function-implementation branch 5 times, most recently from 24dd8de to 2e31a25 Compare September 14, 2019 01:19
@rongrong rongrong force-pushed the function-implementation branch 6 times, most recently from 66ce470 to bf8ed19 Compare October 9, 2019 01:08
@rongrong rongrong changed the title [POC] Function implementation Function implementation Oct 9, 2019
@caithagoras caithagoras self-assigned this Oct 9, 2019
@rongrong rongrong force-pushed the function-implementation branch 3 times, most recently from 15aab08 to 6bb4075 Compare October 10, 2019 23:50
Copy link
Contributor

@caithagoras caithagoras Oct 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is preventing us from being able to pass in a Session?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see this class anywhere in the PR.

@caithagoras
Copy link
Contributor

caithagoras commented Oct 11, 2019

In this PR, FunctionLanguage is used to distinguish between built-in and user-defined functions. Rather than tying it with Language, we should give it a different name since it fullfill a different purpose. For example, suppose we support Python and JS in the future, they'll have a different Language, but might still have the same execution strategy of using an external server.

Some naming suggestions:

  • FunctionImplementationType (this sounds best to me)
  • FunctionImplementation.Type
  • FunctionExecutionStrategy

@caithagoras
Copy link
Contributor

caithagoras commented Oct 11, 2019

A few issues with the FunctionImplementation Interface.

  • Adding FunctionManager.getFunctionImplementation and BuiltInFunctionNamespaceManager.getFunctionImplementation seems to aim for unifying function implementation fetching, since the names are very generic, but that also makes them confusing since they're only fetching scalar functions.
  • I checked all callers of FunctionImplementation.getFunctionLanguage. This method is never actually used to make decisions - it is only used to perform sanity checks, which can be achieved by properly designing the interfaces.
  • FunctionImplementation#getImplementation returns an Object. This indicates something can be improved (even returning an empty interface is better than that).

Here is my proposal:

  • All decisions are made using FunctionMetadata.getFunctionImplemenationType if needed (you're already doing this in this PR)
  • Either
    (*) Introduce a new method FunctionNamespace.getSqlInvokedFunctionImplementation, mimicking the 3 existing getters, doing the checks and the down cast, and returning a SqlInvokedRegularFunctionImplemenation.
    Or
    (**) FunctionManager.getFunctionImplementation and FunctionNamespaceManager.getFunctionImplementation needs to include scalar, aggregation, and window, to at least match the expected behavior as indicated by the method name.
  • Remove the 2 methods in FunctionImplementation to make it an empty
    interface.
    • When you have checkArgument(implementation.getFunctionLanguage() == SQL), that means what you should really pass in instead is a SqlInvokedRegularFunctionImplemenation instance. (Most of cases are in SqlFunctionUtils)
    • If you change all those parameters, I think you'll end up having to do one down-cast right after you fetch the implementation from FunctionManager if you do (**), or no cast at all if you do (*). And I think it is a better practice to cast early in order to fail early.

@rongrong rongrong force-pushed the function-implementation branch from 6bb4075 to d80e9eb Compare October 11, 2019 19:51
@rongrong
Copy link
Contributor Author

In this PR, FunctionLanguage is used to distinguish between built-in and user-defined functions. Rather than tying it with Language, we should give it a different name since it fullfill a different purpose. For example, suppose we support Python and JS in the future, they'll have a different Language, but might still have the same execution strategy of using an external server.

Some naming suggestions:

  • FunctionImplementationType (this sounds best to me)
  • FunctionImplementation.Type
  • FunctionExecutionStrategy

Ya, it makes less and less sense to have this FunctionLanguage especially after the syntax support. The "language" here should theoretically be associated with the "execution engine" (how to run them) so I'll take FunctionExecutionStrategy. If you have more ideas let me know. I'm terrible with names. 🤣

@rongrong
Copy link
Contributor Author

A few issues with the FunctionImplementation Interface.

  • Adding FunctionManager.getFunctionImplementation and BuiltInFunctionNamespaceManager.getFunctionImplementation seems to aim for unifying function implementation fetching, since the names are very generic, but that also makes them confusing since they're only fetching scalar functions.
  • I checked all callers of FunctionImplementation.getFunctionLanguage. This method is never actually used to make decisions - it is only used to perform sanity checks, which can be achieved by properly designing the interfaces.
  • FunctionImplementation#getImplementation returns an Object. This indicates something can be improved (even returning an empty interface is better than that).

Here is my proposal:

  • All decisions are made using FunctionMetadata.getFunctionImplemenationType if needed (you're already doing this in this PR)

  • Either
    (*) Introduce a new method FunctionNamespace.getSqlInvokedFunctionImplementation, mimicking the 3 existing getters, doing the checks and the down cast, and returning a SqlInvokedRegularFunctionImplemenation.
    Or
    (**) FunctionManager.getFunctionImplementation and FunctionNamespaceManager.getFunctionImplementation needs to include scalar, aggregation, and window, to at least match the expected behavior as indicated by the method name.

  • Remove the 2 methods in FunctionImplementation to make it an empty
    interface.

    • When you have checkArgument(implementation.getFunctionLanguage() == SQL), that means what you should really pass in instead is a SqlInvokedRegularFunctionImplemenation instance. (Most of cases are in SqlFunctionUtils)
    • If you change all those parameters, I think you'll end up having to do one down-cast right after you fetch the implementation from FunctionManager if you do (**), or no cast at all if you do (*). And I think it is a better practice to cast early in order to fail early.

We sort of had a conversation about unifying the metadata and implementation but seems like they should still be two separate interfaces. At the point I'm inclined to have the implementation to be an empty interface. I'm thinking of maybe calling it ScalarFunctionImplementation to get away with the integration / window function issues. Problem with unifying them all is that their internal representation for builtin functions are very different.

@rongrong rongrong force-pushed the function-implementation branch from d80e9eb to ecd86a9 Compare October 11, 2019 21:35
Copy link

@highker highker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One comment on ScalarFunctionImplementation; otherwise looks good to me.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This design pattern doesn't look at an "Interface design pattern" but more like a "delegation". FunctionImplementationType should be good enough for caller to decide who to delegate. This interface may not be necessary. In the worst case, presto-main depends on presto-sql-function, parseSqlFunctionExpression can down cast it:

((SqlInvokedRegularSqlFunctionImplementation) functionImplementation).getImplementation()

where only SqlInvokedRegularSqlFunctionImplementation needs a String getImplementation() method.

@rongrong rongrong force-pushed the function-implementation branch 3 times, most recently from c5a897e to 9b6a769 Compare October 23, 2019 19:16
@rongrong
Copy link
Contributor Author

Addressed comments. Please review. Thanks! @caithagoras @highker

@highker highker self-requested a review October 24, 2019 04:12
Copy link

@highker highker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM; I don't have other comments

@rongrong rongrong force-pushed the function-implementation branch from 9b6a769 to 6563a19 Compare October 29, 2019 00:57
@caithagoras
Copy link
Contributor

Commit title Rename ScalarFunctionImplementation to BuiltInScalarFunctionImplement… exceeds 72 characters and has been cut off.

@rongrong
Copy link
Contributor Author

Commit title Rename ScalarFunctionImplementation to BuiltInScalarFunctionImplement… exceeds 72 characters and has been cut off.

What can I change it to? 🤣

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants