-
Notifications
You must be signed in to change notification settings - Fork 156
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add log1p function #273
Conversation
options: [ NAN, ERROR ] | ||
required: false | ||
- value: fp32 | ||
return: fp32 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realize the surrounding functions don't do this but do these arguments need a nullability
parameter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which arguments; as in - value: fp32
? And by the same logic, would we also want a lot of the arithmetic functions to have a nullability
parameter then? Or did you mean something else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this signature handle the case of NULL
inputs? It looks like it only accepts non-nullable inputs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, because MIRROR
and DECLARED_OUTPUT
are IMO weird special cases simply because the return type derivation expressions lack the generality to describe them. See https://substrait.io/expressions/scalar_functions/#nullability-handling. MIRROR
and DECLARED_OUTPUT
ignore the nullability flag of the arguments and accept any combination of both nullable and non-nullable arguments, and MIRROR
ignores and overrides the nullability flag of the return type with the boolean or of all argument nullability flags.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, for the validator I'll be implementing a generalization of the type derivation grammar that uses type??
to specify "either nullable or non-nullable" in a type pattern, and type?parameter
to bind the nullability to the name parameter
, so variadics aside, you could describe MIRROR
and DECLARED_OUTPUT
more flexibly and explicitly if you want. I want to support that if only to avoid adding code for all the special cases, even if Substrait itself won't ever support that syntax.
ETA: heck, I could do away with the special cases even for variadics if I add a type?parameter?
syntax for patterns, where parameter
is bound to true
iff any matched patterns were nullable :D But I realize that the syntax is arcane at best. Ultimately I just want to be able to round-trip the grammar with my IR for these things while being 100% compatible with Substrait grammar (for as far as that's possible at all due to all the conflicting descriptions in Substrait itself).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function definition looks good to me, though I'm not sure I would consider this function sufficiently primitive to include in core. After all, it's literally just (a probably optimized form of) log(add(literal(1), x))
. Then again... the same could be said about all but one of log2, log10, natural log, and log with a specified base though, since they can all be rewritten to any other using the right identities. And then cos(x)
is just sin(add(literal(pi/2), x))
, and tan(x)
is just divide(sin(x), cos(x))
, and so on...
I don't know.
@jvanstraten For the record, |
I didn't know about log1p specifically, but figured as much. There's lots of functions like that to trade off performance, accuracy, and stability, like inverse sqrt to name one. I'm just not sure where we should draw the line for Substrait core. A calculator doesn't have a dedicated button for log1p or invsqrt either, because when you're thinking at the abstraction level of doing abstract math those are pretty meaningless operations. I doubt SQL has it either for the same reason, but correct me if I'm wrong. You can make the case that Substrait explicitly operates on fp32 or fp64 though, not on abstract real numbers. So again, I don't feel strongly either way in this case. I'm just worried this kind of thing will explode without bound if we don't draw the line somewhere. |
This PR adds the log1p function.