Skip to content

Conversation

@martint
Copy link
Member

@martint martint commented Nov 20, 2025

Standard SQL supports static method calls on types via the :: operator. While this syntax is generally incompatible with PostgreSQL shorthand cast syntax, there's a subset of that can be safely repurposed to support that functionality.

The SQL specification defines static method calls as:

   <static method invocation> ::=
      <path-resolved user-defined type name> <double colon> <method name>
      [ <SQL argument list> ]

where <path-resolved user-defined type name> translates to:

   <user-defined type name> ::=
      [ <schema name> <period> ] <qualified identifier>

To support casts, we need to extend the rule to support arbitrary expressions as the target of the invocation. To disambiguate a static method call from a cast, we distinguish between type-producing expressions and expressions that produce regular values. For the latter, if the method matches the name of a well-known type, we treat it as a cast. Otherwise, the expression assumed to be a static method call and fail the evaluation with a "not yet supported" error.

One limitation is that casts are only supported for simple types. Parametric types are not yet supported, but it wouldn't be too hard to add. Types whose name don't match the syntax of SQL function calls are not supported either and will be harder to support. That will require introducing type-producing expressions into the language and making types first-class expressions.

Fixes #23795

Release notes

(x) Release notes are required, with the following suggested text:

## General
* Add support to simplified cast expressions with the `::` operator. ({issue}`23795`)

@cla-bot cla-bot bot added the cla-signed label Nov 20, 2025
@martint martint requested review from findepi and kasiafi November 20, 2025 22:10
// the target is a primaryExpression to support PostgreSQL-style casts
// of the form <complex expression>::<type>, which are syntactically ambiguous with
// static method calls
| primaryExpression DOUBLE_COLON identifier ('(' (expression (',' expression)*)? ')')? #staticMethodCall
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use type on the rhs?

AFAIR engines allow a::timestamp with time zone and likely a::double precision too

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can’t use type, and hence the limitation I mentioned in the description. The standard SQL syntax requires this to be a function invocation, so it has to look like it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to me, "syntactically ambiguous with static method calls" means that foo::bar(1) is ambiguous in a sense that it can be two different things

  • PostgreSQL-style cast of expression foo to type bar(1)
  • a static method bar invocation on type foo with parameters 1

it is surprising, but it seems to me that foo::bar is also ambiguous because argument braces are optional

image

now,
foo::timestamp already departs from the spec
foo::timestamp with time zone also departs from the spec, but is consistent with foo::timestamp. And is not ambiguous.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"syntactically ambiguous with static method calls" means that foo::bar(1) is ambiguous in a sense that it can be two different things

Exactly. That's a syntactic ambiguity that's easy to solve during analysis. We parse all occurrences of :: as static method calls and decide later if it was meant to be a call or a cast depending on what's on the left side of the expression. If it's a type-producing expression (currently, an identifier matching a type name), then we treat it as a static method call. Otherwise, we treat it as a cast.

foo::timestamp with time zone also departs from the spec, but is consistent with foo::timestamp. And is not ambiguous.

Right, but to be able to parse that we'd need to allow either identifier ('(' (expression (',' expression)*)? ')') or type on the right side, which introduces all sorts of problems. See my comment below.

Also, as an aside, note that there's another ambiguous usage (I mention it in the comment below, too) with:

SELECT xxx::double precision

In Trino, that could be interpreted either as SELECT xxx::double AS precision or as a cast to double precision. PostgreSQL doesn't have that problem since double is not a valid type. Which goes to say, sometimes we can't just adopt a language feature from another database, since there are intricate interactions between all features that need to be taken into account.

assertThatThrownBy(() -> SQL_PARSER.createExpression("1::double precision"))
.hasMessageMatching(".*mismatched input.*");

assertThatThrownBy(() -> SQL_PARSER.createExpression("'abc'::timestamp with time zone"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should support this & decimals.
it's not nice to a user to allow foo::type for most but not all types that are commonly cast in sql
this syntax really helps write sql, but we punish users because they will have to sometimes go back to old syntax just because they need decimal instead of a double, or point in time instead of a zoneless timestamp

it's ok not to support this within this PR, but i would want to understand what it takes to support it in the future. also, can help with the implementation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To do that, we'll need to unify these two productions:

  • identifier ('(' (expression (',' expression)*)? ')')?
  • type

Some implications:

  • type will need to be a valid expression, since, for instance, you can have arbitrary nesting: array(array(array(double precision)))
  • there's a syntactic ambiguity between identifier ('(' (expression (',' expression)*)? ')')? and identifier ('(' typeParameter (',' typeParameter)* ')')? if the arguments are all integer numbers. It can be parsed either as a static method invocation or as a generic type
  • we'll need to figure out how to deal with the common that could lead to expensive backtracking. For example, in array(array(array(double precision))), the parser wouldn't know which alternative to follow ahead of time. It would have to go down one of the alternatives until it finds it can't parse it, backtrack all the way to the top and try the other alternative
  • We'll have to resolve other ambiguities or backtracking elsewhere. For example, SELECT xxx::double precision could be interpreted either as a cast to double precision or as SELECT xxx::double AS precision

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

type will need to be a valid expression,

this i don't understand.
types are not expressions in current trino (type is not a first class value) and i don't have intention to make them be.

we'll need to figure out how to deal with the common that could lead to expensive backtracking. For example, in array(array(array(double precision))), the parser wouldn't know which alternative to follow ahead of time. It would have to go down one of the alternatives until it finds it can't parse it, backtrack all the way to the top and try the other alternative

yes, it could require backtracking. but would this be expensive backtracking?
it's not exponential, it's just two passes, right?
it could become exponential if in two passes we also can do backtracking, however :: is not valid part of a type definition

SELECT xxx::double precision could be interpreted either as a cast to double precision or as SELECT xxx::double AS precision

that's a very nice example. some systems solve ambiguity by having deterministic precedence, but i get the point.
i am not concerned about users really needing to type ::double precision. it's a understandable limitation to require them to type this as ::double. they can still write what they need
i am more concern about casting to timestamp with time zone (that's my favorite type btw). how can we allow that?

Standard SQL supports static method calls on types via the :: operator.
While this syntax is generally incompatible with PostgreSQL shorthand cast
syntax, there's a subset of that can be safely repurposed to support that
functionality.

The SQL specification defines static method calls as:

   <static method invocation> ::=
      <path-resolved user-defined type name> <double colon> <method name>
      [ <SQL argument list> ]

where <path-resolved user-defined type name> translates to:

   <user-defined type name> ::=
      [ <schema name> <period> ] <qualified identifier>

To support casts, we need to extend the rule to support arbitrary expressions
as the target of the invocation. To disambiguate a static method call from
a cast, we distinguish between type-producing expressions and expressions that
produce regular values. For the latter, if the method matches the name of a
well-known type, we treat it as a cast. Otherwise, the expression assumed
to be a static method call and fail the evaluation with a "not yet supported"
error.

One limitation is that casts are only supported for simple types. Parametric
types are not yet supported, but it wouldn't be too hard to add. Types whose
name don't match the syntax of SQL function calls are not supported either
and will be harder to support. That will require introducing type-producing
expressions into the language and making types first-class expressions.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

Cast operator ::

2 participants