Skip to content

Change RowExpressionFormatter output to be closer to valid SQL#16335

Merged
rongrong merged 1 commit intoprestodb:masterfrom
lightseba:master
Jun 29, 2021
Merged

Change RowExpressionFormatter output to be closer to valid SQL#16335
rongrong merged 1 commit intoprestodb:masterfrom
lightseba:master

Conversation

@lightseba
Copy link
Contributor

@lightseba lightseba commented Jun 25, 2021

RowExpressionFormatter outputs the type before any constant
(e.g BIGINT 5) for debugging purposes. This is not valid SQL,
so the constant is now output with TypeConstructor (e.g BIGINT'5'),
except for VARBINARY constants, which use the binary literal
syntax (e.g. X'abc123').

RowExpressionFormatter outputs LIKE expressions such as
column LIKE 'pattern%'
as
LIKE(column, CAST(VARCHAR pattern% as LikeExpression))
and LIKE expressions with ESCAPE, such as
column LIKE '$_pattern%' ESCAPE '$'
as
LIKE(column, LIKE_PATTERN(VARCHAR pattern%, VARCHAR $))
The output for LIKE expressions was changed to be valid SQL.

Test plan - unit tests

== RELEASE NOTES ==

General Changes
*  Improved RowExpressionFormatter output to be closer to valid SQL. 
   Constant expressions are now output as ``TYPE 'value'`` 
   (e.g. ``BIGINT'2'``) using TypeConstructor (with single quotes appearing 
   in the string escaped with ``''``), with the exception of ``VARBINARY`` 
   constants, which use the binary literal syntax (e.g. ``X'abc123'``). 
   ``LIKE`` expressions are now output as valid SQL.

@rongrong
Copy link
Contributor

If you want them to be valid prestoSQL they should be in the format of TYPE'value'. For example, bigint'5'. While /*BIGINT*/5 is valid SQL, you might have errors parsing them back because 5 might be resolved to integer rather than bigint for example.

Copy link
Contributor

@kaikalur kaikalur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lightseba lightseba force-pushed the master branch 2 times, most recently from 0d148d1 to 10eb5de Compare June 25, 2021 22:04
@lightseba
Copy link
Contributor Author

If you want them to be valid prestoSQL they should be in the format of TYPE'value'. For example, bigint'5'. While /*BIGINT*/5 is valid SQL, you might have errors parsing them back because 5 might be resolved to integer rather than bigint for example.

done

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is varbinary treated differently rather than using varbinary'xxx'?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like using TypeConstructor on varbinary takes the UTF-8 encoded version of the string as the binary literal, instead of reading it like a hex string. For example, when I run SELECT varbinary'abc123', x'abc123', varbinary'😎' on a local Presto server, I get

presto:sf1> SELECT varbinary'abc123', x'abc123', varbinary'😎';
       _col0       |  _col1   |    _col2    
-------------------+----------+-------------
 61 62 63 31 32 33 | ab c1 23 | f0 9f 98 8e 

I'm not sure if this is intentional behavior or not, but since varbinary literals are output as hex strings, I needed this workaround.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A better way of doing this might be to change the ConstantExpression.toString() rather than putting this complex logic here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved it into a function LiteralInterpreter.evaluateToString(), since turning the constant to a string requires the relevant ConnectorSession and LiteralInterpreter.evaluate() contains similar logic. Thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I didn't pay close attention. If it's not possible to move this to ConstantExpression, let's keep the logic in RowExpressionFormatter. Thanks!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can get the type name generically with type.getTypeSignature().getBase()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No space between type and value

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like a space is allowed:

presto:sf1> SELECT bigint'5', TYPEOF(bigint'5'), bigint '5', TYPEOF(bigint '5'); 
 _col0 | _col1  | _col2 | _col3  
-------+--------+-------+--------
     5 | bigint |     5 | bigint 

Is it better to have no space anyways?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ya more like a nit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@lightseba lightseba force-pushed the master branch 2 times, most recently from 92c1df1 to 9469ccf Compare June 28, 2021 20:32
RowExpressionFormatter outputs the type before any constant
(e.g BIGINT 5) for debugging purposes. This is not valid SQL,
so the constant is now output with TypeConstructor (e.g BIGINT'5'),
except for VARBINARY constants, which use the binary literal
syntax (e.g. X'abc123').

RowExpressionFormatter outputs LIKE expressions such as
  column LIKE 'pattern%'
as
  LIKE(column, CAST(VARCHAR pattern% as LikeExpression))
and LIKE expressions with ESCAPE, such as
  column LIKE '$_pattern%' ESCAPE '$'
as
  LIKE(column, LIKE_PATTERN(VARCHAR pattern%, VARCHAR $))
The output for like expressions was changed to be valid SQL.
@rongrong rongrong merged commit 42cf2e9 into prestodb:master Jun 29, 2021
@ajaygeorge ajaygeorge mentioned this pull request Jul 7, 2021
1 task
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants