Skip to content

Conversation

@ishaoxy
Copy link
Contributor

@ishaoxy ishaoxy commented Jun 25, 2025

Description

  • Introduced a new UDF CompareIpFunction for IP address comparisons.
  • Since Calcite does not provide a native SQL type for IP addresses, EXPR_IP is mapped to SqlTypeName.OTHER.
  • Enhanced the registerOperator method to support registering one or multiple operators under a single function name, enabling function overloading based on operand types.
  • Registered the newly implemented IP comparison operators under the existing comparison function names(e.g., =, !=, <, >, etc.).
  • Added new and specific type checkers for IP comparison and CIDR functions to ensure accurate operand validation and operator resolution.

Related Issues

Resolves #3776

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • New functionality has javadoc added.
  • New functionality has a user manual doc added.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@qianheng-aws
Copy link
Collaborator

I think we should implement implicit conversion before implementing comparison for non-primitive types.

After that, for Ip comparison, we can convert string to ExprIP. And then we can implement some generic comparison functions for types of comparable, via calling their compareTo method.

But current approach is still acceptable if we won't have other UDT in the future.

Copy link
Collaborator

@qianheng-aws qianheng-aws left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add IT for ip comparison.

// EXPR_IP is mapped to SqlTypeFamily.NULL in
// UserDefinedFunctionUtils.convertRelDataTypeToSqlTypeName
return UDFOperandMetadata.wrap(OperandTypes.STRING_STRING);
return UDFOperandMetadata.wrap(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should use new PPLTypeChecker here and override its checkOperandTypes method to only allow accept string or ip.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Handled.

OpenSearchTypeFactory.TYPE_FACTORY.createSqlType(SqlTypeName.INTEGER));
case ANY, IGNORE -> List.of(
OpenSearchTypeFactory.TYPE_FACTORY.createSqlType(SqlTypeName.ANY));
// We borrow SqlTypeFamily.NULL to represent EXPR_IP. This is a workaround
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we stop use CompositeOperandTypeChecker and create a new TypeChecker ourself, maybe we can avoid using SqlTypeFamily.NULL to represent EXPR_IP and check operands on RelDataType level.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@qianheng-aws How about this implementation ishaoxy#1
It doesn't use NULL for IP, but created two other classes for IP type checking.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It just came to me that there's not only compare that checks IP type. CIDR_MATCH also has to validate IP types. If we make a specific IP type checker for compare operators, we may also have to create one for each other functions like cidrmatch and geoip.


void populate() {
// register operators for IP comparing
registerOperator(NOTEQUAL, PPLBuiltinOperators.NOT_EQUALS_IP);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Questions, Do we need to support IP as UDT in PPL engine?

In OpenSearch PPL, IP handling depends on index field type. CIDR-based filtering should use ip_range queries for ip fields, script pushdown / in-memory processing for keyword, text, and runtime string fields.

Field Type Use Case Expectation
IP Field search index=log ip="192.168.0.0/16" Rewrite as term query
  search index | where cidrmatch("192.168.0.0/16", ip) Rewrite as term query
Keyword Field search index=log ip="192.168.0.0/16" Rewrite as term query, extactally keyword match
  search index | where cidrmatch("192.168.0.0/16", ip) Script pushdown — ip field is a string, not rewrite as term query.
Text Field search index=log ip="192.168.0.0/16" Rewrite as query_string query, full text search
  search index | where cidrmatch("192.168.0.0/16", ip) Script pushdown — ip field is a string, not rewrite as term query.
Runtime Field search index=log | parse ip=regex(...)| where ip="192.168.0.0/16" Script pushdown — ip field is a string, it is a string comparsion query
  search index=log | parse ip=regex(...)| where cidrmatch("192.168.0.0/16", ip) Script pushdown — ip field is a string, not rewrite as term query.

@Swiddis Swiddis added the enhancement New feature or request label Jun 26, 2025
@qianheng-aws
Copy link
Collaborator

qianheng-aws commented Jun 30, 2025

@qianheng-aws How about this implementation ishaoxy#1
It doesn't use NULL for IP, but created two other classes for IP type checking.

Yes, I think that's what we need. In the future when we have type conversion, that IPTypeChecker should only accept type of IP, and type conversion should convert String to ExprIpType

It just came to me that there's not only compare that checks IP type. CIDR_MATCH also has to validate IP types. If we make a specific IP type checker for compare operators, we may also have to create one for each other functions like cidrmatch and geoip.

Yes since we treat ExprIPType as a specific type instead of String

@yuancu

ishaoxy and others added 4 commits June 30, 2025 11:35
Signed-off-by: Xinyu Hao <[email protected]>
Make IpComparisonOperators an inner enum of CompareIPFunction
Comment on lines 619 to 623
registerOperator(EQUAL, PPLBuiltinOperators.EQUALS_IP);
registerOperator(GREATER, PPLBuiltinOperators.GREATER_IP);
registerOperator(GTE, PPLBuiltinOperators.GTE_IP);
registerOperator(LESS, PPLBuiltinOperators.LESS_IP);
registerOperator(LTE, PPLBuiltinOperators.LTE_IP);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SqlStdOperatorTable.EQUALS` is a very basic std operator which widely used in calcite internal, I am worry that it may introduce potential bugs and performance regression for pushdown.

Copy link
Collaborator

@yuancu yuancu Jul 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But Calcite's built-in operators like SqlStdOperatorTable.EQUALS does not handle our UDT like IP. These new operators are only effective to IP comparison, controlled via type checkers. Comparison between other types will still falls to Calcite's built-in comparison operators.

There was two solutions:

  • one is to convert IP UDT to a type that is comparable by calcite's comparators.
  • another is to add new operators exclusively for IP comparision

We opted the latter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to keep the comparison logic the same as v2, I chose to add these specific operator udfs for IP comparison instead of converting it to string type that calcite can compare.

Copy link
Member

@LantaoJin LantaoJin Jul 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These new operators are only effective to IP comparison, controlled via type checkers. Comparison between other types will still falls to Calcite's built-in comparison operators.

Can we move above logic to a specific method registerOverrideOperator, it quite confused me.

Copy link
Member

@LantaoJin LantaoJin Jul 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or modify to

registerOperator(EQUAL, check(PPLBuiltinOperators.EQUALS_IP,SqlStdOperatorTable.EQUALS));

and leverage check() to manage the specific register logic via typeChecker.

check() can name to case() or set() or queue()...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or modify to

registerOperator(EQUAL, check(PPLBuiltinOperators.EQUALS_IP,SqlStdOperatorTable.EQUALS));

and leverage check() to manage the specific register logic via typeChecker.

check() can name to case() or set() or queue()...

Modified, improving codes readability.

ishaoxy and others added 4 commits July 3, 2025 15:53
case EXPR_IP -> SqlTypeName.VARCHAR;
// EXPR_IP is mapped to SqlTypeName.OTHER since there is no
// corresponding SqlTypeName in Calcite.
case EXPR_IP -> SqlTypeName.OTHER;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please update the PR description to align changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for reminding me.

} else {
typeChecker = operator.getOperandTypeChecker();
}
public void registerOperator(BuiltinFunctionName functionName, SqlOperator... operators) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add a javadoc for this method to explain what are the operators for

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

Signed-off-by: Xinyu Hao <[email protected]>
IPAddress addr1 = IPUtils.toAddress(ip1);
IPAddress addr2 = IPUtils.toAddress(ip2);
int result = IPUtils.compare(addr1, addr2);
return switch (comparisonType) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For efficiency consideration, we'd better move this switch case to the implement method

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.


public static boolean compare(Object obj1, Object obj2, ComparisonType comparisonType) {
try {
String ip1 = extractIpString(obj1);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not leverage ExprIpValue::compare? Then we don't need to do extra transformation for ExprIPValue

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for only focusing on IPUtils before and ignoring the functions encapsulated in ExprIpValue. Now fixed.

qianheng-aws
qianheng-aws previously approved these changes Jul 7, 2025
@qianheng-aws qianheng-aws merged commit 6c3efa1 into opensearch-project:main Jul 11, 2025
25 checks passed
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.19-dev failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/sql/backport-2.19-dev 2.19-dev
# Navigate to the new working tree
pushd ../.worktrees/sql/backport-2.19-dev
# Create a new branch
git switch --create backport/backport-3821-to-2.19-dev
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 6c3efa1475f815fc9025a2e3c533b7b21d9ea8cf
# Push it to GitHub
git push --set-upstream origin backport/backport-3821-to-2.19-dev
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/sql/backport-2.19-dev

Then, create a pull request where the base branch is 2.19-dev and the compare/head branch is backport/backport-3821-to-2.19-dev.

@LantaoJin
Copy link
Member

@ishaoxy please follow the instructions to backport manually

ishaoxy added a commit to ishaoxy/sql that referenced this pull request Jul 13, 2025
* ip_compare operator added

Signed-off-by: Xinyu Hao <[email protected]>

* only type checker issue left

Signed-off-by: Xinyu Hao <[email protected]>

* fix by modifying ip.sqlTypeName from OTHER to NULL in type checker

Signed-off-by: Xinyu Hao <[email protected]>

* fix less

Signed-off-by: Xinyu Hao <[email protected]>

* modify the CalcitePPLFunctionTypeTest text

Signed-off-by: Xinyu Hao <[email protected]>

* allow CalciteIPComparisonIT in CalciteNoPushdownIT

Signed-off-by: Xinyu Hao <[email protected]>

* Modify the signature description in udf

Signed-off-by: Xinyu Hao <[email protected]>

* fix some typing errors

Signed-off-by: Xinyu Hao <[email protected]>

* modify the udfs for better style

Signed-off-by: Xinyu Hao <[email protected]>

* Make IpComparisonOperators an inner enum of CompareIPFunction

Signed-off-by: Yuanchun Shen <[email protected]>

* modify registerOperator

Signed-off-by: Xinyu Hao <[email protected]>

* modify registerOperator

Signed-off-by: Xinyu Hao <[email protected]>

* add type checker for cidr

Signed-off-by: Xinyu Hao <[email protected]>

* add javadoc

Signed-off-by: Xinyu Hao <[email protected]>

* move switch case to the implement method

Signed-off-by: Xinyu Hao <[email protected]>

---------

Signed-off-by: Xinyu Hao <[email protected]>
Signed-off-by: Yuanchun Shen <[email protected]>
Co-authored-by: Yuanchun Shen <[email protected]>
(cherry picked from commit 6c3efa1)
qianheng-aws pushed a commit that referenced this pull request Jul 15, 2025
* Add compare_ip operator udfs (#3821)

* ip_compare operator added

Signed-off-by: Xinyu Hao <[email protected]>

* only type checker issue left

Signed-off-by: Xinyu Hao <[email protected]>

* fix by modifying ip.sqlTypeName from OTHER to NULL in type checker

Signed-off-by: Xinyu Hao <[email protected]>

* fix less

Signed-off-by: Xinyu Hao <[email protected]>

* modify the CalcitePPLFunctionTypeTest text

Signed-off-by: Xinyu Hao <[email protected]>

* allow CalciteIPComparisonIT in CalciteNoPushdownIT

Signed-off-by: Xinyu Hao <[email protected]>

* Modify the signature description in udf

Signed-off-by: Xinyu Hao <[email protected]>

* fix some typing errors

Signed-off-by: Xinyu Hao <[email protected]>

* modify the udfs for better style

Signed-off-by: Xinyu Hao <[email protected]>

* Make IpComparisonOperators an inner enum of CompareIPFunction

Signed-off-by: Yuanchun Shen <[email protected]>

* modify registerOperator

Signed-off-by: Xinyu Hao <[email protected]>

* modify registerOperator

Signed-off-by: Xinyu Hao <[email protected]>

* add type checker for cidr

Signed-off-by: Xinyu Hao <[email protected]>

* add javadoc

Signed-off-by: Xinyu Hao <[email protected]>

* move switch case to the implement method

Signed-off-by: Xinyu Hao <[email protected]>

---------

Signed-off-by: Xinyu Hao <[email protected]>
Signed-off-by: Yuanchun Shen <[email protected]>
Co-authored-by: Yuanchun Shen <[email protected]>
(cherry picked from commit 6c3efa1)

* backport to java11

Signed-off-by: Xinyu Hao <[email protected]>

* backport to java11

Signed-off-by: Xinyu Hao <[email protected]>

* backport to java11

Signed-off-by: Xinyu Hao <[email protected]>

* backport to java11

Signed-off-by: Xinyu Hao <[email protected]>

* backport to java11

Signed-off-by: Xinyu Hao <[email protected]>

* backport to java11

Signed-off-by: Xinyu Hao <[email protected]>

---------

Signed-off-by: Xinyu Hao <[email protected]>
Signed-off-by: Yuanchun Shen <[email protected]>
Co-authored-by: Yuanchun Shen <[email protected]>
@LantaoJin LantaoJin added the backport-manually Filed a PR to backport manually. label Jul 22, 2025
@opensearch-trigger-bot
Copy link
Contributor

The backport to manually failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/sql/backport-manually manually
# Navigate to the new working tree
pushd ../.worktrees/sql/backport-manually
# Create a new branch
git switch --create backport/backport-3821-to-manually
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 6c3efa1475f815fc9025a2e3c533b7b21d9ea8cf
# Push it to GitHub
git push --set-upstream origin backport/backport-3821-to-manually
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/sql/backport-manually

Then, create a pull request where the base branch is manually and the compare/head branch is backport/backport-3821-to-manually.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport 2.19-dev backport-failed backport-manually Filed a PR to backport manually. enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEATURE] Support comparing IP values with Calcite

6 participants