Skip to content

Handle NaN when extracting predicate#4266

Merged
findepi merged 6 commits intotrinodb:masterfrom
findepi:nan
Jul 1, 2020
Merged

Handle NaN when extracting predicate#4266
findepi merged 6 commits intotrinodb:masterfrom
findepi:nan

Conversation

@findepi
Copy link
Copy Markdown
Member

@findepi findepi commented Jun 29, 2020

Fixes #4119
Fixes #4272

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably stop using == checks against types to avoid any future backward compatibility issues if we decide to create instances on the fly, and for consistency across all type checks. .equals() is more robust and can be implemented internally to do the == check as necessary.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is omnipresent pattern across a codebase, so we would need to (a) update all the paces and (b) have some safeguards on CI.
especially until we do (a), there isn't much of an added value in avoiding them in new code.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

different commit?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will rename Handle NaN when extracting predicate to better describe its holistic approach

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sopel39 suggests we can skip the NaN values instead (like we skip NULLs).

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i pushed an update

@findepi
Copy link
Copy Markdown
Member Author

findepi commented Jun 30, 2020

@sopel39 @losipiuk @martint
i added a bit of new code.
Not a ton, but would appreciate second look.
I will squash to order the commits appropriately.

Copy link
Copy Markdown
Member

@sopel39 sopel39 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally, accepting all non-null values because we've seen NaN is really aggressive. I guess it shouldn't occur in practice too much

if (!values.isEmpty()) {
domain = domain.union(Domain.multipleValues(type, values));
Domain domain;
if (values.isEmpty()) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to handle empty as special case? Should Domain.multipleValues handle that
in order to reduce iffs?
For example:

Domain.create(ValueSet.copyOf(type, ImmutableList.of()), false).isNone() == true

Yet multipleValues is implemented as

    public static Domain multipleValues(Type type, List<?> values)
    {
        if (values.isEmpty()) {
            throw new IllegalArgumentException("values cannot be empty");
        }
        if (values.size() == 1) {
            return singleValue(type, values.get(0));
        }
        return new Domain(ValueSet.of(type, values.get(0), values.subList(1, values.size()).toArray()), false);
    }

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i do not have a strong opinion. There are exactly 3 non-test call sites for multipleValues, so updating it in a follow up would not be a problem.

if (values.isEmpty()) {
domain = Domain.none(type);
}
else if (hasNaN) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding comment. It might not be obvious for reader why NaN makes TupleDomain accept all non-null values.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This knowledge is sprinkled in multiple places in the code base already.
There seems to be no good place to place such a comment though.

@findepi findepi merged commit d068a3f into trinodb:master Jul 1, 2020
@findepi findepi deleted the nan branch July 1, 2020 07:35
@findepi findepi added this to the 338 milestone Jul 1, 2020
@findepi findepi mentioned this pull request Jul 1, 2020
8 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

Dynamic Filter failure in presence of NaN Planning failure when effective predicate contains NaN

4 participants