Skip to content

Add block/position convention to all distinct from operators#11983

Merged
haozhun merged 1 commit intoprestodb:masterfrom
haozhun:distinctfrom
Dec 17, 2018
Merged

Add block/position convention to all distinct from operators#11983
haozhun merged 1 commit intoprestodb:masterfrom
haozhun:distinctfrom

Conversation

@haozhun
Copy link
Contributor

@haozhun haozhun commented Nov 27, 2018

This supersedes #11263.

#11746 removed the blocker preventing its merge.

@haozhun haozhun requested a review from wenleix November 27, 2018 00:24
@haozhun haozhun force-pushed the distinctfrom branch 2 times, most recently from 2762bb5 to 5f5ca6b Compare November 27, 2018 00:37
@haozhun haozhun changed the title Add block position convention in all distinct from operators Add block/position convention to all distinct from operators Nov 27, 2018
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens when the existing operator (here: IS_DISTINCT_FROM) exists but has different calling convention? Will it be adapted?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no adaption support right now. It will fail loudly.

Copy link
Contributor Author

@haozhun haozhun Nov 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The plan is to introduce adaption in the future. The current idea is that the adaption graph is going to be a directed graph that is not complete.

For example, there is probably going to be an adaption if the invocation convention is BLOCK_POSITION and the definition convention is RETURN_NULL_ON_NULL, but not the other way around.

For future reference,

  • Definition conventions
    • Argument conventions
      • boxed nullable
      • return null on null
      • boolean null flag
      • block + position
    • Return conventions
      • boxed nullable
      • not explicitly nullable (further divides into: never null, automatic null only)
      • BlockBuilder argument (not implemented)
    • Optional extra arguments
      • session
    • Cached instance binder (TODO? need refactor: exposed to SqlFunctionImplementation)
  • Invocation conventions
    • Argument convention
      • never null
      • boxed nullable
      • boolean null flag
      • block + position
    • Return conventions
      • fail on null
      • boxed nullable
      • BlockBuilder argument
    • Extra argument
      • session
      • callsite instances holder

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this isn't new annotation, but... What does @BlockPosition mean? This annotates "a block", not "a position" -- this is why i do not understand.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@findepi : This parameter is Block, and next parameter is required to be annotated with BlockIndex. This convention is already used in aggregation framework.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the @BlockPosition annotation means "use block position calling convention"? As it stands in the code, it is hard to understand

if there was no (obvious) name clash, calling it (@Block Block block, @BlockPosition int position) would look better.
or @ContainingBlock Block block, @BlockPosition int position)

This convention is already used ...

i know i am late with my thoughts. I was just a bit surprised by the naming and thought that, maybe, we can improve a bit here.

Copy link
Contributor Author

@haozhun haozhun Nov 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Block wouldn't work, annotation classes and regular classes are in the same namespace.

The annotation can be dropped all together, just like how @IsNull is implemented. Parsing code would be a little more complicated that way, but not at all infeasible. The parsing would require lookahead because the type being Block isn't sufficient indication. But we already do look ahead for @IsNull.

I agree that @BlockPosition is a terrible name. I don't particularly like @ContainingBlock either.

For this PR, I'd leave this as is. This PR isn't introducing the syntax. It's just using it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Block wouldn't work, annotation classes and regular classes are in the same namespace.

of course! I was only trying to convey my thinking what the ideal naming could be.

Parsing code would be a little more complicated that way, but not at all infeasible.

Surely my intention isn't to make things more complicated. Maybe we can think of some better name. Or maybe dropping the annotation is the way.

For this PR, I'd leave this as is. This PR isn't introducing the syntax. It's just using it.

sure!

@findepi
Copy link
Contributor

findepi commented Nov 28, 2018

@haozhun Travis complains a lot.

@haozhun
Copy link
Contributor Author

haozhun commented Nov 30, 2018

Resolved Travis failures

Copy link
Contributor

@wenleix wenleix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. All comments are minor.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Bigint..., ditto for Tinyint, Smallint.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to use invokeExact here?

Also, I think we can squash all parameters into one line now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The long term plan is to gradually deprecated Type.equalTo() and use the EQUALS operator with BLOCK_AND_POSITION calling convention right?

What about Type.compareTo() ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to also implement block position convention for equality, and delegate to use notEquals here? -- We don't need to implement this for now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we can just keep keyType since we no longer need valueType now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

invokeExact?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to call these classes XXIsDistinctFrom or XXDistinctFrom? It looks like for Array and Map, we don't have Is

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: keyEqualsFunction and keyHashcodeFunction are no longer needed. We can also do all the cleanup in a separate commit / PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Rename this to METHOD_HANDLE_NULL_FLAG ?

Copy link
Contributor

@wenleix wenleix Dec 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not related to this PR itself.

Is the motivation to directly call notEqual here (instead of ask Operator Dependency to provide a MethodHandle) because we want Java to inline the call ?

Copy link
Contributor Author

@haozhun haozhun Dec 11, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason notEquals is used here directly is not related to performance. Rather, I believe the direct invocation aligns better with the logic intention this line of code intend to convey. It's also simpler.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the purpose to test all native container types? Then boolean, Block are still untested.

Also, why do we need this test and the next test?

Map distinct from also doesn't test all the native container types :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added these tests to exercise delegation to decimal operators.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@haozhun In that case, do we also want to add tests for Map? Or just Array is good enough ? :)

@wenleix wenleix assigned haozhun and unassigned wenleix Dec 5, 2018
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can be at the same time rename function to eg elementIsDistinct ?

@findepi
Copy link
Contributor

findepi commented Dec 11, 2018

@haozhun please rebase.

@haozhun haozhun force-pushed the distinctfrom branch 4 times, most recently from 203bc28 to e5938ba Compare December 11, 2018 21:17
This commit adds a second implementation to distinct from operator of all
types. This 2nd implementation uses block/position convention. As a
result, for certain types, it's no longer necessary to extract the value
out of the Block in order to invoke "distinct from" operator. This is
expected to resolve performance issue previously introduced in
baad26e (which was work around with a
hack).
@findepi
Copy link
Contributor

findepi commented Dec 15, 2018

@haozhun is it to merged now?

@haozhun haozhun merged commit b845cd6 into prestodb:master Dec 17, 2018
yingsu00 pushed a commit to yingsu00/presto that referenced this pull request Dec 17, 2018
Resolves:
prestodb#11493
prestodb#11978
prestodb#11977

This commit rewrites ARRAY_INTERSECT, ARRAY_DISTINCT, ARRAY_UNION using
a new class HashTable to:
1) Improve performance and reduce memory consumption;
2) Fix the wrong results bug prestodb#11978 and prestodb#11977.

The HashTable class was adapted from JsonUtil#HashTable and TypedSet
with the following design considerations:

1) The HashTable can be created from an immutable Block or an empty
BlockBuilder. If the input is a Block, addIfAbsent() would only need to
update the hash table to add a new value. No additional memory copy is
needed. If the input if an empty BlockBuilder, addIfAbsent() would also
add the new input value to the BlockBuilder in addtion to updating the
hashtable. In either way, we avoided an extra memory copy. Note that the
check of the Block class type does introduce about 10%-30% performance
regression than using two seperate classes, but it is still showing gain
over TypedSet.

The class can return the result Block from the input Block or
BlockBuilder. If the input is BlockBuilder then a Block directly built
from it is returned. Otherwise the values for the positions present in
the hashtable is built and returned.

2) We tried to avoid hash recomputation from callers for contains and
add.

3) We allow the caller to pass in an estimated hashtable size to avoid
rehashing.

The JMH benchmarks were ran on Blocks of 1000 rows with 10, 100, 1000
array entries in each row. BIGINT, VARCHAR, DOUBLE, and BOOLEAN were
tested. Note that all numbers were not using type specialized code
even if they're present. This is to examine the effect of using the
new HashTable class only.

ARRAY_INTERSECT, Using sort (original):
Benchmark                               (arraySizeString)   (type)  Mode  Cnt        Score        Error  Units
BenchmarkArrayIntersect.arrayIntersect                 10   BIGINT  avgt   20      816.963 ±     19.150  ns/op
BenchmarkArrayIntersect.arrayIntersect                 10  VARCHAR  avgt   20     1437.385 ±     32.980  ns/op
BenchmarkArrayIntersect.arrayIntersect                 10   DOUBLE  avgt   20      741.172 ±     13.893  ns/op
BenchmarkArrayIntersect.arrayIntersect                 10  BOOLEAN  avgt   20      451.989 ±     12.216  ns/op
BenchmarkArrayIntersect.arrayIntersect                100   BIGINT  avgt   20    15009.721 ±    576.803  ns/op
BenchmarkArrayIntersect.arrayIntersect                100  VARCHAR  avgt   20    30356.848 ±    607.700  ns/op
BenchmarkArrayIntersect.arrayIntersect                100   DOUBLE  avgt   20    14211.824 ±    269.999  ns/op
BenchmarkArrayIntersect.arrayIntersect                100  BOOLEAN  avgt   20     2698.627 ±    152.828  ns/op
BenchmarkArrayIntersect.arrayIntersect               1000   BIGINT  avgt   20   192445.860 ±   2278.683  ns/op
BenchmarkArrayIntersect.arrayIntersect               1000  VARCHAR  avgt   20   321142.819 ±   7050.989  ns/op
BenchmarkArrayIntersect.arrayIntersect               1000   DOUBLE  avgt   20   203028.772 ±   4898.172  ns/op
BenchmarkArrayIntersect.arrayIntersect               1000  BOOLEAN  avgt   20    22854.985 ±    445.286  ns/op
BenchmarkArrayIntersect.arrayIntersect              10000   BIGINT  avgt   20  2548819.023 ±  50089.751  ns/op
BenchmarkArrayIntersect.arrayIntersect              10000  VARCHAR  avgt   20  2392899.264 ±  47861.656  ns/op
BenchmarkArrayIntersect.arrayIntersect              10000   DOUBLE  avgt   20  2703042.454 ± 104176.133  ns/op
BenchmarkArrayIntersect.arrayIntersect              10000  BOOLEAN  avgt   20   213666.313 ±   1954.165  ns/op

ARRAY_INTERSECT, Using TypedSet (As shown in PR prestodb#11984:
Benchmark                               (arraySizeString)   (type)  Mode  Cnt       Score      Error  Units
BenchmarkArrayIntersect.arrayIntersect                 10   BIGINT  avgt   20     744.993 ±   26.434  ns/op
BenchmarkArrayIntersect.arrayIntersect                 10  VARCHAR  avgt   20    1626.876 ±   80.488  ns/op
BenchmarkArrayIntersect.arrayIntersect                 10   DOUBLE  avgt   20     762.618 ±   18.264  ns/op
BenchmarkArrayIntersect.arrayIntersect                 10  BOOLEAN  avgt   20     563.116 ±   14.056  ns/op
BenchmarkArrayIntersect.arrayIntersect                100   BIGINT  avgt   20    5307.227 ±  176.027  ns/op
BenchmarkArrayIntersect.arrayIntersect                100  VARCHAR  avgt   20   12503.550 ±  930.792  ns/op
BenchmarkArrayIntersect.arrayIntersect                100   DOUBLE  avgt   20    4960.093 ±  218.068  ns/op
BenchmarkArrayIntersect.arrayIntersect                100  BOOLEAN  avgt   20    3506.155 ±   89.350  ns/op
BenchmarkArrayIntersect.arrayIntersect               1000   BIGINT  avgt   20   54011.841 ± 1157.535  ns/op
BenchmarkArrayIntersect.arrayIntersect               1000  VARCHAR  avgt   20  138138.117 ± 2312.833  ns/op
BenchmarkArrayIntersect.arrayIntersect               1000   DOUBLE  avgt   20   52353.568 ± 1500.466  ns/op
BenchmarkArrayIntersect.arrayIntersect               1000  BOOLEAN  avgt   20   27390.156 ±  538.806  ns/op

ARRAY_INTERSECT, Using HashTable, not using PageBuilder (See note1 below)
Benchmark                               (arraySizeString)   (type)  Mode  Cnt       Score       Error  Units
BenchmarkArrayIntersect.arrayIntersect                 10   BIGINT  avgt   20     588.884 ±    23.353  ns/op
BenchmarkArrayIntersect.arrayIntersect                 10  VARCHAR  avgt   20    1209.353 ±    25.677  ns/op
BenchmarkArrayIntersect.arrayIntersect                 10   DOUBLE  avgt   20     394.150 ±     9.150  ns/op
BenchmarkArrayIntersect.arrayIntersect                 10  BOOLEAN  avgt   20     454.669 ±     8.582  ns/op
BenchmarkArrayIntersect.arrayIntersect                100   BIGINT  avgt   20    4619.287 ±   152.689  ns/op
BenchmarkArrayIntersect.arrayIntersect                100  VARCHAR  avgt   20   10578.612 ±   243.154  ns/op
BenchmarkArrayIntersect.arrayIntersect                100   DOUBLE  avgt   20    2330.880 ±    67.917  ns/op
BenchmarkArrayIntersect.arrayIntersect                100  BOOLEAN  avgt   20    2879.377 ±    59.567  ns/op
BenchmarkArrayIntersect.arrayIntersect               1000   BIGINT  avgt   20   46797.957 ±  1540.572  ns/op
BenchmarkArrayIntersect.arrayIntersect               1000  VARCHAR  avgt   20  106199.153 ±  3875.884  ns/op
BenchmarkArrayIntersect.arrayIntersect               1000   DOUBLE  avgt   20   24430.349 ±   558.838  ns/op
BenchmarkArrayIntersect.arrayIntersect               1000  BOOLEAN  avgt   20   27448.122 ±   767.058  ns/op
BenchmarkArrayIntersect.arrayIntersect              10000   BIGINT  avgt   20  517433.121 ± 17836.669  ns/op
BenchmarkArrayIntersect.arrayIntersect              10000  VARCHAR  avgt   20  891996.441 ± 16470.239  ns/op
BenchmarkArrayIntersect.arrayIntersect              10000   DOUBLE  avgt   20  332203.796 ±  7918.177  ns/op
BenchmarkArrayIntersect.arrayIntersect              10000  BOOLEAN  avgt   20  202184.853 ± 51445.059  ns/op

ARRAY_UNION, Using TypedSet (original)
Benchmark                       (arraySizeString)   (type)  Mode  Cnt      Score      Error  Units
BenchmarkArrayUnion.arrayUnion                 10   BIGINT  avgt   20   1106.702 ±  125.186  ns/op
BenchmarkArrayUnion.arrayUnion                 10  VARCHAR  avgt   20   2936.697 ±   75.938  ns/op
BenchmarkArrayUnion.arrayUnion                 10   DOUBLE  avgt   20   1069.611 ±   30.923  ns/op
BenchmarkArrayUnion.arrayUnion                 10  BOOLEAN  avgt   20    442.312 ±   12.642  ns/op
BenchmarkArrayUnion.arrayUnion                100   BIGINT  avgt   20   7565.756 ±  476.906  ns/op
BenchmarkArrayUnion.arrayUnion                100  VARCHAR  avgt   20  19746.303 ±  637.598  ns/op
BenchmarkArrayUnion.arrayUnion                100   DOUBLE  avgt   20   7990.463 ±  315.215  ns/op
BenchmarkArrayUnion.arrayUnion                100  BOOLEAN  avgt   20   2821.322 ±  126.267  ns/op
BenchmarkArrayUnion.arrayUnion               1000   BIGINT  avgt   20  78448.551 ± 1799.462  ns/op
BenchmarkArrayUnion.arrayUnion               1000  VARCHAR  avgt   20  98506.432 ± 2350.677  ns/op
BenchmarkArrayUnion.arrayUnion               1000   DOUBLE  avgt   20  82625.185 ± 3147.564  ns/op

ARRAY_UNION, Using HashTable, not using PageBuilder (See note1 below)
Benchmark                       (arraySizeString)   (type)  Mode  Cnt      Score      Error  Units
BenchmarkArrayUnion.arrayUnion                 10   BIGINT  avgt   20    552.578 ±   25.674  ns/op
BenchmarkArrayUnion.arrayUnion                 10  VARCHAR  avgt   20   1785.893 ±   57.115  ns/op
BenchmarkArrayUnion.arrayUnion                 10   DOUBLE  avgt   20    554.432 ±   19.863  ns/op
BenchmarkArrayUnion.arrayUnion                 10  BOOLEAN  avgt   20    219.533 ±    5.263  ns/op
BenchmarkArrayUnion.arrayUnion                100   BIGINT  avgt   20   3893.698 ±  228.846  ns/op
BenchmarkArrayUnion.arrayUnion                100  VARCHAR  avgt   20  12585.958 ±  327.471  ns/op
BenchmarkArrayUnion.arrayUnion                100   DOUBLE  avgt   20   3843.612 ±  166.522  ns/op
BenchmarkArrayUnion.arrayUnion                100  BOOLEAN  avgt   20   1583.191 ±  386.227  ns/op
BenchmarkArrayUnion.arrayUnion               1000   BIGINT  avgt   20  39900.585 ±  842.960  ns/op
BenchmarkArrayUnion.arrayUnion               1000  VARCHAR  avgt   20  70375.042 ± 1793.881  ns/op
BenchmarkArrayUnion.arrayUnion               1000   DOUBLE  avgt   20  39905.696 ± 1911.857  ns/op
BenchmarkArrayUnion.arrayUnion               1000  BOOLEAN  avgt   20  11147.282 ±  678.497  ns/op

ARRAY_DISTINCT, Using TypedSet (original)
Benchmark                       (arraySizeString)   (type)  Mode  Cnt      Score      Error  Units
BenchmarkArrayUnion.arrayUnion                 10   BIGINT  avgt   20    551.480 ±   17.743  ns/op
BenchmarkArrayUnion.arrayUnion                 10  VARCHAR  avgt   20   1846.533 ±   85.912  ns/op
BenchmarkArrayUnion.arrayUnion                 10   DOUBLE  avgt   20    573.547 ±   35.368  ns/op
BenchmarkArrayUnion.arrayUnion                 10  BOOLEAN  avgt   20    232.252 ±   19.261  ns/op
BenchmarkArrayUnion.arrayUnion                100   BIGINT  avgt   20   4403.410 ±  631.545  ns/op
BenchmarkArrayUnion.arrayUnion                100  VARCHAR  avgt   20  12964.978 ±  707.499  ns/op
BenchmarkArrayUnion.arrayUnion                100   DOUBLE  avgt   20   3841.738 ±  155.047  ns/op
BenchmarkArrayUnion.arrayUnion                100  BOOLEAN  avgt   20   1197.182 ±   63.073  ns/op
BenchmarkArrayUnion.arrayUnion               1000   BIGINT  avgt   20  40894.698 ± 1356.368  ns/op
BenchmarkArrayUnion.arrayUnion               1000  VARCHAR  avgt   20  68847.286 ± 1380.070  ns/op
BenchmarkArrayUnion.arrayUnion               1000   DOUBLE  avgt   20  40348.384 ±  920.434  ns/op
BenchmarkArrayUnion.arrayUnion               1000  BOOLEAN  avgt   20   8892.462 ±  292.235  ns/op

ARRAY_DISTINCT, Using HashTable
Benchmark                              (arraySizeString)   (type)  Mode  Cnt      Score      Error  Units
BenchmarkArrayDistinct.arrayDistinct                  10   BIGINT  avgt   20    422.732 ±   14.037  ns/op
BenchmarkArrayDistinct.arrayDistinct                  10  VARCHAR  avgt   20    889.063 ±   18.377  ns/op
BenchmarkArrayDistinct.arrayDistinct                  10   DOUBLE  avgt   20    328.964 ±    7.290  ns/op
BenchmarkArrayDistinct.arrayDistinct                  10  BOOLEAN  avgt   20    185.788 ±    4.915  ns/op
BenchmarkArrayDistinct.arrayDistinct                 100   BIGINT  avgt   20   3249.734 ±  112.246  ns/op
BenchmarkArrayDistinct.arrayDistinct                 100  VARCHAR  avgt   20   6774.174 ±  129.837  ns/op
BenchmarkArrayDistinct.arrayDistinct                 100   DOUBLE  avgt   20   2137.946 ±   40.572  ns/op
BenchmarkArrayDistinct.arrayDistinct                 100  BOOLEAN  avgt   20    558.567 ±   31.567  ns/op
BenchmarkArrayDistinct.arrayDistinct                1000   BIGINT  avgt   20  32042.905 ±  637.472  ns/op
BenchmarkArrayDistinct.arrayDistinct                1000  VARCHAR  avgt   20  37180.367 ± 1000.611  ns/op
BenchmarkArrayDistinct.arrayDistinct                1000   DOUBLE  avgt   20  21234.565 ±  707.286  ns/op
BenchmarkArrayDistinct.arrayDistinct                1000  BOOLEAN  avgt   20   3887.345 ±  214.052  ns/op

Note 1:
We used pageBuilder in the implementations, but due to an issue on
CachedInstance being created for every batch, this approach is around
2x-20x worse than just creating a new BlockBuilder for each Block being
processed. The JMH benchmarks above were not using PageBuilder. We
will keep the PageBuilder logic in this commit, and will fix the issue
shortly.

TODO:
HashTable is still using the equal to logic. Once PR prestodb#11983 is merged,
we will change it to use the isDistinctFrom operator.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants