Skip to content

Use shared Zstd(De)Compressor in ResolvedFunction encoding#11217

Merged
electrum merged 3 commits intotrinodb:masterfrom
wendigo:serafin/faster-resolved-function
Apr 12, 2022
Merged

Use shared Zstd(De)Compressor in ResolvedFunction encoding#11217
electrum merged 3 commits intotrinodb:masterfrom
wendigo:serafin/faster-resolved-function

Conversation

@wendigo
Copy link
Contributor

@wendigo wendigo commented Feb 28, 2022

Description

Speeds up ResolvedFunction serialization and deserialization, i.e. TestLocalExecutionPlanner now executes in 55 30 seconds, instead of 2 minutes 33 seconds.

Is this change a fix, improvement, new feature, refactoring, or other?

Improvement

Is this a change to the core query engine, a connector, client library, or the SPI interfaces? (be specific)

Core query engine

How would you describe this change to a non-technical end user or system administrator?

Speeds up function resolution

Related issues, pull requests, and links

Documentation

(x) No documentation is needed.
( ) Sufficient documentation is included in this PR.
( ) Documentation PR is available with #prnumber.
( ) Documentation issue #issuenumber is filed, and can be handled later.

Release notes

(x ) No release notes entries required.
( ) Release notes entries required with the following suggested text:

# Section
* Fix some things. ({issue}`issuenumber`)

@cla-bot cla-bot bot added the cla-signed label Feb 28, 2022
@findepi findepi requested review from dain and martint February 28, 2022 11:02
@wendigo wendigo force-pushed the serafin/faster-resolved-function branch 2 times, most recently from 0ec551f to bd7976b Compare February 28, 2022 14:23
Copy link
Member

@electrum electrum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ZstdDecompressor is not safe to share across threads. It uses ZstdFrameDecompressor which holds state during the decompression operation.

ZstdCompressor has no state, so there shouldn't be any benefit to reusing it.

We don't document the thread safety of the Airlift Compressor and Decompressor interfaces, but probably should. cc @martint

@wendigo
Copy link
Contributor Author

wendigo commented Feb 28, 2022

@electrum actually ByteBuffer based methods are thread safe in ZstdDecompressor

@electrum
Copy link
Member

@wendigo you're right. That's a strange choice. We should probably make them both thread safe.

In that case, we need to order the commits so that the byte buffer change is first, so that the commit which uses a shared decompressor is safe.

@wendigo
Copy link
Contributor Author

wendigo commented Feb 28, 2022

@martint
Copy link
Member

martint commented Feb 28, 2022

We don't document the thread safety of the Airlift Compressor and Decompressor interfaces, but probably should. cc

Compressors and Decompressors are not meant to be thread safe. Some may be due to implementation details, but it's not a requirement. Yes, we should document that.

@martint
Copy link
Member

martint commented Feb 28, 2022

actually ByteBuffer based methods are thread safe in ZstdDecompressor

That's only a side effect -- we had to resort to synchronization to prevent the JVM from collecting the objects (and deallocating the underlying buffers) while the method is running. But the methods are not required or guaranteed to be thread safe (as mentioned above, Compressors and Decompressors are not meant to be thread-safe)

martint
martint previously requested changes Mar 1, 2022
@wendigo wendigo force-pushed the serafin/faster-resolved-function branch from bd7976b to 5265db2 Compare March 1, 2022 11:45
@wendigo wendigo requested review from electrum and martint March 2, 2022 10:43
@wendigo wendigo force-pushed the serafin/faster-resolved-function branch from 5265db2 to f08b7b5 Compare March 10, 2022 14:42
@wendigo wendigo requested a review from findepi March 10, 2022 14:42
@findepi
Copy link
Member

findepi commented Mar 14, 2022

is the checks' failure related?

@wendigo
Copy link
Contributor Author

wendigo commented Mar 14, 2022

@findepi I don't think so. There is a problem with upper bounds for dbpool from commons in redis module. I'm investigating.

@wendigo
Copy link
Contributor Author

wendigo commented Mar 14, 2022

@findepi #11462 this will remove the version conflict in this PR.

@findepi
Copy link
Member

findepi commented Mar 14, 2022

@findepi I don't think so. There is a problem with upper bounds for dbpool from commons in redis module. I'm investigating.

Do you mean we don't have precise version pinned?
that would mean the build is not reproducible...

@wendigo
Copy link
Contributor Author

wendigo commented Mar 14, 2022

@findepi there was dependency pulled in by redis module (transitive from jedis client) and I've added another version since the conflict.

@wendigo wendigo force-pushed the serafin/faster-resolved-function branch from f08b7b5 to 81af48e Compare March 15, 2022 11:28
@wendigo
Copy link
Contributor Author

wendigo commented Mar 15, 2022

@findepi PTAL now. I've reworked this PR

Copy link
Member

@findepi findepi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Reuse thread-local Zstd de/compressor in ResolvedFunction to Qualifie…"

@wendigo wendigo force-pushed the serafin/faster-resolved-function branch from 81af48e to faf4cd8 Compare March 15, 2022 12:33
@wendigo wendigo requested a review from findepi March 15, 2022 12:33
@wendigo wendigo requested a review from findepi March 15, 2022 12:33
Copy link
Member

@findepi findepi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Reuse thread-local Zstd de/compressor in ResolvedFunction to Qualifie…"

Copy link
Member

@findepi findepi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Speed up QualifiedName implementation"

Copy link
Member

@findepi findepi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Memoize ResolvedFunction <-> QualifiedName conversions"

Comment on lines 185 to 189
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dain PTAL

@wendigo wendigo force-pushed the serafin/faster-resolved-function branch from faf4cd8 to 181993d Compare March 15, 2022 19:13
@wendigo
Copy link
Contributor Author

wendigo commented Mar 16, 2022

@findepi
Copy link
Member

findepi commented Mar 17, 2022

thanks @wendigo for benchmark results

@martint @electrum you both requested changes on this PR. Do you actually plan on reviewing it?

@electrum electrum dismissed stale reviews from martint and themself April 7, 2022 17:13

stale

@wendigo wendigo force-pushed the serafin/faster-resolved-function branch from 181993d to 7b5a2ea Compare April 8, 2022 14:09
@wendigo
Copy link
Contributor Author

wendigo commented Apr 8, 2022

@findepi can we merge it?

@findepi
Copy link
Member

findepi commented Apr 8, 2022

@findepi can we merge it?

That's probably a question to David.

@electrum electrum merged commit f0b48b8 into trinodb:master Apr 12, 2022
@github-actions github-actions bot added this to the 377 milestone Apr 12, 2022
@wendigo wendigo deleted the serafin/faster-resolved-function branch April 12, 2022 18:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

4 participants