[native] Add caching of parsed Types#21325
Conversation
2d8fc42 to
17b39fa
Compare
xiaoxmeng
left a comment
There was a problem hiding this comment.
@kevinwilfong nice catch. Thanks for the optimization!
There was a problem hiding this comment.
Mark pool_ and typeParser_ as consts? Thanks!
There was a problem hiding this comment.
NYC: drop explicit as the ctor takes more than one input? Thanks!
There was a problem hiding this comment.
NYC: mark poo_ and queryCtx_ as consts?
velox::memory::MemoryPool* const pool_;
velox::core::QueryCtx* const queryCtx_;
17b39fa to
a239894
Compare
We've seen cases of queries that spend a large amount of time just parsing types when converting the Presto Plan to Velox. This seems to be because it parses the same large Row Types that are used across many field accesses. Adding caching within a request shows a substantial decrease in the amount of time it takes to do the conversion. Notably, this helps with timeouts we're seeing making calls from the coordinator to create tasks on the Workers.
a239894 to
20e0ac1
Compare
|
@kevinwilfong I am adding Presto type parser support using Flex/Bison in Velox. facebookincubator/velox#7568 |
| velox::TypePtr parse(const std::string& text) const; | ||
|
|
||
| private: | ||
| mutable std::unordered_map<std::string, velox::TypePtr> cache_; |
There was a problem hiding this comment.
Any reason not to use the SimpleLRUCache from Velox?
We use that to cache file handles
https://github.com/facebookincubator/velox/blob/main/velox/connectors/hive/FileHandle.h#L62
|
I am worried that without a bound, the cache might grow too big in a production system. |
We've seen cases of queries that spend a large amount of time just parsing types when converting the Presto Plan to Velox. This seems to be because it parses the same large Row Types that are used across many field accesses.
Adding caching within a request shows a substantial decrease in the amount of time it takes to do the conversion.
Notably, this helps with timeouts we're seeing making calls from the coordinator to create tasks on the Workers.