[native-pos] Disable AsyncDataCache (thus prefetching)#20969
[native-pos] Disable AsyncDataCache (thus prefetching)#20969mbasmanova merged 1 commit intoprestodb:masterfrom
Conversation
.../src/main/java/com/facebook/presto/spark/execution/property/NativeExecutionSystemConfig.java
Outdated
Show resolved
Hide resolved
8046570 to
ab5c4fd
Compare
|
@Yuhta Please update the commit message (and PR description) with brief explanation of why this change is needed. |
c4d92e4 to
a37e18e
Compare
shrinidhijoshi
left a comment
There was a problem hiding this comment.
We tried to do a test offline but the internal tooling was slowing us down. As this is non-breaking and very specific to cpp process, I am approving this. We can come back and fix if something is missing
b6edb01 to
bc4c894
Compare
presto-iceberg/src/test/java/com/facebook/presto/iceberg/TestIcebergParquetMetadataCaching.java
Outdated
Show resolved
Hide resolved
mbasmanova
left a comment
There was a problem hiding this comment.
@Yuhta Use [native-pos] prefix for Presto-on-Spark specific changes.
mbasmanova
left a comment
There was a problem hiding this comment.
@Yuhta What is the definition of num-connector-io-threads property? Is it documented already? If not, would you document it?
mbasmanova
left a comment
There was a problem hiding this comment.
We see negative performance impact (5% ~ 30%) when split prefetch is enabled,
What is causing that?
bc4c894 to
9763999
Compare
mbasmanova
left a comment
There was a problem hiding this comment.
@Yuhta Since you have [native-pos] prefix, adding "for Presto on Spark" is redundant. Let's remove from both PR title and commit message.
.../src/main/java/com/facebook/presto/spark/execution/property/NativeExecutionSystemConfig.java
Outdated
Show resolved
Hide resolved
mbasmanova
left a comment
There was a problem hiding this comment.
We see negative performance impact (5% ~ 30%) when split prefetch is enabled,
@Yuhta What's the cause of performance impact?
1c416c2 to
896ec91
Compare
5e9be11 to
dbba536
Compare
dbba536 to
b246328
Compare
|
Codenotify: Notifying subscribers in CODENOTIFY files for diff 3d3ee49...ca341eb.
|
steveburnett
left a comment
There was a problem hiding this comment.
Thanks for the documentation! I added a few suggestions to make some statements more explicit and easier to understand. If I have made a suggestion that changes the meaning in a way you disagree with, please help me understand the meaning.
08a3a56 to
b2c3ec6
Compare
aaf03f6 to
27c84d6
Compare
There was a problem hiding this comment.
Are we missing some other configs re: SSD. I assume one needs to specify rott directory or something, no?
There was a problem hiding this comment.
You can use async data cache without SSD
There was a problem hiding this comment.
Got it, but if I want to use SSD I assume I need more properties than just size, no?
There was a problem hiding this comment.
Yes you will need to set SSD size in another property to indicate how much space is available
There was a problem hiding this comment.
consistency: async-data-cache-enabled vs. async-cache-ssd-gb - one says data-cache, the other just cache. Would be nice to fix.
There was a problem hiding this comment.
That will be a big risky change, rather not changing it here
There was a problem hiding this comment.
Understood. Perhaps, file a GitHub issue about this.
There was a problem hiding this comment.
Should these have default values? true, 0 and 30 (according to documentation).
There was a problem hiding this comment.
We set it to false here to disable it. The default values are for when it is not overwritten here.
There was a problem hiding this comment.
Would you explicitly set it or make a comment that default is 'false'?
private boolean asyncDataCacheEnabled = false;
There was a problem hiding this comment.
The default values are for when it is not overwritten here.
I'm not sure I understand.... The user reading the documentation would assume that default is what they get without changing anything. It looks like actual default is false, no?
There was a problem hiding this comment.
Setting it explicitly violates a lint rule. I will put a comment.
The default is you don't overwrite it in code (i.e., instantiate a plain vanilla PrestoServer). Here POS is overwriting it explicitly so the defaults no longer applies. Do you have a place I can write POS defaults?
There was a problem hiding this comment.
@mbasmanova I added the comment and add the default value for POS in the documentation in addition to the vanilla Presto
27c84d6 to
392a8dc
Compare
mbasmanova
left a comment
There was a problem hiding this comment.
Accepting to unblock, but I'm very confused by how these properties are set and what the defaults are. It would be nice to clear this up in a follow up.
85fb23c to
3fb5382
Compare
|
@Yuhta , @mbasmanova : We noticed this negative performance impact internally for Prestissimo as well. Did you'll only see it for POS, but not Prestissimo ? (Just curious as you are only turning off POS configs) We should turn it off in https://github.com/prestodb/presto/blob/master/presto-native-execution/src/test/java/com/facebook/presto/nativeworker/NativeQueryRunnerUtils.java#L33 config as well. wdyt ? |
|
@aditi-pandit We have different deployments of Prestissimo and some do turn the option off. You can use the worker config file to overwrite the system config. POS needs special treatment because the configuration is a little bit weird and worker config does not work out of box there. |
We see negative performance impact (5% ~ 30%) when split prefetch is enabled, hence adding these config options to disable it.
3fb5382 to
ca341eb
Compare
We see negative performance impact (5% ~ 30%) when split prefetch is enabled,
hence adding these config options to disable it.