Skip to content

Improve scale writer creation based on producer buffer#14193

Merged
wenleix merged 1 commit intoprestodb:masterfrom
wenleix:scale_writer
Mar 6, 2020
Merged

Improve scale writer creation based on producer buffer#14193
wenleix merged 1 commit intoprestodb:masterfrom
wenleix:scale_writer

Conversation

@wenleix
Copy link
Contributor

@wenleix wenleix commented Mar 2, 2020

We observe two cases in production cause current scale writer heuristics
not able to scale:

  1. When there is skew on the producer side and more than half of the
    producer buffer is not overutilized.
  2. When grouped execution is enabled and each bucket doesn't make the
    buffer to be overutilized.

This commit tries to improve the situation by considering overall
producer buffer utilization when deciding scale the writers.

== RELEASE NOTES ==

General Changes
* Improve the scale writer heuristics by considering overall producer buffer utilization. This can be enabled by using the session property `optimized_scale_writer_producer_buffer` and the configuration property `optimized-scale-writer-producer-buffer`.

@wenleix wenleix changed the title Improve scale writer creation based on producer buffer [WIP] Improve scale writer creation based on producer buffer Mar 2, 2020
@wenleix wenleix force-pushed the scale_writer branch 4 times, most recently from 46691e3 to fc453bb Compare March 3, 2020 06:38
@wenleix wenleix changed the title [WIP] Improve scale writer creation based on producer buffer Improve scale writer creation based on producer buffer Mar 3, 2020
@wenleix wenleix requested review from arhimondr and rschlussel March 3, 2020 18:54
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little bit confused about this loging. I was thinking more about changing the condition to something like if 50% overutilized OR 90% non empty - add more writer. Thoughts?

Copy link
Contributor Author

@wenleix wenleix Mar 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arhimondr : Addressed the comments per offline discussion. Basically now we add more writers if the overall buffer utilization is more than the number of writers.

We don't need to worry about accidentally open too many writers since overall writtenBytes has to be more than writerMinSizeBytes * scheduledNodes.size()

            writtenBytes >= writerMinSizeBytes * scheduledNodes.size()) 

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unused now

We observe two cases in production cause current scale writer heuristics
not able to scale:

1. When there is skew on the producer side and more than half of the
producer buffer is not overutilized.
2. When grouped execution is enabled and each bucket doesn't make the
buffer to be overutilized.

This commit tries to improve the situation by considering overall
producer buffer utilization when deciding scale the writers.
@wenleix wenleix merged commit 2ab2543 into prestodb:master Mar 6, 2020
@wenleix wenleix deleted the scale_writer branch March 6, 2020 05:51
@caithagoras caithagoras mentioned this pull request Mar 29, 2020
9 tasks
@caithagoras caithagoras mentioned this pull request May 4, 2020
34 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants