Skip to content

Add scale-writers.max-nodes-count property to limit writer scaling#15877

Closed
radek-kondziolka wants to merge 89 commits intotrinodb:masterfrom
radek-kondziolka:rk/add_upper_limit_on_writer_scaling
Closed

Add scale-writers.max-nodes-count property to limit writer scaling#15877
radek-kondziolka wants to merge 89 commits intotrinodb:masterfrom
radek-kondziolka:rk/add_upper_limit_on_writer_scaling

Conversation

@radek-kondziolka
Copy link
Copy Markdown
Contributor

@radek-kondziolka radek-kondziolka commented Jan 27, 2023

Description

Added option task.scale-writers.max-nodes-count to TaskManagerConfig and a session option to make it possible to limit writer scaling by number of nodes that can be used by ScaledWriterScheduler.

Release notes

( ) This is not user-visible or docs only and no release notes are required.
(x ) Release notes are required, please propose a release note for me.
( ) Release notes are required, with the following suggested text:

@cla-bot cla-bot bot added the cla-signed label Jan 27, 2023
@radek-kondziolka radek-kondziolka force-pushed the rk/add_upper_limit_on_writer_scaling branch 2 times, most recently from 1c426f6 to 0341a57 Compare January 27, 2023 12:05
@radek-kondziolka radek-kondziolka changed the title Add max-fraction-writer-nodes property to limit writer scaling Add scale-writers.max-nodes-count property to limit writer scaling Jan 27, 2023
@radek-kondziolka radek-kondziolka force-pushed the rk/add_upper_limit_on_writer_scaling branch 4 times, most recently from 6d86f08 to 13bb374 Compare January 27, 2023 15:31
@radek-kondziolka radek-kondziolka marked this pull request as ready for review January 27, 2023 16:12
Copy link
Copy Markdown
Member

@gaurav8297 gaurav8297 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly LGTM! There are some tests failing

Adaptively set the hash partition count for
a query. It increases the concurrency on
the large cluster when there are many small
queries.

Benchmarks:

We ran concurrency benchmarks on a cluster size
of 20 with queries from the TPC-DS sf10 set
submitted concurrently from 64 threads.

Before: 17578 queries/hour
After: 22782 queries/hour (30% increase)
@radek-kondziolka radek-kondziolka force-pushed the rk/add_upper_limit_on_writer_scaling branch from 13bb374 to cc1d7f1 Compare January 30, 2023 09:34
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Node limit should also apply task.scale-writers.enabled when preferred partitioning insert mode is used. Please refer to @gaurav8297 for help

@radek-kondziolka radek-kondziolka force-pushed the rk/add_upper_limit_on_writer_scaling branch from cc1d7f1 to e5ae293 Compare January 31, 2023 09:40
findepi and others added 20 commits February 1, 2023 12:12
Usually "a dispatcher" is the entity defining tasks to be done rather
than the entity doing an individual task.

The change has the additional benefit that the matrix task checking
individual commits has a shorter name, so the matrix parameter is
slightly more visible in the GitHub actions UI.
Upcoming 1.2.0 version doesn't delete data in RESTSessionCatalog.dropTable method.
The unsupported types are handled at the end of the `toWriteMapping`
method.
This isn't dissimilar to how we use constants in other temporal types.
Previously `date` would coerce to `timestamp(3) with time zone`, now it
gets coerced to `timestamp(0) with time zone`.
This is sometimes useful, so let's always do it.
This is for code consistency reasons. Most of the places where
`ExpressionMatcher` is constructed already goes through
`PlanMatchPattern.expression` factory method.
`UnwrapCastInComparison` works on any nested comparison expressions.
`StatementAnalyzer.Visitor#analyzeRowFilter` captures necessary
coercions for filters that are not of boolean type (e.g. `null` literal
being of `unknown` type). Before the change, the coercions where not
getting applied though.
When predicate is not of boolean type (e.g. `null` literal being of
`unknown` type), the `Analysis` holds a coercion that should be applied
to it.  Before the change, the coercion was not getting applied though.

Co-authored-by: Assaf Bern <assaf.bern@starburstdata.com>
@radek-kondziolka radek-kondziolka force-pushed the rk/add_upper_limit_on_writer_scaling branch from e5ae293 to 3a64db9 Compare February 1, 2023 12:29
@github-actions github-actions bot added docs jdbc Relates to Trino JDBC driver release-notes tests:hive labels Feb 1, 2023
Added option task.scale-writers.max-nodes-count to TaskManagerConfig
and a session option to make it possible to limit writer scaling
by maximal number of nodes that can be used by ScaledWriterScheduler.
@radek-kondziolka radek-kondziolka force-pushed the rk/add_upper_limit_on_writer_scaling branch from 3a64db9 to ddcbc48 Compare February 1, 2023 14:41
@sopel39
Copy link
Copy Markdown
Member

sopel39 commented Feb 6, 2023

@radek-starburst please rebase

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla-signed docs jdbc Relates to Trino JDBC driver release-notes

Development

Successfully merging this pull request may close these issues.