Skip to content

Dispatcher#95

Merged
dain merged 16 commits intotrinodb:masterfrom
dain:dispatcher
May 16, 2019
Merged

Dispatcher#95
dain merged 16 commits intotrinodb:masterfrom
dain:dispatcher

Conversation

@dain
Copy link
Copy Markdown
Member

@dain dain commented Jan 29, 2019

Move queued phase of query from QueryManager to a new dispatcher service. This
change is in preparation for adding a optional new server that moves the queue
phase to a separate process.

Ref prestodb/presto#12176

@cla-bot cla-bot bot added the cla-signed label Jan 29, 2019
@dain
Copy link
Copy Markdown
Member Author

dain commented Jan 29, 2019

Remove system startup minimum worker requirement

is this removing a functionality or moving it elsewhere?

@findepi I added a new system which waits for a minimum number of workers after queuing is complete and before execution starts. The min worker count is configured with query-manager.required-workers-max-wait. This is removes the old system which failed queries until the cluster started.

@sopel39
Copy link
Copy Markdown
Member

sopel39 commented Jan 29, 2019

The min worker count is configured with query-manager.required-workers-max-wait

So the semantics is that queries will be queued until there is min number of workers. What if workers leave the cluster after the cluster is started? Will the new queries be on hold too? This is useful in case of downscaling cluster to 0 nodes for idle periods in cloud environment.

@dain
Copy link
Copy Markdown
Member Author

dain commented Jan 30, 2019

The min worker count is configured with query-manager.required-workers-max-wait

So the semantics is that queries will be queued until there is min number of workers. What if workers leave the cluster after the cluster is started? Will the new queries be on hold too? This is useful in case of downscaling cluster to 0 nodes for idle periods in cloud environment.

Yes. The new behavior is designed for cloud environments that scale down to zero when there is no traffic and then scale back up when there are queries. BTW, that feature is already checked in, so you can use it now.

@dain dain self-assigned this Jan 30, 2019
@dain dain force-pushed the dispatcher branch 2 times, most recently from 69c5323 to 5bc1cd7 Compare February 5, 2019 04:21
Comment thread presto-main/src/main/java/io/prestosql/execution/QueryState.java Outdated
@electrum
Copy link
Copy Markdown
Member

First three commits look good

@raghavsethi
Copy link
Copy Markdown
Member

Can we add a commit prior that adds a specific error code (not generic insufficient resources), so that clients can deal with that situation properly?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remote Optional -> Remove Optional

@dain dain mentioned this pull request Mar 6, 2019
@dain dain force-pushed the dispatcher branch 5 times, most recently from d461996 to c512833 Compare March 12, 2019 05:37
Copy link
Copy Markdown
Member

@raghavsethi raghavsethi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These commits LGTM

Add query id to NoSuchElementException
Remove system startup minimum worker requirement
Add DISPATCHING query states

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

requireNonNull for queryManagerConfig

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^

Copy link
Copy Markdown
Member

@raghavsethi raghavsethi Mar 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This export has no effect. I looked in JMX and the counter does not appear.

@dain dain force-pushed the dispatcher branch 2 times, most recently from fe558b1 to fdbc795 Compare March 15, 2019 23:08
private final SessionContext sessionContext;
private final DispatchManager dispatchManager;
private final QueryId queryId;
private final String slug = format("%016x%016x", ThreadLocalRandom.current().nextLong(), ThreadLocalRandom.current().nextLong());
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


QueryInfo getQueryInfo();

String getSlug();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return queryId;
}

public String getSlug()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Member

@raghavsethi raghavsethi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following commits LGTM:

Remove system startup minimum worker requirement
Add DISPATCHING query states
Split out queued phase from QueryManager
Add query id to NoSuchElementException
Improve query event stats for immediately failed queries
Remove Optional from QueryStateMachine resourceGroup
Change local dispatch to finish immediately after query submission

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my own reference: here's the bug from yesterday.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit? Edge case for static import?

Copy link
Copy Markdown
Member

@raghavsethi raghavsethi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following commits look good:

Remove Optional from QueryStateMachine resourceGroup
Simplify DispatchInfo construction
Fix handling of failures during query creation
Simplify query manager stats tracking

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: If you named these more specifically (eg queuedDispatchInfo), you could static import.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I go back and forth on this. In this case I like the FQN.

Comment thread presto-main/src/main/java/io/prestosql/execution/QueryManagerStats.java Outdated
Copy link
Copy Markdown
Member

@raghavsethi raghavsethi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following commits LGTM % nits:

Rename SqlQueryManagerStats to QueryManagerStats
Cleanup dispatcher executor management

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we moving to the closer vs the annotation pattern?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what you mean. This class uses a closer and an @PreDestroy

Comment thread presto-main/src/main/java/io/prestosql/dispatcher/DispatchExecutor.java Outdated
Comment thread presto-main/src/main/java/io/prestosql/dispatcher/DispatchExecutor.java Outdated
Copy link
Copy Markdown
Member

@raghavsethi raghavsethi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following commits look good:

Fixup! Cleanup dispatcher executor management
Remove bad call to recordHeartbeat in dispatch query
Fix visibility of failed queries in LocalDispatchQuery
Make protocol Query public
Catch errors from LocalDispatchQuery querySubmitter

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious: Why this is called LocalDispatchQuery ?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

6 participants