Skip to content

Support multi coordinator while serving query state info#16163

Merged
tdcmeehan merged 1 commit intoprestodb:masterfrom
abhiseksaikia:asaikia_disagg_pqs
Jun 17, 2021
Merged

Support multi coordinator while serving query state info#16163
tdcmeehan merged 1 commit intoprestodb:masterfrom
abhiseksaikia:asaikia_disagg_pqs

Conversation

@abhiseksaikia
Copy link
Contributor

@abhiseksaikia abhiseksaikia commented May 26, 2021

Currently /v1/queryState/* endpoints serve query state info available within a particular coordinator. The change here is to enable these endpoints to retrieve cluster wide query state info.

Test plan - Verified though unit test and TpchQueryRunner

== RELEASE NOTES ==

General Changes
* Support multi coordinator for /v1/queryState/ endpoint

@abhiseksaikia abhiseksaikia force-pushed the asaikia_disagg_pqs branch 3 times, most recently from 8770903 to 3adb730 Compare May 26, 2021 20:45
@abhiseksaikia abhiseksaikia changed the title [WIP] Support multi coordinator while serving query state info Support multi coordinator while serving query state info May 27, 2021
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, given the asynchronous nature of the work in this method, it makes sense to make it completely async.

Look at examples in the codebase for:

  • Using @Suspended AsyncResponse asyncResponse to asynchronously return results in the endpoint.
  • Futures.transform / Futures.transformAsync to avoid blocking on the results of the futures.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes a lot of sense to make it non blocking and leverage asyncResponse here. I have made the changes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a pure refactoring or are there new tests introduced as part of it?

Also, I would prefer to just duplicate what's needed between the two tests. There's no need IMO to introduce an inheritance hierarchy.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree unless it's a shared tests that we want to run, lets not introduce inheritance in tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I did this to leverage same multi coordinator setup and teardown for both tests, but I agree we should avoid inheritance for the tests. I will make the change and may be leverage util/rules to share common stuff between the tests.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of pulling all info from RM, we could just use the other coordinator query info from RM and merge it with the existing query info. Not a must but a good to have optimization.

Copy link
Contributor Author

@abhiseksaikia abhiseksaikia Jun 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can do this optimization. Only caveat is that we need to do back and forth GET-POST conversion while proxying request between coordinator and RM. I looked at the changes required for this and this seems like a relatively bigger one to add on top of the existing changes. I will create a separate PR for addressing this optimization.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree unless it's a shared tests that we want to run, lets not introduce inheritance in tests.

Copy link
Contributor

@tdcmeehan tdcmeehan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly good!

{
try {
Set<InternalNode> coordinators = internalNodeManager.getCoordinators();
List<ListenableFuture<List<QueryStateInfo>>> queryStateInfoFutureList = new ArrayList<>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general we use immutable data structures. Here I would either stream through the coordinators, perform the action and collect() it into an toImmutableList(), or use an ImmutableList.Builder

Copy link
Contributor Author

@abhiseksaikia abhiseksaikia Jun 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have to get used to immutable data structures :) I made the changes to use ImmutableList.Builder

private static final JsonCodec<List<QueryStateInfo>> JSON_CODEC = JsonCodec.listJsonCodec(QueryStateInfo.class);
private final ResourceManagerClusterStateProvider clusterStateProvider;
private final InternalNodeManager internalNodeManager;
private ListeningExecutorService executor;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make this final

public class DistributedQueryInfoResource
{
private static final Logger log = Logger.get(DistributedQueryInfoResource.class);
private static final JsonCodec<List<QueryStateInfo>> JSON_CODEC = JsonCodec.listJsonCodec(QueryStateInfo.class);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typically this is wired in via Guice. See existing codecs defined in ResourceManagerModule.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for checking this! I missed this, it makes sense to use Guice here. I made the changes.

@abhiseksaikia abhiseksaikia force-pushed the asaikia_disagg_pqs branch 3 times, most recently from 0dd8aa8 to 1083051 Compare June 4, 2021 21:47
@abhiseksaikia abhiseksaikia requested a review from tdcmeehan June 4, 2021 21:49
Copy link
Contributor

@tdcmeehan tdcmeehan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One question, otherwise LGTM

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the logic here is simple enough, and not proxy-like enough, that it should just be implemented inline in this class. i.e. let's remove the changes to add this to the RM proxy, as it's not really proxying, it's just performing the request.

Copy link
Contributor Author

@abhiseksaikia abhiseksaikia Jun 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense since this class is meant to be used for proxy like stuff. I was using it like a client helper earlier. I made the change.

@abhiseksaikia abhiseksaikia force-pushed the asaikia_disagg_pqs branch 5 times, most recently from 310f2aa to 948ad88 Compare June 9, 2021 02:56
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's technically nothing here that is specific to multi-coordinator setup or resource manager. Shall we rename this class to something like QueryExecutionClientUtil? I would also suggest making the HTTP client a parameter, that way we can just static import these methods in the tests that use them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was earlier named as a test helper to be used for tests under resource manager package, but it makes sense to move these to specific utils and we can use it in other tests as well if required. I made the changes.

@abhiseksaikia abhiseksaikia force-pushed the asaikia_disagg_pqs branch 3 times, most recently from 93a302f to 9ed08fd Compare June 10, 2021 15:31
@tdcmeehan tdcmeehan self-requested a review June 14, 2021 15:32
Currently /v1/queryState/* endpoints serve query state info available within a particular coordinator. The change here is to enable these endpoints to retrieve cluster wide query state info.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants