[Scheduling] Expand the ability of resource evaluator #2997

zhongchun · 2022-05-05T13:08:03Z

What do these changes do?

Resource evaluator is used to estimate and set resources required by subtasks. It can be an internal service or an external service. If it is an internal service, we can set default of adjustable resources for subtasks. If it is an external service, we should report the running result of the task to the external service, so that it can accurately predict the required resources of subtasks based on the historical running information, we call it HBO.
But it is not easy to implement a new resource evaluator and config it. This pr introduces an extension point of resource evaluator with which we could add a new evaluator as follows:

Inherit ResourceEvaluator and implement create, evaluate and report methods. The create method is to create a new resource evaluator instance. The evaluate method is to estimate and set required resources for the subtasks of a task stage. And this method must be implemented. The report method is to report the running information and result of the task. And this method does not have to be implemented.
Add default configs of the new evaluator needed in base_config.xml or its descendant files.
Set the resource_evaluator to choose a resource evaluator in base_config.xml when running a mars job.

Related issue number

Fixes #xxxx

Check code requirements

tests added / passed (if needed)
Ensure all linting tests pass, see here for how to run them

qinxuye

About resource evaluator, is there a plan about it? Apart from the plan, I left some comments.

mars/services/task/execution/mars/executor.py

mars/services/task/execution/mars/resource.py

mars/services/task/execution/mars/executor.py

zhongchun · 2022-05-06T03:58:50Z

About resource evaluator, is there a plan about it? Apart from the plan, I left some comments.

It is necessary to build an external resource recommendation service that can take advantage of historical job information if we want to fully utilize resource evaluator. Maybe we could add an extension library of mars later.

…ator as abstract class

qinxuye

LGTM

hekaisheng

LGTM

zhongchun added 2 commits May 5, 2022 20:54

Expand the ability to resource evaluator

a076f20

Fix comments

b0b1dc7

zhongchun requested review from wjsi, qinxuye and hekaisheng as code owners May 5, 2022 13:08

zhongchun requested review from fyrestone, Catch-Bull and chaokunyang May 5, 2022 13:08

zhongchun changed the title ~~Expand the ability to resource evaluator~~ [Scheduling] Expand the ability to resource evaluator May 5, 2022

Ignore F821

7792b88

qinxuye reviewed May 6, 2022

View reviewed changes

mars/services/task/execution/mars/executor.py Outdated Show resolved Hide resolved

mars/services/task/execution/mars/resource.py Outdated Show resolved Hide resolved

qinxuye added type: enhancement request mod: scheduling service labels May 6, 2022

qinxuye added this to In progress in Distributed via automation May 6, 2022

qinxuye added this to PR-In progress in v0.9 Release via automation May 6, 2022

qinxuye added this to the v0.9.0rc3 milestone May 6, 2022

fyrestone reviewed May 6, 2022

View reviewed changes

mars/services/task/execution/mars/executor.py Outdated Show resolved Hide resolved

zhongchun added 2 commits May 6, 2022 12:00

Add type hint for resource evaluator argument and make resource evalu…

094584a

…ator as abstract class

Remove creation check of resource evaluator

b78bae5

qinxuye approved these changes May 6, 2022

View reviewed changes

qinxuye changed the title ~~[Scheduling] Expand the ability to resource evaluator~~ [Scheduling] Expand the ability of resource evaluator May 6, 2022

hekaisheng approved these changes May 6, 2022

View reviewed changes

qinxuye merged commit 261eaaf into mars-project:master May 6, 2022

Distributed automation moved this from In progress to Done May 6, 2022

v0.9 Release automation moved this from PR-In progress to PR-Done May 6, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Scheduling] Expand the ability of resource evaluator #2997

[Scheduling] Expand the ability of resource evaluator #2997

zhongchun commented May 5, 2022

qinxuye left a comment

zhongchun commented May 6, 2022

qinxuye left a comment

hekaisheng left a comment

[Scheduling] Expand the ability of resource evaluator #2997

[Scheduling] Expand the ability of resource evaluator #2997

Conversation

zhongchun commented May 5, 2022

What do these changes do?

Related issue number

Check code requirements

qinxuye left a comment

Choose a reason for hiding this comment

zhongchun commented May 6, 2022

qinxuye left a comment

Choose a reason for hiding this comment

hekaisheng left a comment

Choose a reason for hiding this comment