Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FTE task scheduling improvements #23562

Open
wants to merge 23 commits into
base: master
Choose a base branch
from

Conversation

losipiuk
Copy link
Member

@losipiuk losipiuk commented Sep 25, 2024

Description

Includes #23585

Additional context and related issues

Release notes

(x) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
( ) Release notes are required, with the following suggested text:

# Section
* Fix some things. ({issue}`issuenumber`)

@cla-bot cla-bot bot added the cla-signed label Sep 25, 2024
@github-actions github-actions bot added the docs label Sep 25, 2024
@losipiuk losipiuk force-pushed the lukaszos/move-more-logic-into-preschedulingtaskcontexts-a1f30f branch 3 times, most recently from 376b9f6 to 41e2091 Compare September 26, 2024 16:20
@losipiuk losipiuk changed the title [WIP] FTE task scheduling improvements FTE task scheduling improvements Sep 26, 2024
@losipiuk losipiuk force-pushed the lukaszos/move-more-logic-into-preschedulingtaskcontexts-a1f30f branch from 41e2091 to 670e4bc Compare September 26, 2024 17:54
@losipiuk losipiuk marked this pull request as ready for review September 27, 2024 09:22
Copy link
Contributor

@dekimir dekimir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First batch of comments, more to come...

If task suggests specific node for execution, but can be run elswhere,
another node will be picked if designated node is out of resources for
prolonged time.
@losipiuk losipiuk force-pushed the lukaszos/move-more-logic-into-preschedulingtaskcontexts-a1f30f branch from 670e4bc to f31006c Compare September 30, 2024 11:25
@losipiuk
Copy link
Member Author

Addressed/replied

@losipiuk losipiuk force-pushed the lukaszos/move-more-logic-into-preschedulingtaskcontexts-a1f30f branch 2 times, most recently from fba74f5 to 785123b Compare September 30, 2024 14:31
Copy link
Contributor

@dekimir dekimir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some prior feedback seems still unaddressed. Plus one new bit about clarifying verify messages.

We used list there but effectively it always had 0 or 1 element.
Benchmark                                                            (leasesCount)  (nodeCount)  (preferredNodes)  (specificCatalogs)  Mode  Cnt        Score       Error  Units
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations            100           64             false               false  avgt   20   413451.269 ±  1397.575  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations            100           64             false                true  avgt   20   581265.632 ± 11256.748  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations            100           64              true               false  avgt   20    94551.139 ±   433.725  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations            100           64              true                true  avgt   20   254866.254 ±  2692.992  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations           1000           64             false               false  avgt   20  3791936.468 ± 39483.935  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations           1000           64             false                true  avgt   20  5304851.875 ± 59196.050  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations           1000           64              true               false  avgt   20   610103.852 ±  1935.714  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations           1000           64              true                true  avgt   20  2164744.556 ± 40551.285  ns/op
@losipiuk losipiuk force-pushed the lukaszos/move-more-logic-into-preschedulingtaskcontexts-a1f30f branch from 785123b to 24762a3 Compare October 2, 2024 16:35
Benchmark                                                            (leasesCount)  (nodeCount)  (preferredNodes)  (specificCatalogs)  Mode  Cnt        Score        Error  Units
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations            100           64             false               false  avgt   20   419293.594 ±   1430.376  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations            100           64             false                true  avgt   20   599678.686 ±  20983.678  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations            100           64              true               false  avgt   20   104018.298 ±   2182.237  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations            100           64              true                true  avgt   20   259052.723 ±   2338.331  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations           1000           64             false               false  avgt   20  3923838.021 ± 109646.795  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations           1000           64             false                true  avgt   20  5540916.250 ± 195298.291  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations           1000           64              true               false  avgt   20   590766.769 ±   2964.269  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations           1000           64              true                true  avgt   20  2171346.868 ±  19084.670  ns/op
Benchmark                                                            (leasesCount)  (nodeCount)  (preferredNodes)  (specificCatalogs)  Mode  Cnt        Score       Error  Units
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations            100           64             false               false  avgt   20   261361.627 ±   809.996  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations            100           64             false                true  avgt   20   409676.930 ±  2075.109  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations            100           64              true               false  avgt   20   114465.035 ±  1527.033  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations            100           64              true                true  avgt   20   264062.856 ±  1285.936  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations           1000           64             false               false  avgt   20  1068282.563 ±  3537.391  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations           1000           64             false                true  avgt   20  2587284.647 ± 32666.139  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations           1000           64              true               false  avgt   20   636299.146 ±  2357.665  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations           1000           64              true                true  avgt   20  2216356.698 ± 92424.342  ns/op
Benchmark                                                            (leasesCount)  (nodeCount)  (preferredNodes)  (specificCatalogs)  Mode  Cnt        Score      Error  Units
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations            100           64             false               false  avgt   20   267082.973 ± 5137.556  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations            100           64             false                true  avgt   20   292547.745 ± 1092.674  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations            100           64              true               false  avgt   20   134841.540 ± 1293.610  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations            100           64              true                true  avgt   20   151523.257 ± 2924.575  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations           1000           64             false               false  avgt   20  1113035.959 ± 3771.173  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations           1000           64             false                true  avgt   20  1085887.350 ± 5215.795  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations           1000           64              true               false  avgt   20   804516.131 ± 2892.900  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations           1000           64              true                true  avgt   20   676353.953 ± 2750.783  ns/op
Benchmark                                                            (leasesCount)  (nodeCount)  (preferredNodes)  (specificCatalogs)  Mode  Cnt       Score       Error  Units
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations            100           64             false               false  avgt   20  236344.792 ±   710.618  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations            100           64             false                true  avgt   20  256848.350 ±  8301.158  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations            100           64              true               false  avgt   20  141538.765 ±  4917.644  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations            100           64              true                true  avgt   20  157876.417 ±  4639.070  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations           1000           64             false               false  avgt   20  744818.759 ±  6280.880  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations           1000           64             false                true  avgt   20  619565.616 ±  5323.407  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations           1000           64              true               false  avgt   20  822423.150 ± 28199.269  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations           1000           64              true                true  avgt   20  699053.103 ± 32292.766  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations          10000           64             false               false  avgt   20  5876228.751 ±  39982.549  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations          10000           64             false                true  avgt   20  4412155.854 ± 136046.311  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations          10000           64              true               false  avgt   20  7612951.146 ± 139080.899  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations          10000           64              true                true  avgt   20  6225676.438 ± 146037.043  ns/op
Pass information about who is requesting a node to
BinPackingNodeAllocatorService. The requests from different requesters
are processed in a fair manner.

That way will be able to increase number of tasks waiting for node on the
EventDrivenFaultTolerantQueryScheduler side without starving tasks
from one query with another query.

Benchmark                                                            (leasesCount)  (nodeCount)  (preferredNodes)  (requestersCount)  (specificCatalogs)  Mode  Cnt        Score        Error  Units
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations            100           64             false                  1               false  avgt   20   236219.794 ±    901.521  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations            100           64             false                  1                true  avgt   20   246114.804 ±   1103.992  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations            100           64             false                 10               false  avgt   20   239386.380 ±   1460.258  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations            100           64             false                 10                true  avgt   20   254222.950 ±   6733.860  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations            100           64             false                100               false  avgt   20   248875.061 ±   4511.145  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations            100           64             false                100                true  avgt   20   268068.940 ±  18156.268  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations            100           64              true                  1               false  avgt   20   141119.168 ±   1109.988  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations            100           64              true                  1                true  avgt   20   157305.123 ±    693.511  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations            100           64              true                 10               false  avgt   20   147334.146 ±  10441.633  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations            100           64              true                 10                true  avgt   20   159163.343 ±   1786.215  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations            100           64              true                100               false  avgt   20   146048.896 ±   1444.922  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations            100           64              true                100                true  avgt   20   161208.952 ±   3715.410  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations           1000           64             false                  1               false  avgt   20   764475.610 ±   8732.411  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations           1000           64             false                  1                true  avgt   20   623774.929 ±  11331.334  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations           1000           64             false                 10               false  avgt   20   762493.286 ±  18650.893  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations           1000           64             false                 10                true  avgt   20   628688.085 ±  14672.321  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations           1000           64             false                100               false  avgt   20   766290.547 ±   7830.801  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations           1000           64             false                100                true  avgt   20   632645.998 ±   6778.130  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations           1000           64              true                  1               false  avgt   20   837527.999 ±   8708.213  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations           1000           64              true                  1                true  avgt   20   708984.716 ±  16561.732  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations           1000           64              true                 10               false  avgt   20   851645.901 ±  31810.762  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations           1000           64              true                 10                true  avgt   20   722967.207 ±  33678.655  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations           1000           64              true                100               false  avgt   20   841761.599 ±  10801.442  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations           1000           64              true                100                true  avgt   20   716503.975 ±  10194.003  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations          10000           64             false                  1               false  avgt   20  6019031.583 ±  53365.260  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations          10000           64             false                  1                true  avgt   20  4552106.459 ± 101192.792  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations          10000           64             false                 10               false  avgt   20  6056989.127 ±  54721.687  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations          10000           64             false                 10                true  avgt   20  4604220.906 ± 156878.917  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations          10000           64             false                100               false  avgt   20  6107233.313 ± 170989.749  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations          10000           64             false                100                true  avgt   20  4713464.926 ± 197102.157  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations          10000           64              true                  1               false  avgt   20  7769610.040 ± 205044.664  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations          10000           64              true                  1                true  avgt   20  6366632.981 ± 188204.229  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations          10000           64              true                 10               false  avgt   20  7799547.770 ± 266060.358  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations          10000           64              true                 10                true  avgt   20  6582084.562 ± 174761.752  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations          10000           64              true                100               false  avgt   20  8460711.729 ± 221220.668  ns/op
BenchmarkBinPackingNodeAllocator.benchmarkProcessPendingAllocations          10000           64              true                100                true  avgt   20  7151663.644 ± 238792.415  ns/op
@losipiuk losipiuk force-pushed the lukaszos/move-more-logic-into-preschedulingtaskcontexts-a1f30f branch from 24762a3 to 9e2ff44 Compare October 2, 2024 16:42
@losipiuk losipiuk requested a review from dekimir October 2, 2024 16:42
@losipiuk
Copy link
Member Author

losipiuk commented Oct 2, 2024

AC - hopefully did not miss naything

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

4 participants