Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add poller autoscaler #1184

Merged
merged 3 commits into from
Aug 26, 2022
Merged

Conversation

shijiesheng
Copy link
Member

@shijiesheng shijiesheng commented Aug 20, 2022

What changed?

  • added autoscaler package with interface and implementation of linearRecommender
  • added pollerAutoscaler implementation

Why?

Pollers constantly poll the cadence-frontend for tasks even though there are no tasks.
To reduce unnecessary polling, an autoscaler is used to limit the number of concurrent polls. This PR addresses the implementation of polleraAutoscaler and there will be a follow-up PR to add the autoscaler on pollers.

As for the design of autoscaler, a resizable semaphore is used to limit concurrency.

How did you test it?

unit tests

Potential risks

@coveralls
Copy link

coveralls commented Aug 26, 2022

Pull Request Test Coverage Report for Build 0182dc0d-f5a2-4033-bffd-2b4b85a25e34

  • 107 of 110 (97.27%) changed or added relevant lines in 3 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.2%) to 64.063%

Changes Missing Coverage Covered Lines Changed/Added Lines %
internal/common/autoscaler/recommender.go 21 22 95.45%
internal/internal_poller_autoscaler.go 80 82 97.56%
Totals Coverage Status
Change from base Build 01826fe7-7f06-47b4-bbe2-421af8f4fde0: 0.2%
Covered Lines: 12539
Relevant Lines: 19573

💛 - Coveralls

want: Resource(12),
},
{
name: "over utilized, scale up",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicates test above?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

want: Resource(10),
},
{
name: "under utilized, scale down",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth adding a test where usage is 0?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

go func() {
for {
select {
case <-p.ctx.Done():

This comment was marked as resolved.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch!

}()
}
wg.Wait()
time.Sleep(time.Millisecond * 500)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of sleeping yourself, better would be to use assert.Eventually() - less chance of race condition and should test faster.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL. Thank you for the advice. I've added optional hook functions in the pollerAutoscaler implementation to simply the testing.


type (
// Resource is the unit of scalable resources
Resource uint
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be called ResourceCount, or something that signifies that it's an int / unit? The way "Resource" reads to me makes it sound like its more than an int.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initially I was more ambitious in thinking a single autoscaler can handle multiple resources. But it doesn't seem we will need it any time soon. So yeah, I've changed it to ResourceUnit.

Resource uint

// UsageInMilli is the unit of Resource usage times 1000
UsageInMilli uint64
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And just to clarify the naming of this - its got nothing to do with milliseconds, right? As far as I can tell, its just resource usage * 1000. Milli in the name makes it sound like milliseconds, but maybe thats just me. Would this not work as a float?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've changed to MilliUsage similar to how kubernetes autoscaler named Millicores. https://sourcegraph.com/github.com/kubernetes/autoscaler@master/-/blob/vertical-pod-autoscaler/pkg/recommender/logic/recommender.go#L27:5

I was trying to avoid using float to avoid negative numbers from the beginning. Otherwise, I'll need to handle negative cases in a lot of places unnecessarily.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like a valid reason not to use floats - I definitely prefer simplicity in this case too. How about calling it Millicores or MilliSomething?

Copy link
Contributor

@ZackLK ZackLK left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks for making those changes!

@shijiesheng shijiesheng merged commit d94db89 into cadence-workflow:master Aug 26, 2022
maeer pushed a commit to maeer/cadence-client that referenced this pull request Jun 29, 2023
* add poller autoscaler

* add tests

* address comments
maeer pushed a commit to maeer/cadence-client that referenced this pull request Jun 29, 2023
* add poller autoscaler

* add tests

* address comments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants