Skip to content

Make sure ConsistentHashingNodeProvider returns unique candidates#17228

Merged
rongrong merged 1 commit intoprestodb:masterfrom
rongrong:consistent-hashing
Feb 7, 2022
Merged

Make sure ConsistentHashingNodeProvider returns unique candidates#17228
rongrong merged 1 commit intoprestodb:masterfrom
rongrong:consistent-hashing

Conversation

@rongrong
Copy link
Copy Markdown
Contributor

@rongrong rongrong commented Jan 26, 2022

Test plan - unit test

== RELEASE NOTE ==
* Fix a bug where cache performance might be affected when ``CONSISTENT_HASHING`` is used as the scheduling strategy.

Copy link
Copy Markdown
Contributor

@rschlussel rschlussel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see there's no release note. what was the motivation for this change? Does it fix a bug or other issue?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of building the node list here, why not add to the unique set no matter what, and then convert the set to an immutable list before returning it?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ordering is important here.

@rongrong
Copy link
Copy Markdown
Contributor Author

I see there's no release note. what was the motivation for this change? Does it fix a bug or other issue?

Welcome back! Yes, it's fixing a bug. The method should return count number of candidates. If they happened to be not unique, it's not satisfying the requirement. Let me check whether we've already had release notes for the feature.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be an infinite loop? e.g., when num of workers is < count? Add a test for this case?

It is also possible that this will take a lot of iterations to find next unique candidate if the hash function happens to return a visited virtual node for a lot of times.
How about searching it in the candidates map to find the next unique candidate instead of relying on a random function? Specifically, if we find a duplicate node, we get its iterator/pointer in the map and check the following nodes according to their order in the map until we find a new node or have exhausted all node.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I think the assumption here is that count is small. 2 is used right now. But I agree we should add a check. I think in any realistic scenario the probability of continuously hitting the same virtual nodes is pretty low but that's also assuming count is small. Next physical node on the ring seems better.

@rongrong
Copy link
Copy Markdown
Contributor Author

rongrong commented Feb 2, 2022

@rschlussel @beinan Can you guys take a look? Thanks!

@rongrong rongrong force-pushed the consistent-hashing branch 2 times, most recently from 000d433 to f425728 Compare February 3, 2022 01:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants