Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RATIS-1925. Support Zero-Copy in GrpcClientProtocolService #990

Closed
wants to merge 15 commits into from

Conversation

duongkame
Copy link
Contributor

@duongkame duongkame commented Dec 13, 2023

What changes were proposed in this pull request?

Support Zero-Copy in GrpcClientProtocolService.
Zero copy applied to ordered and unordered API.

Incoming streams are managed by a component called RaftLogZeroCopyCleaner. It sorts the streams by associating the log index and observes the raft log cache eviction events to release the streams accordingly.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/RATIS-1925

How was this patch tested?

Added integration test for zero-copy.

@duongkame duongkame marked this pull request as draft December 13, 2023 06:24
@duongkame duongkame marked this pull request as ready for review December 15, 2023 06:39
@duongkame
Copy link
Contributor Author

@szetszwo can you have a look?

@szetszwo
Copy link
Contributor

The change assume

  • processClientRequest completes before evictCache

This is not necessarily true -- when the log can be fast but the StateMachine is slow, evictCache can happen before processClientRequest.

Also, it needs a separated map in RaftLogZeroCopyCleaner to manage all the ReferenceCountedClosable. It is not needed since we may release the buffer when the reference count becomes zero.

@szetszwo
Copy link
Contributor

@duongkame , I suggest to do the following:

  • Wrap the Closable with ReferenceCountedObject, retain() (i.e. refCount=1) and pass it around.

    1. When the request log is added to the cache, retain() (i.e. refCount=2).
    2. When the log is evicted from the cache, release().
    3. When the request is processed, release().
    4. When refCount becomes zero, close the Closable.

Note that (ii) and (iii) can happen in any order.

@duongkame
Copy link
Contributor Author

This is not necessarily true -- when the log can be fast but the StateMachine is slow, evictCache can happen before processClientRequest.

That's the reason why we need to keep the buffer sorted by log index, so if some entry is missed because evictCache happens before processClientRequest, it'll be caught by the next evictCache. All buffers ready to release can be easily addressed from a log index. This approach is way cleaner than managing each buffer separately.

Also, evictCache only happens when the cache capacity is about to be filled. There may be (very) few occurences of evictCache before processClientRequest returns, but not the majority.

Plus, as discussed as per RATIS-1979, in the case of stateMachineCachingEnabled, we may consider releasing zero-copy buffers much earlier than cache evict. The ability to release all zero copy buffers up to a certain log index seems to be crucial.

@szetszwo
Copy link
Contributor

That's the reason why we need to keep the buffer sorted by log index, so if some entry is missed because evictCache happens before processClientRequest, it'll be caught by the next evictCache.

Consider evictCache happens before processClientRequest below:

  1. evictCache(LogSegment(11, 20)) -> release(groupId, 11, 20)
  2. processClientRequest(reply(20)) -> watch(reply(20)) -> group.put(20)

How could 20 be released later on?

@szetszwo
Copy link
Contributor

... we may consider releasing zero-copy buffers much earlier than cache evict. The ability to release all zero copy buffers up to a certain log index seems to be crucial.

Let's don't mix the buffer releasing logic with the RaftLog index. Why the releasing buffer at index 10 has any relation to the buffer at index 11?

Use ref count described in #990 (comment) can solve the problem. Netty uses a similar approach which works very well.

@duongkame duongkame closed this Dec 23, 2023
@duongkame
Copy link
Contributor Author

duongkame commented Jan 4, 2024

@szetszwo My basic assumption is that hat when applied and follower indices reach a particular point, x, we can safely clean up any zero-copy for write/appendEntries up to x (0-x). I did that in the Ozone PoC and that works fine. But I'm not sure if it's correct universally for any use-cases.

I think your suggestion is a better approach. Only concern is that it may lead to a more complex change as we need to pass the ReferenceCountedObject from GRPC to raft internal. E.g. for ClientServiceProtocol (appendEntries) we may need to change the RaftServerAsynchronousProtocol.appendEntriesAsync because now it only accepts a protobuf object.
Anyway, I think it's viable. And I just saw #998 which should help this. Thanks.

Closed this PR for now. Will submit a new one for the new implementation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants