Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid temporary allocations by improved APIs #580

Merged
merged 3 commits into from
Sep 6, 2024

Conversation

antiguru
Copy link
Member

@antiguru antiguru commented Sep 6, 2024

This PR aims at removing temporary allocations by:

  • Exposing a better addr function, i.e, addr_for that allocates the correct size.
  • Passing Vec<usize> instead of &[usize] whenever the receiver needs to take ownership. In this case, the caller can determine whether they can pass ownership or need to clone, instead of unconditionally forcing the receiver to allocating a new vector to take ownership.

@@ -64,7 +64,7 @@ impl<G: Scope> OperatorBuilder<G> {
where
P: ParallelizationContract<G::Timestamp, C> {

let connection = vec![Antichain::from_elem(Default::default()); self.builder.shape().outputs()];
let connection = (0..self.builder.shape().outputs()).map(|_| Antichain::from_elem(Default::default())).collect();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note for future selves, this is a common enough pattern (antichain with default summary) that it probably wants to be represented more compactly, perhaps as an enum variant. It seems that such enums around vectors are surprisingly cheap now (more than just the one additional variant allowed by e.g. Option<Vec<_>>).

@@ -97,6 +97,15 @@ where
{
fn name(&self) -> String { self.subgraph.borrow().name.clone() }
fn addr(&self) -> Vec<usize> { self.subgraph.borrow().path.clone() }

fn addr_for(&self, index: usize) -> Vec<usize> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should eventually add a comment here, or enrich the name to addr_for_child to be clear. Or addr_plus or .. idk something to be clear what is going on here. It's not a great mystery once you read the code, ofc. Don't worry about this if you don't want, and I can tidy in post.

Comment on lines 36 to 38
/// A sequence of scope identifiers describing the path from the worker root to the index in
/// this scope.
fn addr_for(&self, index: usize) -> Vec<usize>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh boy. You did have the docs and I read the files out of order. I'm sorry! But maybe "to the index" -> "to the child indicated by index"?

Copy link
Member

@frankmcsherry frankmcsherry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! I left some notes, but they are all minor.

Another thing we discussed offline, and I wanted to record here, is that perhaps we should be using Rc<[usize]> for the addresses, rather than Vec<usize>. We don't need to mutate anything, and there are moments that we have multiple copies of the same thing. Another potentially useful idea is to have the worker support a per-dataflow allocation for storing addresses, as they all have the same shared lifetime (the dataflow), and they never change. So, e.g. a map from operator id (integer) back to the path (integer sequence).

Anyhow, this seems great independently, and these ideas are potentially good follow-ons!

Signed-off-by: Moritz Hoffmann <[email protected]>
@antiguru antiguru merged commit adbaf2d into TimelyDataflow:master Sep 6, 2024
1 check passed
@antiguru antiguru deleted the temporary_allocations branch September 6, 2024 18:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants