Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Precise parallel relation size #57

Closed
StarGazerM opened this issue Dec 6, 2024 · 4 comments · Fixed by #62
Closed

Precise parallel relation size #57

StarGazerM opened this issue Dec 6, 2024 · 4 comments · Fixed by #62

Comments

@StarGazerM
Copy link

Hi:

Can ask for the reason why in default RelIndexRead implementation for CRelIndex, we are computing an approximate size?
I was thinking use precise relation size may avoid the redundunt computation in the case that some inner join body clause is empty.

Yihao

@StarGazerM
Copy link
Author

StarGazerM commented Dec 6, 2024

For more details, what I want do is add following change in in ascent_macro/src/ascent_codegen when iterating over scc.rules :

let size_check_code_list = rule.body_items.iter().filter_map(|bi| {
         if let MirBodyItem::Clause(clause) = bi {
            let rel = expr_for_rel(&clause.rel, mir);
               Some(quote! {
                  #rel.len() > 0
               })
         } else {None}
      });
      // connect the size check code with logical and
      let size_check_code = size_check_code_list.reduce(|a, b| quote!{#a && #b}).unwrap_or(quote!{true});
      evaluate_rules.push(if rule_parallelism { quote! {
         ascent::internal::comment(#msg);
         if #size_check_code {
            __scope.spawn(|_| {
               #before_rule_var
               #rule_compiled
               #update_rule_time_field
            });
         }
      }} else { quote! {
         #before_rule_var
         ascent::internal::comment(#msg);
         if #size_check_code {
            #rule_compiled
         }
         #update_rule_time_field
      }});
   }

Seems when running large rules like ddisasm, this extra check could save us lots of time avoid spinning in long tail iterations

@s-arash
Copy link
Owner

s-arash commented Dec 12, 2024

Hey @StarGazerM, yeah I think that's reasonable.

I initially made the len() an estimate for perf reasons (you can check its impl for parallel relations). But maybe we can calculate the len() when freezing the relation; but I'd need to think about how it would affect BYODS relations.

In any case, if making the len() precise turns out not to be feasible for BYODS relations, we can add an is_empty() method to RelIndexRead and do the optimization you suggested.

@regexident
Copy link
Contributor

In any case, if making the len() precise turns out not to be feasible for BYODS relations, we can add an is_empty() method to RelIndexRead and do the optimization you suggested.

Something like Iterator's size_hint() should work quite well for this (and already be familiar to most), I think?

fn size_hint(&self) -> (usize, Option<usize>)

https://doc.rust-lang.org/1.82.0/std/iter/trait.Iterator.html#method.size_hint

The idea is that if you return (n, Some(n) you get exact size, while (m, None) provides a lower bound, (0, Some(n)) an upper bound and (m, Some(n) (where m <= n) provides lower and upper bounds.

@StarGazerM
Copy link
Author

yep, I agree, if the len can be unreliable for many data structure maybe we can consider rename? ==, although now these change will immediate force everywhere to be changed in current BYODS relations....

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants