-
Notifications
You must be signed in to change notification settings - Fork 229
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
compiler: Add skewing pass towards Temporal Blocking #1620
Conversation
0750bc9
to
299d305
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity, have you considered using an implicit tree visitors for Clusters (ie a subclass of Queue
) as for example done here https://github.com/devitocodes/devito/blob/master/devito/passes/clusters/blocking.py#L12
?
A few questions:
- Did you understand how a Queue works as for example in Blocking?
- Did you think it's not necessary here? As in, do you think that working on individual clusters in isolation is the way to go? (Haven't really thought about it, honestly, for this specific pass)
The other key comment I have is the following: when you implement a pass (and, in general, any algorithm in your life), you need to think not once, not ten times, but better one hundred times about the complexity of your implementation. Useful questions to ask yourself to drive the implementation "are all these lists/dict/... really necessary?" , "can I get away with fewer data structures and/or loops?" , etc. For example here you have both skew_dims
and skewable
-- that doesn't make much sense to me. Also remove
from skewable
is hardly necessary... I'm fairly sure there's a neater way of achieving this. You also have two nested if
s... first one is if i.dim not in skew_dims
while the nested one is if index < .... and i.dim in skewable
. I hardly believe that comparing indices and checking again for the presence of a dimension in skewable
is the simplest way to go
devito/passes/clusters/temporal.py
Outdated
skewable.append(i.dim) | ||
|
||
if len(skew_dims) > 1: | ||
raise warning("More than 1 dimensions that can be skewed.\ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No nested skewing? :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning or error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No nested skewing? :D
why not?
devito/passes/clusters/temporal.py
Outdated
skewable.append(i.dim) | ||
|
||
if len(skew_dims) > 1: | ||
raise warning("More than 1 dimensions that can be skewed.\ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No nested skewing? :D
why not?
d0f6636
to
b611b90
Compare
Codecov Report
@@ Coverage Diff @@
## master #1620 +/- ##
==========================================
+ Coverage 86.50% 86.55% +0.04%
==========================================
Files 216 216
Lines 32479 32598 +119
Branches 4279 4296 +17
==========================================
+ Hits 28097 28215 +118
+ Misses 3901 3899 -2
- Partials 481 484 +3
Continue to review full report at Codecov.
|
devito/passes/clusters/blocking.py
Outdated
intervals.append(i) | ||
# Since we are here, prefix is skewable and nested under a | ||
# SEQUENTIAL loop. Do not skew innermost loop | ||
# TODO: In case of subdomains or perfect loops nests with more |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder whether this should rather be fixed in this PR
------ | ||
In case of skewing, if 'blockinner' is enabled, the innermost loop is also skewed. | ||
""" | ||
processed = preprocess(clusters, options) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this was pointed out a few times, "processed" is only at the end, when it needs no more processing... anyway, nitpicking
@@ -88,9 +98,14 @@ def callback(self, clusters, prefix): | |||
exprs = [uxreplace(e, {d: bd}) for e in c.exprs] | |||
|
|||
# The new Cluster properties | |||
# TILABLE property is dropped after the blocking. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#
comments don't take full stop at end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
after the blocking -> after blocking
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds like a useless comment though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
@@ -103,6 +118,28 @@ def callback(self, clusters, prefix): | |||
return processed | |||
|
|||
|
|||
def preprocess(clusters, options): | |||
# Preprocess: heuristic: drop TILABLE from innermost Dimensions to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this could maybe be turned into a docstring now that preprocess has its own function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
left some nitpicks, addressable in the next PR
This now looks great ! Thanks for the patience :)
This is the first step towards Temporal Blocking automation.
This PR aims to add skewing as a pass to:
A new Queue subclass is implemented.
skewinner
option is added to skew or not the innermost loopNo changes are happening in the computation order. Nothing is optimized.
Codegen tests have been added, covering simple time-stepping computations, autotuning, and subdomains.