-
Notifications
You must be signed in to change notification settings - Fork 229
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
compiler: Revamp code generation for asynchronous operations #2376
Conversation
writes = c1.scope.writes[target] | ||
except KeyError: | ||
# No read-write dependency, ignore | ||
def _schedule_waitlocks(self, c0, d, clusters, locks, syncs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note for reviewers: this is mostly the same as before (with same exceptions), but split over different methods for better code readability
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #2376 +/- ##
==========================================
- Coverage 86.81% 86.77% -0.05%
==========================================
Files 234 234
Lines 44004 44245 +241
Branches 8127 8181 +54
==========================================
+ Hits 38204 38393 +189
- Misses 5093 5137 +44
- Partials 707 715 +8 ☔ View full report in Codecov by Sentry. |
|
||
__all__ = ['buffering'] | ||
|
||
|
||
@timed_pass() | ||
def buffering(clusters, callback, sregistry, options, **kwargs): | ||
def buffering(clusters, key, sregistry, options, **kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note for reviewers: unlike passes/clusters/asynchrony.py
, which for a good 85-90% is the same code but refactored, this pass has been totally revamped and the result is, imho, vastly better (more robust, easier to read, dramatically easier to understand and modify) code
devito/operator/profiling.py
Outdated
|
||
k = PerfKey(name, rank) | ||
|
||
if not ops: | ||
if not ops or any(np.isnan(i) for i in [ops, points, traffic]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah same
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
np.isfinite
to catch the Inf cases too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so something like
any(not np.isfinite(i) ...)
instead of
any(np.isnan(i) ...)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reminder
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
devito/passes/iet/instrument.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
οκ
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Big PR, I need some more time, to have another look and most importantly understand a few things!
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
devito/ir/iet/efunc.py
Outdated
@@ -215,6 +215,13 @@ def functions(self): | |||
launch_args += (self.stream.function,) | |||
return super().functions + launch_args | |||
|
|||
@cached_property | |||
def expr_symbols(self): | |||
launch_symbols = (self.grid, self.block,) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't need second comma in this tuple
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
devito/ir/support/space.py
Outdated
if callable(maybe_callable): | ||
dims = {i.dim for i in self if maybe_callable(i.dim)} | ||
else: | ||
dims = set(as_tuple(maybe_callable)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: add an as_set
utility to do this. They could both be thin wrappers around some as_iterable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
intervals = IntervalGroup(intervals, relations=relations) | ||
|
||
return IterationSpace(intervals, sub_iterators, directions) | ||
return self.ispace |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this be dropped then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe, given the major change I prefer to keep it like that for now, it also helps me remember what .ispace
actually represents (and why) when obtained via .written
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks mostly good
devito/core/cpu.py
Outdated
options0 = kwargs0.pop('options') | ||
|
||
def wrapper(expressions, **kwargs1): | ||
options = {**options0, **kwargs1.pop('options', {})} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just wrapper(expressions, options=None, **kwargs1):
and skip the pop
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right
kwargs = {'options': options, **kwargs0, **kwargs1} | ||
|
||
# User-provided openmp flag has precedence over defaults | ||
if not options['openmp']: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
options.get ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no square brackets is fine because by construction openmp
must be there
devito/ir/stree/algorithms.py
Outdated
processed.append(c.rebuild(exprs=None, guards=c.guards, syncs=c.syncs)) | ||
elif c.is_critical_region: | ||
if c.is_wait: | ||
processed.append(c.rebuild(exprs=[], syncs=c.syncs)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can probably drop c.sync
it's unchanged
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch
"""Shorthand to retrieve the DiscreteFunctions in `exprs`.""" | ||
indexeds = search(exprs, q_indexed, mode, 'dfs') | ||
|
||
functions = search(exprs, q_function, mode, 'dfs') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be in a deep
mode like for retrieve_dimensions
@@ -672,29 +672,6 @@ class Dummy(SubDomainSet): | |||
assert z.is_Parallel | |||
|
|||
|
|||
class TestMultiSubDimension: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There were issues rebuilding MultiSubDimension
because of how the thickness worked, so it has a bunch of extra gubbins to handle this. This PR removes the thickness as a kwarg and all the extra machinery, so this test isn't really needed anymore
57c7acd
to
2e5d1eb
Compare
|
||
@cached_property | ||
def readfrom(self): | ||
def last_idx(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is actually useful for the xdsl lowering, I had no idea it existed
def _make_snapout(self, iet, *args): | ||
return self._make_callable('write_snapshot', iet, *args) | ||
|
||
def _make_snapin(self, iet, *args): | ||
return self._make_callable('read_snapshot', iet, *args) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would keep the func name as the callable arg for being homogeneous to 'init_array'
# 2-tuples representing ranges | ||
imask = [] | ||
for v in o.imask: | ||
try: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The single dispatch of uxrteplace doesn't handle these two cases?
devito/operator/profiling.py
Outdated
|
||
k = PerfKey(name, rank) | ||
|
||
if not ops: | ||
if not ops or any(np.isnan(i) for i in [ops, points, traffic]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes
devito/operator/profiling.py
Outdated
|
||
k = PerfKey(name, rank) | ||
|
||
if not ops: | ||
if not ops or any(np.isnan(i) for i in [ops, points, traffic]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reminder
@@ -36,6 +36,16 @@ def is_on_device(obj, gpu_fit): | |||
return all(f in gpu_fit for f in fsave) | |||
|
|||
|
|||
def stream_dimensions(f): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really related to this PR but is it good that these functions live in init?
Maybe some better place? other/new file?
return clusters | ||
|
||
try: | ||
pd = prefix[-2].dim |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not equivalent to dim.parent
?
right = parent.symbolic_max - rst | ||
def __init_finalize__(self, name, parent, msd): | ||
# NOTE: a MultiSubDimension stashes a reference to the originating | ||
# MultiSubDomain. This creates a circular pattern as the `msd` itself |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
two spaces?
2e5d1eb
to
0decbb2
Compare
This PR is the result of the last 40 days of work, more or less. It would deserve to be split over multiple PRs, as what I've done here is really really bad, but point is, 1) I didn't have the time to go back and forth and 2) it's all intertwined, somehow...
Strictly speaking, there are no new features here. "Just" a major revamp to generalise code generation for asynchronous operations, which will mostly impact PRO.
A lot of compiler passes have been rewritten from scratch, and in my opinion, the code quality has increased significantly. Some legacy code has finally gone for good...
Despite being a giant PR, it's worth saying that all edits go in the same direction and contribute to the same goal; in other words, this PR isn't the sum of N orthogonal changes, even though, I agree, the appearance may be deceptive