-
Notifications
You must be signed in to change notification settings - Fork 229
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
compiler: Improve data streaming backend #2390
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #2390 +/- ##
==========================================
- Coverage 86.78% 86.69% -0.09%
==========================================
Files 234 234
Lines 44271 44383 +112
Branches 8184 8216 +32
==========================================
+ Hits 38419 38477 +58
- Misses 5137 5185 +48
- Partials 715 721 +6 ☔ View full report in Codecov by Sentry. |
devito/arch/archinfo.py
Outdated
@@ -19,6 +19,8 @@ | |||
'get_cuda_path', 'get_hip_path', 'check_cuda_runtime', 'get_m1_llvm_path', | |||
'Platform', 'Cpu64', 'Intel64', 'IntelSkylake', 'Amd', 'Arm', 'Power', | |||
'Device', 'NvidiaDevice', 'AmdDevice', 'IntelDevice', | |||
# Generic | |||
'CPU64', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SO both CPU and Cpu ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CPU64 = Cpu64('cpu64')
in archinfo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agrred, CPU64 + Cpu64 seems like there shound't need both
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one is the class the other an instance I agree it's pretty horrible will check
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, done CPU64 -> ANYCPU
@@ -939,21 +955,23 @@ def _emit_apply_profiling(self, args): | |||
if metrics: | |||
perf("Global performance: [%s]" % ', '.join(metrics)) | |||
|
|||
# Same as above, but excluding the setup phase, e.g. the CPU-GPU |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the data transfer time from device to host is not included ?
is it reported somewhere else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wdym ? this is just an additional line -- a line that does not include the setup phase
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will cross-check the logs again
metrics.append("%.2f s" % fround(v.time)) | ||
metrics.append("%.2f GPts/s" % fround(v.gpointss)) | ||
|
||
perf("Global performance <w/o setup>: [%s]" % ', '.join(metrics)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to run locally the PR and I do not see any diff in the log, should I ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it should yes according to these changes
devito/tools/abc.py
Outdated
@@ -151,6 +151,22 @@ def __init__(self, a, b, c=4): | |||
|
|||
return cls(*args, **kwargs) | |||
|
|||
def _rreplace(self, subs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why? this is just self._rebuild(**subs)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not only was it pretty similar if not identical to _rebuild
, but also -- I just noted, sorry -- it appears to be completely unused at this stage. Probably something I coded during development which I eventually managed to avoid using, hence I'm dropping it for good!
b79cf67
to
5d9dfc6
Compare
devito/ir/clusters/algorithms.py
Outdated
@@ -433,7 +433,7 @@ def callback(self, clusters, prefix, seen=None): | |||
|
|||
key = lambda i: i in prefix[:-1] or i in hs.loc_indices | |||
ispace = c.ispace.project(key) | |||
# HaloTOuch are not parallel | |||
# HaloTouch's are not parallel |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"HaloTouches"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
devito/operator/operator.py
Outdated
@@ -975,21 +993,12 @@ def lower_perfentry(v): | |||
|
|||
if v.time <= 0.01: | |||
# Trim down the output for very fast sections | |||
name = "%s%s<>" % (k.name, rank) | |||
name = "%s%s" % (k.name, rank) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you define this first thing in the function? Would save the repeated line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch, OK
def _signature(self): | ||
""" | ||
The signature of an AbstractFunction is the set of fields that | ||
makes it "compatible" with another AbstractFunction. The fact that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be worth defining "compatible" here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's encoded in the first part of the sentence
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In what sense? The current docstring doesn't help me understand what it means for one AbstractFunction
to be compatible with another in this context
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will try to improve it in my next PR
7ab94e2
to
4d08839
Compare
Yet another bunch of misc compiler tweaks and generalisations, mostly to better support lazy data streaming in PRO