Skip to content

compiler: Revamp data streaming #1702

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 39 commits into from
Jun 17, 2021
Merged

compiler: Revamp data streaming #1702

merged 39 commits into from
Jun 17, 2021

Conversation

FabioLuporini
Copy link
Contributor

@FabioLuporini FabioLuporini commented Jun 7, 2021

This PR:

  • Fixes a host-side memory leak when streaming data from the host to device. This was due to the pthreads allocating so-called pinned memory under-the-hood to perform the transfers; the issue was that this memory never got freed. Now such pinned memory regions are created by the parent thread, through the use of buffers (hence the significant changes to the buffering pass) and by requesting the OpenACC runtime to always allocate pinned memory (option gpu=pinned added to NvidiaCompiler class)
  • Enables data streaming on AMD cards via OpenMP offloading (reusing part of the technology implemented for the bug fix above!)
  • Reworks and improves various components revolving around the entire data streaming process
  • Makes data streaming , and in particular host-to-device prefetching, more efficient
  • Adds quite a few new data streaming tests

@codecov
Copy link

codecov bot commented Jun 7, 2021

Codecov Report

Merging #1702 (e21744f) into master (7f564c1) will decrease coverage by 0.56%.
The diff coverage is 75.12%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1702      +/-   ##
==========================================
- Coverage   89.10%   88.54%   -0.57%     
==========================================
  Files         207      208       +1     
  Lines       32759    33440     +681     
  Branches     4245     4352     +107     
==========================================
+ Hits        29190    29608     +418     
- Misses       3085     3334     +249     
- Partials      484      498      +14     
Impacted Files Coverage Δ
devito/arch/compiler.py 52.39% <0.00%> (-2.24%) ⬇️
devito/passes/equations/__init__.py 100.00% <ø> (ø)
devito/passes/iet/languages/openmp.py 100.00% <ø> (ø)
tests/test_adjoint.py 100.00% <ø> (ø)
tests/test_gpu_common.py 1.70% <0.00%> (-0.20%) ⬇️
devito/symbolics/queries.py 39.72% <6.12%> (-8.97%) ⬇️
devito/ir/support/vector.py 88.23% <9.09%> (-5.58%) ⬇️
devito/passes/clusters/asynchrony.py 34.71% <38.56%> (-13.52%) ⬇️
devito/passes/iet/orchestration.py 60.89% <55.20%> (-4.71%) ⬇️
devito/ir/iet/efunc.py 90.00% <87.50%> (-0.48%) ⬇️
... and 36 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7f564c1...e21744f. Read the comment docs.

@FabioLuporini FabioLuporini requested a review from mloubout June 7, 2021 13:02
@FabioLuporini FabioLuporini force-pushed the revamp-streaming-final branch 3 times, most recently from a606c53 to b5669de Compare June 9, 2021 07:37
@@ -460,7 +460,7 @@ def __init__(self, *args, **kwargs):
self.cflags.remove('-std=c99')
self.cflags.remove('-O3')
self.cflags.remove('-Wall')
self.cflags += ['-std=c++11', '-acc:gpu', '-mp']
self.cflags += ['-std=c++11', '-acc:gpu', '-gpu=pinned', '-mp']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

"""
# TODO: this is currently very rudimentary
required = derive_parameters(iet)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

required assigned twice?

# Attach SyncOps to Clusters
actions = defaultdict(Actions)

# Case 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any docs to provide more details for each case?

else:
size = async_degree

# Replace `d` with a suitable CustomDimension `bd`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bd-db mismatch ?

`time` Dimension, we could have `{time: (time_m + db0) % 2, (time_m + db0)}`;
likewise, for backwards, `{time: (time_M - 2 + db0) % 4, time_M - 2 + db0}`.
"""
mapper = {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bd bm mismatch

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, you're very right, but the gencode name convention and the variable name convention in the python code is sometimes different. I'll see what needs to be done to fix this

Copy link
Contributor

@mloubout mloubout left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Except the Precedence thing I don't think makes sense, no other comments loks fine to me.

else:
nofs, size = i
if isinstance(nofs, Mod):
# NOTE: work around alleged SymPy bug which doesn't allow
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As said on slack, it's not a sympy bug, it's a choice of devito to go with % while sympy uses fmod which is a function call and therefore always has precedence over Mul.
I think it makes much more sense to fix print_Mod rather than adding this odd extra type.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's actually very easy to fix if you use ourMod instead of the sympy one and add

@property
def precedence(self):
    return 49  # precedence(Mul)-1

since the first thing sympy checks is if it's defined before going to the defaults
https://github.com/sympy/sympy/blob/126f80578140e752ad5135aac77b8ff887eede3e/sympy/printing/precedence.py#L123

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right. However, as explained on Slack, I forgot that this piece of code is actually dead, so I just dropped it

@FabioLuporini FabioLuporini force-pushed the revamp-streaming-final branch from 18765ab to e21744f Compare June 16, 2021 13:56
@FabioLuporini FabioLuporini merged commit 87ee495 into master Jun 17, 2021
@FabioLuporini FabioLuporini deleted the revamp-streaming-final branch June 17, 2021 22:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants