Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compiler: Refactorings, simplifications, generalizations #1810

Merged
merged 47 commits into from
Dec 29, 2021

Conversation

FabioLuporini
Copy link
Contributor

@FabioLuporini FabioLuporini commented Dec 16, 2021

As the title says, zero new features introduced by this PR.

The main changes are the following:

  • The DataSpace construction is now vastly easier than in master. Also, it is performed only once, when all Clusters will have been constructed and optimized. This makes the code: easier to understand and faster to execute
  • Guards (ie conditionals) are now a first-class citizen in the compiler. See ir/support/guards.py
  • A lot of dead code (also as a consequence of the aforementioned changes) was dropped (in particular, most of the Stencil class from ir/support/stencil.py , most of DataSpace, ...)
  • The Dimension hierarchy was refactored. Now there's IncrDimension and BlockDimension, the latter for loop blocking. They inherit from the same class, which contains most of the logic, AbstractIncrDimension. They behave the same! But explicit distinction helps with making the code neater in more than one place. Also, the is_SubIterator class property could be introduced, which introduces a clear separation between Dimensions that cannot create loops (the SubIterators...) and the other ones (like Dimension, SubDimension, CustomDimension)

I will also add comments to the code to guide the reviewing

fixes #1771


def lshw_gpu_info(text):
def lshw_single_gpu_info(text):
def homogenise_gpus(gpu_infos):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here u see a big diff, but it's mostly due to indentation...


gpu_info = homogenise_gpus(gpu_infos)

# Also attach callbacks to retrieve instantaneous memory info
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... this is really the only new part

devito/passes/clusters/asynchrony.py Show resolved Hide resolved
devito/passes/iet/parpragma.py Show resolved Hide resolved
the Devito compiler) to ensure data coherence. 'uniform', instead,
means the Array will be used as if the underlying node had uniform
memory address space.
The memory space. Allowed values: 'local', 'mapped'. Defaults to 'local'.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

simplifcations...

@@ -790,76 +795,9 @@ def free_symbols(self):
_pickle_kwargs = DerivedDimension._pickle_kwargs + ['factor', 'condition', 'indirect']


class SteppingDimension(DerivedDimension):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was not dropped -- just moved below towards the bottom of the file!

@codecov
Copy link

codecov bot commented Dec 16, 2021

Codecov Report

Merging #1810 (9ce8cb6) into master (3c89f70) will decrease coverage by 0.03%.
The diff coverage is 75.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1810      +/-   ##
==========================================
- Coverage   89.60%   89.57%   -0.04%     
==========================================
  Files         209      209              
  Lines       34307    34379      +72     
  Branches     5198     5197       -1     
==========================================
+ Hits        30741    30794      +53     
- Misses       3077     3097      +20     
+ Partials      489      488       -1     
Impacted Files Coverage Δ
devito/arch/compiler.py 57.41% <ø> (ø)
devito/passes/iet/orchestration.py 60.11% <0.00%> (ø)
devito/symbolics/extended_sympy.py 96.01% <0.00%> (-0.72%) ⬇️
tests/test_gpu_common.py 1.48% <0.00%> (-0.06%) ⬇️
devito/arch/archinfo.py 47.32% <16.66%> (-1.83%) ⬇️
devito/core/arm.py 50.00% <50.00%> (+5.17%) ⬆️
devito/core/operator.py 87.59% <50.00%> (+0.77%) ⬆️
devito/passes/clusters/asynchrony.py 37.33% <50.00%> (+2.35%) ⬆️
devito/logger.py 77.77% <60.00%> (-2.62%) ⬇️
devito/ir/support/guards.py 70.23% <70.23%> (ø)
... and 48 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3c89f70...9ce8cb6. Read the comment docs.

Copy link
Contributor

@mloubout mloubout left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a nice cleanup. Will make a full pass over the next few days.

Copy link
Contributor

@georgebisbas georgebisbas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A zero-level review. Will look properly next days. Interesting work done!

devito/ir/clusters/cluster.py Show resolved Hide resolved
devito/ir/equations/equation.py Show resolved Hide resolved
if d.is_SubIterator:
iterators.setdefault(d.root, set()).add(d)
elif d.is_Conditional:
# Use `parent`, and `root`, because a ConditionalDimenion may
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, fixing

devito/ir/equations/equation.py Show resolved Hide resolved

There are two main differences between a LoweredEq and a
ClusterizedEq:

* In a ClusterizedEq, the iteration and data spaces must *always*
be provided by the caller.
* In a ClusterizedEq, the iteration space must *always* be provided
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the IterationSpace

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK


collapsable = self._find_collapsable(root, candidates)
root, collapsable = self._select_candidates(candidates)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm....maybe better before...

devito/passes/iet/parpragma.py Show resolved Hide resolved
@@ -36,13 +36,11 @@ class Dimension(ArgProvider):
|
---------------------------
| |
BasicDimension DefaultDimension
DerivedDimension DefaultDimension
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe add all DImension subclasses to this plot?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as the upper part of the docstring says, just the API classes

tests/test_dle.py Show resolved Hide resolved
@@ -2565,11 +2572,11 @@ def test_fullopt(self):

# Check expected opcount/oi
assert summary[('section1', None)].ops == 90
assert np.isclose(summary[('section1', None)].oi, 1.511, atol=0.001)
assert np.isclose(summary[('section1', None)].oi, 2.031, atol=0.001)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

? Why?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

byproduct of a more accurate dspace calculation

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems way.. more accurate..

def lshw_single_gpu_info(text):
def homogenise_gpus(gpu_infos):
"""
Run homogeneity checks on a list of GPU, return GPU with count if
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

list of GPUs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

@@ -56,11 +56,15 @@ def autotune(operator, args, level, mode):
if mode in ['preemptive', 'destructive']:
for p in operator.parameters:
if isinstance(p, MPINeighborhood):
at_args.update(MPINeighborhood(p.neighborhood)._arg_values())
at_args.update(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needed this change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's just to make the code more homogeneous. The line below had to be split over multiple lines due to the additions of the args argument, so I did the same aesthetic change here

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@FabioLuporini FabioLuporini force-pushed the tweaks-101010 branch 2 times, most recently from 79b00a2 to 45e6e23 Compare December 20, 2021 16:09
Copy link
Contributor Author

@FabioLuporini FabioLuporini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pushing revisions now

def lshw_single_gpu_info(text):
def homogenise_gpus(gpu_infos):
"""
Run homogeneity checks on a list of GPU, return GPU with count if
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

@@ -56,11 +56,15 @@ def autotune(operator, args, level, mode):
if mode in ['preemptive', 'destructive']:
for p in operator.parameters:
if isinstance(p, MPINeighborhood):
at_args.update(MPINeighborhood(p.neighborhood)._arg_values())
at_args.update(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's just to make the code more homogeneous. The line below had to be split over multiple lines due to the additions of the args argument, so I did the same aesthetic change here

devito/ir/clusters/cluster.py Show resolved Hide resolved
if d.is_SubIterator:
iterators.setdefault(d.root, set()).add(d)
elif d.is_Conditional:
# Use `parent`, and `root`, because a ConditionalDimenion may
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, fixing


There are two main differences between a LoweredEq and a
ClusterizedEq:

* In a ClusterizedEq, the iteration and data spaces must *always*
be provided by the caller.
* In a ClusterizedEq, the iteration space must *always* be provided
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

@@ -36,13 +36,11 @@ class Dimension(ArgProvider):
|
---------------------------
| |
BasicDimension DefaultDimension
DerivedDimension DefaultDimension
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as the upper part of the docstring says, just the API classes

@@ -2565,11 +2572,11 @@ def test_fullopt(self):

# Check expected opcount/oi
assert summary[('section1', None)].ops == 90
assert np.isclose(summary[('section1', None)].oi, 1.511, atol=0.001)
assert np.isclose(summary[('section1', None)].oi, 2.031, atol=0.001)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

byproduct of a more accurate dspace calculation

@@ -182,6 +173,14 @@ def has_increments(self):
def is_scalar(self):
return not any(f.is_Function for f in self.scope.writes)

@cached_property
def grid(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder whether this could help with some bugs that we have seen in the past..... need to search ...

@@ -47,9 +47,19 @@
}


def _set_log_level(level):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need both _set_log_level, set_log_level ?

@@ -792,9 +800,9 @@ def test_nested_cache_blocking_structure_subdims(self, blocklevels):
@pytest.mark.parametrize('exprs,collapsed,scheduling', [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we are OK with this? Maybe we can drop some of them since there is no diff now.

@@ -2565,11 +2572,11 @@ def test_fullopt(self):

# Check expected opcount/oi
assert summary[('section1', None)].ops == 90
assert np.isclose(summary[('section1', None)].oi, 1.511, atol=0.001)
assert np.isclose(summary[('section1', None)].oi, 2.031, atol=0.001)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems way.. more accurate..

@FabioLuporini FabioLuporini merged commit 8e10af2 into master Dec 29, 2021
@FabioLuporini FabioLuporini deleted the tweaks-101010 branch December 29, 2021 13:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Integer overflow with Linearize.
3 participants