Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compiler: Rework HaloSpot optimization #2477

Open
wants to merge 16 commits into
base: master
Choose a base branch
from

Conversation

georgebisbas
Copy link
Contributor

Fix for issue #2448, supplemented with some code edits

@georgebisbas georgebisbas added MPI mpi-related compiler labels Nov 4, 2024
@georgebisbas georgebisbas self-assigned this Nov 4, 2024
@georgebisbas georgebisbas linked an issue Nov 4, 2024 that may be closed by this pull request
devito/passes/iet/mpi.py Show resolved Hide resolved
tests/test_mpi.py Outdated Show resolved Hide resolved
Copy link

codecov bot commented Nov 4, 2024

Codecov Report

Attention: Patch coverage is 99.20319% with 2 lines in your changes missing coverage. Please review.

Project coverage is 87.35%. Comparing base (41d04ed) to head (309dd1b).

Files with missing lines Patch % Lines
devito/passes/iet/mpi.py 96.77% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2477      +/-   ##
==========================================
+ Coverage   87.29%   87.35%   +0.06%     
==========================================
  Files         238      238              
  Lines       45382    45561     +179     
  Branches     4031     4034       +3     
==========================================
+ Hits        39616    39802     +186     
+ Misses       5083     5078       -5     
+ Partials      683      681       -2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

devito/ir/iet/nodes.py Outdated Show resolved Hide resolved
devito/ir/iet/nodes.py Outdated Show resolved Hide resolved
frozenset(self.halos),
frozenset(self.dims)))

def rebuild(self, **kwargs):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Subclass Reconstructable and add rargs?

@@ -95,8 +127,16 @@ def __init__(self, exprs, ispace):
self._honored = frozendict(self._honored)

def __repr__(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be attached to a mixin class for both HaloSpot and HaloSchemeEntry?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes but where would you add it?

I would just drop all these repr it's a lot of code for what at the end of the day?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you print the IET, you see more useful details, it makes dev/debug life easier!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

anyway, Ed has a point, this is redundant code which should somehow be factorized (a private method of HaloScheme, also called by HaloSpot.__repr__? not sure)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was also thinking of this, but fmapper is the problem, we need to drop it from the entry? and thus be a class itself?
needs some thought, I may restructure more

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand your last comment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I am trying to say is that in order to factor code out, someone has to factor out the fmapper/functions, but I had failed to do it. Now I think I did it, fix coming

tests/test_mpi.py Outdated Show resolved Hide resolved
tests/test_mpi.py Outdated Show resolved Hide resolved
tests/test_mpi.py Outdated Show resolved Hide resolved
tests/test_mpi.py Outdated Show resolved Hide resolved
devito/ir/iet/nodes.py Outdated Show resolved Hide resolved
@@ -28,7 +28,39 @@ class HaloLabel(Tag):
STENCIL = HaloLabel('stencil')


HaloSchemeEntry = namedtuple('HaloSchemeEntry', 'loc_indices loc_dirs halos dims')
class HaloSchemeEntry:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as Ed wrote below, this should inherit from Reconstructable

HaloSchemeEntry = namedtuple('HaloSchemeEntry', 'loc_indices loc_dirs halos dims')
class HaloSchemeEntry:

def __init__(self, loc_indices, loc_dirs, halos, dims):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can it not inherit, alternatively, from EnrichedTuple?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

THere are some 'getters', that look redundant ?
Reconstrcutable seems to work fine

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that ok now?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that look redundant

with EnrichedTuple, once you properly override __rargs__ and __rkwargs__, you probably don't need to override __init__ and __eq__

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fail to do so, so far,, the already existing code of DimensionTuple does not seem to help me....

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so what's the issue?

There's a lot of redundant code here -- we should use EnrichedTuple

Then:

  • you shouldn't need to override __eq__
  • you shouldn't need to override __repr__
  • I have some doubts about overriding __hash__, but maybe also see comment below

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still do not understand this. SHould I still be able to construct HaloSchemeEntry the same way I do it now?
Will I need to specify getters ?
Like e.g. here:

    def glb_shape(self):
        """Shape of the decomposed domain."""
        return EnrichedTuple(*self._glb_shape, getters=self.dimensions)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SHould I still be able to construct HaloSchemeEntry the same way I do it now?

what happens if you try (so yes, w/o specifying any getters)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I think I managed to do it, not sure what I was doing wrong earlier

@@ -95,8 +127,16 @@ def __init__(self, exprs, ispace):
self._honored = frozendict(self._honored)

def __repr__(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes but where would you add it?

I would just drop all these repr it's a lot of code for what at the end of the day?

tests/test_mpi.py Outdated Show resolved Hide resolved
# a stopper to halo merging
# Loop over the functions in the HaloSpots
for f, v in hs1.fmapper.items():
# If no time accesses, skip
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"E.g., if no time accesses, skip"

you have to say it's an example...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wdym by 'say it's an example' ?

return not any(d in hs.dimensions or dep.distance_mapper[d] is S.Infinity
for d in dep.cause)
# If the function is not in both HaloSpots, skip
if (*hs0.functions, *hs1.functions).count(f) < 2:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.intersection(...), again as mentioned above?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no needed anymore

devito/passes/iet/mpi.py Outdated Show resolved Hide resolved
devito/passes/iet/mpi.py Outdated Show resolved Hide resolved

candidates = [i.dim._defines for i in iters[n:]]
for hs1 in halo_spots[i+1:]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe you could avoid an indentation level but rather considering all pairs

from itertools import combinations

...

for hs0, hs1 in combinations(halo_schemes, 2):
    ...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw, are we sure that the current code works if there are 2 hoisting opportunities, instead of 1 ?

could we add a test?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am working on this now

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@georgebisbas georgebisbas force-pushed the halo_opt_revamp_II branch 4 times, most recently from ca5251a to 5951eaf Compare November 8, 2024 11:04
HaloSchemeEntry = namedtuple('HaloSchemeEntry', 'loc_indices loc_dirs halos dims')
class HaloSchemeEntry:

def __init__(self, loc_indices, loc_dirs, halos, dims):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that ok now?

devito/passes/iet/mpi.py Show resolved Hide resolved
devito/passes/iet/mpi.py Show resolved Hide resolved
devito/passes/iet/mpi.py Show resolved Hide resolved
tests/test_mpi.py Outdated Show resolved Hide resolved
devito/mpi/halo_scheme.py Outdated Show resolved Hide resolved
devito/passes/iet/mpi.py Outdated Show resolved Hide resolved
devito/mpi/halo_scheme.py Outdated Show resolved Hide resolved
HaloSchemeEntry = namedtuple('HaloSchemeEntry', 'loc_indices loc_dirs halos dims')
class HaloSchemeEntry:

def __init__(self, loc_indices, loc_dirs, halos, dims):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that look redundant

with EnrichedTuple, once you properly override __rargs__ and __rkwargs__, you probably don't need to override __init__ and __eq__

@@ -95,8 +127,16 @@ def __init__(self, exprs, ispace):
self._honored = frozendict(self._honored)

def __repr__(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

anyway, Ed has a point, this is redundant code which should somehow be factorized (a private method of HaloScheme, also called by HaloSpot.__repr__? not sure)

hse = HaloSchemeEntry(frozendict(loc_indices),
frozendict(loc_dirs),
hse0.halos, hse0.dims)
hse = hse0._rebuild(loc_indices=frozendict(loc_indices),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe the cast to frozendict should be moved to inside HaloSchemeEntry.__init__

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should it be for all attributes?

devito/passes/iet/mpi.py Outdated Show resolved Hide resolved
for hs, v in cond_mapper.items()}
cond_mapper = _make_cond_mapper(iet)

mapper = {}

iter_mapper = MapNodes(Iteration, HaloSpot, 'immediate').visit(iet)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you may introudce a private function to get this iter_mapper too, since you're doing the same exact filtering that appears in hoist_invariant

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now that I think about it, you can probably also move the cond_mapper inside such a new utility function, and do the conditional filtering in there directly, rather than repeating it in both passes (it's basically the same stuff repeated at the moment)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can be resolved

devito/passes/iet/mpi.py Outdated Show resolved Hide resolved
devito/passes/iet/mpi.py Outdated Show resolved Hide resolved
devito/passes/iet/mpi.py Outdated Show resolved Hide resolved
devito/passes/iet/mpi.py Outdated Show resolved Hide resolved
tests/test_mpi.py Outdated Show resolved Hide resolved
tests/test_mpi.py Outdated Show resolved Hide resolved
if i is None or len(halo_spots) <= 1:
continue
# Drop pairs that have keys that are None
iter_mapper = {k: v for k, v in iter_mapper.items() if k is not None}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This whole section looks like a plain copy paste of 89-119 especially the loop next, iter_maper filtering and conditions, can't this be. merged or lifted in some common piece?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

simplified a bit

@@ -95,8 +127,16 @@ def __init__(self, exprs, ispace):
self._honored = frozendict(self._honored)

def __repr__(self):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was also thinking of this, but fmapper is the problem, we need to drop it from the entry? and thus be a class itself?
needs some thought, I may restructure more

devito/mpi/halo_scheme.py Outdated Show resolved Hide resolved
hse = HaloSchemeEntry(frozendict(loc_indices),
frozendict(loc_dirs),
hse0.halos, hse0.dims)
hse = hse0._rebuild(loc_indices=frozendict(loc_indices),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

hse = HaloSchemeEntry(frozendict(loc_indices),
frozendict(loc_dirs),
hse0.halos, hse0.dims)
hse = hse0._rebuild(loc_indices=frozendict(loc_indices),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should it be for all attributes?

devito/passes/iet/mpi.py Outdated Show resolved Hide resolved
iter_mapper = {k: v for k, v in iter_mapper.items() if len(v) > 1}

for it, halo_spots in iter_mapper.items():

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dropped

if ensure_control_flow(hs0, hs1, cond_mapper):
continue

# If there are overlapping time accesses, skip
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cahnged, is it better?

devito/passes/iet/mpi.py Outdated Show resolved Hide resolved
devito/passes/iet/mpi.py Outdated Show resolved Hide resolved
examples/seismic/viscoacoustic/operators.py Outdated Show resolved Hide resolved
devito/passes/iet/mpi.py Outdated Show resolved Hide resolved
Copy link
Contributor

@EdCaunt EdCaunt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few minor comments. Looks very good to me

devito/passes/iet/mpi.py Outdated Show resolved Hide resolved
devito/passes/iet/mpi.py Outdated Show resolved Hide resolved

# Test two variants of receiver interpolation
nrec = 1
rec = SparseTimeFunction(name="rec", grid=grid, npoint=nrec, nt=tn)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: You could remove some lines by having npoint=1, nt=30 in here. You can do similar stuff throughout this test (for example Grid(shape=(2,)).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the rest ok, for shape maybe better to stay as is

tests/test_mpi.py Outdated Show resolved Hide resolved
Copy link
Contributor

@FabioLuporini FabioLuporini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few more comments in, but it's surely getting closer and closer to a mergeable state

@@ -95,8 +127,16 @@ def __init__(self, exprs, ispace):
self._honored = frozendict(self._honored)

def __repr__(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand your last comment

@@ -556,7 +596,7 @@ def process_loc_indices(raw_loc_indices, directions):
known = set().union(*[i._defines for i in loc_indices])
loc_dirs = {d: v for d, v in directions.items() if d in known}

return frozendict(loc_indices), frozendict(loc_dirs)
return loc_indices, loc_dirs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ping

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cuz the returned loc_indices and loc_dirs go directly to the constructor where they are frozen there. Inside HaloSchemeEntry init (THought this is enough?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok so these two lines above

        halos = frozenset(halos)
        dims = frozenset(dims)

are droppable as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say yes!

devito/passes/iet/mpi.py Show resolved Hide resolved
raw_loc_indices = {}

for d in hse.loc_indices:
md = hse.loc_indices[d]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can spare one line

for d, md in hse.loc_indices.items():
    ...

Does md stand for ModuloDimension? are we sure (I really don't remember) that at this point it can only be a ModuloDimension ? I don't think so actually, even though in most cases it will be. Anyway, it might be more prudent to call it v

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right!

md_sub = it.start
raw_loc_indices[d] = md.symbolic_min.subs(it.dim, md_sub)
else:
raw_loc_indices[d] = md
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's this case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is was this tutorial only failing..

https://github.com/devitocodes/devito/blob/master/examples/seismic/tutorials/12_time_blocking.ipynb

need to check closer if I can factor out some mfe I guess

tests/test_mpi.py Outdated Show resolved Hide resolved
tests/test_mpi.py Outdated Show resolved Hide resolved
tests/test_mpi.py Show resolved Hide resolved
class TestElastic:

@pytest.mark.parallel(mode=[(1, 'diag')])
def test_elastic_structure(self, mode):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at first glance, it looks like there's a lot of redundant checks that also appear in the huge test above... but I may be wrong

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well....I think that there are no similar checks in other tests.
And this structure is something that we really need to ensure that is tested.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We check not only where haloupdates are placed, but also functions in each one of them...something that could possibly be affected by future work in halo-related optimizations.

assert calls[4].arguments[1] is v[1]


class TestTTIwMPI:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can drop "wMPI"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

Copy link
Contributor Author

@georgebisbas georgebisbas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realized that I had added comments but were not posted, only drafted...releasing them + some new ones

devito/passes/iet/mpi.py Outdated Show resolved Hide resolved
devito/passes/iet/mpi.py Outdated Show resolved Hide resolved
devito/passes/iet/mpi.py Outdated Show resolved Hide resolved
tests/test_mpi.py Outdated Show resolved Hide resolved

# Test two variants of receiver interpolation
nrec = 1
rec = SparseTimeFunction(name="rec", grid=grid, npoint=nrec, nt=tn)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the rest ok, for shape maybe better to stay as is

md_sub = it.start
raw_loc_indices[d] = md.symbolic_min.subs(it.dim, md_sub)
else:
raw_loc_indices[d] = md
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is was this tutorial only failing..

https://github.com/devitocodes/devito/blob/master/examples/seismic/tutorials/12_time_blocking.ipynb

need to check closer if I can factor out some mfe I guess

devito/passes/iet/mpi.py Outdated Show resolved Hide resolved
devito/passes/iet/mpi.py Outdated Show resolved Hide resolved
examples/seismic/viscoacoustic/operators.py Outdated Show resolved Hide resolved
@@ -64,4 +64,4 @@ def ForwardOperator(model, geometry, space_order=4, save=False, **kwargs):

# Substitute spacing terms to reduce flops
return Operator([u_v, u_r, u_t] + src_rec_expr, subs=model.spacing_map,
name='Forward', **kwargs)
name='ViscoElForward', **kwargs)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same way

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the new names aren't consistent since here you don't specify the type of media

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ViscoIsoElForward

Copy link
Contributor Author

@georgebisbas georgebisbas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More comments and fixes to follow tmr

return "<%s(%s)>" % (self.__class__.__name__, functions)


class HaloSpot(HaloMixin, Node):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

after re-thinking about this change, I struggle to understand.

  • HaloScheme (in mpi.py) has this new __repr__
  • HaloSpot (here) carries a halo_scheme
  • Hence, couldn't HaloSpot.__repr__ simply resort to self.halo_scheme.__repr__() (or a wrapper of it if you want to add more information)

I think this thing that the same exact representation is used in two different places doesn't make much sense

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ι did a trick to emit the same repr.
Lemme know what you think

HaloSchemeEntry = namedtuple('HaloSchemeEntry', 'loc_indices loc_dirs halos dims')
class HaloSchemeEntry:

def __init__(self, loc_indices, loc_dirs, halos, dims):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so what's the issue?

There's a lot of redundant code here -- we should use EnrichedTuple

Then:

  • you shouldn't need to override __eq__
  • you shouldn't need to override __repr__
  • I have some doubts about overriding __hash__, but maybe also see comment below

self.halos == other.halos and
self.dims == other.dims)

def __hash__(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do you need to use all these frozentsets?

if halos and dims are mutuable, they should be turned into immutable inside __init__

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this fine now, or I still do not understand?

@@ -556,7 +596,7 @@ def process_loc_indices(raw_loc_indices, directions):
known = set().union(*[i._defines for i in loc_indices])
loc_dirs = {d: v for d, v in directions.items() if d in known}

return frozendict(loc_indices), frozendict(loc_dirs)
return loc_indices, loc_dirs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ping

devito/passes/iet/mpi.py Outdated Show resolved Hide resolved
tests/test_mpi.py Show resolved Hide resolved
tests/test_mpi.py Outdated Show resolved Hide resolved
tests/test_mpi.py Outdated Show resolved Hide resolved
# The correct we want
assert len(calls) == 5

assert len(FindNodes(HaloUpdateCall).visit(op.body.body[1].body[1].body[0])) == 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are you saying these are not covered already in the new test_issue_* that you've added above?

it seems to be the same dependency pattern, hence a redundant test...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is better here, is that it has more dimensions, closer to the >1d example taher than the 1d mfe of test_issue_*.
This leads to also check merge_halospots doing the right thing as in haloupdate(v_x, v_y....) and not only v_x....

tests/test_mpi.py Outdated Show resolved Hide resolved
@georgebisbas georgebisbas force-pushed the halo_opt_revamp_II branch 3 times, most recently from 3c19ade to 2fbed4d Compare December 9, 2024 19:29
@@ -1492,6 +1495,9 @@ def body(self):
def functions(self):
return tuple(self.fmapper)

def __repr__(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

show stay below __init__, just like all other special __X methods


Halo = namedtuple('Halo', 'dim side')

OMapper = namedtuple('OMapper', 'core owned')


class HaloScheme:
class HaloScheme():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

avoid

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

@@ -1492,6 +1495,9 @@ def body(self):
def functions(self):
return tuple(self.fmapper)

def __repr__(self):
funcs = self.halo_scheme.__reprfuncs__()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again, it's a useless complication. Just return something along the lines of HaloSpot(f,g) and if the user really cares about the underlying halo_scheme, they can always do hs.halo_scheme and this way get the textual information they are after

so, IOW, drop __reprfuncs__

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, then I can just return to the old one I guess

HaloSchemeEntry = namedtuple('HaloSchemeEntry', 'loc_indices loc_dirs halos dims')
class HaloSchemeEntry:

def __init__(self, loc_indices, loc_dirs, halos, dims):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SHould I still be able to construct HaloSchemeEntry the same way I do it now?

what happens if you try (so yes, w/o specifying any getters)

@@ -93,9 +120,17 @@ def __init__(self, exprs, ispace):
self._honored[i.root] = frozenset([(ltk, rtk)])
self._honored = frozendict(self._honored)

def __reprfuncs__(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as explained above, avoid; put it back inside __repr__

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dropped

else:
raw_loc_indices[d] = v

hse = hse._rebuild(loc_indices=raw_loc_indices)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again you might not need this extra variable if u just use loc_indices

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thnk it is good here to differentiate since we have both raw and processed



def _make_cond_mapper(iet):

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no blank line; add simple docstring

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

for hs, v in MapHaloSpots().visit(iet).items():
conditionals = set()
for i in v:
if i.is_Conditional and not isinstance(i.condition, GuardFactorEq):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this smells like a hack. What are you trying to accomplish here?

Is this a recent addition or is it something I've missed all along until now?

also, it should have been a set comprehension --

conditionals = {i for i in v if i.is_Conditional ...}

But again, this line seems a bit dodgy to me

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is only factored-out code; I did not write this, just to avoid duplication.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did comprehension

@@ -1220,7 +1223,7 @@ def test_avoid_fullmode_if_crossloop_dep(self, mode):
assert np.all(f.data[:] == 2.)

@pytest.mark.parallel(mode=2)
def test_avoid_haloudate_if_flowdep_along_other_dim(self, mode):
def test_avoid_haloupdate_if_flowdep_along_other_dim(self, mode):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this being tagged as a diff?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo in haloupdate, letter p mising

@@ -1333,8 +1336,10 @@ def test_merge_haloupdate_if_diff_locindices_v1(self, mode):

* the second and third Eqs cannot be fused in the same loop

In the IET we end up with *one* HaloSpots, placed right before the
second Eq. The third Eq will seamlessy find its halo up-to-date.
In the IET we end up with *two* HaloSpots, one placed before the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does the name of the test still make sense or should it change as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, renamed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler MPI mpi-related
Projects
None yet
Development

Successfully merging this pull request may close these issues.

compiler: Redundant haloupdate
4 participants