Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MPI fixes + smoother builtin #931

Merged
merged 11 commits into from
Oct 23, 2019
Merged

MPI fixes + smoother builtin #931

merged 11 commits into from
Oct 23, 2019

Conversation

rhodrin
Copy link
Contributor

@rhodrin rhodrin commented Sep 9, 2019

This PR:

  1. Includes some bug fixes associated with MPI data slicing along with the relevant tests.
  2. A gaussian smoother builtin for Devito functions that mimics the scipy.ndimage gaussian_filter.

Note: Some changes and tidying are still required.

@rhodrin rhodrin added the testing label Sep 9, 2019
devito/builtins.py Outdated Show resolved Hide resolved
"""
Gaussian smooth function.
"""
class DataDomain(dv.SubDomain):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth moving this inside Devito?
BTW, I wouldn't call it "DataDomain" -- all subdomains are data domains after all ?

Copy link
Contributor Author

@rhodrin rhodrin Sep 10, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth moving this inside Devito?

Not at the moment I don't think - if we're going to have some 'pre-set' subdomain methods it's probably better we introduce them after a little more thought. Or is that not what you meant?

BTW, I wouldn't call it "DataDomain" -- all subdomains are data domains after all ?

Yeah, sure. I think ObjectiveDomain is a better name.

def define(self, dimensions):
return {d: ('middle', self.lw, self.lw) for d in dimensions}

def fill_data(g, f, dim, lw, mode):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't this basically model/initialize_function? Would a be lot shorter/more readable to reuse it IMO

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will take a look, but I would have thought model/initialize_function can't do the 'reflect' conditions?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't put it because didn't need it but it's just a small change in the rhs of the equation. Currently is Eq(u(d), u(npml) can just do Eq(u(d), u(nbpm+d) or something like that and you got the reflect

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oki, will check it out.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it worth promoting initialize_function to a builtin ?


return

def subfloor(f, g):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably don't need this with the right sizes + Operator rather than python fill_data

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, probably, will test.

weights = np.exp(-0.5/sigma**2*(np.linspace(-lw, lw, 2*lw+1))**2)
weights = weights/weights.sum()

for d in dims:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can make it in one go with a 3D kernel from outer product of 1D kernel and avoid the loop as 3D smooth is
W[i,j,k] = w[i]*w[j]*w[k] and need one 3d conv rather than multiple 1D ones with padding.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, but then the stencil is a square/cube (2d/3d). I think the symbolic coefficients stuff may need an update to get that to work, will double check. (It should of course be done that way, and will be at some point, but I don't think it'll work immediately which is why it's currently like this).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes but you can just do sum w[i]*f.subs(...) rather than symbolic coeefs and do just all dims at once

@codecov-io
Copy link

Codecov Report

Merging #931 into master will increase coverage by <.01%.
The diff coverage is 91.82%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master    #931      +/-   ##
=========================================
+ Coverage   90.09%   90.1%   +<.01%     
=========================================
  Files         148     149       +1     
  Lines       19815   19991     +176     
  Branches     3080    3111      +31     
=========================================
+ Hits        17852   18012     +160     
- Misses       1596    1611      +15     
- Partials      367     368       +1
Impacted Files Coverage Δ
devito/types/dense.py 92.95% <ø> (-0.02%) ⬇️
devito/types/grid.py 93.12% <100%> (-0.04%) ⬇️
devito/data/data.py 96.93% <100%> (+0.01%) ⬆️
tests/test_builtins.py 100% <100%> (ø)
devito/types/dimension.py 93.64% <100%> (+0.01%) ⬆️
devito/data/utils.py 92.57% <100%> (+1.52%) ⬆️
tests/test_data.py 97.5% <100%> (+0.06%) ⬆️
devito/builtins.py 72.18% <80.68%> (+8.77%) ⬆️
devito/mpi/distributed.py 91.37% <0%> (+0.43%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3911117...10ff2d6. Read the comment docs.

@@ -176,7 +176,9 @@ def __getitem__(self, glb_idx, comm_type):
if comm_type is index_by_index:
# Retrieve the pertinent local data prior to mpi send/receive operations
data_idx = loc_data_idx(loc_idx)
self._index_stash = flip_idx(glb_idx, self._decomposition)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably time to introduce a context manager

return as_tuple(processed)


def distributed_data_size(shape, coords, topology, nprocs):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just to be sure: does this name mean "compute_distributed_data_size" ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's to compute the 'cumulative distrusted data size' (could maybe call it cum_dist_dat_size? Anyway, alternate name suggestions are welcome!). Quick example:
Rank 0-3 dat shapes are (2, 2) and grid is a square.
This function will return:

[[(2, 2), (2, 4)],
 [(4, 2), (4, 4)]]

@rhodrin rhodrin force-pushed the gaussian_smooth branch 2 times, most recently from 097ed68 to a41aa15 Compare September 18, 2019 21:15
thickness=lw)
dim_r = dv.SubDimension.right(name='abc_%s_r' % d.name, parent=d,
thickness=lw)
if mode == 'constant':
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aside from the to_copy variables, the construction of the eqs is pretty much identical in the two cases (constant, reflect). So we can probably trim down the bodies of this if-else ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, can do.

return f


def initialize_function(function, data, nbpml, mode='constant'):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this should be squashed with the assign builtin so as to create something unique and sufficiently general for all our needs ? otherwise, there's an overlap between these builtins...although their names are totally different.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, should probably do something along those lines.


Parameters
----------
idx: Tuple of slices/ints/tuples
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

numpydoc nitpicking:

idx : tuple of slices/ints/tuples
     ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, still need to fix a few docstrings.

"""
Compute the cumulative shape of the distributed data (cshape).

coords: Array containing the local shape of data to each rank.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use numpydoc?

Parameters
-----------
...

@rhodrin rhodrin removed the testing label Sep 20, 2019
@FabioLuporini
Copy link
Contributor

just noticed that basically this is a single commit with loads of changes in it 😬

@rhodrin
Copy link
Contributor Author

rhodrin commented Oct 16, 2019

just noticed that basically this is a single commit with loads of changes in it

Oh. I've messed up a rebase somewhere. Will take a look.



def assign(f, v=0):
def assign(f, RHS=0, options=None, name='asign', **kwargs):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo (asign)

"""
Assign a value to a Function.
Assign a list of RHS's to a list of Function's.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpicking: we don't use apostrophes for plurals such as Functions

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps it's now worth adding a couple examples?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, there will be + quite a few tests.

The left-hand side of the assignment.
v : scalar, optional
RHS : expression or list of expression's, optional
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expression -> expr-like

"""
dv.Operator(dv.Eq(f, v), name='assign')()
if not isinstance(f, list):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does it work with

f = as_tuple(f) 

?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or use the new as_list ?

rhs.append(function.subs({d: subsr}))
options.extend([None, None])

if bool(additional_expressions):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pythonicity : no need for bool

if not isinstance(RHS, list):
RHS = [RHS, ]*len(f)
eqs = []
if bool(options):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pythonicity : no need for bool

expr['options'] = options
additional_expressions[d] = expr

if isinstance(f, dv.Function):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps this if-else should be inside initialize-function ? ie, it's up to initialize-function to do the right thing based on whether the second argument is of type numpy.ndarray or Function?

rhs = []
options = []

for d in function.dimensions:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mloubout
Note that this won't work with TimeFunctions

perhaps we should use space_dimensions instead of dimensions ?

or am I missing something?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reminder

Copy link
Contributor

@FabioLuporini FabioLuporini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've just added some more nitpicking, but I understand this is pretty much ready to go ?

@mloubout can you give a final look?


Parameters
----------
f : Function
f : Function or list of Function's
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpicking: Functions



def assign(f, v=0):
def assign(f, RHS=0, options=None, name='assign', **kwargs):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'rhs', not 'RHS': not sure it should be all capital letters


additional_expressions = {}
for d in f_c.dimensions:
expr = {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpicking: u probably don't need this one (see below)

expr['lhs'] = lhs
expr['rhs'] = rhs
expr['options'] = options
additional_expressions[d] = expr
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

additional_expressions[d] = {'lhs': lhs, 'rhs': rhs, 'options': options}


b = [min(l) for l in (w for w in (buff(i, j) for i, j in zip(local_size, halo)))]
if any(np.array(b) < 0):
raise ValueError("Function's halo is not sufficiently thick.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Best if we say which Function

"Function `%s`  ... " % function

devito/builtins.py Show resolved Hide resolved
weights = np.exp(-0.5/sigma**2*(np.linspace(-lw, lw, 2*lw+1))**2)
weights = weights/weights.sum()

additional_expressions = {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpicking: for this sort of things, throughout devito we use the conventional name "mapper"

if mode == 'reflect' and function.grid.distributor.is_parallel:
# Check that HALO size is appropriate
halo = function.halo
local_size = function.grid.distributor.shape
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not just function.shape ?

@rhodrin
Copy link
Contributor Author

rhodrin commented Oct 21, 2019

I've just added some more nitpicking, but I understand this is pretty much ready to go ?

Yes, should be.
Will address the comments above with a new commit shortly.

…with negative step.

devito/data/data.py: Update slice step normalisation, devito/data/decomposition.py: Modify the manner in which negative slices are dealt with.

test_data.py: Fix 'test_advanced_indexing' hopefully.

Fix test.

Some not final changes to ensure we now get tests to pass.

tests/test_data.py: Move slice tests with negative step to their own 'tests' and skip them if yask enabled.

Remove unused import.

test_data.py: Seeing what works and what doesn't.

data.py: Breaking for testing purposes.

Example update.

Edit notebook.

data.py; decomposition.py: Some fixes. f.data[::-1, ::-1] ordering still broken.

Start clean up.

data.py, test_data.py: Enable slicing with step < 0 when MPI is enabled and add test for this functionality.

data.py: Modifications to allow slicing with MPI. Code is in a mess but working for basic operations. Still need to fix for some advanced slicing.

data.py: Start updating __getitem__. Still not working as intended.

data.py: tweaking.

Fix an error.

Misc. fixes.

Breaking it.

data.py: More fixes.

Start tidying.

Fixes + tidying.

Testing.

This is painful.

data.py: A mess, but progress. Many tests passing now, lots of tidying and simplifying to do.

data.py, test_data.py: Some updates + adding more tests.

flake8.

decomposition.py, data.py: Add fixes for step sizes > 1. Currently incomplete and some cases will still fail.

data.py, decomposition.py: Fixes for negative slicing with step < -1.

decomposition.py: Fix formatting errror. test_data.py: Add tests with step sizes > 1.

data.py, decomposition.py: __setitem__ fix for when (with mpi_slicing) the dim of val is 1.

Data: Fix for cases with dimension > 2. Add more tests.

Fix.

test_data.py: Add tests for when dim(slice) < dim(Data).

data.py: Some tidying.

data.py: Start moving code to new methods.

data.py: Start moving code to new methods.

data: Further tidying + trimming.

data: Minor tidying + add one new test.

Bug testing.

data: Start adding support for mpi slicing with step size > 2 + appropriate tests. One important test still broken owing to bug.

flake8

data: Working on final test.

data: Fix bug and add another test. Most tests in place now, lots of tidying + re-structuring still required.

test_data: Update test_convert_index + add local to global tests.

Add notebook before rebase to my other branch.

decomposition.py: Fix bug for negative step, test_data.py: Add tests with negative step.

flake8.

devito/data/data.py: Update slice step normalisation, devito/data/decomposition.py: Modify the manner in which negative slices are dealt with.

Tidying (no important changes).

test_data.py: Fix 'test_advanced_indexing' hopefully.

Attempt to fix test.

Some not final changes to ensure we now get tests to pass.

tests/test_data.py: Move slice tests with negative step to their own 'tests' and skip them if yask enabled.

Remove unused import.

test_data.py: Seeing what works and what doesn't.

data.py: Breaking for testing purposes.

Example update.

Edit notebook.

Testing.

data.py; decomposition.py: Some fixes. f.data[::-1, ::-1] ordering still broken.

Start clean up.

test_data.py: Remove test that's no longer needed.

.

data.py: Modifications to allow slicing with MPI. Code is in a mess but working for basic operations. Still need to fix for some advanced slicing.

data.py: Start updating __getitem__. Still not working as intended.

data.py: tweaking.

Fix one error.

Misc. fixes.

Breaking it.

.

data.py: More fixes.

Start tidying.

Fixes + tidying.

Testing.

This is painful.

data.py: A mess, but progress. Many tests passing now, lots of tidying and simplifying to do.

data.py, test_data.py: Some updates + adding more tests.

flake8.

decomposition.py, data.py: Add fixes for step sizes > 1. Currently incomplete and some cases will still fail.

data.py, decomposition.py: Fixes for negative slicing with step < -1.

data.py, decomposition.py: __setitem__ fix for when (with mpi_slicing) the dim of val is 1.

Data: Fix for cases with dimension > 2. Add more tests.

data.py: Some tidying.

data.py: Start moving code to new methods.

data.py: Start moving code to new methods.

data: Further tidying + trimming.

data: Minor tidying + add one new test.

Bug testing.

data: Working on final test.

data: Fix bug and add another test. Most tests in place now, lots of tidying + re-structuring still required.

data: Tidying.

data: Further tidying. Note: Still some things to address (see Fabio's comments above) but approaching complete code.

data.py: Change mpi_slicing to SLICING tag.

data.py: Fix rebase errors.

data, test_data: Further rebase fixes.

data.py: Update docstrings.

Add gaussian_smooth fn + example.

Remove needless expression

Fix.

Add gaussian_smooth fn + example.

Remove needless expression

Fix.

builtins.py: Begin adding boundary conditions to the Gaussian filter.

Tinkering.

builtins.py: Modify gaussian smoother. Now returns same results and numpy in examples tested.

Add builtins test file.

updates.

Working on MPI.

Bug fixing. Delete.

Testing. Remove.

MPI slicing fixes. Smoother now working.

Update.

Add fixme reminder.

Fix tests for 'N' procs.

Enable N-dims.

gird, dimension: Memoize _arg_defaults for performance.

gird.py: Chane _arg_defaults decorator from memoized_meth -> memoized_func.

Bug fix.

data: Add test and fixme.

data.utils: Fix flip for slices with negative start/stops.

builtins: Improve index creation. grid: Change _arg_defaults mpi test.

grid: Remove commented out import.

Tidying.

Tidying.

test_builtins: Skip mpi test if no mpi.

No need for list

data.utils: Add docstrings.

builtins: Move init func to builtins. Start tidying Gaussian smooth.

Making test function.

builtins: 'New' Gaussian smoother fixes.

builtins: Minor tidying of gaussian_smooth + initialize_function.

Minor tidying.

data.utils: Fix docstrings.

types.dense: Add note _arg_defaults note. test_mpi: Add test.

test_mpi.py: Fix rebase error.
…builtins: Minor tidying.

test_builtins: Start adding 'assign' tests.

dle.transformer: Try another 'noop' override config.

dle.transformer: Swtich the strings stupid!
>>> import numpy as np
>>> from devito import Grid, SubDomain, Function, initialize_function

Create a subdomain representing the `inner` region:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is actually grid.interior ?

def initialize_function(function, data, nbl, mapper=dict(), mode='constant',
name='padfunc'):
"""
Initialize a `Function` with the given ``data``. ``data``
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't use backtips for types (so `Function` -> Function)

----------
function : Function
The initialised object.
data : ndarray of Function
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

*or ?

return f


def initialize_function(function, data, nbl, mapper=dict(), mode='constant',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

never use mutable objects as default values . Just google for it, you'll find loads of explanations, for example here

nbl : int
Number of outer layers (such as PML layers for boundary damping).
mapper : dict, optional
Dictionary containing, for each dimension of function, a sub-dictionary
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here you probably mean `function` as you're referring to a specific variable name

rhs = []
options = []

for d in function.dimensions:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reminder

"""
dv.Operator(dv.Eq(f, v), name='assign')()
if not isinstance(f, list):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or use the new as_list ?

rhs_extra = exprs['rhs']
lhs.extend(as_list(lhs_extra))
rhs.extend(as_list(rhs_extra))
if 'options' in exprs.keys() and exprs['options']:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need for .keys()

options_extra = exprs['options']
else:
options_extra = len(as_list(lhs_extra))*[None, ]
if isinstance(options_extra, list):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you may be able to use as_list again, instead of this if-else ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not in this case since is could be a single dictionary which the current versions of as_tuple and as_list mess up. But I'm not sure if this behavior was intended or is a bug?

devito/tools/utils.py Show resolved Hide resolved
rhs_extra = exprs['rhs']
lhs.extend(as_list(lhs_extra))
rhs.extend(as_list(rhs_extra))
if 'options' in exprs and exprs['options']:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the pythonic way to write these 4 lines would be

options_extra = exprs.get('options', len(as_list(lhs_extra))*[None, ])

>>> grid = Grid(shape=(6, 6))
>>> x, y = grid.dimensions

Create the function we wish to set along with the data to set it:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function

@FabioLuporini FabioLuporini merged commit c4ea844 into master Oct 23, 2019
@FabioLuporini FabioLuporini deleted the gaussian_smooth branch October 23, 2019 07:57
@FabioLuporini
Copy link
Contributor

merged

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants