Skip to content

Conversation

@DPeterK
Copy link
Member

@DPeterK DPeterK commented Oct 26, 2017

We have chosen to set default Iris behaviour to use only one thread when processing dask graphs initially. This was done by setting the global dask state using dask.set_options(). This is far from ideal as it means Iris is changing dask state; a situation that should not be. This PR updates the system for setting dask processing options so that behaviour is maintained without changing dask state.

@DPeterK DPeterK requested review from djkirkham and pelson October 26, 2017 15:30
@pelson
Copy link
Member

pelson commented Oct 26, 2017

Closes #2739.

@pelson pelson added this to the v2.0.0 milestone Oct 26, 2017
"""
dask_opts = {}
if 'pool' not in dask.context._globals and \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd probably guard about accessing dask.context._globals. They could legitimately remove this and leave iris unable to load data... 🤒

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(to be clear, in this context I simply mean getattr(dask.context, '_globals'))

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the nature of this API (private), I'd also be tempted to add a test that asserts dask's set_opt behaviour continues to be a valid assumption in dask.context._globals.

@DPeterK
Copy link
Member Author

DPeterK commented Oct 26, 2017

@pelson I had a sneaky extra 10min, so I implemented the changes you suggested 😎

Copy link
Member

@pelson pelson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for updating @dkillick. Good use of a sneaky 10 mins 😄

_iris_dask_defaults()
dask_opts = {}
dask_globals = getattr(dask.context, '_globals')
if dask_globals is not None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if dask change their dask.context._globals storage location, we let dask give us the default engine. Is that what is intended?

# We may need to unset a previously-set default.
if dask_opts.get('get') is not None:
dask_opts = {key: value for key, value in dask_opts.items()
if key != 'get'}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to understand why the get is not used from the dask options. I think this comes down to my not-understanding the dask globals and what they actually mean...

@djkirkham djkirkham assigned pp-mo and unassigned djkirkham Oct 27, 2017
@djkirkham djkirkham removed their request for review October 27, 2017 14:09
@pelson
Copy link
Member

pelson commented Oct 27, 2017

Deferring for now. #2879 removes the default change, and we can re-introduce a different default when we have a "smoking gun"/evidence of need to divert from dask's defaults.

@pelson pelson closed this Oct 27, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants