Fix sharedarray indexing regression #12964

rened · 2015-09-05T16:49:47Z

#12560 got rid of pass-through getindex/setindex definitions for SharedArray. This causes getindex(::SharedArray, ...) to always create new shared arrays and thus quickly running out of open file handles. Instead, normal arrays should be returned. This PR reintroduces the old behavior.

To trigger the problem in current master:

d = SharedArray(Float64, (2,3))
for x in 1:100000
    @show x
    d[:,2]
end

I think this is a blocker for 0.4, as SharedArrays are totally unusable in the current state.
cc @timholy @amitmurthy

timholy · 2015-09-05T20:10:41Z

test/parallel.jl

@@ -97,6 +97,11 @@ for p in procs(d)
    @test d[idxl] == p
 end

+d = SharedArray(Float64, (2,3))
+for x in 1:100000
+    d[:,2]


This isn't quite a real test, since it doesn't explicitly fail for the reason you want it to.

How about test the return type?

That would of course be the other, better way of testing this, agreed. This way you just run out of file descriptors and crash, I guess...

timholy · 2015-09-05T20:12:10Z

Good catch. They are of course fine if you always just index scalars, but I agree this should be fixed.

This is one way to fix it; another would be to have similar return an Array. Any preferences?

rened · 2015-09-05T21:06:15Z

Regarding similar: Sorry, I know that part of Julia's array design well enough yet. Whatever you think is best.

rened · 2015-09-05T21:10:50Z

Thinking about it, I believe it's better to leave similar returning a SharedArray, and using the pass-through in getindex/setindex. There are valid use cases for wanting a similar shared array.

timholy · 2015-09-05T21:38:21Z

There are, but on the other hand it means that A+3, cumsum, etc will return SharedArrays. That could be good or bad, depending on whether we want to encourage people to call those functions on SharedArrays. (It seems more likely you'd want to call them on the "assigned chunk.")

rened · 2015-09-06T07:00:14Z

True. similar seems better. I am changing and testing this now.

rened · 2015-09-06T07:41:59Z

Implemented now by making similar always return a non-shared array. Works, except for the unit test for #6362 in https://github.com/rened/julia/blob/shgetindex/test/parallel.jl#L184-L192 - now, the result of copy or deepcopy is not a shared array any more.

Also, now, there is no clean way of getting a similar shared array (which, e.g., lies on the same pids).

I think is good that working on a shared array does not create new shared arrays - that should be an explicit operation. Loosing the ability to get an actually similar shared array needs to be mitigated somehow. @timholy Do you know a way around this?

tkelman · 2015-09-06T09:19:52Z

I'm of the opinion that similar(::SharedArray) should return a SharedArray.

timholy · 2015-09-06T09:56:03Z

You can always get a shared array with SharedArray(eltype(A), size(A)).

I am torn about changing similar, but on balance lean towards making it return an Array. The main reason is that only one process (typically, the one with myid() == 1) should be creating a SharedArray: worker computations should not result in multiple copies springing up like daisies.

rened · 2015-09-06T10:05:23Z

I believe this is the better approach, yes. Updated the unit test for copy/deepcopy.

JeffBezanson · 2015-09-07T21:35:36Z

Ok, I'm going to trust the practical experience here.

Fix sharedarray indexing regression

tkelman · 2015-09-08T12:14:05Z

similar is sort of messy and inelegant, but I don't see why SharedArray should be special-cased differently than any other expensive-to-create array type here. SharedArray(eltype(A), size(A)) defeats generic code, and while typeof(A)(eltype(A), size(A)) might be workable, isn't that what similar(A) is supposed to be shorthand for?

rened · 2015-09-08T12:17:21Z

@amitmurthy care to comment? you have the most detailed view of this I guess!

amitmurthy · 2015-09-08T12:36:13Z

Actually, since I don't use this stuff in any real world applications, I would withhold any detailed opinion except to observe that while Tony's view is more purist in nature, Tim's is more practical.

timholy · 2015-09-08T14:16:25Z

I'm not sure it is different than other types. For example, similar(::SubArray) returns an Array.

I think if it were returning something of the same type, it would be called same. The name seems to be license to make a different choice.

tkelman · 2015-09-08T14:24:30Z

Reminding me why I dislike similar, ref #11574. I thought the decision of when to return a different type was more based on whether the elements can be modified (perhaps taking aliasing into consideration, for the SubArray case)?

timholy · 2015-09-08T14:31:09Z

I agree this isn't easy. But even if one only wanted a "modifiability" exemption, this might pass it: if it returns a SharedArray, then in principle other processes can modify elements in this new thing. If it's an Array, they can't.

mbauman · 2015-09-08T14:44:23Z

There's been some more discussion around this in packages, see JuliaStats/NullableArrays.jl#56 (comment). I really think that similar just means: "give me the best mutable subtype of AbstractArray for a given shape and element type, stemming from this source array." I hold that this operation is crucially important, and really needs to be flexible enough to be guided by practical experience. Sure, it's a subjective choice. But there's definitely value in allowing the flexibility to make good, practical choices here.

tkelman · 2015-09-08T15:11:25Z

Dunno, the subjectivity of the manual array type mapping in similar just leaves me with a lingering distrust for it, seems like an underspecified, unreliable and ad-hoc choice that needs to be made every single time you talk about a new AbstractArray subtype, otherwise the fallback means you get unspecialized Arrays in a lot of operations you might not be expecting them from.

I guess we can try this out for a while, but for the examples of A+3 and cumsum returning a SharedArray feels more appropriate to me than an Array.

jakebolewski · 2015-09-08T15:15:29Z

I agree with @tkelman, something like A + 3 should definitely return a shared array here. Otherwise it destroys the whole PGAS abstraction.

mbauman · 2015-09-08T15:53:54Z

Generic programming with array types that require certain algorithmic complexities is extremely hard. My instinct has been to make the base library correct in terms of the index=>value mapping, and then punt to the individual subtypes for specializations that provide different algorithmic complexities. But I'm learning to appreciate that a fallback with a wrong complexity class for some applications may even be worse than if it were not implemented in the first place.

I'm not sure how much the base library can do here. There's no limit to how much we can divide up similar into particular tasks (well, besides a vocabulary), but in doing so it loses its generalizability and power. :-\

The question, then, is who has to pay the price of not fitting in? And where is that price the lowest? The answer may be that ShardArrays need to define their own non-scalar indexing operations.

jakebolewski · 2015-09-08T16:06:07Z

That is fine, but you are concentrating too much on computational complexity and leaving out memory / space requirements. You can have an out-of-core shared array, you cannot have one (well preserving the abstraction) if A + 3 returns a dense array.

*edited

mbauman · 2015-09-08T16:08:42Z

Yes, that's precisely what I mean to say by algorithmic complexity, that is, including both space and time complexities for the data structure and common operations on it.

jakebolewski · 2015-09-08T16:14:38Z

Ok fair enough, SharedArrayhas good enough performance where you can re-use the existing indexing infrastructure but those abstractions fall down when applied to DArray's. I think it is inevitable that re-defining indexing operations has to occur for specific distributed array types. No one is going to accept the performance tradeoff to make these abstractions universal for the dense Array case.

ChrisRackauckas · 2017-10-25T14:02:23Z

This choice makes it extremely hard to allow SharedArrays in generic algorithms because most operations make it automatically convert back to an Array. Could making similar return a SharedArray be reconsidered, especially now that we have the broadcast changes which implicitly fix a lot of indexing + performance issues but now give issues since these return arrays?

timholy · 2017-10-25T15:23:10Z

Now that we have good views (SubArray, ReshapedArray, ReinterpretArray, MappedArray) there's a lot you can do without ever needing to create new storage. So the landscape may have changed.

But it will take someone sitting down and thinking out a proposal for how to manage the issues above.

EDIT: a recommendation would to be implement the change locally and see if you still like it a week later.

timholy reviewed Sep 5, 2015
View reviewed changes

rened force-pushed the shgetindex branch from be6020e to abf600c Compare September 5, 2015 21:04

kshyatt added domain:parallelism Parallel or distributed computation kind:regression Regression in behavior compared to a previous version labels Sep 6, 2015

rened force-pushed the shgetindex branch from abf600c to 55f2cf0 Compare September 6, 2015 07:32

fix sharedarray indexing regression

060b13c

rened force-pushed the shgetindex branch from 55f2cf0 to 060b13c Compare September 6, 2015 10:04

JeffBezanson added a commit that referenced this pull request Sep 7, 2015

Merge pull request #12964 from rened/shgetindex

e6b59b6

Fix sharedarray indexing regression

JeffBezanson merged commit e6b59b6 into JuliaLang:master Sep 7, 2015

tkelman mentioned this pull request Nov 25, 2015

How best to fix similar for triangular matrices? #13731

Closed

tkelman mentioned this pull request Dec 21, 2015

Segmentation fault after deep-copying SharedArray nested in a type #14459

Closed

KristofferC mentioned this pull request Feb 9, 2016

promotion to SharedArray #15002

Open

ChrisRackauckas mentioned this pull request Jun 5, 2017

A new "outer_similar" #22218

Open

KristofferC mentioned this pull request Sep 17, 2017

similar on SharedArray returns SharedArray #23747

Closed

timholy mentioned this pull request Sep 18, 2017

Shared Memory Parallelizaton (Question) JuliaMath/Interpolations.jl#149

Open

ranocha mentioned this pull request Oct 25, 2017

Support SharedArrays SciML/OrdinaryDiffEq.jl#215

Closed

mbauman mentioned this pull request Nov 5, 2018

make shared array constructor consistent to array constructor #29930

Open

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix sharedarray indexing regression #12964

Fix sharedarray indexing regression #12964

rened commented Sep 5, 2015

timholy Sep 5, 2015

rened Sep 5, 2015

timholy commented Sep 5, 2015

rened commented Sep 5, 2015

rened commented Sep 5, 2015

timholy commented Sep 5, 2015

rened commented Sep 6, 2015

rened commented Sep 6, 2015

tkelman commented Sep 6, 2015

timholy commented Sep 6, 2015

rened commented Sep 6, 2015

JeffBezanson commented Sep 7, 2015

tkelman commented Sep 8, 2015

rened commented Sep 8, 2015

amitmurthy commented Sep 8, 2015

timholy commented Sep 8, 2015

tkelman commented Sep 8, 2015

timholy commented Sep 8, 2015

mbauman commented Sep 8, 2015

tkelman commented Sep 8, 2015

jakebolewski commented Sep 8, 2015

mbauman commented Sep 8, 2015

jakebolewski commented Sep 8, 2015

mbauman commented Sep 8, 2015

jakebolewski commented Sep 8, 2015

ChrisRackauckas commented Oct 25, 2017

timholy commented Oct 25, 2017 •

edited

Loading

Fix sharedarray indexing regression #12964

Fix sharedarray indexing regression #12964

Conversation

rened commented Sep 5, 2015

timholy Sep 5, 2015

Choose a reason for hiding this comment

rened Sep 5, 2015

Choose a reason for hiding this comment

timholy commented Sep 5, 2015

rened commented Sep 5, 2015

rened commented Sep 5, 2015

timholy commented Sep 5, 2015

rened commented Sep 6, 2015

rened commented Sep 6, 2015

tkelman commented Sep 6, 2015

timholy commented Sep 6, 2015

rened commented Sep 6, 2015

JeffBezanson commented Sep 7, 2015

tkelman commented Sep 8, 2015

rened commented Sep 8, 2015

amitmurthy commented Sep 8, 2015

timholy commented Sep 8, 2015

tkelman commented Sep 8, 2015

timholy commented Sep 8, 2015

mbauman commented Sep 8, 2015

tkelman commented Sep 8, 2015

jakebolewski commented Sep 8, 2015

mbauman commented Sep 8, 2015

jakebolewski commented Sep 8, 2015

mbauman commented Sep 8, 2015

jakebolewski commented Sep 8, 2015

ChrisRackauckas commented Oct 25, 2017

timholy commented Oct 25, 2017 • edited Loading

timholy commented Oct 25, 2017 •

edited

Loading