Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix sharedarray indexing regression #12964

Merged
merged 1 commit into from
Sep 7, 2015
Merged

Conversation

rened
Copy link
Member

@rened rened commented Sep 5, 2015

#12560 got rid of pass-through getindex/setindex definitions for SharedArray. This causes getindex(::SharedArray, ...) to always create new shared arrays and thus quickly running out of open file handles. Instead, normal arrays should be returned. This PR reintroduces the old behavior.

To trigger the problem in current master:

d = SharedArray(Float64, (2,3))
for x in 1:100000
    @show x
    d[:,2]
end

I think this is a blocker for 0.4, as SharedArrays are totally unusable in the current state.
cc @timholy @amitmurthy

@@ -97,6 +97,11 @@ for p in procs(d)
@test d[idxl] == p
end

d = SharedArray(Float64, (2,3))
for x in 1:100000
d[:,2]
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't quite a real test, since it doesn't explicitly fail for the reason you want it to.

How about test the return type?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would of course be the other, better way of testing this, agreed. This way you just run out of file descriptors and crash, I guess...

@timholy
Copy link
Sponsor Member

timholy commented Sep 5, 2015

Good catch. They are of course fine if you always just index scalars, but I agree this should be fixed.

This is one way to fix it; another would be to have similar return an Array. Any preferences?

@rened
Copy link
Member Author

rened commented Sep 5, 2015

Regarding similar: Sorry, I know that part of Julia's array design well enough yet. Whatever you think is best.

@rened
Copy link
Member Author

rened commented Sep 5, 2015

Thinking about it, I believe it's better to leave similar returning a SharedArray, and using the pass-through in getindex/setindex. There are valid use cases for wanting a similar shared array.

@timholy
Copy link
Sponsor Member

timholy commented Sep 5, 2015

There are, but on the other hand it means that A+3, cumsum, etc will return SharedArrays. That could be good or bad, depending on whether we want to encourage people to call those functions on SharedArrays. (It seems more likely you'd want to call them on the "assigned chunk.")

@kshyatt kshyatt added domain:parallelism Parallel or distributed computation kind:regression Regression in behavior compared to a previous version labels Sep 6, 2015
@rened
Copy link
Member Author

rened commented Sep 6, 2015

True. similar seems better. I am changing and testing this now.

@rened
Copy link
Member Author

rened commented Sep 6, 2015

Implemented now by making similar always return a non-shared array. Works, except for the unit test for #6362 in https://github.com/rened/julia/blob/shgetindex/test/parallel.jl#L184-L192 - now, the result of copy or deepcopy is not a shared array any more.

Also, now, there is no clean way of getting a similar shared array (which, e.g., lies on the same pids).

I think is good that working on a shared array does not create new shared arrays - that should be an explicit operation. Loosing the ability to get an actually similar shared array needs to be mitigated somehow. @timholy Do you know a way around this?

@tkelman
Copy link
Contributor

tkelman commented Sep 6, 2015

I'm of the opinion that similar(::SharedArray) should return a SharedArray.

@timholy
Copy link
Sponsor Member

timholy commented Sep 6, 2015

You can always get a shared array with SharedArray(eltype(A), size(A)).

I am torn about changing similar, but on balance lean towards making it return an Array. The main reason is that only one process (typically, the one with myid() == 1) should be creating a SharedArray: worker computations should not result in multiple copies springing up like daisies.

@rened
Copy link
Member Author

rened commented Sep 6, 2015

I believe this is the better approach, yes. Updated the unit test for copy/deepcopy.

@JeffBezanson
Copy link
Sponsor Member

Ok, I'm going to trust the practical experience here.

JeffBezanson added a commit that referenced this pull request Sep 7, 2015
Fix sharedarray indexing regression
@JeffBezanson JeffBezanson merged commit e6b59b6 into JuliaLang:master Sep 7, 2015
@tkelman
Copy link
Contributor

tkelman commented Sep 8, 2015

similar is sort of messy and inelegant, but I don't see why SharedArray should be special-cased differently than any other expensive-to-create array type here. SharedArray(eltype(A), size(A)) defeats generic code, and while typeof(A)(eltype(A), size(A)) might be workable, isn't that what similar(A) is supposed to be shorthand for?

@rened
Copy link
Member Author

rened commented Sep 8, 2015

@amitmurthy care to comment? you have the most detailed view of this I guess!

@amitmurthy
Copy link
Contributor

Actually, since I don't use this stuff in any real world applications, I would withhold any detailed opinion except to observe that while Tony's view is more purist in nature, Tim's is more practical.

@timholy
Copy link
Sponsor Member

timholy commented Sep 8, 2015

I'm not sure it is different than other types. For example, similar(::SubArray) returns an Array.

I think if it were returning something of the same type, it would be called same. The name seems to be license to make a different choice.

@tkelman
Copy link
Contributor

tkelman commented Sep 8, 2015

Reminding me why I dislike similar, ref #11574. I thought the decision of when to return a different type was more based on whether the elements can be modified (perhaps taking aliasing into consideration, for the SubArray case)?

@timholy
Copy link
Sponsor Member

timholy commented Sep 8, 2015

I agree this isn't easy. But even if one only wanted a "modifiability" exemption, this might pass it: if it returns a SharedArray, then in principle other processes can modify elements in this new thing. If it's an Array, they can't.

@mbauman
Copy link
Sponsor Member

mbauman commented Sep 8, 2015

There's been some more discussion around this in packages, see JuliaStats/NullableArrays.jl#56 (comment). I really think that similar just means: "give me the best mutable subtype of AbstractArray for a given shape and element type, stemming from this source array." I hold that this operation is crucially important, and really needs to be flexible enough to be guided by practical experience. Sure, it's a subjective choice. But there's definitely value in allowing the flexibility to make good, practical choices here.

@tkelman
Copy link
Contributor

tkelman commented Sep 8, 2015

Dunno, the subjectivity of the manual array type mapping in similar just leaves me with a lingering distrust for it, seems like an underspecified, unreliable and ad-hoc choice that needs to be made every single time you talk about a new AbstractArray subtype, otherwise the fallback means you get unspecialized Arrays in a lot of operations you might not be expecting them from.

I guess we can try this out for a while, but for the examples of A+3 and cumsum returning a SharedArray feels more appropriate to me than an Array.

@jakebolewski
Copy link
Member

I agree with @tkelman, something like A + 3 should definitely return a shared array here. Otherwise it destroys the whole PGAS abstraction.

@mbauman
Copy link
Sponsor Member

mbauman commented Sep 8, 2015

Generic programming with array types that require certain algorithmic complexities is extremely hard. My instinct has been to make the base library correct in terms of the index=>value mapping, and then punt to the individual subtypes for specializations that provide different algorithmic complexities. But I'm learning to appreciate that a fallback with a wrong complexity class for some applications may even be worse than if it were not implemented in the first place.

I'm not sure how much the base library can do here. There's no limit to how much we can divide up similar into particular tasks (well, besides a vocabulary), but in doing so it loses its generalizability and power. :-\

The question, then, is who has to pay the price of not fitting in? And where is that price the lowest? The answer may be that ShardArrays need to define their own non-scalar indexing operations.

@jakebolewski
Copy link
Member

That is fine, but you are concentrating too much on computational complexity and leaving out memory / space requirements. You can have an out-of-core shared array, you cannot have one (well preserving the abstraction) if A + 3 returns a dense array.

*edited

@mbauman
Copy link
Sponsor Member

mbauman commented Sep 8, 2015

Yes, that's precisely what I mean to say by algorithmic complexity, that is, including both space and time complexities for the data structure and common operations on it.

@jakebolewski
Copy link
Member

Ok fair enough, SharedArrayhas good enough performance where you can re-use the existing indexing infrastructure but those abstractions fall down when applied to DArray's. I think it is inevitable that re-defining indexing operations has to occur for specific distributed array types. No one is going to accept the performance tradeoff to make these abstractions universal for the dense Array case.

@ChrisRackauckas
Copy link
Member

This choice makes it extremely hard to allow SharedArrays in generic algorithms because most operations make it automatically convert back to an Array. Could making similar return a SharedArray be reconsidered, especially now that we have the broadcast changes which implicitly fix a lot of indexing + performance issues but now give issues since these return arrays?

@timholy
Copy link
Sponsor Member

timholy commented Oct 25, 2017

Now that we have good views (SubArray, ReshapedArray, ReinterpretArray, MappedArray) there's a lot you can do without ever needing to create new storage. So the landscape may have changed.

But it will take someone sitting down and thinking out a proposal for how to manage the issues above.

EDIT: a recommendation would to be implement the change locally and see if you still like it a week later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain:parallelism Parallel or distributed computation kind:regression Regression in behavior compared to a previous version
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants