Implemented a variety of segmented algorithms #2859

ghost · 2017-08-24T13:46:09Z

Implemented following segmented algorithms -

all_of, any_of, none_of
binary transform_reduce
adjacent_find
adjacent_difference

Unit tests provided for each.

See #1338

…gmented_algorithms

…nspect errors

ghost · 2017-08-24T13:46:37Z

@hkaiser, @mcopik Please have a look.

mcopik · 2017-08-24T13:49:09Z

This implements several missing algorithms from #1338

mcopik

Good work Ajai! I have left few comments suggesting improvements. Most of them are minor nitpicks, except the question of invoking the binary operator in algorithms from adjacent family. We should discuss this issue further.

mcopik · 2017-08-27T21:43:26Z

hpx/parallel/segmented_algorithms/adjacent_difference.hpp

+                dest, std::forward<Op>(op), is_seq());
+        }
+
+        // forward declare the non-segmented version of this algorithm


Why is there a forward declaration if you are including the definition with algorithms/adjacent_difference.hpp?

This is being done in all the segmented algorithms I have implemented. I started with the existing for_each as base, which contained this forward declaration. It is also there in the existing scans and the generate etc. I am not sure if it is needed, but I have just been following the standard. :P

mcopik · 2017-08-27T21:53:46Z

tests/unit/parallel/segmented_algorithms/partitioned_vector_adjacent_difference.cpp

@@ -0,0 +1,100 @@
+//  Copyright (c) 2017 Ajai V George


Tests include only a case where the difference in adjacent pairs is constant. I think it should be extended with at least one case where it is either constantly increasing/decreasing or random.

A test case with an operator (x, y) -> y would produce non-constant results which can be easily checked on the vector with N elements (0, 1, ..., n-1).

Please check the current test cases.

mcopik · 2017-08-27T22:16:26Z

hpx/parallel/segmented_algorithms/adjacent_difference.hpp

+                            std::true_type(), beg, end, ldest, op);
+
+                        beginning = traits1::compose(sit, beg);
+                        if(beginning != last)


If I understand correctly, here you apply the difference operator to the adjacent pair (the last value in the previous segment, the first value in the current segment)? I think it will cause an unnecessary communication between localities. I think this could be simplified by dispatching a modified function object which accepts an additional parameter and returns the last value in the segment. I think it's worth saving time by sending data together.

@hkaiser, what do you think?

Ok, So i will create another function object for both adjacent difference and adjacent find which accept a parameter for the previous value. Also just to confirm, the function object should return the last value right? Not an iterator to the last input value? Currently the function object returns an iterator to the last output.

What about the case of the parallel version? As you know parallel and sequential versions cannot call different function objects. So the parallel version will be an exact copy of the existing parallel version, right?

mcopik · 2017-08-27T22:32:23Z

hpx/parallel/segmented_algorithms/adjacent_difference.hpp

+                        while(start != between_segments.end())
+                        {
+                            FwdIter2 curr = dest;
+                            std::advance(curr, std::distance(first, *start));


Since you're invoking synchronously op on the executing thread, the execution time of this code depends on the latency of communication between localities. Two reads and a one write are necessary for each call to op. There are different ways to do the same thing, though. For example, could we attach a continuation to each future which would immediately write the last value of adjacent_difference on a given segment to the first element of next segment? It would only require disabling the first copy in adjacent_difference function object to avoid a race condition.

Perhaps it would be good to make a quick benchmark comparing how much time could be spent in this section, compared to the parallel execution. I might be wrong about this.

@hkaiser, do you have an opinion on this?

Hmm, so the parallel function object will be exactly same except the first value is ignored. Instead I will attach a then clause to the future, which will call the function object on the first value of current segment and last value of previous segment and write the result to the correct position. Am I right? Also something similar in adjacent find right?

mcopik · 2017-08-27T22:35:31Z

hpx/parallel/segmented_algorithms/adjacent_find.hpp

+                        output = traits::compose(sit, out);
+                    }
+                }
+                FwdIter ending = traits::compose(sit, std::prev(end));


I think that arguments of this logical conjunction should be reversed. The comparison should be performed only iff found = false.

mcopik · 2017-08-27T22:52:08Z

hpx/parallel/segmented_algorithms/all_any_none.hpp

+                        std::vector<bool> res =
+                            hpx::util::unwrap(std::move(r));
+                        auto it = res.begin();
+                        while (it != res.end())


Not a problem, but a small remark: this loop could be replaced by one call to std::all_of.

mcopik · 2017-08-27T22:52:43Z

hpx/parallel/segmented_algorithms/all_any_none.hpp

+                        >::call(r, errors);
+                        std::vector<bool> res =
+                            hpx::util::unwrap(std::move(r));
+                        auto it = res.begin();


As above, std::any_of.

mcopik · 2017-08-27T22:56:47Z

hpx/parallel/segmented_algorithms/transform_reduce.hpp

+                local_iterator_type1 end1 = traits1::local(last1);
+                if (beg1 != end1)
+                {
+                    overall_result = hpx::util::invoke(red_op, overall_result,


Indentation and braces locations are slightly confusing here. The style used later in this function (line 292) is much easier to read.

mcopik · 2017-08-27T22:57:31Z

hpx/parallel/segmented_algorithms/transform_reduce.hpp

+                > forced_seq;
+
+
+            std::vector<shared_future<T> > segments;


Same as any_of etc. Is a shared_future really necessary here?

mcopik · 2017-08-27T23:02:03Z

tests/unit/parallel/segmented_algorithms/partitioned_vector_adjacent_find.cpp

+template <typename T>
+void initialize(hpx::partitioned_vector<T> & xvalues)
+{
+    T init_array[SIZE] = {1,2,3,4, 5,1,2,3, 1,5,2,3, 4,2,3,2, 1,2,3,4, 5,6,5,6,


Could you add a test using the binary predicate?

Please check the current test cases.

ghost · 2017-08-28T10:00:10Z

@mcopik, I will try to fix the minor issues by today and then work on the new function objects.

hkaiser · 2017-09-05T14:46:16Z

Now that GSoC is formally over - what is the state of this PR?

mcopik · 2017-09-06T16:29:03Z

@hkaiser @ajaivgeorge has fixed some issues which I have mentioned but I believe that two algorithms should be improved. Right now, there is an additional round of communication between segments after finishing the dispatched work. I think it should be possible to hide latencies introduced by this additional communication.

ghost · 2017-09-07T04:27:12Z

@hkaiser This PR is almost ready for merging.

@mcopik I am working on eliminating the additional round of communication in the sequential versions of adjacent_find and adjacent_difference with a custom function object which returns the first value (as you described). Will try to get that done by evening.

Regarding the parallel version, I do not see how appending a future will eliminate the extra round of communication.

ghost · 2017-09-11T11:38:18Z

@mcopik I have been trying to implement the custom function object for the sequential version to reduce communication overhead, as you suggested. For this, the new function objects return the last value of each segment, which is taken as a parameter by the function object working on the next segment. An issue is that, I also need to finally return an iterator to the last destination computed as per the specification of adjacent_find/adjacent_difference. This is easy enough to compute when all the segments are successfully computed. But what happens if one of the segments errors out and is unable to compute the entire result. So last computed dest != end of dest segment. How will this error be found out if the function object does not return this iterator. I suppose I could return a pair<last element in input, last computed iterator in dest> but this does not seem very elegant. What do you suggest?

hkaiser · 2018-01-26T12:10:23Z

@ajaivgeorge, @mcopik what will happen to this PR? Should we abandon it? How much work is left to get this into the main branch?

mcopik · 2018-01-29T10:43:35Z

@hkaiser Ignoring merge conflicts, the PR is ready - all algorithms seem to be correctly implemented. I have few doubts about the most optimal way to achieve that, that's all.

mcopik · 2018-01-29T10:43:45Z

@ajaivgeorge Can you resolve those conflicts?

msimberg · 2018-09-10T09:57:43Z

@mcopik, @ajaivgeorge I see the source repo has been deleted now. Is there anything we could do to help getting this merged? Would it be possible to for example leave out the problematic parts so that we could have at least some of the work in master? Would be a shame to have all the work thrown away.

mcopik · 2018-09-10T10:17:43Z

@ajaivgeorge Ajai, why did you delete the repository?

@msimberg I might have a fork on my old machine. I didn't fork his repo on GH.

hkaiser · 2018-09-11T00:22:48Z

@msimberg we should still be able to get the patch (we have the hashes of all commits).

hkaiser · 2018-09-11T00:27:30Z

Here are the patches for the commits:

msimberg · 2018-09-11T06:46:55Z

@hkaiser, awesome, thanks!

msimberg · 2018-11-13T08:48:25Z

Closing this in favor of #3525.

Ajai V George added 9 commits August 12, 2017 23:11

segmented any_of, none_of, all_of implemented

2f263bf

Binary Transform Reduce added

e8be3ae

adjacent_find and adjacent_difference added

8e9d992

Update to adjacent find

a10b846

Merge branch 'master' of https://github.com/STEllAR-GROUP/hpx into se…

9df89d5

…gmented_algorithms

Working Adjacent find

f65fc77

Adjacent find between segments fixed

618be41

Partial adjacent_difference

b34e1cd

Working adjacent difference, minor changes to adjacent find and fix i…

1774acd

…nspect errors

Fix Circle build errors and warnings

395dddd

hkaiser added category: algorithms type: enhancement labels Aug 25, 2017

hkaiser added this to the 1.1.0 milestone Aug 25, 2017

hkaiser mentioned this pull request Aug 25, 2017

Extend parallel algorithms to work with hpx::partitioned_vector et.al. #1338

Open

36 tasks

mcopik reviewed Aug 27, 2017

View reviewed changes

PR Review minor changes

554a5d8

msimberg removed this from the 1.1.0 milestone Mar 22, 2018

msimberg mentioned this pull request Sep 10, 2018

Catch non-existent source repo biddisco/pycicle#36

Merged

hkaiser mentioned this pull request Nov 5, 2018

Segmented algorithms #3525

Merged

msimberg closed this Nov 13, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implemented a variety of segmented algorithms #2859

Implemented a variety of segmented algorithms #2859

ghost commented Aug 24, 2017 •

edited by ghost

Loading

ghost commented Aug 24, 2017

mcopik commented Aug 24, 2017

mcopik left a comment •

edited

Loading

mcopik Aug 27, 2017

ghost Aug 29, 2017

mcopik Aug 27, 2017

mcopik Aug 27, 2017

ghost Aug 29, 2017

mcopik Aug 27, 2017

ghost Aug 28, 2017

mcopik Aug 27, 2017

ghost Aug 28, 2017

mcopik Aug 27, 2017

mcopik Aug 27, 2017

mcopik Aug 27, 2017

mcopik Aug 27, 2017

mcopik Aug 27, 2017

mcopik Aug 27, 2017

ghost Aug 29, 2017

ghost commented Aug 28, 2017

hkaiser commented Sep 5, 2017

mcopik commented Sep 6, 2017

ghost commented Sep 7, 2017

ghost commented Sep 11, 2017

hkaiser commented Jan 26, 2018

mcopik commented Jan 29, 2018

mcopik commented Jan 29, 2018

msimberg commented Sep 10, 2018

mcopik commented Sep 10, 2018

hkaiser commented Sep 11, 2018

hkaiser commented Sep 11, 2018

msimberg commented Sep 11, 2018

msimberg commented Nov 13, 2018

Implemented a variety of segmented algorithms #2859

Implemented a variety of segmented algorithms #2859

Conversation

ghost commented Aug 24, 2017 • edited by ghost Loading

ghost commented Aug 24, 2017

mcopik commented Aug 24, 2017

mcopik left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ghost commented Aug 28, 2017

hkaiser commented Sep 5, 2017

mcopik commented Sep 6, 2017

ghost commented Sep 7, 2017

ghost commented Sep 11, 2017

hkaiser commented Jan 26, 2018

mcopik commented Jan 29, 2018

mcopik commented Jan 29, 2018

msimberg commented Sep 10, 2018

mcopik commented Sep 10, 2018

hkaiser commented Sep 11, 2018

hkaiser commented Sep 11, 2018

msimberg commented Sep 11, 2018

msimberg commented Nov 13, 2018

ghost commented Aug 24, 2017 •

edited by ghost

Loading

mcopik left a comment •

edited

Loading