Fix _cumsum
helper function in multi-gpu
#2636
+4
−2
Merged