You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi there, I've been playing around with this promising project and I'm really excited about what it can achieve.
In the examples of reduce, all aggregations seem to be per-key. I'm trying to find a way to aggregate all values in a collection to a single value. For example, summing over a collection.
For example, I've attempted this with explode, using a constant as a key:
use differential_dataflow::input::Input;use differential_dataflow::operators::arrange::agent::TraceAgent;use differential_dataflow::operators::arrange::{ArrangeByKey,ArrangeBySelf};use differential_dataflow::operators::Count;use differential_dataflow::trace::cursor::CursorDebug;use differential_dataflow::trace::implementations::ord::OrdValSpine;use differential_dataflow::trace::{Cursor,TraceReader};use timely::dataflow::operators::probe::Handle;use timely::{Configuration,PartialOrder};pubfnsum_with_workers(initial_items:Vec<(String,isize)>){
timely::execute(Configuration::Process(3),move |worker| {let worker_index = worker.index();let initial_items = initial_items.clone();letmut probe = Handle::new();let(mut input_session,mut trace) = worker.dataflow(|scope| {let(input_session, collection) = scope.new_collection_from(initial_items);let trace = collection.probe_with(&mut probe).arrange_by_self().trace;(input_session, trace.clone())});letmut sum_trace = worker.dataflow(|scope| {let collection = trace
.import(scope).as_collection(|k, v| {
k.clone()}).probe_with(&mut probe);let sum = collection
.explode(|(k, v)| {Some(("CONSTANT_KEY".to_string(), v))}).count().probe_with(&mut probe);
sum.arrange_by_key().trace.clone()});let time = 2;
input_session.advance_to(time);
input_session.flush();
worker.step_while(|| probe.less_than(&time));
sum_trace.advance_by(&[time]);
sum_trace.distinguish_since(&[time]);let sum_result = read_collection_at_time(&mut sum_trace, time);println!("Worker: {}, Time: {}. Obtained sum of all items: {:?}",
worker_index, time, sum_result
)});}fnread_collection_at_time(trace_reader:&mutTraceAgent<OrdValSpine<String,isize,usize,isize>>,time:usize,) -> Option<isize>{let(mut cursor, storage) = trace_reader.cursor();letmut result = None;while cursor.key_valid(&storage){while cursor.val_valid(&storage){let item = cursor.val(&storage);let key = cursor.key(&storage);letmut total = 0;
cursor.map_times(&storage, |timestamp, update| {if timestamp.less_equal(&time){
total = total + update;}});if total > 0{
result = Some(item);}
cursor.step_val(&storage);}
cursor.step_key(&storage);}
result.map(|v| *v)}
Hi there, I've been playing around with this promising project and I'm really excited about what it can achieve.
In the examples of reduce, all aggregations seem to be per-key. I'm trying to find a way to aggregate all values in a collection to a single value. For example, summing over a collection.
For example, I've attempted this with
explode
, using a constant as a key:Inputting a value of
Problem with this approach is that the result will be
actual_sum * num_of_workers
. I'm wondering if I should approach this differently?Is my approach applicable for other types of aggregations that do not use
explode
? e.g. get the min of all items usingreduce
.The text was updated successfully, but these errors were encountered: