Conversation
Benchmarks: TPC-HTable of Results
|
Benchmarks: random_access |
CodSpeed Performance ReportMerging #2348 will improve performances by 43.48%Comparing Summary
Benchmarks breakdown
|
vortex-layout/src/scan/task.rs
Outdated
| #[async_trait] | ||
| impl TaskExecutor for InlineTaskExecutor { | ||
| async fn execute(&self, array: &Array, tasks: &[ScanTask]) -> VortexResult<Array> { | ||
| fn execute(&self, array: &Array, tasks: &[ScanTask]) -> VortexResult<Array> { |
There was a problem hiding this comment.
this is a flyby but seems like it can be blocking now? potentially not even needed but I'm not sure what are your plans for it
Benchmarks: compressTable of Results
|
|
made another round of changes, I think this interface is more general (and works better for the non-specific runtime case), but |
|
I have an idea for the non-specific-runtime variant, coding it now |
| result: oneshot::Sender<R>, | ||
| } | ||
|
|
||
| impl<F, R> Task for ExecutorTask<F, R> |
just a temporary measure until the [language API](rust-lang/rust#70142) is stabilized. I have a couple of these cases in #2348 so I figured its worth pulling into a separate PR.
Benchmarks: TPC-H on NVMETable of Results
|
| } | ||
| } | ||
|
|
||
| struct Inner { |
There was a problem hiding this comment.
This structure is needed to achieve the behavior we want, where arbitrary futures can be spawned on this runtime. I've tried to structure it differently but I honestly think this is pretty nice, and we can definitely try and improve that later.
Benchmarks: TPC-H on S3Table of Results
|
Benchmarks: Clickbench on NVMETable of Results
|
Benchmarks: TPC-H on NVMETable of Results
|
|
All benchmarks look good, not much difference on S3 (maybe slightly better) and much better on disk (especially for TPC-H). |
vortex-layout/src/scan/mod.rs
Outdated
| split_by: SplitBy::Layout, | ||
| canonicalize: false, | ||
| concurrency: 32, | ||
| concurrency: std::thread::available_parallelism() |
There was a problem hiding this comment.
I don't know that this should default to nthreads, this is "concurrency" and not "parallelism", the number of row splits to make progress on concurrently. Basically... we want as many as possible provided the data fits in some reasonable memory, since the more concurrency, the more visibility into upcoming I/O, and therefore the more coalescing.
There was a problem hiding this comment.
yeah, it was an idea but we probably do want to oversubscribe the tokio runtime if we can and buffer by a different metric
| use vortex_error::VortexResult; | ||
|
|
||
| pub trait Executor { | ||
| /// Spawns a future to run on a different runtime. The shouldn't be polled to make progress. |
There was a problem hiding this comment.
LMK what you think of the new one
|
|
||
| pub trait Executor { | ||
| /// Spawns a future to run on a different runtime. The shouldn't be polled to make progress. | ||
| fn spawn<F>(&self, f: F) -> VortexResult<BoxFuture<'static, VortexResult<F::Output>>> |
There was a problem hiding this comment.
Does this need to return a result? Can it just be BoxFuture<'static, VortexResult<..>>? Since the impl can always return ready(Err(...)) if it needs to.
There was a problem hiding this comment.
I don' think we want to return ready(Err(...)) here, but if we're ok with panicking in cases where we can't submit work/tasks start getting cancelled, we can get the nicer function signature.
|
|
||
| /// Generic wrapper around different async runtimes. Used to spawn async tasks to run in the background, concurrently with other tasks. | ||
| #[derive(Clone)] | ||
| pub enum TaskExecutor { |
There was a problem hiding this comment.
I wonder what the performance hit is to just have a dyn Executor? Seems like a lot of code here for basically |f| tokio::spawn(f)
There was a problem hiding this comment.
IDK if we can have dyn Executor here without a major change due to the generics
There was a problem hiding this comment.
You would have to take / return BoxFuture. You already return one
There was a problem hiding this comment.
I don't think its that much code (probably less than what happens in Arc), but in order to make it dyn-compatible we'll also have to move the generic bound from the function to the trait itself (example), which makes this thing less useful in other cases IMO.
very hacky demo