-
Notifications
You must be signed in to change notification settings - Fork 4k
ARROW-4464: [Rust] [DataFusion] Add support for LIMIT #3669
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
andygrove
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking good so far ... I've left some feedback that should help you out
|
|
||
| let input_schema = input_rel.as_ref().borrow().schema().clone(); | ||
|
|
||
| let compiled_expr = compile_scalar_expr(&self, expr, &input_schema)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The LIMIT expression will be a literal e.g. LIMIT 10 so I don't think we need to compile and evaluate the expression but just use pattern matching to extract the number.
match expr {
&Expr::Literal(ref value) => /* extract value */,
_ => /* error - only literals are supported */
}
``
| input: Rc<RefCell<Relation>>, | ||
| expr: RuntimeExpr, | ||
| schema: Arc<Schema>, | ||
| num_consumed_rows: i64, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add limit: usize
| input: Rc<RefCell<Relation>>, | ||
| expr: RuntimeExpr, | ||
| schema: Arc<Schema>, | ||
| num_consumed_rows: i64, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
num_consumed_rows should be usize
| fn next(&mut self) -> Result<Option<RecordBatch>> { | ||
| match self.input.borrow_mut().next()? { | ||
| Some(batch) => { | ||
| match self.expr.get_func()(&batch)? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need to evaluate the expression, use limit field from struct
| } | ||
|
|
||
| //TODO: move into Arrow array_ops | ||
| fn filter(a: &Array, num_rows_to_read: i64) -> Result<ArrayRef> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rename filter to limit
|
Thanks a lot @andygrove I extracted the limit and cast it with |
Codecov Report
@@ Coverage Diff @@
## master #3669 +/- ##
==========================================
- Coverage 87.85% 87.76% -0.09%
==========================================
Files 689 689
Lines 84012 84023 +11
Branches 1081 1081
==========================================
- Hits 73805 73745 -60
- Misses 10094 10163 +69
- Partials 113 115 +2
Continue to review full report at Codecov.
|
| .collect(); | ||
|
|
||
| let limited_batch: RecordBatch = | ||
| RecordBatch::new(Arc::new(Schema::empty()), limited_columns?); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new batch needs the same schema as the original, not an empty schema
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once this is fixed, I'm ready to approve. Thanks @ntrinquier !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed that makes more sense. filter is also creating the new batch with an empty schema, so I guess this would have to be changed there too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops, good catch, yes I'll file a bug for that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@andygrove I created it: https://issues.apache.org/jira/browse/ARROW-4721
andygrove
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
No description provided.