-
Notifications
You must be signed in to change notification settings - Fork 1.9k
feat: Improve datafusion-cli memory usage and considering reserve mem… #14766
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 12 commits
69412e9
54bc8b7
ac7e783
ac24222
31866e9
1f9026c
b7205fd
0ecc9d2
aa838b2
4380843
a0a0db1
6715d23
ef4f184
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -121,6 +121,13 @@ struct Args { | |
| )] | ||
| maxrows: MaxRows, | ||
|
|
||
| #[clap( | ||
| short, | ||
| long, | ||
| help = "Whether to stop early when max rows is reached, this will help reduce the memory usage when the result is large" | ||
|
||
| )] | ||
| stop_after_max_rows: bool, | ||
|
|
||
| #[clap(long, help = "Enables console syntax highlighting")] | ||
| color: bool, | ||
| } | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -102,14 +102,14 @@ impl PrintOptions { | |
| schema: SchemaRef, | ||
| batches: &[RecordBatch], | ||
| query_start_time: Instant, | ||
| row_count: usize, | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this is a great idea |
||
| ) -> Result<()> { | ||
| let stdout = std::io::stdout(); | ||
| let mut writer = stdout.lock(); | ||
|
|
||
| self.format | ||
| .print_batches(&mut writer, schema, batches, self.maxrows, true)?; | ||
|
|
||
| let row_count: usize = batches.iter().map(|b| b.num_rows()).sum(); | ||
| let formatted_exec_details = get_execution_details_formatted( | ||
| row_count, | ||
| if self.format == PrintFormat::Table { | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be nice if it even could print based on the stream instead of collecting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @Dandandan review , it's a good point if we setting the maxrows huge number because the results only keep the maxrows count batch size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created a follow-up ticket:
#14810
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice