Skip to content

Conversation

@andygrove
Copy link
Member

@andygrove andygrove commented Sep 25, 2019

I discovered this last minute while running manual tests. I have been able to run parallel queries against parquet files using this branch as a dependency.

@andygrove
Copy link
Member Author

Sorry @paddyhoran I found this last minute issue ... pretty small fix though.

Copy link
Contributor

@paddyhoran paddyhoran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just one question.

Err(ExecutionError::General("No files found".to_string()))
} else {
let parquet_file = ParquetFile::open(&filenames[0], None, 0)?;
let schema = parquet_file.projection_schema.clone();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if the schema of the files differ? I guess it just fails are execution time when a different schema is encountered?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this code assumes that all of the partitions have the same schema currently. It's pretty basic. I imagine we could eventually have schema merging.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will write up a JIRA to add validation that all the partitions have the same schema. That would be a nice improvement.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. Feel free to merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants