Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify ParquetExec::new() #10643

Closed
wants to merge 1 commit into from
Closed

Conversation

alamb
Copy link
Contributor

@alamb alamb commented May 23, 2024

Note there is an alternate approach that is not an API change that uses a dedicated builder in #10636, but then I realized that
there were many fields (like enable_page_index etc that could already be set with builder apis on ParquetExec so making them redundant with ParquetExec seemed uncessary

Which issue does this PR close?

Part of #10546

Rationale for this change

While working on #10549 it was cumbersome
to create a ParquetExec -- specifically there are a few fields on ParquetExec::new that
must always be provided even though they are Options and most of the rest of the options can be set via builder style
APIs like with_schema_adapter_factory

In addition, some fields like schema_adapter_factory can not be set by constructor but are instead set
by builder function with_schema_adapater_factor.

Thus it is confusing and imposes a cognitive load that the properties setting is inconsistent.

What changes are included in this PR?

  1. Remove all arguments except FileScanConfig from ParquetExec::new
  2. Add builder style APIs
  3. Improve documentation
  4. Add examples

Are these changes tested?

Yes by existing CI

Are there any user-facing changes?

This is a user API change for anyone who creates ParquetExec's directly (as we do in InfluxDB IOx for example)

There is also better documentation

@alamb alamb added the api change Changes the API exposed to users of the crate label May 23, 2024
@github-actions github-actions bot added the core Core DataFusion crate label May 23, 2024
@alamb alamb closed this May 23, 2024
@alamb
Copy link
Contributor Author

alamb commented May 23, 2024

The more I think about this the more I like #10636 and deprecate the ParquetExec::new function...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api change Changes the API exposed to users of the crate core Core DataFusion crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant