Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Catalog Snapshot Isolation #4697

Closed
tustvold opened this issue Dec 21, 2022 · 2 comments
Closed

Catalog Snapshot Isolation #4697

tustvold opened this issue Dec 21, 2022 · 2 comments
Labels
enhancement New feature or request

Comments

@tustvold
Copy link
Contributor

tustvold commented Dec 21, 2022

Is your feature request related to a problem or challenge? Please describe what you are trying to do.

Currently SqlToRel takes a ContextProvider on which it calls ContextProvider::get_table_provider whilst it traverses the SQL AST. If a table appears multiple times in the same query, it will call ContextProvider::get_table_provider multiple times, and obtain potentially different Arc<dyn TableSource> for the same table.

This is perfectly fine, provided TableSource are not interior mutable, that is their contents can't change. However, in a system supporting mutation, it is unclear how to provide a consistent snapshot of that data to the query.

The obvious place to obtain this snapshot would be when constructing the TableSource / TableProvider in SchemaProvider, however, as this method gets called multiple times this would potentially result in multiple snapshots for the same query.

Describe the solution you'd like

Some component, likely SessionContext, should first identify all the relations that appear in the query and obtain a snapshot of each unique TableSource by calling SchemaProvider::table. This immutable state can then be handed off to SqlToRel to produce the LogicalPlan.

Describe alternatives you've considered

TableProvider could potentially infer some notion of session state based on the session_id, but it is unclear how this would work with multiple concurrent queries on the same SessionContext.

Additional context

#4617 tracks the general issue of mutation isolation
#4607 implements a form of this as part of supporting async catalogs

@tustvold tustvold added the enhancement New feature or request label Dec 21, 2022
@jiacai2050
Copy link
Contributor

If a table appears multiple times in the same query, it will call ContextProvider::get_table_provider multiple times, and obtain potentially different Arc for the same table.

For query with only one table, perhaps we can add a cache to prevent this multiple get issue?

@tustvold
Copy link
Contributor Author

This has actually been fixed by #4607

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants