Example using SQLite via SQLx for secondary indexes #1

adriangb · 2024-06-07T17:18:18Z

No description provided.

sqlx-sqlite/Cargo.toml

adriangb · 2024-06-07T17:20:26Z

sqlx-sqlite/README.md

+
+SQLite is used as a stand-in for an external remote relational database, it should be easy to adapt this example to use another database.
+
+This examples should be considered incomplete: it does not try to handle **many** edge cases or push down filters as much as possible.


In particular, I am quite concerned that the filter pushdown might be incorrect in some ways, I don't intend for people to copy this or use it directly. Happy to add some tests and increase confidence then tone down this warning.

I'll point out any obvious errors I see

sqlx-sqlite/src/main.rs

adriangb · 2024-06-07T17:21:37Z

sqlx-sqlite/src/index.rs

+                FileStatistics::RowGroupCount,
+            ])
+            .column(ColumnStatistics::RowGroup)
+            .distinct() // could be distinct_on(vec![ColumnStatistics::FileId, ColumnStatistics::RowGroup]) if the backing store supports it


There's a couple thing along these lines that are just limitations of SQLite

sqlx-sqlite/src/index.rs

adriangb · 2024-06-07T17:23:00Z

cc @alamb

adriangb · 2024-06-07T17:26:48Z

sqlx-sqlite/index.db

This is just an empty sqlite file

alamb

Thank you @adriangb -- this looks very cool. Thank you for sharing

sqlx-sqlite/Cargo.toml

alamb · 2024-06-09T20:12:59Z

sqlx-sqlite/README.md

+
+SQLite is used as a stand-in for an external remote relational database, it should be easy to adapt this example to use another database.
+
+This examples should be considered incomplete: it does not try to handle **many** edge cases or push down filters as much as possible.


I'll point out any obvious errors I see

alamb · 2024-06-09T20:19:53Z

sqlx-sqlite/src/index.rs

+
+/// Push down a simple binary expression to the index
+/// Only a subset of expressions are supported since `a = 1` has to be rewritten as `a_int_max_value >= 1 AND a_int_min_value <= 1`
+fn push_down_binary_filter(value: &ScalarValue, op: &Operator) -> Option<SimpleExpr> {


It might be clearer to call this "rewrite_filter" or "rewrite_binary_expr" as it isn't really "pushing down" the filter in my mind -- it is creating SimpelExpr that wll be "pushed down" (into sqlite)

BTW this rewrite looks like a combination

The expr rewrite that PruningPredicate does internally

A translation from the DataFusion Exprs to the sea_query

If you separated the two passes the logic might be clearer and maybe you could reuse some existing code:

Some thoughts:

Consider using the rewritten PruningPredicate expr directly (and there has been a lot of effort in testing that rewrite for correctness)

Use the Expr --> SQL code in expr_to_sql instead of sea_query (as the code already exists

This is great, I was wondering if there's existing tested code to do these conversions. I see that I can build a PruningPredicate and call PruningPredicate::predicate_expr to get the rewritten expr so I can then run that against SQLite and not have to materialize the statistics in memory (which is what PruningPredicate seems to be used for generally).

However the way that rewriting happens it expects the schema of the statistics to match the schema of the table: value needs columns value_min and value_max of the correct type.
That makes sense when your statistics are stored with the actual data or in memory but breaks down if you use a traditional RDMS and want to support more than 1 table schema because you don't want to make a super wide table and add dynamically generated columns.

The schema I'm using here gets around that by storing a sparse table (1 column for each of max/min of each type) of values with (file_id, row_group, column_name) as the primary key.

What do you think generally of the statistics schema being used here, any idea how we can generalize what PruningPredicate does without requiring the schema of the statistics table to depend on the schema of the data?

Looking a bit further into what PruningPredicate is doing it seems to me like the LiteralGuarantees part is not necessary if I'm pushing work to another database. Let me know if that sounds right or not.

If we wanted to keep the current schema and re-use existing logic I think we'd need to make RequiredColumns::stat_column_expr pluggable.

I'll also note that I use the type of the value to determine which column to query (since they're named as int_max_value and such), which seems wrong as soon as someone does date = '2020-01-01'. I think it would be better to use the approach that I think the existing code is using of getting the type from the schema.

I think here you mean col can never be x and y at the same time 👀

I see -- yes, if you want to push the evaluation down into the lower level database without a column per statistic (which could indeed result in a large number of columns) it would be hard to reuse the exisitng pruning predicate logic

I think we may still be able to make it work by doing a join per column. Which I don't love, but should work and avoids any dynamic schemas in the index database. I'll give that a try and loop back.

I'm going to rework this to be simple and hardcoded with the index's schema having a min/max column per column in the actual schema. It was getting out of hand to try to make this fancier for an example.

@alamb is there a way to go from a PhysicalExpr to an Expr if I want to use expr_to_sql? It's another thing where it would be simpler for the example but not universal (I assume it generates datafusion SQL, there's no guarantee that's valid for the lower level index database).

@alamb I've reworked this to use PruningPredicate to do the rewriting. The statistics table now has largely the same schema as what PruningPredicate expects.

I did not use expr_to_sql because that generates DataFusion flavored SQL which would not be compatible with many index databases. Instead I opted to implement a function to convert from PhysicalExpr -> sea_query::SimpleExpr. I think this is valuable to have as an example for folks, it covers a wide range of index databases (since SeaQuery can generate SQL for SQLite, Postgres, MySQL, or be extended) and is not that much code (at least for the cases I chose to handle).

alamb · 2024-06-09T20:27:30Z

sqlx-sqlite/src/index.rs

+        // TODO: we could aggregate the row groups into an array in the query to transmit less data over the wire
+        // (and maybe avoid the join), leaving that as a TODO since it introduces more complexity and coupling to the index's backing store
+        // Result is in the form of (file_name, file_size, row_group_count, row_group_to_scan)
+        let row_groups: Vec<(String, i64, i64, i64)> = sqlx::query_as_with(&sql, values)


FWIW you could probably use sqlite to do the aggregations

Like add a SELECT DISTINCT ... to the query so you wouldn't have to handle duplicates in the Rust code (handling duplicates is fine, I am just pointing it out

There's already a distinct in there (currently line 91).
I do think these queries could be a bit fancier, e.g. I'd like it to just return file_name string, file_size int, row_groups int[] but I'm not sure that's possible with sqlite, I'm most familiar with postgres. But I do plan on giving it another shot before merging this.

Cool -- sorry I didn't see that. Nice@

davidhewitt · 2024-06-11T13:08:08Z

sqlx-sqlite/index.db

Possibly don't want to commit the DB to the repo?

I actually do. Running this is idempotent and it's one setup command less. The empty db I'd also 24kB

samuelcolvin · 2024-06-11T18:35:15Z

Looks great in general, I'll read this more later.

I do wonder if we could do something similar using a bloom filter stored in the external index - either using the bloom crate, then serialize that and store the data in sqlite (that's what I prototyped in logfire), or better using a bloom filter that's natively part of the database.

A bloom filter will only make sense on some columns, and even then on relatively large row groups or entire files, but still it could be very effective for stuff like trace_id where min/max won't help and there are not that many distinct values.

adriangb · 2024-06-11T18:37:31Z

I do wonder if we could do something similar using a bloom filter stored in the external index

Yes, absolutely! I think that's beyond the scope of this example though.

samuelcolvin · 2024-06-11T20:49:31Z

yes definitely.

alamb · 2024-06-11T20:49:37Z

I do wonder if we could do something similar using a bloom filter stored in the external index - either using the bloom crate, then serialize that and store the data in sqlite (that's what I prototyped in logfire), or better using a bloom filter that's natively part of the database.

BTW bloom filters (and similar structures) are the usecase for PruningStatistics::contains

Specifically that will tell you the constant values to check for in your bloom filter. We use this code in the parquet reader to evaluate parquet bloom filters

adriangb · 2024-06-11T21:04:06Z

BTW bloom filters (and similar structures) are the usecase for PruningStatistics::contains

It'd be really cool to have an example (again, I think not this one) where PruningStatistics::contains gets used to get the constant values to check and those constant values are checked against a secondary index instead of bloom filters in the parquet file (either using the indexes native filtering or pulling the bloom filter as a blob from the index and filtering against it in memory.

sqlx-sqlite/src/index.rs

davidhewitt · 2024-06-12T17:13:04Z

sqlx-sqlite/src/index.rs

+            let null_counts = null_counts.as_primitive::<UInt64Type>();
+
+            for row_group in 0..metadata.num_row_groups() {
+                match field.data_type() {


You could probably get some re-use here with two generic functions (one for IntXType and one for strings).

Ok I nerd-sniped myself into trying this; davidhewitt@9fa9e07 - feel free to pull the commit if you want it :)

I ended up going with e88641b which also means we do the downcasts once instead of for each value in the stats arrays. Let me know if that sounds good to you.

Ah definitely better!

sqlx-sqlite/src/index.rs

davidhewitt · 2024-06-12T22:35:20Z

sqlx-sqlite/src/conversions.rs

+    let values = match array.data_type() {
+        DataType::Int8 => {
+            let array = array.as_primitive::<datatypes::Int8Type>();
+            array.iter().map(|v| {


Can the map closure here be simplified to:

array.iter().map(Value::TinyInt).collect()

and similar for the other data types?

Yes, at least those that don’t require a box

adriangb · 2024-06-18T18:38:06Z

I think this is in a good enough state to merge and be iterated upon in the future.

adriangb added 2 commits June 7, 2024 12:16

Example using SQLx + SQLite

596f3b0

Example using SQLx + SQLite

3ebe0aa

adriangb commented Jun 7, 2024

View reviewed changes

adriangb mentioned this pull request Jun 7, 2024

Example for building an external index for parquet files apache/datafusion#10546

Closed

adriangb added 2 commits June 7, 2024 12:25

don't delete license

57c46e8

remove data

55ebd61

adriangb commented Jun 7, 2024

View reviewed changes

sqlx-sqlite/index.db Outdated

Copy link

Collaborator Author

adriangb Jun 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just an empty sqlite file

adriangb requested a review from alamb June 7, 2024 19:59

adriangb self-assigned this Jun 7, 2024

adriangb added 2 commits June 9, 2024 13:19

doc updates

63f398f

doc updates

b22b77e

alamb reviewed Jun 9, 2024

View reviewed changes

adriangb added 2 commits June 10, 2024 11:44

Use datafusion @ main, clean up deps, fix flushing of WAL to DB

43868c1

Remove LIKE example

0466b35

davidhewitt reviewed Jun 11, 2024

View reviewed changes

alamb mentioned this pull request Jun 11, 2024

DataFusion weekly project plan (Andrew Lamb) - June 10, 2024 apache/datafusion#10869

Closed

7 tasks

adriangb changed the title ~~Example using SQLx + SQLite~~ Example using SQLx + SQLite for secondary indexes Jun 11, 2024

adriangb changed the title ~~Example using SQLx + SQLite for secondary indexes~~ Example using SQLite via SQLx for secondary indexes Jun 11, 2024

adriangb added 6 commits June 11, 2024 20:52

Draft using PruningPredicate to generate a PhysiclaExpr

4d14fe7

Working example

d3310d5

Add comments

4fde60a

Update README.md

69855f7

replace db

a49c817

update comment

5ed957e

davidhewitt reviewed Jun 12, 2024

View reviewed changes

adriangb added 7 commits June 12, 2024 12:21

update comment

1846982

improve comment qbout {col}_row_count

1073e81

Move initialize into new

d47bbce

remove old query

27c1be4

reset db

42eb788

use sqlx::FromRow for results

a6c1d69

Cleaner conversion

e88641b

davidhewitt reviewed Jun 12, 2024

View reviewed changes

adriangb merged commit ae4edd4 into main Jun 18, 2024

adriangb deleted the sqlx branch June 18, 2024 18:38

adriangb added a commit that referenced this pull request Jun 18, 2024

Example using SQLite via SQLx for secondary indexes (#1)

b581cd8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Example using SQLite via SQLx for secondary indexes #1

Example using SQLite via SQLx for secondary indexes #1

adriangb commented Jun 7, 2024

adriangb Jun 7, 2024

alamb Jun 9, 2024

adriangb Jun 7, 2024

adriangb commented Jun 7, 2024

adriangb Jun 7, 2024

alamb left a comment

alamb Jun 9, 2024

alamb Jun 9, 2024

alamb Jun 9, 2024

adriangb Jun 10, 2024 •

edited

Loading

adriangb Jun 10, 2024

adriangb Jun 10, 2024 •

edited

Loading

davidhewitt Jun 11, 2024

alamb Jun 11, 2024

adriangb Jun 11, 2024

adriangb Jun 11, 2024 •

edited

Loading

adriangb Jun 12, 2024

alamb Jun 9, 2024

adriangb Jun 10, 2024

alamb Jun 11, 2024

davidhewitt Jun 11, 2024

adriangb Jun 11, 2024

samuelcolvin commented Jun 11, 2024

adriangb commented Jun 11, 2024 •

edited

Loading

samuelcolvin commented Jun 11, 2024

alamb commented Jun 11, 2024

adriangb commented Jun 11, 2024

davidhewitt Jun 12, 2024

davidhewitt Jun 12, 2024

adriangb Jun 12, 2024

davidhewitt Jun 12, 2024

davidhewitt Jun 12, 2024

adriangb Jun 13, 2024

adriangb commented Jun 18, 2024


		SQLite is used as a stand-in for an external remote relational database, it should be easy to adapt this example to use another database.

		This examples should be considered incomplete: it does not try to handle many edge cases or push down filters as much as possible.

Example using SQLite via SQLx for secondary indexes #1

Example using SQLite via SQLx for secondary indexes #1

Conversation

adriangb commented Jun 7, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

adriangb commented Jun 7, 2024

Choose a reason for hiding this comment

alamb left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

adriangb Jun 10, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

adriangb Jun 10, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

adriangb Jun 11, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

samuelcolvin commented Jun 11, 2024

adriangb commented Jun 11, 2024 • edited Loading

samuelcolvin commented Jun 11, 2024

alamb commented Jun 11, 2024

adriangb commented Jun 11, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

adriangb commented Jun 18, 2024

adriangb Jun 10, 2024 •

edited

Loading

adriangb Jun 10, 2024 •

edited

Loading

adriangb Jun 11, 2024 •

edited

Loading

adriangb commented Jun 11, 2024 •

edited

Loading