-
Notifications
You must be signed in to change notification settings - Fork 22
Description
Parts of datafusion and its ecosystem support DELETE and UPDATE queries (for example the sql parser), but other parts only really support SELECT and INSERT (for example, the TableProvider trait, the DefaultPhysicalPlanner).
Overall this means that datafusion by default does not support updates or deletes.
datafusion-postgres currently reflects this by supporting SELECT and INSERT queries. Supporting INSERT includes mapping from the returned table structure and correctly returning the number of affected rows to the postgresql client. The format it needs to map from is from DataSinkExec:
/// Create a output record batch with a count
///
/// ```text
/// +-------+,
/// | count |,
/// +-------+,
/// | 6 |,
/// +-------+,
/// ```
fn make_count_batch(count: u64) -> RecordBatch {
let array = Arc::new(UInt64Array::from(vec![count])) as ArrayRef;
RecordBatch::try_from_iter_with_nullable(vec![("count", array, false)]).unwrap()
}
It is actually possible today for applications using datafusion to support DELETE and UPDATE queries by using a custom QueryPlanner implementation. However, this will not by correctly handled by datafusion-postgres yet because it doesn't map the number of rows affected.
As an experiment I made my custom QueryPlanner implementation with support for update and delete queries, and it returns the same "count" structure as DataSinkExec returns, and I hacked datafusion-postgres to map the returned count correctly. This all works fine.
My question is: How should this be supported properly in datafusion-postgres? It seems wrong to add support for update & delete mapping based on assuming a specific table structure when nothing is defined in datafusion itself for updates & deletes.
Maybe datafusion-postgres could expose a way to customise or override how it maps results to pgwire responses?
I'd be happy to work on this, but I'm not sure what the best approach is.
(Longer term - I also wonder whether one data datafusions TableProvider trait might add support for updates and deletes, and whether working on that is the right approach?)