-
Notifications
You must be signed in to change notification settings - Fork 140
Description
With materializations split into non-distributed (standard) and distributed (non-standard) versions — i.e., table
vs. distributed_table
and incremental
vs. distributed_incremental
— we often encounter issues due to redundancy and divergence in their implementations. This has led to distributed materializations lacking full feature support or containing bugs in duplicated sections of the codebase.
Some examples:
-
Schema change validation for contracted incremental models in
dbt-core
is only applied to"incremental"
materializations. A related issue exists here, which prevents microbatches from running when using thedistributed_incremental
strategy. -
Significant code duplication between the incremental and distributed_incremental materializations makes maintenance harder and introduces subtle bugs when switching between them.
Suggested Solution
Consolidate the distributed logic into the existing incremental
and table materializations
, controlled via a new model configuration flag (e.g., a boolean is_distributed
). This would:
- Simplify the codebase by reducing redundancy.
- Ensure consistent feature support and behavior across both modes.
- Enable more comprehensive integration testing for both distributed and non-distributed materializations.
The existing distributed_incremental
and distributed_table
materializations could be retained as aliases (with deprecation warnings) for backwards compatibility.