Diffa is a command-line interface (CLI) tool for comparing data between two database systems. It supports configuration through both environment variables and a configuration file.
On the meltano side, we can install the diffa just like any other plugin.
- name: diffa
namespace: diffa
pip_url: git+https://github.com/Kaligo/diffa.git
executable: diffa
config:
uri:
source: ${DIFFA__SOURCE_URI}
target: ${DIFFA__TARGET_URI}
diffa_db: ${DIFFA__DIFFA_DB_URI}
settings:
- name: uri.source
env: DIFFA__SOURCE_URI
- name: uri.target
env: DIFFA__TARGET_URI
- name: uri.diffa_db
env: DIFFA__DIFFA_DB_URI
Users can configure database connection strings in two ways:
-
Environment Variables (higher priority)
DIFFA__SOURCE_URI
: Connection string for the source database.DIFFA__TARGET_URI
: Connection string for the target database.DIFFA__DIFFA_DB_URI
: Connection string for the Diffa database.
-
Configuration File (if environment variables are not set)
-
Run the following command to configure Diffa interactively:
diffa configure
-
This will store the connection strings in
~/.diffa/config.json
.
-
- Interactively configure database connections and save them to
~/.diffa/config.json
. - Environment variables take precedence over configuration in the file.
diffa configure
- Run
diffa
database migrations.
diffa migrate
The data-diff
command checks data differences between two database systems.
To compare data between two databases, run:
diffa data-diff \
--source-database loyalty_engine_staging \
--source-schema public \
--source-table users \
--target-database rc-us_dev \
--target-schema loyalty_engine \
--target-table users \
--lookback-window 1 \
--execution-date 2025-02-02
--source-database
: Name of the source database (default: Infered from the connection string).--source-schema
: Schema of the source table (default:public
).--source-table
: (Required) Name of the source table.--target-database
: Name of the target database (default: Infered from the connection string).--target-schema
: Schema of the target table (default:public
).--target-table
: (Required) Name of the target table.--lookback-window
: (Required) Lookback window in days.--execution-date
: (Required) Execution date inYYYY-MM-DD
format.