SQLMesh Data Pipeline for Drupal Integration

Overview

This document outlines an hourly process using SQLMesh to harvest data from external REST APIs, transform it, and store it in MySQL for consumption by Drupal views.

Developing locally

The justfile in this repository has most useful commands:

$ just -l -u
Available recipes:
    default         # Choose a task to run
    prereqs         # Install project tools
    minikube        # Setup minikube
    mysql-svc       # Forward mysql from service defined in env
    dev             # SQLMesh ui for local dev
    skaffold *args  # skaffold configured with env and minikube
    mysqldump *args # mysqldump configured with same env as SQLMesh
    mysql *args     # mysql configured with same env as SQLMesh
    everestctl      # Install percona everest cli
    everest         # Percona Everest webui to manage databases

To get started, run just everest and use the web ui to create a database. Configure the database details in the .env file (refer example.env). Once configured you can run just local-dev to forward the mysql port and expose the sqlmesh ui.

To dump the sqlmesh database for validation/testing:

just mysqldump sqlmesh | gzip > sqlmesh.sql.gz

Testing container with skaffold

Configure secrets then run skaffold dev (which expects secrets created in cluster).

Container publish workflow

Process Design

Hourly Data Harvesting: SQLMesh connects to and harvests data from external REST APIs
- SQLMesh Python Models
- Configurable API endpoints and authentication
- Runs every hour via scheduled task
Data Transformation: SQLMesh processes the harvested data
- SQLMesh SQL Models
- Data cleaning and standardization
- Value translation based on mapping configuration
- Data clone from duckdb state engine to mysql target tables
Content Management:
- Read-only imports of external content
- Full management of Drupal-authored content

Notes on Development

For detailed implementation guidance, refer to:

Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
.devcontainer		.devcontainer
.github		.github
audits		audits
macros		macros
models		models
seeds		seeds
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
.python-version		.python-version
Brewfile		Brewfile
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
config.yaml		config.yaml
example.env		example.env
justfile		justfile
k8s-harvestjob.yaml		k8s-harvestjob.yaml
pyproject.toml		pyproject.toml
schema.png		schema.png
skaffold.yaml		skaffold.yaml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SQLMesh Data Pipeline for Drupal Integration

Overview

Developing locally

Testing container with skaffold

Container publish workflow

Process Design

Notes on Development

About

Releases

Packages

Contributors 3

Languages

License

wagov-dtt/wa.gov.au_harvest-consultations

Folders and files

Latest commit

History

Repository files navigation

SQLMesh Data Pipeline for Drupal Integration

Overview

Developing locally

Testing container with skaffold

Container publish workflow

Process Design

Notes on Development

About

Topics

Resources

License

Security policy

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages