Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
-
Updated
Jul 4, 2024 - Python
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
Apache DolphinScheduler is the modern data orchestration platform. Agile to create high performance workflow with low-code
An orchestration platform for the development, production, and observation of data assets.
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
🧙 Build, run, and manage data pipelines for integrating and transforming data.
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
MLeap: Deploy ML Pipelines to Production
Build data pipelines, the easy way 🛠️
Lean and mean distributed stream processing system written in rust and web assembly.
Use this template repository to write projects and tenders data ingestion pipelines
The dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-hosted or cloud service with premium features.
Optimus is an easy-to-use, reliable, and performant workflow orchestrator for data transformation, data modeling, pipelines, and data quality management.
Dataform is a framework for managing SQL based data operations in BigQuery
Meltano: the declarative code-first data integration engine that powers your wildest data and ML-powered product ideas. Say goodbye to writing, maintaining, and scaling your own API integrations.
The best place to learn data engineering. Built and maintained by the data engineering community.
Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.
First open-source data discovery and observability platform. We make a life for data practitioners easy so you can focus on your business.
Relational data pipelines for the science lab
dbt package that is part of Elementary, the dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-hosted or cloud service with premium features.
Learn the basics of Apache Kafka® from leaders in the Kafka community with these video courses covering the Kafka ecosystem and hands-on exercises.
Add a description, image, and links to the data-pipelines topic page so that developers can more easily learn about it.
To associate your repository with the data-pipelines topic, visit your repo's landing page and select "manage topics."