Apache DolphinScheduler is the modern data orchestration platform. Agile to create high performance workflow with low-code
-
Updated
Jul 5, 2024 - Java
Apache DolphinScheduler is the modern data orchestration platform. Agile to create high performance workflow with low-code
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
An orchestration platform for the development, production, and observation of data assets.
Lean and mean distributed stream processing system written in rust and web assembly.
🧙 Build, run, and manage data pipelines for integrating and transforming data.
Ministream is a small, stand-alone, real-time event messaging streaming server
dbt package that is part of Elementary, the dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-hosted or cloud service with premium features.
Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.
Python toolkit for working with high-dimensional neural data recorded during naturalistic, continuous stimuli @a-darcher @rachrapp
Build data pipelines with SQL and Python, ingest data from different sources, add quality checks, and build end-to-end flows.
The dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-hosted or cloud service with premium features.
Building data processing pipelines for documents processing with NLP using Apache NiFi and related services
One framework to develop, deploy and operate data workflows with Python and SQL.
Framework for standardizing, transforming, and applying quality checks to time series data.
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
Low-code ETL for structured and unstructured data. Generates Python code you can deploy anywhere.
Meltano: the declarative code-first data integration engine that powers your wildest data and ML-powered product ideas. Say goodbye to writing, maintaining, and scaling your own API integrations.
MLeap: Deploy ML Pipelines to Production
Dataform is a framework for managing SQL based data operations in BigQuery
Add a description, image, and links to the data-pipelines topic page so that developers can more easily learn about it.
To associate your repository with the data-pipelines topic, visit your repo's landing page and select "manage topics."