The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
-
Updated
Jul 7, 2024 - Python
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
Building data processing pipelines for documents processing with NLP using Apache NiFi and related services
A high-performance, extremely flexible, and easily extensible multipurpose workflow engine.
汇总Apache Hudi相关资料
🧙 Build, run, and manage data pipelines for integrating and transforming data.
SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.
An open-source compound AI toolchain for fast, accurate, and efficient entity matching, powered by LLMs.
Privacy and Security focused Segment-alternative, in Golang and React
Upserts, Deletes And Incremental Processing on Big Data.
Hop Orchestration Platform
Flink CDC is a streaming data integration tool
A python framework for data mining microbial natural products by integrating genomics and metabolomics data
Jitsu is an open-source Segment alternative. Fully-scriptable data ingestion engine for modern data teams. Set-up a real-time data pipeline in minutes, not days
The open source high performance ELT framework powered by Apache Arrow
Lean and mean distributed stream processing system written in rust and web assembly.
CloudQuery Go SDK for source and destination plugins
Powerful RDF Knowledge Graph Generation with RML Mappings
Conduit streams data between data stores. Kafka Connect replacement. No JVM required.
integrated chemical-property-values from many source databases.
Add a description, image, and links to the data-integration topic page so that developers can more easily learn about it.
To associate your repository with the data-integration topic, visit your repo's landing page and select "manage topics."