Scalable identity resolution, entity resolution, data mastering and deduplication using ML
-
Updated
Jul 7, 2024 - Java
Scalable identity resolution, entity resolution, data mastering and deduplication using ML
Wrangler Transform: A DMD system for transforming Big Data
Data transformation framework for ETL processing with SQL-like syntax and GIS extensions, based on Apache Spark
Preprocessing of data (e.g. filling missing values, normalization,etc.) in field of Data Mining (Knowledge Discovery).
🗓️ iCalendar proxy reshaping the data for your needs
Pluggable framework that can be used to spider websites and extract data.
The project efficiently processes user data, demonstrating key components. Explore the code for a structured approach to large-scale data transformations.
Apache Spark based 'Dist' utility to supplement Data Cooker ETL tool
[👨🎓 BSc thesis] merGeo: Integration Platform For Linked Data Management Tools
Api to receive IoT data from an end device
DeltaFi is a flexible, code-light data transformation and normalization platform.
Add a description, image, and links to the data-transformation topic page so that developers can more easily learn about it.
To associate your repository with the data-transformation topic, visit your repo's landing page and select "manage topics."