hadoop-hdfs

Here are 56 public repositories matching this topic...

Morphl-AI / MorphL-Community-Edition

MorphL Community Edition uses big data and machine learning to predict user behaviors in digital products and services with the end goal of increasing KPIs (click-through rates, conversion rates, etc.) through personalization

kubernetes machine-learning cassandra pipeline artificial-intelligence pyspark user-experience data-driven-design conversion-rate-optimization front-end-development product-development hadoop-hdfs morphl-platform

Updated Oct 2, 2019
Python

vim89 / datapipelines-essentials-python

Star

Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations

python big-data spark apache-spark hadoop etl xml python3 xml-parsing pyspark data-pipeline datalake hadoop-mapreduce spark-sql etl-framework hadoop-hdfs etl-pipeline etl-components

Updated May 6, 2023
Python

AhmetFurkanDEMIR / Data-Engineering-Project-with-HDFS-and-Kafka

Sponsor

Star

Data Engineering Project with Hadoop HDFS and Kafka

Updated Nov 4, 2023
Python

prabal03 / python-automation-in-linux

Star

Python automation in linux

linux docker aws webserver lvm hadoop-hdfs

Updated Nov 8, 2020
Python

Ren294 / Log-Analysis-Project

Star

This project builds a scalable log analytics pipeline use Lambda architecture for real-time and batch processing of NASA server logs.

data-science big-data cassandra apache-spark hive hadoop grafana data-engineering spark-streaming apache-kafka apache-nifi powerbi spark-sql big-data-analytics hadoop-hdfs cassandra-driver spark-rdd

Updated Sep 16, 2024
Python

briandi26 / Machine-Learning-for-Forest-Fire-Prediction

Star

Machine Learning for Forest Fire Prediction using Hadoop ecosystems and Spark Tools (Pyspark)

machine-learning spark pyspark forest-fire-model hadoop-hdfs

Updated Aug 2, 2019
Python

viveknigam3003 / hadoop-linux-setup

Star

Python scripts to assist setting up Hadoop v1 in Linux and starting a NameNode, DataNodes and Client.

linux hadoop cluster hdfs hadoop-hdfs

Updated Dec 17, 2018
Python

MarwanMashra / Hadoop-MapReduce

Star

Map/Reduce project with Hadoop

python distributed-systems hadoop mapreduce hadoop-mapreduce hadoop-hdfs

Updated Feb 27, 2022
Python

rajoffl / BigData-Project-3-Kafka-Streaming

Star

Real-Time Meetup RSPV Data Processing using Kafka and Spark

kafka spark aws-s3 pyspark matplotlib hadoop-hdfs

Updated Sep 7, 2021
Python

Luissalazarsalinas / Avocado-Yield-Prediction

Star

Freelancer Project - Batch processing data pipeline and machine learning application.

docker data-science data machine-learning airflow hive docker-compose agriculture postgresql python3 data-engineering xgboost powerbi datawarehouse avocado hadoop-hdfs fastapi

Updated Oct 23, 2023
Python

pranay1603 / Linux-Automation

Star

Linux-Automation

linux docker aws webserver python3 lvm hadoop-hdfs

Updated Nov 11, 2020
Python

SakhriHoussem / MapReduce-Python

Star

MapReduce Python Example

Updated Jul 11, 2018
Python

nbfujx / hadoop-learn-demo

Star

hadoop hadoop-mapreduce hadoop-hdfs

Updated Jan 16, 2018
Python

OmarZOS / deep-learning-at-scale

Star

This repository contains the necessary scripts for oil production flow prediction models that make use of spark's MLlib

spark cnn-keras lstm-neural-networks oil-wells hadoop-hdfs oil-and-gas

Updated Jan 28, 2022
Python

BurraAbhishek / Python_Hadoop_MapReduce_MarketBasketAnalysis

Star

Market Basket Analysis using Hadoop MapReduce in Python

python frequent-itemset-mining hadoop-mapreduce hadoop-streaming apriori-algorithm market-basket-analysis hadoop-hdfs affinity-analysis

Updated Jul 25, 2021
Python

theNeo39 / airline_analysis

Star

Airline On-time performance Using Hive

python hive tez jdbc-connector hadoop-hdfs

Updated Jan 25, 2021
Python

divithraju / divith-raju-pipeline-hadoop-pyspark

Star

This project presents a comprehensive data pipeline designed to predict customer churn using historical customer data. By leveraging Hadoop and PySpark, this pipeline efficiently processes large datasets, performs feature engineering, and trains a machine learning model to identify customers at risk of leaving.

linux open-source data database hadoop pipeline ubuntu bigdata apache project python3 pyspark software-engineering dataengineering hadoop-hdfs pyspark-mllib pyspark-python project-repository

Updated Aug 17, 2024
Python

everthonnreis / hadoop-spark-install-shell-script

Star

Script for installing a standalone hadoop and spark environment

scala spark jupyterhub shell-script hadoop-hdfs anaconda3

Updated Sep 6, 2021
Python

divithraju / divith-raju-ETL-Airflow-Project

Star

This ETL pipeline project is a practical demonstration of my skills in data engineering and automation using Python and Apache Airflow. By integrating MySQL for data storage and leveraging Airflow for task orchestration, the project simulates a scalable and modular ETL solution often required in enterprise data workflows.

linux data airflow sql apache-spark ubuntu etl script bigdata apache project python3 mysql-database dataengineering hadoop-hdfs etl-pipeline airflow-dags project-repository

Updated Aug 17, 2024
Python

dastagiri7 / Deep-Learning-Prediction-Engine-on-Big-Data

Star

To mitigate the housing business in a specific region, the proposed model helps to predict the best feasible pricing(real-estate).

deep-learning neural-network tensorflow keras big-data-analytics hadoop-hdfs big-data-for-official-statistics

Updated Aug 18, 2021
Python

Improve this page

Add a description, image, and links to the hadoop-hdfs topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the hadoop-hdfs topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

hadoop-hdfs

Here are 56 public repositories matching this topic...

Morphl-AI / MorphL-Community-Edition

vim89 / datapipelines-essentials-python

AhmetFurkanDEMIR / Data-Engineering-Project-with-HDFS-and-Kafka

prabal03 / python-automation-in-linux

Ren294 / Log-Analysis-Project

briandi26 / Machine-Learning-for-Forest-Fire-Prediction

viveknigam3003 / hadoop-linux-setup

MarwanMashra / Hadoop-MapReduce

rajoffl / BigData-Project-3-Kafka-Streaming

Luissalazarsalinas / Avocado-Yield-Prediction

pranay1603 / Linux-Automation

SakhriHoussem / MapReduce-Python

nbfujx / hadoop-learn-demo

OmarZOS / deep-learning-at-scale

BurraAbhishek / Python_Hadoop_MapReduce_MarketBasketAnalysis

theNeo39 / airline_analysis

divithraju / divith-raju-pipeline-hadoop-pyspark

everthonnreis / hadoop-spark-install-shell-script

divithraju / divith-raju-ETL-Airflow-Project

dastagiri7 / Deep-Learning-Prediction-Engine-on-Big-Data

Improve this page

Add this topic to your repo