ronellsalunke / Titanic-BigData Public

Notifications You must be signed in to change notification settings
Fork 0
Star 0

Java Hadoop MapReduce code for my Big Data Analytics Project using the Titanic dataset

0 stars 0 forks Branches Tags Activity

Notifications

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
q1		q1
q2		q2
q3		q3
q4		q4
Questions.txt		Questions.txt
README.md		README.md
input.csv		input.csv
titanic.csv		titanic.csv

Repository files navigation

BDA Project (Sem2)

Java Hadoop MapReduce code for my Big Data Analytics Project

Pre-requisites:

JDK
Hadoop

Steps

Make sure HADOOP_CLASSPATH is correctly set to the tools.jar within jdkx.y/lib/
Compile the code: hadoop com.sun.tools.javac.Main File.java
Create JAR file: jar cf File.jar *.class
Create input directory in Hadoop: hadoop dfs -mkdir /dir
Upload input file into Hadoop: hadoop dfs -put input.csv /dir/input.txt
Run the code: hadoop jar File.jar File /dir/input.txt /dir/out.txt
Check the output directory: hadoop dfs -ls /dir/*
Verify the output hadoop dfs -cat /dir/out.txt/part-r-00000

About

Java Hadoop MapReduce code for my Big Data Analytics Project using the Titanic dataset

java big-data hadoop titanic-kaggle hadoop-mapreduce big-data-analytics

Report repository

Releases

No releases published

Packages

Languages

Java 100.0%