GitHub

Big-Data

Hadoop & Mapreduce

In Hadoop, MapReduce is a computation that decomposes large manipulation jobs into individual tasks that can be executed in parallel cross a cluster of servers. The results of tasks can be joined together to compute final results.

[MapReduce] consists of 2 steps:

Map Function – It takes a set of data and converts it into another set of data, where individual elements are broken down into tuples (Key-Value pair).
Reduce Function – Takes the output from Map as an input and combines those data tuples into a smaller set of tuples.
Work Flow of Program

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
NGram		NGram
PageRank		PageRank
WordCount		WordCount
pics		pics
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Big-Data

About

Releases

Packages

Languages

KevinZhou92/Big_Data

Folders and files

Latest commit

History

Repository files navigation

Big-Data

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages