GitHub - rkh/hadoop-scripting: Scripting Languages on Hadoop: Jaql vs. Pig Latin (MapReduce stuff)

rkh / hadoop-scripting Public

Notifications You must be signed in to change notification settings
Fork 4
Star 12

Scripting Languages on Hadoop: Jaql vs. Pig Latin (MapReduce stuff)

tinyurl.com/pig-jaql

View license

12 stars 4 forks Branches Tags Activity

Star

Notifications

Branches Tags

Name		Name	Last commit message	Last commit date
Latest commit History 187 Commits
benchmarks		benchmarks
jaql		jaql
java/markov		java/markov
paper		paper
pig		pig
presentation		presentation
ruby		ruby
.gitignore		.gitignore
LICENSE		LICENSE
Rakefile		Rakefile
markov.sh		markov.sh
readme.txt		readme.txt
wordcount.sh		wordcount.sh

Repository files navigation

This code has been written by Johan Uhle and Konstantin Haase during the summer term 2009 at Hasso Plattner Institute, Universität Potsdam, D-14482 Potsdam, Germany, in the seminar “Map/Reduce Algorithms on Hadoop” supervised by Alexander Albrecht and Prof. Felix Naumann.

Since this is not an eclipse (nor a java) project, we did not provide an Ant file.

For running wordcount benchmarks:
Make sure you have setup hadoop, jaql and pig.
  ./wordcount.sh

For runnning markov benchmarks:
Make sure you have setup hadoop and pig.
Execute ./markov.sh

For generating the pdf:
Make sure you have latex, gnuplot and ruby installed.
Execute rake

For generating the Pig UDF:
cd pig/splitsuc
javac -cp pig.jar SPLITSUC.java STORESQL.java
cd ..
jar -cf splitsuc.jar splitsuc