Skip to content

Individual MR job commands

Madhav Sharan edited this page Jan 19, 2017 · 7 revisions

Video Processing jobs -

Command to run OTS

Input - /user/pts/output/OpticalAndGradientTimeSeriesInput
(Input has a file original_videos.txt which contains path to all the video files)
Output - /user/pts/output/OTSOutput
(of.txt files for each video)
Command

hadoop fs -rm -r /user/pts/output/OTSOutput
hadoop jar /mnt/pooled_time_series/target/pooled-time-series-1.0-SNAPSHOT-jar-with-dependencies.jar org.pooledtimeseries.OpticalTimeSeries /user/pts/output/OpticalAndGradientTimeSeriesInput /user/pts/output/OTSOutput 

Command to run GTS

Input - /user/pts/output/OpticalAndGradientTimeSeriesInput
(Input has a file original_videos.txt which contains path to all the video files)
Output - /user/pts/output/GTSOutput
(hog.txt files for each video)
Command

hadoop fs -rm -r /user/pts/output/OTSOutput
hadoop jar /mnt/pooled_time_series/target/pooled-time-series-1.0-SNAPSHOT-jar-with-dependencies.jar org.pooledtimeseries.GradientTimeSeries /user/pts/output/OpticalAndGradientTimeSeriesInput /user/pts/output/GTSOutput 

Similarity Jobs

Command to run MeanChiSquareDistanceCalcs

Input - /user/pts/output/MeanChiSquareAndSimilarityInput
(Input contains file videos.txt which have a pair of video file path on each line)
Output - /user/pts/output/MeanChiSquaredCalcOutput
(mean_dist.txt)
Command

hadoop fs -rm -r /user/pts/output/MeanChiSquaredCalcOutput
hadoop jar target/pooled-time-series-1.0-SNAPSHOT-jar-with-dependencies.jar org.pooledtimeseries.MeanChiSquareDistanceCalculation /user/pts/output/MeanChiSquareAndSimilarityInput /user/pts/output/MeanChiSquaredCalcOutput

Individual command to run SimilarityCalculation

Input 1 - /user/pts/output/MeanChiSquareAndSimilarityInput
(Input 1 contains file videos.txt which have a pair of video file path on each line)

Input 2 - /user/pts/output/mean_dists.txt
(Input 2 is merged output of MeanChiSquaredCalcOutput this is referenced in SimilarityCalculation mapper jobs)
Output - /user/pts/output/SimilarityCalc/
(similarity_calc.txt)
Command

hadoop fs -rm -r /user/pts/output/SimilarityCalc
hadoop jar target/pooled-time-series-1.0-SNAPSHOT-jar-with-dependencies.jar org.pooledtimeseries.SimilarityCalculation /user/pts/output/MeanChiSquareAndSimilarityInput /user/pts/output/SimilarityCalc/ /user/pts/output/mean_dists.txt

Notes

The input used above is in ./OpticalTimeSeriesInput/videos.txt and looks like

/Path/to/example/videos/badvideo.mp4
/Path/to/example/videos/goodvideo.mp4
/Path/to/example/videos/movie2.mp4

The input used for the similarity job above ./SimilarityInput looks like the below. It should contain the pairs of all videos to be evaluated.

/Path/to/badvideo.mp4,/Path/to/badvideo.mp4
/Path/to/badvideo.mp4,/Path/to/goodvideo.mp4
/Path/to/goodvideo.mp4,/Path/to/goodvideo.mp4

Example output from the similarity calculation looks something like the below:

/Path/to/badvideo.mp4,/Path/to/badvideo.mp4     1.0
/Path/to/badvideo.mp4,/Path/to/goodvideo.mp4    0.0326700669930306
/Path/to/goodvideo.mp4,/Path/to/goodvideo.mp4   1.0

Easy Script

All above commands can be executed in one go through easy script pooled-time-series-hadoop.

cd $HOME/hadoop-pot #project home directory
./src/main/bin/pooled-time-series-hadoop `pwd` /full/path/to/example_videos_dir  #You can also use an alias but keep same home dir