Skip to content

We perform spatial operations on the dataset distributed over HDFS and Spark cluster consisting of a master and two worker nodes. We use New York City taxi dataset to obtain spatial statistics and use that to identify the statistically significant clusters (Hot Spot Analysis)

Notifications You must be signed in to change notification settings

narendrakumar92/Distributed-Geospatial-computation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Distributed-Geospatial-computation

We perform spatial operations on the dataset distributed over HDFS and Spark cluster consisting of a master and two worker nodes. We use New York City taxi dataset to obtain spatial statistics and use that to identify the statistically significant cell.

For Phase 1 description, please check CSE512-Phase1.pdf

Youtube Demo Phase 1: https://www.youtube.com/watch?v=oe5NLMuJWAI

For Phase 2 description, please check Phase2-requirement.pdf

A new function is created in GeoSpark Library to perform Naive Cartesian Product and the performance is compared with other optimized API calls. Full detailed analysis is reported (Phase2.pdf)

For Phase 3 description, please check Phase3-requirement.pdf.

We use New York City taxi dataset to find the important points in the envelope and create neighborhood of each cell in the space-time cube with latitude, longitude, and date (each cell has 26 neighbors), calculate significant cells using Getis-ord statistic (Hot Spot Analysis)

Full report: Final Report.pdf

About

We perform spatial operations on the dataset distributed over HDFS and Spark cluster consisting of a master and two worker nodes. We use New York City taxi dataset to obtain spatial statistics and use that to identify the statistically significant clusters (Hot Spot Analysis)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published