Skip to content

GIS Tools for Hadoop for Beginners

Sarah Ambrose edited this page Jun 15, 2015 · 15 revisions

#Tryout GIS Tools for Hadoop with a Virtual Machine This tutorial will walk you through the steps of setting up a virtual machine (VM) and running GIS Tools for Hadoop - without a cluster! This example is oriented towards users using Windows.

##Requirements

  • A virtual machine with a Hadoop Environment. We used the Hortonworks Sandbox with VirtualBox

    At the time of writing, we had errors installing HDP v2.2, while HDP v2.1 worked well

Once you have downloaded a VM and Hadoop Environment, complete the set-up instructions by following the steps in the installation guide.

###Optional

The openssh package is required with Cygwin (not turned on by default in standard install).

##Instructions

  1. Open the VM and click the greenshow arrow, your VirtualBox should look like:

If you do not have the Hortonworks Sandbox listed on the left, you will need to add it - follow the installation guide instructions.

  1. Make note of the IP address - this will allow you to access the Sandbox later using ssh.

    ssh [email protected] -p 2222

  1. Type Alt + F5 and complete the username: root and password: hadoop.

  2. Type ls and push enter, you will see the files listed in the folder root.

  3. Make a folder named esri-git to hold the github project, type: mkdir esri-git and push enter.

  4. Type ls and enter again and you will see the newly created folder.

  5. Type cd esri-git to enter the newly created directory.

  1. You are now going to clone the github repository. Since VirtualBox does not recognise the web address "www.github.com" we will need to find the IP address of github.com. In either Windows Command Prompt or Cygwin type: ping github.com Once a ping has been returned, push Ctrl + C to stop the responses. Make note of the IP address.

Here it would be 190.30.252.131

  1. In VirtualBox type :

    git clone [email protected]:Esri/gis-tools-for-hadoop.git where xxx.xx.xxx.xxx is the IP address from Step 8.

  2. You have now cloned the GIS Tools for Hadoop toolkit. If you would like to work in Cygwin (which is much easier to use): continue on, if not: skip to step 14.

  3. In Cygwin type ssh [email protected] -p 2222 (From Step 2)

  1. Enter the password: hadoop

  2. Change directories if needed (cd esri-git)

  3. You are now able to complete the the sample. Completing the sample successfully means that everything is installed correctly, and will give you a good intro of how to use the framework.