This Quick Start helps you build a multi-node Cloudera Enterprise Data Hub (EDH) cluster on the AWS Cloud by integrating Cloudera Director with AWS services such as Amazon EC2 and Amazon VPC. EDH enables you to store your data with the flexibility to run a variety of enterprise workloads--including batch processing, interactive SQL, enterprise search, and advanced analytics--while utilizing robust security, governance, data protection, and management. You can choose to deploy Cloudera EDH into a new VPC or your existing VPC. The Quick Start includes AWS CloudFormation templates that automate each option.
In this reference architecture, we support two options for deploying Cloudera's Enterprise Data Hub within an Amazon VPC. One option is to launch all the nodes within a public subnet providing direct Internet access. The second option is to deploy all the nodes within a private subnet. The reference deployment builds both a public and private subnet, and the cluster can be deployed in either subnet using the configuration file.
Deployment steps:
- Sign up for an AWS account at http://aws.amazon.com, select a region, and create a key pair.
- In the AWS CloudFormation console, launch one of the following templates to build a new stack:
- /templates/cloudera-master.template (to deploy Cloudera EDH into a new VPC)
- /templates/cloudera.template (to deploy Cloudera EDH into your existing VPC)
The Quick Start provides parameters that you can set to customize your deployment. For architectural details, best practices, step-by-step instructions, and customization options, see the deployment guide.