Skip to content

Latest commit

 

History

History
162 lines (110 loc) · 7.08 KB

DEPLOY_FROM_SCRATCH.md

File metadata and controls

162 lines (110 loc) · 7.08 KB

Deployment from Scratch

How to deploy the entire governance watchdog infrastructure from scratch.

Infra Deployment via Terraform

Terraform State Management

Google Cloud Permission Requirements

Using Service Account Impersonation (recommended)

The project is preconfigured to impersonate our shared terraform service account (see ./infra/versions.tf). The only permission you will need on your own gcloud user account is roles/iam.serviceAccountTokenCreator to allow you to impersonate our shared terraform service account.

Using Your Own Gcloud User Account (not recommended)

If for whatever reason service account impersonation doesn't work, you'll need at least the following permissions on your personal gcloud account to deploy this project with terraform:

  • roles/resourcemanager.folderViewer on the folder that you want to create the project in
  • roles/resourcemanager.organizationViewer on the organization
  • roles/resourcemanager.projectCreator on the organization
  • roles/billing.user on the organization
  • roles/storage.admin to allow creation of new storage buckets

Deployment

  1. Run ./bin/set-up-terraform.sh to check required permissions and provision all required terraform providers and modules

  2. Create a ./infra/terraform.tfvars file. This is like .env for Terraform:

    touch ./infra/terraform.tfvars
    # This file is `.gitignore`d to avoid accidentally leaking sensitive data
  3. Add Google Cloud Org ID and Billing Account to your local terraform.tfvars

    # Required for creating new GCP projects
    # Get it via `gcloud organizations list`
    org_id               = "<our-org-id>"
    
    # Required for creating new GCP projects
    # Get it via `gcloud billing accounts list` (pick the GmbH account)
    billing_account      = "<our-billing-account-id>"
  4. Create a Discord Webhook URL for the channel you want to receive notifications in

  5. Add the Discord Webhook URL to your local terraform.tfvars:

    # This will be stored in Google Secret Manager upon deployment via Terraform
    echo "discord_webhook_url = \"<discord-webhook-url>"" >> terraform.tfvars
  6. Create a Telegram group and invite a new bot into it

    • Open a new telegram chat with @BotFather

    • Use the /newbot command to create a new bot

    • Copy the API key printed out at the end of the prompt and store it in your terraform.tfvars

      telegram_bot_token = "<bot-api-key>"
    • Get the Chat ID by inviting @MissRose_bot to the group and then using the /id command

    • Add the Chat ID to your terraform.tfvars

      telegram_chat_id = "<group-chat-id>"
    • Remove @MissRose_bot after you got the Chat ID

  7. Get (or generate if non-existing) a QuickNode API key to enable Terraform to provision QuickAlerts

    quicknode_api_key = "<quicknode-api-key>"
  8. Get a VictorOps webhook URL by copying the Service API Endpoint URL from the VictorOps Stackdriver Integration. The routing key can be founder under the Settings tab

    # Required to send on-call alerts to VictorOps
    victorops_webhook_url   = "<victorops-webhook-url>/<victorops-routing-key>"
  9. Generate an auth key to allow us to test the deployed function from our local machines

    • You can use your password manager to generate a long and secure (url-compatible) key
    • Add it to terraform.tfvars
    x_auth_token = "<x-auth-token>"
  10. Deploy the entire project via terraform apply

    • You will see an overview of all resources to be created. Review them if you like and then type "Yes" to confirm.

    • This command can take up to 10 minutes because it does a lot of work creating and configuring all defined Google Cloud Resources

    • ❌ Given the complexity of setting up an entire Google Cloud Project incl. service accounts, permissions, etc., you might run into deployment errors with some components.

      Often a simple retry of terraform apply helps. Sometimes a dependency of a resource has simply not finished creating when terraform already tried to deploy the next one, so waiting a few minutes for things to settle can help.

  11. Set your local gcloud project ID to our freshly created one and populate your local cache with frequently used project values:

    npm run cache:clear
  12. Check that everything worked as expected

    # 1. Call the deployed function via:
    npm run test:prod
    
    # 2. Monitor the configured Discord channel for a message to appear
    open https://discord.com/channels/966739027782955068/1262714272476037212
    
    # 3. Monitor the configured Telegram channel for a message to appear
    
    # 4. Check the function logs via:
    npm run logs # prints logs into your local terminal (with a few seconds of latency)
    # OR
    npm run logs:url # prints a URL to the cloud console logs in the browser

Debugging Problems

View Logs

For most problems, you'll likely want to check the cloud function logs first.

  • npm run logs will print the latest 50 staging log entries into your local terminal for quick and easy access
  • npm run logs:url will print the URL to the staging function logs in the Google Cloud Console for full access

Teardown

  1. Run npm run destroy to delete the entire production environment from google cloud
    • You might run into permission issues here, especially around deleting the associated billing account resources
    • I didn't have time to figure out the minimum set of permissions required to delete this project so the easiest would be to let an organization owner (i.e. Bogdan) run this with full permissions if you face any issues