Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Roles for our pathogen-repo-build GitHub Actions workflow #6

Closed
wants to merge 2 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 24 additions & 0 deletions env/production/.terraform.lock.hcl

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ import {

resource "aws_iam_role" "GitHubActionsRoleNextstrainBatchJobs" {
name = "GitHubActionsRoleNextstrainBatchJobs"
description = "Provides permissions to run jobs on AWS Batch via the Nextstrain CLI to select GitHub Actions OIDC workflows."
description = "Provides permissions to launch and monitor jobs on AWS Batch via the Nextstrain CLI to select GitHub Actions OIDC workflows."

max_session_duration = 43200 # seconds (12 hours)

Expand All @@ -21,7 +21,7 @@ resource "aws_iam_role" "GitHubActionsRoleNextstrainBatchJobs" {
"Condition": {
"StringLike": {
"token.actions.githubusercontent.com:aud": "sts.amazonaws.com",
"token.actions.githubusercontent.com:sub": "repo:nextstrain/.github:*"
"token.actions.githubusercontent.com:sub": "repo:nextstrain/*:*:job_workflow_ref:nextstrain/.github/.github/workflows/pathogen-repo-build.yaml@refs/heads/*",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would @refs/heads/* prevent the use of specific commit hashes, e.g. pathogen-repo-build.yaml@e48956bb5eb0b1db1ba3d614ede9d1a6275b72ec?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it would… and that was kinda intentional. I was worried that forks could get a specific commit id accessible in workflow calls. Reason is that GitHub (roughly speaking) stores all forks in the same Git storage backend, which means you can do things like view commits in forks via the forked repo, e.g. nextstrain/test-github-actions@0d1fee0 which really comes from https://github.com/tsibley/test-github-actions.

However, GitHub's protected against such attacks:

image

(In hindsight, of course they have, but they've also had other oversights in Actions before, so I don't think I was wrong to be wary.)

}
},
}
Expand Down
100 changes: 100 additions & 0 deletions env/production/aws-iam-role-GitHubActionsRoleNextstrainBuilds.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
resource "aws_iam_role" "GitHubActionsRoleNextstrainBuilds" {
name = "GitHubActionsRoleNextstrainBuilds"
description = "Provides permissions for a Nextstrain build (i.e. in a pathogen repo) to upload datasets, workflow files, etc. for select GitHub Actions OIDC workflows."

max_session_duration = 43200 # seconds (12 hours)

assume_role_policy = jsonencode({
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Federated": aws_iam_openid_connect_provider.github-actions.arn
},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringLike": {
"token.actions.githubusercontent.com:aud": "sts.amazonaws.com",
"token.actions.githubusercontent.com:sub": "repo:nextstrain/*:*:job_workflow_ref:nextstrain/.github/.github/workflows/pathogen-repo-build.yaml@refs/heads/*",
}
},
}
]
})

# This role provides a superset of the permissions expected to actually be
# required by any individual Nextstrain pathogen build. In practice, we
# further scope down permissions per-repo using an inline session policy
# declared in our centralized and trusted pathogen-repo-build workflow. The
# inline session policy is obviously less of a hard boundary, but it still
# provides guardrails against accidental operations. See also the discussion
# in <https://github.com/nextstrain/private/issues/96>.
# -trs, 15 May 2024
managed_policy_arns = [
# Builds inside the AWS Batch runtime need access to the jobs bucket.
aws_iam_policy.NextstrainJobsAccessToBucket.arn,
Comment on lines +35 to +36
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused by this. Why does the build require access to the jobs bucket? I thought that was handled by the Nextstrain CLI?

Edit: OH duh, it's the workdir for the Batch job so the build needs to be able to write to bucket.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup. That permission is provided by default by our Batch job role, but that role doesn't apply if you pass in your own credentials. In my ideal world we'd be able to take the union of the default job role + any additional creds passed in, but that's non-trivial in my understanding.

]

# All builds need a subset of this access for downloading starting data and
# publishing results.
inline_policy {
name = "S3Access"
policy = jsonencode({
"Version": "2012-10-17",
"Statement": [
# Technically we don't need to include the public buckets
# nextstrain-data and nextstrain-staging in this statement since they
# already allow a superset of this with their bucket policies, but it's
# good to be explicit about what permissions we require.
# -trs, 16 Feb 2024
{
"Sid": "List",
"Effect": "Allow",
"Action": [
"s3:ListBucket",
"s3:ListBucketVersions",
"s3:GetBucketLocation",
"s3:GetBucketVersioning",
],
"Resource": [
"arn:aws:s3:::nextstrain-data",
"arn:aws:s3:::nextstrain-data-private",
"arn:aws:s3:::nextstrain-ncov-private",
"arn:aws:s3:::nextstrain-staging",
],
},
{
"Sid": "ReadWrite",
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:GetObjectTagging",
"s3:GetObjectVersion",
"s3:GetObjectVersionTagging",
"s3:PutObject",
"s3:PutObjectTagging",
"s3:DeleteObject",
# but NOT s3:DeleteObjectVersion so objects can't be completely wiped
],
"Resource": [
"arn:aws:s3:::nextstrain-data/*.json",
"arn:aws:s3:::nextstrain-data/files/workflows/*",
"arn:aws:s3:::nextstrain-data/files/datasets/*",

"arn:aws:s3:::nextstrain-data-private/*.json",
"arn:aws:s3:::nextstrain-data-private/files/workflows/*",
"arn:aws:s3:::nextstrain-data-private/files/datasets/*",

# This bucket is akin to nextstrain-data-private/files/{workflows,datasets}/ncov/.
"arn:aws:s3:::nextstrain-ncov-private/*",

"arn:aws:s3:::nextstrain-staging/*.json",
"arn:aws:s3:::nextstrain-staging/files/workflows/*",
"arn:aws:s3:::nextstrain-staging/files/datasets/*",
],
},
]
})
}
}
19 changes: 19 additions & 0 deletions env/production/github-oidc.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
data "github_repositories" "nextstrain" {
query = "org:nextstrain"
}

resource "github_actions_repository_oidc_subject_claim_customization_template" "nextstrain" {
for_each = toset(data.github_repositories.nextstrain.names)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am I understanding this correctly that this is for existing Nextstrain repos? For any new repos, we'd have to re-deploy the Terraform plan?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. No code changes required, but a new plan and apply cycle would be needed. (We could run that periodically on automation…)

repository = each.key

# <https://docs.github.com/en/actions/deployment/security-hardening-your-deployments/about-security-hardening-with-openid-connect>
use_default = false
include_claim_keys = [
# The GitHub default…
"repo",
"context",

# …plus the <org>/<repo>/<path>@<ref> of the workflow obtaining the token, if any.
"job_workflow_ref",
]
}
9 changes: 9 additions & 0 deletions env/production/terraform.tf
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,10 @@ terraform {
source = "registry.terraform.io/hashicorp/aws"
version = "~> 4.32"
}
github = {
source = "integrations/github"
version = "~> 6.0"
}
}

backend "s3" {
Expand All @@ -25,3 +29,8 @@ terraform {
provider "aws" {
region = "us-east-1"
}

provider "github" {
# Authn is via GITHUB_TOKEN
owner = "nextstrain"
}