Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resource name sanitization #108

Merged
merged 11 commits into from
Feb 2, 2023
Merged

Conversation

DavidSeptimus-Klotho
Copy link
Contributor

@DavidSeptimus-Klotho DavidSeptimus-Klotho commented Jan 18, 2023

Resolves #49 (minus the actual namespacing piece)

This PR adds sanitization for the identifiers of AWS resources specified for creation via Klotho's IAC layer. This change should not change any happy path identifiers (resource names or Pulumi URNs)

Resource names are generated using tagged string templates as in the following examples using the sanitized() tag function:

let appName = 'my-very-long-app-name'
let resourceId = 'my-load-balancer'// ensure that the output is a valid load balancer name
// from left to right, hash appName and then resourceId if the name is too long
sanitized(loadBalancer.nameValidation())`${h(appName)}-${h(resourceId)}-lb`
// 2d958-my-load-balancer-lb// prioritize hashing resourceId before appName since its priority is 1 (default is 100)
sanitized(loadBalancer.nameValidation())`${h(appName)}-${h(resourceId, 1)}-lb`
// my-very-long-app-name-0ea7a-lb

When shortening resource names, raw text or template expressions of type string are not modified. Expressions of type Component are shortened from left to right based on priority (default priority is 100) using the Component instance's configured ShorteningStrategy. In the examples above, the h() function returns a Component with a ShorteningStrategy that converts the supplied text into a 5 character truncated SHA256 hash.

The process of generating a valid resource name is as follows:

  1. Shorten Components from left to right ranked by priority until the name's length <= maxLength.
  2. Sanitize the generated string in one or more passes (up to 5 passes by default):
    1. Truncate the string if it's longer than max length (arguably, we might want this to be an error).
    2. Throw an error if the string is shorter than the min length.
    3. Apply the supplied SanitizationRules and capture any violations.
    4. For any violations, sequentially apply the violated rules' supplied FixFunc (if present) to resolve any issues.
    5. Revalidate the string if it's been fixed
    6. Return the valid string or start another pass if the string is still invalid
  3. return the sanitized result or throw an exception if any constraint violations are present in the result.

Standard checks

  • Unit tests: Any special considerations? We don't have a pattern for testing IAC-related code at the moment, but we should really figure that out.
  • Docs: Do we need to update any docs, internal or public? No
  • Backwards compatibility: Will this break existing apps? This PR aims to avoid changing "happy path" resource names so there should be no issue there.

@DavidSeptimus-Klotho
Copy link
Contributor Author

rebased and squashed old commits

Copy link
Contributor

@ewucc ewucc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First pass, looks good in general. I think general comments

  • seems like for every new entity that we support we'll need to add that set of rules. Seems like the right way to do it but yeh definitely a lot of bloat to the compiled dir since i believe these will end up there.
  • Might just be me but while it does seem comprehensive, it also seems somewhat complex of a process. Specifically around the weighting (0 -> 100) of components and in the shortening func, it does a length subtraction vs. building from most important components up and hashing once it's exceeded the length.

Last point is also not really like a negative, just more of an observation so not a big deal if no one else has similar thoughts.

pkg/infra/pulumi_aws/iac/sanitization/sanitizer.ts Outdated Show resolved Hide resolved
pkg/infra/pulumi_aws/iac/sanitization/sanitizer.ts Outdated Show resolved Hide resolved
@DavidSeptimus-Klotho
Copy link
Contributor Author

First pass, looks good in general. I think general comments

  • seems like for every new entity that we support we'll need to add that set of rules. Seems like the right way to do it but yeh definitely a lot of bloat to the compiled dir since i believe these will end up there.

That's true. Ideally, this functionality will move into the compiler at some point. Another option is running it through a build tool to merge all the sanitization files into a single file that we include in the output.

@jhsinger-klotho
Copy link
Contributor

What do we do if we have like multiple clusters and install a controller on both. Is it on the plugin to label that resourceId cluster-resource?

switch (params.loadBalancerType) {
case 'application':
lb = new aws.lb.LoadBalancer(`${appName}-${resourceId}-alb`, {
name: `${appName}-${resourceId}`,
name: lbName,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also a general thing, we likely shouldnt set actual names of resources because then they cant get replaced if they change. We should be doing the validation on the resource id for pulumi i believe so that it autogenerates the name + some uuid for uniqueness if we ever need replacements.

let me know your thought on that, i think i added the name field, but i was going to remove it in my next pr because i noticed it could cause issues. We should likely see where else we do this

@@ -46,7 +47,10 @@ export const setupElasticacheCluster = (
retentionInDays: 0,
})

const clusterName = sanitizeClusterName(appName, dbName)
// TODO: look into removing sanitizeClusterName when making other breaking changes to resource names
const clusterName = sanitized(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here, need to remove clustername as the actual name for this and memdb but lets maybe chat about that tomorrow

@DavidSeptimus-Klotho DavidSeptimus-Klotho temporarily deployed to integ_test January 30, 2023 15:12 — with GitHub Actions Inactive
@DavidSeptimus-Klotho DavidSeptimus-Klotho temporarily deployed to integ_test January 30, 2023 15:12 — with GitHub Actions Inactive
@DavidSeptimus-Klotho DavidSeptimus-Klotho temporarily deployed to integ_test January 30, 2023 15:12 — with GitHub Actions Inactive
@DavidSeptimus-Klotho DavidSeptimus-Klotho temporarily deployed to integ_test January 30, 2023 15:12 — with GitHub Actions Inactive
@DavidSeptimus-Klotho DavidSeptimus-Klotho temporarily deployed to integ_test January 30, 2023 15:12 — with GitHub Actions Inactive
@DavidSeptimus-Klotho DavidSeptimus-Klotho temporarily deployed to integ_test January 30, 2023 15:12 — with GitHub Actions Inactive
@DavidSeptimus-Klotho DavidSeptimus-Klotho temporarily deployed to integ_test January 30, 2023 15:12 — with GitHub Actions Inactive
@DavidSeptimus-Klotho DavidSeptimus-Klotho temporarily deployed to integ_test January 30, 2023 15:12 — with GitHub Actions Inactive
@DavidSeptimus-Klotho DavidSeptimus-Klotho temporarily deployed to integ_test January 30, 2023 15:12 — with GitHub Actions Inactive
@DavidSeptimus-Klotho DavidSeptimus-Klotho temporarily deployed to integ_test January 30, 2023 15:12 — with GitHub Actions Inactive
@DavidSeptimus-Klotho DavidSeptimus-Klotho temporarily deployed to integ_test January 30, 2023 15:12 — with GitHub Actions Inactive
@DavidSeptimus-Klotho DavidSeptimus-Klotho temporarily deployed to integ_test January 30, 2023 15:12 — with GitHub Actions Inactive
@DavidSeptimus-Klotho DavidSeptimus-Klotho temporarily deployed to integ_test January 30, 2023 15:12 — with GitHub Actions Inactive
@DavidSeptimus-Klotho DavidSeptimus-Klotho temporarily deployed to integ_test January 30, 2023 15:12 — with GitHub Actions Inactive
@DavidSeptimus-Klotho DavidSeptimus-Klotho temporarily deployed to integ_test January 30, 2023 15:12 — with GitHub Actions Inactive
@DavidSeptimus-Klotho DavidSeptimus-Klotho temporarily deployed to integ_test January 30, 2023 15:12 — with GitHub Actions Inactive
@DavidSeptimus-Klotho DavidSeptimus-Klotho temporarily deployed to integ_test January 30, 2023 15:12 — with GitHub Actions Inactive
@DavidSeptimus-Klotho DavidSeptimus-Klotho temporarily deployed to integ_test January 31, 2023 14:57 — with GitHub Actions Inactive
@DavidSeptimus-Klotho DavidSeptimus-Klotho temporarily deployed to integ_test January 31, 2023 14:57 — with GitHub Actions Inactive
@DavidSeptimus-Klotho DavidSeptimus-Klotho temporarily deployed to integ_test January 31, 2023 14:57 — with GitHub Actions Inactive
@DavidSeptimus-Klotho DavidSeptimus-Klotho temporarily deployed to integ_test January 31, 2023 14:57 — with GitHub Actions Inactive
@DavidSeptimus-Klotho DavidSeptimus-Klotho temporarily deployed to integ_test January 31, 2023 14:57 — with GitHub Actions Inactive
@DavidSeptimus-Klotho DavidSeptimus-Klotho temporarily deployed to integ_test January 31, 2023 14:57 — with GitHub Actions Inactive
@DavidSeptimus-Klotho DavidSeptimus-Klotho temporarily deployed to integ_test January 31, 2023 14:57 — with GitHub Actions Inactive
@DavidSeptimus-Klotho DavidSeptimus-Klotho temporarily deployed to integ_test January 31, 2023 14:57 — with GitHub Actions Inactive
Copy link

@ghost ghost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, other than the merge conflicts. I had a suggestion for a simplification, but it's not blocking. May want to get @jhsinger-klotho's quick glance too, since he took a more in-depth look a couple weeks back.

pkg/infra/pulumi_aws/iac/sanitization/aws/common.ts Outdated Show resolved Hide resolved
export function regexpMatch(
description: string,
pattern: RegExp,
fix: FixFunc | undefined = undefined
Copy link

@ghost ghost Jan 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small suggestion: I think all of the calls to this are basically:

regexpMatch(desc, /^P+$/, (n) => n.replace(/P/g, R))

... where P is some pattern and R is a replacement string. For example:

regexpMatch('', /^[\w-]+$/, (n) => n.replace(/[^\w-]/g, '-'))

If that's true, I think you could simplify this to just take the pattern and a replace char, and generate the validate and FixFunc from them:

regexpMatch('', /[^\w-]/, '-'))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's true for a lot of them, but definitely not all.

@jhsinger-klotho
Copy link
Contributor

yeah, LGTM and if the test are passing definitely a good starting point. The name issue looks to be resolved but we can continue to remove those if we find more

if entry.IsDir() {
dirSuffix = "/"
}
path := strings.TrimSuffix(parentDir, "/") + "/" + entry.Name() + dirSuffix
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: use path.Join instead of + "/" + to be more platform agnostic

Copy link
Contributor Author

@DavidSeptimus-Klotho DavidSeptimus-Klotho Feb 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The separator is always a forward slash when reading from embed.FS.

From the docs for go:embed:

The //go:embed directive accepts multiple space-separated patterns for brevity, but it can also be repeated, to avoid very long lines when there are many patterns. The patterns are interpreted relative to the package directory containing the source file. The path separator is a forward slash, even on Windows systems. Patterns may not contain ‘.’ or ‘..’ or empty path elements, nor may they begin or end with a slash. To match everything in the current directory, use ‘*’ instead of ‘.’. To allow for naming files with spaces in their names, patterns can be written as Go double-quoted or back-quoted string literals.
https://pkg.go.dev/embed#:~:text=The%20//go%3Aembed%20directive%20accepts,back%2Dquoted%20string%20literals.

pkg/infra/pulumi_aws/plugin_iac.go Show resolved Hide resolved
Comment on lines 182 to 184
if err != nil {
return
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This won't propagate the error like addFile since err is defined within the scope of this function (at the files.ReadDir line).

}
}

addDir("iac/sanitization", "**/*.{test,spec}.{ts,js}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, I like this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be even better if those spec files existed. 😅

@@ -152,6 +154,50 @@ func (p Plugin) Transform(result *core.CompilationResult, deps *core.Dependencie
addFile("iac/k8s/add_ons/external_dns/index.ts")
addFile("iac/k8s/add_ons/index.ts")

addDir := func(dir string, exclusions ...string) {
var unreadEntries []string
dirContents, err := files.ReadDir(dir)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this do something with err if it's not nil?

unreadEntries = unreadEntries[1:]
if strings.HasSuffix(entry, "/") {
var childEntries []os.DirEntry
childEntries, err = files.ReadDir(strings.TrimSuffix(entry, "/"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LIikewise, handling this err

@github-actions
Copy link

github-actions bot commented Feb 2, 2023

Package Line Rate Health
github.com/klothoplatform/klotho/pkg/analytics 2%
github.com/klothoplatform/klotho/pkg/annotation 23%
github.com/klothoplatform/klotho/pkg/cli 4%
github.com/klothoplatform/klotho/pkg/core 21%
github.com/klothoplatform/klotho/pkg/env_var 82%
github.com/klothoplatform/klotho/pkg/exec_unit 53%
github.com/klothoplatform/klotho/pkg/infra/kubernetes 59%
github.com/klothoplatform/klotho/pkg/infra/kubernetes/helm 39%
github.com/klothoplatform/klotho/pkg/input 63%
github.com/klothoplatform/klotho/pkg/lang 38%
github.com/klothoplatform/klotho/pkg/lang/dockerfile 0%
github.com/klothoplatform/klotho/pkg/lang/golang 19%
github.com/klothoplatform/klotho/pkg/lang/javascript 47%
github.com/klothoplatform/klotho/pkg/lang/python 60%
github.com/klothoplatform/klotho/pkg/lang/yaml 0%
github.com/klothoplatform/klotho/pkg/logging 7%
github.com/klothoplatform/klotho/pkg/multierr 95%
github.com/klothoplatform/klotho/pkg/provider/aws 59%
github.com/klothoplatform/klotho/pkg/runtime 75%
github.com/klothoplatform/klotho/pkg/static_unit 33%
github.com/klothoplatform/klotho/pkg/validation 73%
github.com/klothoplatform/klotho/pkg/yaml_util 79%
Summary 40% (3793 / 9454)

@DavidSeptimus-Klotho DavidSeptimus-Klotho merged commit 35df9fd into main Feb 2, 2023
@DavidSeptimus-Klotho DavidSeptimus-Klotho deleted the resource-name-validation branch February 2, 2023 14:44
jhsinger-klotho pushed a commit that referenced this pull request Feb 6, 2023
Sanitizes AWS resource names in deploylib and related IAC files
jhsinger-klotho pushed a commit that referenced this pull request Feb 6, 2023
Sanitizes AWS resource names in deploylib and related IAC files
atorres-klo added a commit that referenced this pull request Aug 14, 2024
updating tests after merge to main
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use appname namespace for all resources
4 participants