-
Notifications
You must be signed in to change notification settings - Fork 6
DATA-5849: File download support #51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Wiz Scan Summary
To detect these findings earlier in the dev lifecycle, try using Wiz Code VS Code Extension. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds file download support to the ECS command execution system. It introduces functionality to download files from S3 (and potentially other sources) before executing the main container command.
- Added startup script template and generation logic for file downloads
- Extended ECS command context with file download configuration
- Implemented container modification pattern to inject startup scripts
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
File | Description |
---|---|
startup_script_template.sh | New shell script template for downloading files before container execution |
ecs.go | Extended ECS command functionality with file download support and container modification options |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
) | ||
|
||
var ( | ||
templatePath = "internal/pkg/object/command/ecs/startup_script_template.sh" |
Copilot
AI
Sep 25, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The hardcoded template path will fail when deployed as the relative path won't exist. Consider embedding the template file using embed.FS
or making the path configurable.
Copilot uses AI. Check for mistakes.
Co pilot review suggestions Co-authored-by: Copilot <[email protected]>
type ScriptDownload struct { | ||
Source string | ||
Destination string | ||
IsS3 bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's add enum which will show location for download. If we add gsc,azure tomorrow we'll not need to add more and more variables
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add test information
echo 'Starting file downloads to {{.DownloadDir}}...' | ||
{{if .CreateDirs}}mkdir -p {{.DownloadDir}}{{end}} | ||
|
||
if ! command -v aws &> /dev/null; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this check should be run if file is S3
mkdir -p $(dirname {{.Destination}}) | ||
{{if .IsS3}} | ||
echo "Downloading from S3: {{.Source}}" | ||
if aws s3 cp '{{.Source}}' '{{.Destination}}' --cli-read-timeout {{$.Timeout}} --cli-connect-timeout {{$.Timeout}}; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's move downloading from s3 in separate function
} | ||
|
||
// WithStartupScriptWrapper creates a container option that injects startup script for file downloads | ||
func (execCtx *executionContext) WithStartupScriptWrapper() ContainerOption { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: don't mix object definitions and functions.
Structure file in the way that on top we have object definitions and functions lower
containerDefinitions := execCtx.TaskDefinitionWrapper.TaskDefinition.ContainerDefinitions | ||
|
||
// Apply container options using the options pattern | ||
containerOptions := execCtx.getDefaultContainerOptions() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you have containerOptions inside execCtx and you call applyContainerOptions on execCtx don't pass containerOptions to the method call.
// applyContainerOptions applies container options to container definitions | ||
func (execCtx *executionContext) applyContainerOptions(containerDefinitions []types.ContainerDefinition, options []ContainerOption) error { | ||
for _, option := range options { | ||
for i := range containerDefinitions { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: for _, cd := range containerDefinitions { here and after that option.ModifyContainer(cd)
} | ||
|
||
// ContainerModificationOption represents a generic option for modifying container definitions | ||
type ContainerModificationOption func(*types.ContainerDefinition) error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's move options to separate file, it makes code more cleaner and easier for read.
WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be private object
} | ||
|
||
// ScriptTemplateData represents data for populating the startup script template | ||
type ScriptTemplateData struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
try to make types private if they are not used outside of the package.
@@ -1,11 +1,13 @@ | |||
package ecs | |||
|
|||
import ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the idea is really nice, and I like how you did that
} | ||
|
||
// getDefaultContainerOptions returns the default container options | ||
func (execCtx *executionContext) getDefaultContainerOptions() []ContainerOption { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's not default container options. It's just container options
} | ||
|
||
// getDefaultContainerOptions returns the default container options | ||
func (execCtx *executionContext) getDefaultContainerOptions() []ContainerOption { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit; let's consider to move this function to the buildExecutionContext() and after that call getOptions. don't create options during registerTaskDefinition
The major motivation for that from my point of view is that if we failed to build any option in the future we receive errors on the buildExecutionContext step, not on registrating them
|
||
// generateStartupScript creates a startup script for downloading files using a template | ||
func generateStartupScript(fileDownloads []FileDownload, config *StartupScriptConfig) (string, error) { | ||
if len(fileDownloads) == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you don't need that check because you've already checked it on line 79
// Parse template | ||
tmpl, err := template.New("startup_script").Parse(string(templateContent)) | ||
if err != nil { | ||
return "", fmt.Errorf("failed to parse template: %w", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consider to start emit this metrics like in this PR
On low leve functions we only emmit metrics on top level we add logs as well
echo "Executing: $ORIGINAL_COMMAND" | ||
exec "$ORIGINAL_COMMAND" | ||
else | ||
echo "No original command found, executing: $@" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we should add error here
@@ -0,0 +1,47 @@ | |||
#!/bin/bash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we want to hardcode 1 script for every user of heimdall or we can add more flexibility provide predefined template but allows task, to override path to the script.
@babourine WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add testing section in your PR. Please describe how the code was tested
task_count: 2 | ||
|
||
file_uploads: | ||
- data: "Processing completed at 2025-10-13" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add to the comment that this field can be overridden in the execution
} | ||
|
||
// WithFileUploadScript wraps container commands with a script that downloads files from S3 to the container | ||
func WithFileUploadScript(fileUploads []FileUpload, localDir string) ContainerOption { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
with file download script? Name doesn't match description
} | ||
} | ||
|
||
const downloadScriptTemplate = ` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the general rule is to put constants on the top of the file. You can also consider the option to move it to the separate file
return script | ||
} | ||
|
||
// Future container options can be added here as needed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can add it as a comment in PR, but when time comes people will figure out how to do that
"cluster_criteria": ["type:fargate"], | ||
"context": { | ||
"task_count": 1, | ||
"file_uploads": [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these examples can miss lead users. They show task results, but basically the goal is to upload some files on ecs task machine
MaxFailCount int `yaml:"max_fail_count,omitempty" json:"max_fail_count,omitempty"` // max failures before giving up | ||
|
||
// File upload configuration | ||
FileUploads []FileUpload `yaml:"file_uploads,omitempty" json:"file_uploads,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make it a list of pointers
|
||
// Upload files to S3 if configured | ||
if err := execCtx.uploadFilesToS3(); err != nil { | ||
return fmt.Errorf("failed to upload files to S3: %w", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
metrics were added to the heimdall please start to use them
} | ||
|
||
// parseS3URI parses an S3 URI into bucket and key components | ||
func parseS3URI(s3URI string) (bucket, key string, err error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you move this function to another package because this function can be reused in different places
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add steps for testing and fix review comments
What : Support for donwloading files from S3 on remote ECS fargate containers
Why: For certain use cases wherein my config files are large in size even after base64 encoding, I need a user friendly way to pass them on remote ECS container, instead of passing as part of API payload.