Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add envfile as task resource and retrieve from s3 #2383

Merged
merged 1 commit into from
Mar 9, 2020

Conversation

jy19
Copy link
Contributor

@jy19 jy19 commented Mar 4, 2020

and write s3 contents to disk

Summary

Getting this pull request out to get it looked at. Using envfiles as a task resource isn't fully piped through yet.

Implementation details

envfile task resource -> retrieve each specified file from s3 and write contents to disk.

Testing

make test

New tests cover the changes: yes (unit tests)

Description for the changelog

Add envfile as task resource

Licensing

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@jy19 jy19 requested review from yhlee-aws and fenxiong March 4, 2020 00:59
taskARN string
createdAt time.Time
region string
resourceDir string // path to store env var files
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are not capturing the "type" value (currently only s3) in this object, and implementation including error message assumes s3. We don't have to over engineer ant this point, but it'd be a good data to capture and at least make it available for future iteration.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point, will add

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

on second thought, "type" wasn't included because we have a list of the environment files - each object in the list could be of a different type (if we ever support anything besides s3).
concerns/thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't we store the type per file? we are accepting a list of env files, are you modeling the entire list as 1 task resource?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

type IS stored per file - each EnvironmentFile object has a Value and Type. Unless you mean each file should be a separate resource? So each object in the list is its own resource? I've currently modeled the entire list as 1 task resource because that's how it's defined from the payload.

type EnvironmentFileResource struct {
cluster string
taskARN string
createdAt time.Time
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this while there's a createdAtUnsafe below?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed this and renamed places where it was accessed to createdAtUnsafe since it required a lock.

lock sync.RWMutex
}

func NewEnvironmentFileResource(cluster, taskARN, region, dataDirOnHost string, credentialsManager credentials.Manager,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think where the code needs to write to is under ECS_DATADIR (by default /data), rather than under ECS_HOST_DATA_DIR (by default /var/lib/ecs),so dataDirOnHost should probably name as dataDir instead, and you don't really need to know the host dir here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/var/lib/ecs is what was decided in the design doc. lmk concerns around this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agent container has a bind mount /var/lib/ecs/data:/data. if you want to write to /var/lib/ecs/data on the host, you need to write to /data in the code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure i'll change the name here and make sure to pass in ECS_DATADIR when this method is called

return nil
}

func (envfile *EnvironmentFileResource) writeEnvFile(writeFunc func(file oswrapper.File) error, fullPathName string) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we supporting windows? if so, writing file in windows is different. refer to agent/statemanager/state_manager_windows.go

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nvm i see that credentialspec is using the same code, so it's probably fine

@jy19 jy19 force-pushed the env_files branch 2 times, most recently from c255ace to d48fc2c Compare March 6, 2020 22:41
@jy19 jy19 force-pushed the env_files branch 2 times, most recently from 5f7a0f9 to c95cc31 Compare March 6, 2020 23:22

taskARNFields := strings.Split(taskARN, "/")
taskID := taskARNFields[len(taskARNFields)-1]
// we save envfiles to path: /var/lib/ecs/data/envfiles/cluster_name/task_id/${s3filename.env}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does this s3filename.env mean?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the name of the s3 file but it is required to have extension .env

// so this should be ok to be called by multiple go routines
tmpFile, err := envfile.ioutil.TempFile(envfile.resourceDir, envTempFilePrefix)
if err != nil {
seelog.Infof("something went wrong trying to create a temp file with prefix %s", envTempFilePrefix)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The log message should start with capital letter. Also, I think this should be Errorf, since the file downloading will fail here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed the logs in err != nil statements to be Errorf and capitalized letters

}

// UnmarshalJSON deserializes the raw JSON to an EnvironmentFileResource struct
func (envfile *EnvironmentFileResource) UnmarshalJSON(b []byte) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you miss the assignment of executionCredentialsID here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops you're right, also task arn. thanks for catching

Copy link
Contributor

@fenxiong fenxiong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

couple small comments, mostly around lock..


// SetKnownStatus safely sets the currently known status of the resource
func (envfile *EnvironmentFileResource) SetKnownStatus(status resourcestatus.ResourceStatus) {
envfile.lock.RLock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be Lock/Unlock

Copy link
Contributor Author

@jy19 jy19 Mar 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i didn't look at my RLock/RUnlock(s) vs Lock/Unlocks carefully based on the method, my bad. should all be fixed now.

return
}

envfile.lock.RLock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be Lock/Unlock


// GetName returns the name fo the resource
func (envfile *EnvironmentFileResource) GetName() string {
envfile.lock.RLock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this doesn't need a lock

// SetAppliedStatus sets the applied status of the resource and returns whether
// the resource is already in a transition
func (envfile *EnvironmentFileResource) SetAppliedStatus(status resourcestatus.ResourceStatus) bool {
envfile.lock.RLock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be Lock/Unlock

resourceDir string // path to store env var files

// env file related attributes
environmentFilesSourceUnsafe []apicontainer.EnvironmentFile // list of env file objects
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this field ever change after initializing from NewEnvironmentFileResource? seems like it doesn't change so don't need lock or be marked as unsafe

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right it doesn't change, was thinking of something else. unmarked as unsafe.

return envfile.environmentFilesSourceUnsafe
}

func (envfile *EnvironmentFileResource) getExecutionCredentialsID() string {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: same as environmentFileSource, i don't quite see why you need a lock for executionCredentialsID, since no one writes to it. read can be concurrent without lock

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh i thought the credentials get rotated so it might change. i must've remembered wrong, credentials rotating during task resource creation seems unlikely

CreatedAt *time.Time `json:"createdAt,omitempty"`
DesiredStatus *EnvironmentFileStatus `json:"desiredStatus"`
KnownStatus *EnvironmentFileStatus `json:"knownStatus"`
EnvironmentFilesSource []apicontainer.EnvironmentFile `json:"envfileResources"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe envFileSources? should match the field name

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

made sure matches

return nil
}

type EnvironmentFileResourceJSON struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't think this need to be exported?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

un-exported

envfile.credentialsManager = resourceFields.CredentialsManager
envfile.s3ClientCreator = factory.NewS3ClientCreator()
envfile.os = oswrapper.NewOS()
envfile.ioutil = ioutilwrapper.NewIOUtil()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is one thing here you may want to think about is whether we want to redownload the S3 file when agent crash and restart during resource CREATED but task hasn't go to RUNNING. if we want to redownload, we need to reset the task resource status back to none here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think you're right. i guess if agent crashes we'd jsut re-download everything again, even if files were previously partially downloaded.

@jy19 jy19 force-pushed the env_files branch 2 times, most recently from 1e4f610 to 91c14af Compare March 7, 2020 03:33
@yhlee-aws
Copy link
Contributor

Travis check is failing for all 3 platforms with "The command "make static-check" exited with 2"

@jy19
Copy link
Contributor Author

jy19 commented Mar 9, 2020

go vet is failing on something related to the locks, am fixing.

@@ -0,0 +1,260 @@
// Copyright Amazon.com Inc. or its affiliates. All Rights Reserved.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this needs a:

// +build unit

at the top

@jy19 jy19 merged commit 40ddd41 into aws:env_files Mar 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants