Skip to content

Filter tasks returned by introspection API by short container ID #775

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 23 commits into from

Conversation

aaronwalker
Copy link
Contributor

@aaronwalker aaronwalker commented Apr 21, 2017

Summary

Adds the ability to lookup a task via the introspection service using the short version of the docker Id. Use case is for a container to be able to discover this own task arn and other task related metadata

addresses feature-request #770

Implementation details

A function TaskByShortID was added to TaskEngineState interface to find the first matching Tasks who's container id started with the given short id. The tasksV1RequestHandlerMaker was then updated to call either the TaskByID or TaskByShortID depending on the length of the dockerId request param

Testing

  • Builds on Linux (make release)
  • Builds on Windows (go build -out amazon-ecs-agent.exe ./agent)
  • Unit tests on Linux (make test) pass
  • Unit tests on Windows (go test -timeout=25s ./agent/...) pass
  • Integration tests on Linux (make run-integ-tests) pass
  • Integration tests on Windows (.\scripts\run-integ-tests.ps1) pass
  • Functional tests on Linux (make run-functional-tests) pass
  • Functional tests on Windows (.\scripts\run-functional-tests.ps1) pass

New tests cover the changes: yes

Description for the changelog

Feature - Filter tasks returned by introspection API by short container ID

Licensing

This contribution is under the terms of the Apache 2.0 License: yes

Copy link
Contributor

@samuelkarp samuelkarp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aaronwalker Thank you so much for contributing, and especially for adding a test covering your changes! I have one small change I'd like you to make in the test and a small concern about collisions.


var taskResponse TaskResponse
err := json.Unmarshal(recorder.Body.Bytes(), &taskResponse)
if err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you change this to require.NoError(t, err)? You'll need to add an import for github.com/stretchr/testify/require. We're trying to slowly change to using the assert and require packages and new tests should use those.

defer state.lock.RUnlock()

for id := range state.idToTask {
if strings.HasPrefix(id, cid) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think my only (relatively minor) concern here is that it's really easy for collisions to happen (especially with super short prefixes) and it'll be non-deterministic which task gets returned (Go map traversal is randomized). It might be worth requiring a minimum prefix length of 12 (the prefix length Docker appears to use) in order to reduce the likelihood of collisions.

@aaithal any thoughts here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@samuelkarp I've been thinking that maybe this should be model like how the docker cli tools work allowing you to specify the the small number of characters if actually check for collisions and return an error if there is

something like this

docker logs c6
Error response from daemon: Multiple IDs found with provided prefix: c69a1491a2eab90f2153ea18b86ac43f24dbc4a3228b44363a67c21bddd3435a

WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think I'd prefer returning an error when collisions are found rather than randomly returning one of the matching tasks.

Copy link
Contributor Author

@aaronwalker aaronwalker Apr 28, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@samuelkarp how about I change the TaskByShortID api to return an array of matching tasks and then update the handler to return an error?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be reasonable to have func (state *DockerTaskEngineState) TaskByShortID(cid string) ([]*api.Task, bool).

…ision occurs and updates the handler to return a bad request when this does occur
var tasks []*api.Task
for id := range state.idToTask {
if strings.HasPrefix(id, cid) {
task, _ := state.TaskByID(id)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@richardpen please correct me if I'm wrong here. But, IIUC, this has the potential to cause a deadlock as an intervening write lock between line 151 (state.lock.RLock()) and this line has the potential to cause a deadlock.

Please see RWMutex for more details.

I think something like this is a better alternative:

func (state *DockerTaskEngineState) GetAllContainerIDs() []string {
    state.lock.RLock()
    defer state.lock.RUnlock()

    var ids []string
    for id := range state.idToTask {
        ids = append(ids, id)
    }

    return ids
}

func (state *DockerTaskEngineState) TaskByShortID(cid string) ([]*api.Task, bool) {
    containerIDs := state.GetAllContainerIDs()
    var tasks []*api.Task
    for id := range containerIDs {
        if strings.HasPrefix(id, cid) {
            if task, ok := state.TaskByID(id); ok {
                tasks = append(tasks, task)
            }
        }
    }

    return tasks, len(tasks) > 0
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good eye; this is definitely a deadlock risk.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aaithal happy to make the recommended changes. Do we want to add the GetAllContainerIDs to the API def or just make it private? @samuelkarp ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aaronwalker thank you! I'm in favor of adding the AllContainers() (GetAllContainerIDs() in my example code) method to the TaskEngineState interface itself (on the lines of AllTasks()) and implementing the same in DockerTaskEngineState. However, this also probably means that you'd have to update the mock for the interface (by running make gogenerate). If you think that's getting too tricky/complicated, I'd happy with a private/local scope method.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aaronwalker I think either approach is fine.

@aaithal aaithal requested a review from richardpen May 3, 2017 18:04
aaithal and others added 20 commits May 18, 2017 12:06
…hods.

This method has now been broken into:
i. startContainerTransitions
ii. onContainersUnableToTransitionState and
iii. waitForContainerTransitions

The main aim is to make this method more readable and testable. On that note,
multiple unit tests have also been added to test each method that constitutes
progressContainers()
This change is needed to support alternative URLs for per-region
bucket support. The validation of Host and Location provide no extra
security guarantees as the update's checksum is transmitted along with
the location across a trusted communication channel and used to verify
the download.
Testify's assertions clean this up considerably. The require.* funcs
are needed to stop the test on assert failures. assert.* funcs are
used only where the test can continue regardless of the assertion,
despite the test being marked as a FAILure.
This patch improves clarity to the existing `ECS_INSTANCE_ATTRIBUTES`.
It also points to additional documentation on AWS docs.

Signed-off-by: Vinothkumar Siddharth <[email protected]>
* Refactor http request code to use public API
* Adding "codegen" build constraint
* Regenerating ECS and ECR client code
* Minor fixes
This commit only includes changes to vendored dependencies
Mockgen now prints stderr instead of swallowing it
Previously, it would remove the entire import line and depend on
goimports to correct the issue. goimports + mockgen doesn't always
play well with vendored dependencies.
New behavior only removes the vendor directory prefix, but still
runs goimport to validate / organize.
The verifyTaskStopped sometimes won't work when there there are multiple
task waiting on the same taskevents chanel, because task can drop other
task's events from taskevents channel
@aaronwalker
Copy link
Contributor Author

closing with squash changes into new PR

@aaronwalker aaronwalker deleted the task_by_short_container_id branch May 18, 2017 03:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants