Skip to content

Create a new nox task to clean up images on dockerhub #292

@tomuben

Description

@tomuben

Background

We need to clean up the cache images on Dockerhub.
It would be useful to have a command in exaslct where all tags for a given image older than given number of days will be removed.

Acceptance Criteria

Create command cleanup-tags-dockerhub with parameters --image-name and --older-than-days.

Possible solution

Here’s a simple Python example that uses Docker Hub’s v2‐API to list all tags of a public repository along with their creation (last_updated) timestamps. It handles pagination automatically:

import requests

def get_dockerhub_tags(namespace: str, repo: str, page_size: int = 100):
    """
    Fetch all tags for a given Docker Hub repository and return name + last_updated.
    
    :param namespace: e.g. "library" for official images, or your Docker ID/org
    :param repo: repository name, e.g. "nginx"
    :param page_size: how many tags to fetch per API call (max 100)
    :return: list of dicts: [{'name': '1.19', 'last_updated': '2023-05-02T12:34:56.789Z'}, …]
    """
    url = f"https://hub.docker.com/v2/repositories/{namespace}/{repo}/tags?page_size={page_size}"
    headers = {
        "Accept": "application/json",
    }

    tags = []
    while url:
        resp = requests.get(url, headers=headers)
        resp.raise_for_status()
        data = resp.json()
        
        for item in data.get("results", []):
            tags.append({
                "name": item["name"],
                "last_updated": item["last_updated"]
            })
        
        # Docker Hub API provides a `next` URL for pagination
        url = data.get("next")

    return tags

if __name__ == "__main__":
    namespace = "library"   # official images live under "library"
    repo       = "nginx"    # replace with your repo name
    all_tags = get_dockerhub_tags(namespace, repo)
    
    for tag in all_tags:
        print(f"{tag['name']:20s} {tag['last_updated']}")

How it works

  1. Endpoint
    GET
    https://hub.docker.com/v2/repositories/{namespace}/{repo}/tags
  2. Pagination
    The response JSON contains a next field. Loop until next is null.
  3. Timestamp
    Each tag object has a last_updated field (ISO 8601 string).

Authentication (for private repos)
If you need to fetch tags from a private repo, first obtain a token:

login_url = "https://hub.docker.com/v2/users/login"
login_data = {"username": "YOUR_USER", "password": "YOUR_PASS"}
token = requests.post(login_url, json=login_data).json()["token"]
headers = {
    "Authorization": f"JWT {token}",
    "Accept": "application/json"
}
# then call get_dockerhub_tags with these headers

That’s all! You now have a list of tag names and their creation/last‐updated dates directly from Docker Hub.

Metadata

Metadata

Assignees

Labels

featureProduct feature

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions