Support Runner inside of Docker Container #406

jpb · 2020-04-03T21:18:49Z

Describe the enhancement

Fully support all features when runner is within a Docker container.

Not all features are currently supported when the runner is within a Docker container, specifically those features that use Docker like Docker-based Actions and services. Running self-hosted runners using Docker is an easy way to scale out runners on some sort of Docker-based cluster and an easy way to provide clean workspaces for each run (with ./run.sh --once).

Code Snippet

Possible implementation that I am using now.

Additional information

There are a few areas of concern when the runner executes in a Docker container:

Filesystem access for other containers needed as part of the job. This can be resolved by using a volume mount from the host which uses a matching host and container path (for example: docker run -v /home/github:/home/github, although it doesn't have to be this exact directory) and telling the runner to use a directory within that for the work directory (./config.sh --work /home/github/work). This works with the current volume mounting behaviour for containers created by the runner. This would need to be documented as part of the setup process for a Docker-based runner.
Network between runner and other containers needed as part of the job. This could be resolved by not creating a network as part of the run and instead optionally accepting an existing network to be used. I have found that it works well to use --network container:<container ID of the runner> to reuse the network from the runner container without having to orchestrate a network created via docker network create. There is no straightforward way to discover the network or ID of a container from within it, so it would likely need to be the responsibility of the user to pass this information to the runner (I current do something like "container:$(cat /proc/self/cgroup | grep "cpu" | head -n 1 | rev | cut -d/ -f 1 | rev)" from within the runner container to find the ID and pass this to the runner, although this isn't guaranteed to work in all cases).

The text was updated successfully, but these errors were encountered:

jpb · 2020-04-07T05:33:30Z

There appear to be a couple more things that need to be done to account for multiple runners on the same host concurrently:

docker network prune can not run concurrently and should likely be retried if such an error is received:

/usr/local/bin/docker network prune --force --filter "label=898d1dec6adc"
Error response from daemon: a prune operation is already running
##[warning]Delete stale container networks failed, docker network prune fail with exit code 1

The docker label is not sufficient for isolating separate runners on the same host. The current hash of the root directory will result in the same label being used for all runners with the exact same version. In my testing I've switched this to use the hostname, but perhaps something like the runner name or run ID could be used.

jpb · 2020-05-13T18:04:06Z

@TingluoHuang @bryanmacfarlane I'm hoping to get your feedback on this – getting official support for this would be a huge help for me. I'm happy to work on an implementation if that is helpful.

SonicGD · 2020-06-08T07:49:26Z

This is a big problem for us. We want to run gh runner in docker to easier scaling and isolation. But we need to also run services for tests. So our workaround for now is to run multiple runners on host, but scaling container with docker-compose is so much easier and convenient.

npalm · 2020-06-08T11:54:55Z

This is a big problem for us. We want to run gh runner in docker to easier scaling and isolation. But we need to also run services for tests. So our workaround for now is to run multiple runners on host, but scaling container with docker-compose is so much easier and convenient.

Fully agree, for the time being we have build a scalable solution on AWS spot to server our docker builds. A detailed blog post en ref to the code https://040code.github.io/2020/05/25/scaling-selfhosted-action-runners

jupe · 2021-02-16T12:36:13Z

This is a big problem for us. We want to run gh runner in docker to easier scaling and isolation. But we need to also run services for tests. So our workaround for now is to run multiple runners on host, but scaling container with docker-compose is so much easier and convenient.

Currently we have to create workarounds using non-optimal solutions to deploy tens of runners - or workarounds for container usage in jobs which is rather ugly. How to raise priority of this ?

Just curious, how other's manage scaling the runners ? This is probably one of the most interested approach so far I've seen.. I guess many of us faced this same challenge when scaling gh-runners.. "Official" scaling proposals from GitHub would be more than welcome :) .

vincentbrison · 2021-02-19T09:08:07Z

This is a big problem for us. We want to run gh runner in docker to easier scaling and isolation. But we need to also run services for tests. So our workaround for now is to run multiple runners on host, but scaling container with docker-compose is so much easier and convenient.

Big kudos for @npalm and its solution on AWS. We also build a similar solution for GCP allowing us to scale our self hosted runners for a whole GitHub organization https://github.com/faberNovel/terraform-gcp-github-runner

callum-tait-pbx · 2021-04-17T11:10:08Z

@jupe we use https://github.com/summerwind/actions-runner-controller which has worked really well for us so far

pratikbin · 2021-06-27T07:19:11Z

waiting for this one so bad

uwehdaub · 2021-07-02T07:56:28Z

@jpb Is there any possibility to get a higher priority on this?

uwehdaub · 2021-07-08T09:26:38Z

For now we will use some workaround based on docker-compose.
We have the following docker-compose.yaml file in the repo to setup the services.

version: "3.3"
services:
  nginx:
    image: nginx
  redis:
    image: redis

We then connect the self-hosted runner which is also running inside docker with the created network.
This is one example WF:

name: Start docker compose
on:
  workflow_dispatch:
jobs:
  start-docker-compose:
    # should run as docker container with connection to the dockerd of the host
    runs-on: [self-hosted] 
    steps:
      - name: Checkout code
        uses: actions/checkout@v2
        with:
          fetch-depth: 1
      - name: Start docker compose
        id: start-docker-compose
        run: |
          project_prefix=my-project
          project_name="${project_prefix}-$(head /dev/urandom | tr -dc A-Za-z0-9 | head -c 13 | tr '[:upper:]' '[:lower:]')"
          my_container_id=$(grep docker /proc/self/cgroup | head -n 1 | sed "s|^.*/docker/\(.*\)|\\1|")

          docker-compose -p "${project_name}" up -d
          while ! docker network inspect "${project_name}_default" > /dev/null ; do
            sleep 1
          done

          docker network connect "${project_name}_default" "${my_container_id}"

          echo "::set-output name=my_container_id::$my_container_id"
          echo "::set-output name=project_name::$project_name"

      - name: Check output
        run: |
          echo "Project name: ${{ steps.start-docker-compose.outputs.project_name}}"
          echo "Container ID: ${{ steps.start-docker-compose.outputs.remote_container_id}}"

      - name: Use the started docker compose services
        run: |
          # Install netcat to check redis
          apt-get update
          apt-get install -y netcat
          echo '--------------------'
          ping -c 1 nginx
          curl nginx
          echo '--------------------'
          ping -c 1 redis
          echo ping | netcat -w 2 redis 6379
      - name: Cleanup started docker compose services
        if: always()
        run: |
          docker network disconnect ${{ steps.start-docker-compose.outputs.project_name}}_default ${{ steps.start-docker-compose.outputs.my_container_id}}
          docker-compose -p ${{ steps.start-docker-compose.outputs.project_name}} down

bryanmacfarlane · 2021-07-09T01:33:35Z

@jpb , since you asked me, I'm ➕ on this, But adding @hross to weigh in since he's driving the runner area now. 🚀

hross · 2021-07-12T11:18:00Z

We still want to do this and it's on our list but we don't have a date or schedule for shipping this type of feature right now.

brandonschabell · 2021-07-13T23:44:31Z

Would love to see this prioritized. Can't really run docker-in-docker on Kubernetes self-hosted runners without this.

nehagargSeequent · 2021-08-18T20:11:20Z

Any update on this issue?

myoung34 · 2021-10-07T12:46:53Z

Ping @bryanmacfarlane =)

na-jakobs · 2021-11-17T22:32:39Z

Plus one here, any update ETA? @hross

pl4nty · 2021-12-18T22:22:23Z

Another +1 here, for me this is blocking some 3rd-party deployment workflows with private AKS clusters

ecout · 2022-09-07T23:34:33Z

This is a big problem for us. We want to run gh runner in docker to easier scaling and isolation. But we need to also run services for tests. So our workaround for now is to run multiple runners on host, but scaling container with docker-compose is so much easier and convenient.

Fully agree, for the time being we have build a scalable solution on AWS spot to server our docker builds. A detailed blog post en ref to the code https://040code.github.io/2020/05/25/scaling-selfhosted-action-runners

Makes sense, I did something similar for DNS Resolver ENIS with Cloudwatch as inputs.

ecout · 2022-09-07T23:57:00Z

The main issue I see with all this is access to docker.sock...the whole docker in docker with root access scenario.
myoung34/docker-github-actions-runner#61
From the examples mentioned here:
#367 (comment)
You can try rootless,
https://docs.docker.com/engine/security/rootless/#rootless-docker-in-docker

But then you run into limitations. So a docker container "runner" running into another docker rootless container inside your typical rooted docker, can you even do docker build then?
Buildkit: https://www.containiq.com/post/docker-alternatives

And apparently some things have stopped working:
#2103

And then again, at the end of the day, you'll want container orchestration to bring your runners up and down.

Can your actions consider docker alternatives to build images with a container runner?

https://snyk.io/blog/building-docker-images-kubernetes/

For our team specifically we do want container runners that are also able to run containers.

alexjoeyyong · 2023-01-18T17:03:56Z

Any update on this or new news?

AJMcKane · 2023-02-25T15:47:35Z

Also need to plus one this issue. I've tried every workaround including the latest changes to

https://github.com/actions/runner/blob/main/images/Dockerfile by the @TingluoHuang and the team, but having the CLI isn't really too much use unless we can run docker pull xxx. More specifically when anyone is developing actions they have to be hyper aware of what the action is written in.

AllanOricil · 2023-05-03T01:56:05Z

I really want to run my jobs using that container feature :(

...
jobs:
  job:
    runs-on:
      labels:
        - self-hosted
        - linux
        - ${{ inputs.RUNNER_LABEL }}
    container:
      image: ${{ inputs.DOCKER_IMAGE }}
      credentials:
        username: ${{ github.actor }}
        password: ${{ secrets.GITHUB_TOKEN }}

    steps:
      - run: |
          sfdx version --json

Since I can't execute jobs that run inside containers, and my workflows don't need any other dockerized services, I can get over this limitation by just creating PODs with a docker image that has the runner + everything else that my docker image has that is necessary to run the job. The downside is that I need to create a new image. The following image shows exactly what I'm thinking about:

OBS: I don't need more than one node. If an entire node goes down, I can just wait for EKS to recreate it, as well as its PODs

With the workaround architecture in place, I can then remove the container configuration from the job manifest.

...
jobs:
  job:
    runs-on:
      labels:
        - self-hosted
        - linux
        - ${{ inputs.RUNNER_LABEL }}

    steps:
      - run: |
          sfdx version --json

OBS: I'm just not sure if the storage is going to, somehow, be shared by the PODs or if they are unique, even when using the same name. If the storage is shared between PODs, then one job could impact another one if both PODs run on the same Node.

AllanOricil · 2023-05-03T09:37:08Z

I have a POD running a github-runner and a dind container. When a job that runs on a container is taken by the github-runner container, the job can't execute a simple inline script such as echo hello. Am I doing something wrong, or is it a problem caused by this issue as stated in this other issue?

In the following image you can see that the Container is created by the dind container without a problem, but the inline script can't be executed inside the container

This is my kubernetes deployment manifest:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: github-runner-public
  namespace: github-runners
  labels:
    app: github-runner-public
spec:
  replicas: 1
  selector:
    matchLabels:
      app: github-runner-public
  template:
    metadata:
      labels:
        app: github-runner-public
    spec:
      nodeSelector:
        eks.amazonaws.com/nodegroup: public-nodegroup
      containers:
        - name: github-runner
          image: 225537886698.dkr.ecr.eu-west-1.amazonaws.com/github-runner-test:v1.1.3
          env:
            - name: DOCKER_HOST
              value: tcp://localhost:2375
            - name: DOCKER_API_VERSION
              value: "1.42"
          volumeMounts:
            - name: runner-workspace
              mountPath: /actions-runner/_work
        - name: dind-daemon
          image: docker:23.0.5-dind
          command: ["dockerd", "--host", "tcp://127.0.0.1:2375"]
          securityContext:
            privileged: true
          volumeMounts:
            - name: docker-graph-storage
              mountPath: /var/lib/docker
      volumes:
        - name: docker-graph-storage
          emptyDir: {}
        - name: runner-workspace
          emptyDir: {}

Why/how can't the container that runs inside the dind not have access to /actions-runner/_work/_temp from github-runner? I don't get it.
After reading this post I understood that the directories from the container inside the dind would be mapped to the directories inside the github-runner. So, If the container created by dind is mapping /actions-runner/_work from the github-runner container to /__w that is inside the container as shown below, why isn't the /__w/__temp/<bla>.sh available?

/actions-runner/_work (volume in the node) <- github-runer [/actions-runner/_work] -> (dind) -> my-container [/__w]

Shouln't /actions-runner/_work/_temp/<bla>.sh be inside the volume?

This is the content of the /actions-runner/_work/_temp directory inside the github-runner container. For some reason it is empty. Does this mean that the controller can't create the inline script inside the runner when it is running in a container?

ChristopherHX · 2023-05-03T19:30:43Z

Your kubernetes manifest has a problem, because actions/runner performs docker bind mounts and you use DOCKER_HOST=tcp:// (same applies to DOCKER_HOST=ssh://) to a different system with it's own filesystem, the mounted folder doesn't exist in dind-daemon.
Mounting the _work folders in both container might help to use run steps, but due to an empty externals folder on the docker machine it won't be able to find externals/node16/bin/node. So you would also need to download / mount the externals to the dind-daemon container, before starting the container job.

You probably don't need to share any runner credentials with the dind container, but it would be easier to install the actions/runner to the dind image other than using two isolated containers.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: github-runner-public
  namespace: github-runners
  labels:
    app: github-runner-public
spec:
  replicas: 1
  selector:
    matchLabels:
      app: github-runner-public
  template:
    metadata:
      labels:
        app: github-runner-public
    spec:
      nodeSelector:
        eks.amazonaws.com/nodegroup: public-nodegroup
      containers:
        - name: github-runner
          image: 225537886698.dkr.ecr.eu-west-1.amazonaws.com/github-runner-test:v1.1.3
          env:
            - name: DOCKER_HOST
              value: tcp://localhost:2375
            - name: DOCKER_API_VERSION
              value: "1.42"
          volumeMounts:
            - name: runner-workspace
              mountPath: /actions-runner/_work
        - name: dind-daemon
          image: docker:23.0.5-dind
          command: ["dockerd", "--host", "tcp://127.0.0.1:2375"]
          securityContext:
            privileged: true
          volumeMounts:
            - name: docker-graph-storage
              mountPath: /var/lib/docker
            - name: runner-workspace # You need to share this, docker bind mounts only work if the docker daemon can find the path locally
              mountPath: /actions-runner/_work
            # TODO download the external tools of the actions/runner to `/actions-runner/externals` to be able to use `actions/checkout@v3` and all other nodejs actions.
      volumes:
        - name: docker-graph-storage
          emptyDir: {}
        - name: runner-workspace
          emptyDir: {}

AllanOricil · 2023-05-03T21:32:47Z

@ChristopherHX thank you for helping again 😄

AllanOricil · 2023-05-05T00:00:49Z

@ChristopherHX you are a god! Thank you! it worked :D

AllanOricil · 2023-05-05T01:32:48Z

before sharing /actions-runner/externals directory with the dind container

after sharing /actions-runner/externals directory with the dind container

This is my final deployment manifest. With this deployment, I was able to run a github actions job inside a Kubernetes POD

apiVersion: apps/v1
kind: Deployment
metadata:
  name: github-runner-public
  namespace: github-runners
  labels:
    app: github-runner-public
spec:
  replicas: 1
  selector:
    matchLabels:
      app: github-runner-public
  template:
    metadata:
      labels:
        app: github-runner-public
    spec:
      nodeSelector:
        eks.amazonaws.com/nodegroup: public-nodegroup
      containers:
        - name: github-runner
          image: 225537886698.dkr.ecr.eu-west-1.amazonaws.com/github-runner-test:v1.1.3
          env:
            - name: DOCKER_HOST
              value: tcp://localhost:2375
            - name: DOCKER_API_VERSION
              value: "1.42"
          volumeMounts:
            - name: runner-workspace
              mountPath: /actions-runner/_work
            - name: runner-externals
              mountPath: /externals
          lifecycle:
            postStart:
              exec:
                command:
                  ["/bin/sh", "-c", "cp -a /actions-runner/externals/. /externals"]
        - name: dind-daemon
          image: docker:23.0.5-dind
          command: ["dockerd", "--host", "tcp://127.0.0.1:2375"]
          securityContext:
            privileged: true
          volumeMounts:
            - name: docker-graph-storage
              mountPath: /var/lib/docker
            - name: runner-workspace
              mountPath: /actions-runner/_work
            - name: runner-externals
              mountPath: /actions-runner/externals
      volumes:
        - name: docker-graph-storage
          emptyDir: {}
        - name: runner-workspace
          emptyDir: {}
        - name: runner-externals
          emptyDir: {}

and this is the workflow that has a job that runs inside a container (dind)

name: Test Self-hosted Runners Docker

on:
  workflow_dispatch:
    inputs:
      RUNNER_LABEL:
        type: string
        description: choose the runner based using a label
      DOCKER_IMAGE:
        type: string
        description: docker image
        default: ghcr.io/vodafoneis/salesforce-build-image:v3.x

env:
  HOME: /root

jobs:
  job:
    runs-on:
      labels:
        - self-hosted
        - linux
        - ${{ inputs.RUNNER_LABEL }}
    container:
      image: ${{ inputs.DOCKER_IMAGE }}
      credentials:
        username: ${{ github.actor }}
        password: ${{ secrets.GITHUB_TOKEN }}

    steps:
      - run: |
          echo $HOME
          echo $PATH
          sfdx version --json

      - uses: actions/checkout@v3

my github-runner image is using v2.303.0

Thanks @ChristopherHX for the tips

AllanOricil · 2023-05-05T02:01:44Z

Many jobs running on the same Node

AllanOricil · 2023-05-05T02:10:35Z

As an enhancement to avoid having each dind downloading the same image over and over again, which stresses the disk a lot, I'm going to follow these steps: https://blog.argoproj.io/storage-considerations-for-docker-in-docker-on-kubernetes-ed928a83331c

AllanOricil · 2023-05-05T13:36:48Z

I have also verified that my implementation enables service containers. Below you can see the job execution for the redis example from this repository.

AllanOricil · 2023-05-05T13:44:57Z

@TingluoHuang @bryanmacfarlane @nikola-jokic I think this issue can be closed. If not, could you provide an example of workflow manifest that won't work on my kubernetes cluster.

Sebastian-0 · 2023-05-08T08:19:31Z

@AllanOricil I disagree, this issue is not only about Kubernetes. We host our own runners in-house using docker (without Kubernetes) and this limitation is causing problems. The temporary solution for us is to install the runners directly on the machines rather than inside containers, but that's not the way I would like it configured.

AllanOricil · 2023-05-08T09:20:22Z

@Sebastian-0 I understand that there are limitations. But the way this issue is currently written does not say what these limitations are exactly. The way it is currently written, specially its title, can lead people to believe that is not possible to run the runner inside a container in any ways, which, after some trials, I discovered it is not true.
I know that there is a solution that allows the runner to run inside a docker container, because I was able to do it. However, this evidence alone does not prove that there isn't an edge case that won't work. So, that is why I'm asking an example of workflow that won't work with this solution. This can help other people to better understand what the real problem is.

nbrugger-tgm · 2023-06-26T10:37:51Z

@AllanOricil, I do not think that this issue can be closed as long as there is no official guide/info on this such as:

docker will not be supported
to run in docker you need to ...image... entry point ... network … mounts ... etc that is kept up to date by the gh-actions team

timnolte · 2023-08-21T03:04:00Z

@Sebastian-0 I understand that there are limitations. But the way this issue is currently written does not say what these limitations are exactly. The way it is currently written, specially its title, can lead people to believe that is not possible to run the runner inside a container in any ways, which, after some trials, I discovered it is not true. I know that there is a solution that allows the runner to run inside a docker container, because I was able to do it. However, this evidence alone does not prove that there isn't an edge case that won't work. So, that is why I'm asking an example of workflow that won't work with this solution. This can help other people to better understand what the real problem is.

@AllanOricil basically these items are not working: myoung34/docker-github-actions-runner#98

Meaning that in GitHub Actions you can't really get container services working/running that might be needed. I was getting failures when attempting to use the "Local Registry" solution for GitHub Actions when building Docker images in order to test them.

/usr/bin/docker create --name fc2ec27c43d44dc68880bee45a7b12fb_registry2_b6e2e5 --label c3f261 --network github_network_6886f8441d2f4f3489b19978db60872b --network-alias registry -p 5000:5000  -e GITHUB_ACTIONS=true -e CI=true registry:2
  83b0180533c3ec325b50cff90[42](https://github.com/ndigitals/ols-dockerfiles/actions/runs/5920922473/job/16052814082#step:2:45)8487490e74aaa907a46ac8ae45f08bc866755
  /usr/bin/docker start 83b0180533c3ec325b50cff90428487490e74aaa907a46ac8ae45f08bc866755
  Error response from daemon: network github_network_6886f8[44](https://github.com/ndigitals/ols-dockerfiles/actions/runs/5920922473/job/16052814082#step:2:47)1d2f4f3489b19978db60872b not found
  Error: failed to start containers: 83b0180533c3ec325b50cff90428487490e74aaa907a46ac8ae[45](https://github.com/ndigitals/ols-dockerfiles/actions/runs/5920922473/job/16052814082#step:2:48)f08bc866755

I was trying to use the following Local Registry GitHub Actions setup for Docker image builds.

AllanOricil · 2023-08-21T03:07:21Z

Interesting. Now it makes sense to me why this should not be closed. Thanks @timnolte

fabio-s-franco · 2023-09-14T09:08:58Z

I have setup dind using terraform and a custom image based on dind. That all ran in my local WSL2, so I assume it wouldn't be far fetched to do it for the other scenarios. In an approach similar to what @AllanOricil, I did have to setup a bridge network properly, which took a lot of trial and error, but in the end it worked, both with a shared socket and with an independent socket. I will share here what I did as is and maybe someone can pick it up from it.

This is not a solution, but whoever is having problems, may pickup some ideas from what I did:

Terraform code

Should be straightforward to convert it to anything else)

Network

resource "docker_network" "rke_network" {

  name = local.docker_network_name

  driver     = "bridge"
  attachable = true
  internal   = false

  ipam_config {
    ip_range = local.network_subnet
    subnet   = local.network_subnet
    gateway  = local.network_gateway
  }

  options = {
    "com.docker.network.bridge.enable_icc"           = "true"
    "com.docker.network.bridge.enable_ip_masquerade" = "true"
    "com.docker.network.bridge.host_binding_ipv4"    = "0.0.0.0"
    "com.docker.network.bridge.name"                 = local.iface_name
    "com.docker.network.driver.mtu"                  = "65000"
    "com.docker.network.driver.txqueuelen"           = "10000"
  }

  provisioner "local-exec" {
    command = "sudo ip link set dev eth0 txqueuelen 10000 && sudo ip link set dev eth0 mtu 65000 && sysctl -w net/netfilter/nf_conntrack_max=393216"
  }
}

Docker container node definition (where the agent would be)

resource "docker_container" "node" {
  count = 4

  name  = "rke-dind-${local.nodes[count.index].ipv4_address}"
  image = "docker.local/dind-ssh:latest"

  privileged        = true
  publish_all_ports = true

  user          = "root"
  cgroupns_mode = "host"
  ipc_mode      = "shareable"
  stdin_open    = true
  tty           = true
  runtime       = "runc"

  network_mode = "bridge"

  env = [ "DOCKER_TLS_CERTDIR=\"\"", "AUTH_PUBKEY=${tls_private_key.ssh_key.public_key_openssh}", "AUTH_PRVKEY=${tls_private_key.ssh_key.private_key_openssh}" ]

  networks_advanced {
    name         = local.docker_network_name
    ipv4_address = local.nodes[count.index].ipv4_address
  }

  ports {
    internal = 22
    ip = "0.0.0.0"
    protocol = "tcp"
  }

  ports {
    internal = 2375
    ip = "0.0.0.0"
    protocol = "tcp"
  }

  ports {
    internal = 2376
    ip = "0.0.0.0"
    protocol = "tcp"
  }

  ports {
    internal = 2379
    ip = "0.0.0.0"
    protocol = "tcp"
  }
}

Building a custom dind image

I did to enable ssh into it, but can also be used to add an agent for example.
The RSA keys you should generate one yourself as they are used in the image build process (but could also be generated otherwise during build). My use case required that I had these keys pre-made elsewhere.

Dockerfile

FROM docker:dind AS dindssh

# Install SSH server
RUN apk add openssh-server

# Generate SSH host keys
RUN ssh-keygen -A

# Copy sshd_config file
COPY sshd_config /etc/ssh/sshd_config
COPY daemon.json /etc/docker/daemon.json
RUN chmod 700 /usr/local/bin/dockerd-entrypoint.sh
COPY dockerd-entrypoint.sh /usr/local/bin/dockerd-entrypoint.sh
RUN chmod +x /usr/local/bin/dockerd-entrypoint.sh

# Add authorized keys for root user
RUN mkdir -p /root/.ssh
RUN touch /root/.ssh/authorized_keys
RUN touch /root/.ssh/id_rsa
RUN chmod 600 /root/.ssh/authorized_keys
RUN chmod 600 /root/.ssh/id_rsa
RUN chown root:root /root/.ssh/authorized_keys
RUN chown root:root /root/.ssh/id_rsa

# Expose SSH, Docker and etcd ports
EXPOSE 22 2375 2379

# Start SSH daemon and Docker daemon
ENTRYPOINT (/usr/sbin/sshd -D &) && dockerd-entrypoint.sh
CMD []

Entrypoint file

Don't remember where I got the base file from, but it is highly customized to work with SSH + DIND

#!/bin/sh
set -eu

_tls_ensure_private() {
	local f="$1"; shift
	[ -s "$f" ] || openssl genrsa -out "$f" 4096
}
_tls_san() {
	{
		ip -oneline address | awk '{ gsub(/\/.+$/, "", $4); print "IP:" $4 }'
		{
			cat /etc/hostname
			echo 'docker'
			echo 'localhost'
			hostname -f
			hostname -s
		} | sed 's/^/DNS:/'
		[ -z "${DOCKER_TLS_SAN:-}" ] || echo "$DOCKER_TLS_SAN"
	} | sort -u | xargs printf '%s,' | sed "s/,\$//"
}
_tls_generate_certs() {
	local dir="$1"; shift

	# if server/{ca,key,cert}.pem && !ca/key.pem, do NOTHING except verify (user likely managing CA themselves)
	# if ca/key.pem || !ca/cert.pem, generate CA public if necessary
	# if ca/key.pem, generate server public
	# if ca/key.pem, generate client public
	# (regenerating public certs every startup to account for SAN/IP changes and/or expiration)

	if [ -s "$dir/server/ca.pem" ] && [ -s "$dir/server/cert.pem" ] && [ -s "$dir/server/key.pem" ] && [ ! -s "$dir/ca/key.pem" ]; then
		openssl verify -CAfile "$dir/server/ca.pem" "$dir/server/cert.pem"
		return 0
	fi

	# https://github.com/FiloSottile/mkcert/issues/174
	local certValidDays='825'

	if [ -s "$dir/ca/key.pem" ] || [ ! -s "$dir/ca/cert.pem" ]; then
		# if we either have a CA private key or do *not* have a CA public key, then we should create/manage the CA
		mkdir -p "$dir/ca"
		_tls_ensure_private "$dir/ca/key.pem"
		openssl req -new -key "$dir/ca/key.pem" \
			-out "$dir/ca/cert.pem" \
			-subj '/CN=docker:dind CA' -x509 -days "$certValidDays"
	fi

	if [ -s "$dir/ca/key.pem" ]; then
		# if we have a CA private key, we should create/manage a server key
		mkdir -p "$dir/server"
		_tls_ensure_private "$dir/server/key.pem"
		openssl req -new -key "$dir/server/key.pem" \
			-out "$dir/server/csr.pem" \
			-subj '/CN=docker:dind server'
		cat > "$dir/server/openssl.cnf" <<-EOF
			[ x509_exts ]
			subjectAltName = $(_tls_san)
		EOF
		openssl x509 -req \
				-in "$dir/server/csr.pem" \
				-CA "$dir/ca/cert.pem" \
				-CAkey "$dir/ca/key.pem" \
				-CAcreateserial \
				-out "$dir/server/cert.pem" \
				-days "$certValidDays" \
				-extfile "$dir/server/openssl.cnf" \
				-extensions x509_exts
		cp "$dir/ca/cert.pem" "$dir/server/ca.pem"
		openssl verify -CAfile "$dir/server/ca.pem" "$dir/server/cert.pem"
	fi

	if [ -s "$dir/ca/key.pem" ]; then
		# if we have a CA private key, we should create/manage a client key
		mkdir -p "$dir/client"
		_tls_ensure_private "$dir/client/key.pem"
		chmod 0644 "$dir/client/key.pem" # openssl defaults to 0600 for the private key, but this one needs to be shared with arbitrary client contexts
		openssl req -new \
				-key "$dir/client/key.pem" \
				-out "$dir/client/csr.pem" \
				-subj '/CN=docker:dind client'
		cat > "$dir/client/openssl.cnf" <<-'EOF'
			[ x509_exts ]
			extendedKeyUsage = clientAuth
		EOF
		openssl x509 -req \
				-in "$dir/client/csr.pem" \
				-CA "$dir/ca/cert.pem" \
				-CAkey "$dir/ca/key.pem" \
				-CAcreateserial \
				-out "$dir/client/cert.pem" \
				-days "$certValidDays" \
				-extfile "$dir/client/openssl.cnf" \
				-extensions x509_exts
		cp "$dir/ca/cert.pem" "$dir/client/ca.pem"
		openssl verify -CAfile "$dir/client/ca.pem" "$dir/client/cert.pem"
	fi
}

export DOCKER_HOST=/var/run/docker.sock
# no arguments passed
# or first arg is `-f` or `--some-option`
if [ "$#" -eq 0 ] || [ "${1#-}" != "$1" ]; then
	# set "dockerSocket" to the default "--host" *unix socket* value (for both standard or rootless)
	uid="$(id -u)"
	if [ "$uid" = '0' ]; then
		dockerSocket='unix:///var/run/docker.sock'
	else
		# if we're not root, we must be trying to run rootless
		: "${XDG_RUNTIME_DIR:=/run/user/$uid}"
		dockerSocket="unix://$XDG_RUNTIME_DIR/docker.sock"
	fi
	case "${DOCKER_HOST:-}" in
		unix://*)
			dockerSocket="$DOCKER_HOST"
			;;
	esac

	# add our default arguments
	if [ -n "${DOCKER_TLS_CERTDIR:-}" ]; then
		_tls_generate_certs "$DOCKER_TLS_CERTDIR"
		# generate certs and use TLS if requested/possible (default in 19.03+)
		set -- dockerd \
			--tlsverify \
			--tlscacert "$DOCKER_TLS_CERTDIR/server/ca.pem" \
			--tlscert "$DOCKER_TLS_CERTDIR/server/cert.pem" \
			--tlskey "$DOCKER_TLS_CERTDIR/server/key.pem" \
			"$@"
		DOCKERD_ROOTLESS_ROOTLESSKIT_FLAGS="${DOCKERD_ROOTLESS_ROOTLESSKIT_FLAGS:-} -p 0.0.0.0:2376:2376/tcp"
	else
		# TLS disabled (-e DOCKER_TLS_CERTDIR='') or missing certs
		set -- dockerd \
			"$@"
		DOCKERD_ROOTLESS_ROOTLESSKIT_FLAGS="${DOCKERD_ROOTLESS_ROOTLESSKIT_FLAGS:-} -p 0.0.0.0:2375:2375/tcp"
	fi
fi

if [ "$1" = 'dockerd' ]; then
	# explicitly remove Docker's default PID file to ensure that it can start properly if it was stopped uncleanly (and thus didn't clean up the PID file)
	find /run /var/run -iname 'docker*.pid' -delete || :

	if dockerd --version | grep -qF ' 20.10.'; then
		# XXX inject "docker-init" (tini) as pid1 to workaround https://github.com/docker-library/docker/issues/318 (zombie container-shim processes)
		set -- docker-init -- "$@"
	fi

	if ! iptables -nL > /dev/null 2>&1; then
		# if iptables fails to run, chances are high the necessary kernel modules aren't loaded (perhaps the host is using nftables with the translating "iptables" wrappers, for example)
		# https://github.com/docker-library/docker/issues/350
		# https://github.com/moby/moby/issues/26824
		modprobe ip_tables || :
	fi

	uid="$(id -u)"
	if [ "$uid" != '0' ]; then
		# if we're not root, we must be trying to run rootless
		if ! command -v rootlesskit > /dev/null; then
			echo >&2 "error: attempting to run rootless dockerd but missing 'rootlesskit' (perhaps the 'docker:dind-rootless' image variant is intended?)"
			exit 1
		fi
		user="$(id -un 2>/dev/null || :)"
		if ! grep -qE "^($uid${user:+|$user}):" /etc/subuid || ! grep -qE "^($uid${user:+|$user}):" /etc/subgid; then
			echo >&2 "error: attempting to run rootless dockerd but missing necessary entries in /etc/subuid and/or /etc/subgid for $uid"
			exit 1
		fi
		: "${XDG_RUNTIME_DIR:=/run/user/$uid}"
		export XDG_RUNTIME_DIR
		if ! mkdir -p "$XDG_RUNTIME_DIR" || [ ! -w "$XDG_RUNTIME_DIR" ] || ! mkdir -p "$HOME/.local/share/docker" || [ ! -w "$HOME/.local/share/docker" ]; then
			echo >&2 "error: attempting to run rootless dockerd but need writable HOME ($HOME) and XDG_RUNTIME_DIR ($XDG_RUNTIME_DIR) for user $uid"
			exit 1
		fi
		if [ -f /proc/sys/kernel/unprivileged_userns_clone ] && unprivClone="$(cat /proc/sys/kernel/unprivileged_userns_clone)" && [ "$unprivClone" != '1' ]; then
			echo >&2 "error: attempting to run rootless dockerd but need 'kernel.unprivileged_userns_clone' (/proc/sys/kernel/unprivileged_userns_clone) set to 1"
			exit 1
		fi
		if [ -f /proc/sys/user/max_user_namespaces ] && maxUserns="$(cat /proc/sys/user/max_user_namespaces)" && [ "$maxUserns" = '0' ]; then
			echo >&2 "error: attempting to run rootless dockerd but need 'user.max_user_namespaces' (/proc/sys/user/max_user_namespaces) set to a sufficiently large value"
			exit 1
		fi
		# TODO overlay support detection?
		exec rootlesskit \
			--net="${DOCKERD_ROOTLESS_ROOTLESSKIT_NET:-vpnkit}" \
			--mtu="${DOCKERD_ROOTLESS_ROOTLESSKIT_MTU:-1500}" \
			--disable-host-loopback \
			--port-driver=builtin \
			--copy-up=/etc \
			--copy-up=/run \
			${DOCKERD_ROOTLESS_ROOTLESSKIT_FLAGS:-} \
			"$@"
	elif [ -x '/usr/local/bin/dind' ]; then
		# if we have the (mostly defunct now) Docker-in-Docker wrapper script, use it
		set -- '/usr/local/bin/dind' "$@"
	fi
else
	# if it isn't `dockerd` we're trying to run, pass it through `docker-entrypoint.sh` so it gets `DOCKER_HOST` set appropriately too
	set -- docker-entrypoint.sh "$@"
fi

if [ "$AUTH_PUBKEY" ]; then
    echo "$AUTH_PUBKEY" > /root/.ssh/authorized_keys
    echo "$AUTH_PRVKEY" > /root/.ssh/id_rsa
fi

mount --make-shared / && mount --make-shared /sys && mount --make-shared /var/lib/docker
exec "dockerd" "--tls=false"
#exec "$@"

daemon.json file

{
    "debug": false,
    "hosts": [
        "tcp://0.0.0.0:2375",
        "unix:///var/run/docker.sock"
    ],
    "runtimes": {
        "sysbox-runc": {
            "path": "/usr/bin/sysbox-runc"
        }
    },
    "dns": ["1.1.1.1"],
    "userland-proxy": false
}

juannavalonribas · 2024-02-15T11:54:53Z

Hello, any update about this feature?

sergii-rybin-tfs · 2024-08-02T14:02:49Z

Hello we need this feature also.

jpb added the enhancement New feature or request label Apr 3, 2020

jpb mentioned this issue Apr 3, 2020

Allow using Docker actions when running within a Docker container #383

Closed

TingluoHuang added the Runner ❤️ Container label Jun 6, 2020

davidkarlsen mentioned this issue Jun 24, 2020

Can't run Jobs with JS & Docker Actions if they are interdependent evryfs/github-actions-runner-operator#39

Closed

tcardonne mentioned this issue Jul 10, 2020

[Question] Using under docker compose tcardonne/docker-github-runner#17

Closed

myoung34 mentioned this issue Oct 14, 2020

Container feature is not supported when runner is already running inside container. myoung34/docker-github-actions-runner#61

Closed

dimisjim mentioned this issue Mar 10, 2021

Self Hosted Runner running in a container fails to run steps inside a service container "sh: 0: Can't open /__w/_temp/xxx.sh" #988

Closed

lee-at-work mentioned this issue Mar 12, 2021

Container feature is not supported when runner is already running inside container. tcardonne/docker-github-runner#36

Open

dimisjim mentioned this issue Mar 22, 2021

Add disclaimer in README that it does not support running "Container" and "Services" workflows myoung34/docker-github-actions-runner#98

Closed

tomkerkhove mentioned this issue Apr 14, 2021

Provide scaler for GitHub Actions Runner kedacore/keda#1732

Closed

skjnldsv mentioned this issue Jul 29, 2021

Fix CI failures when building settings app nextcloud/server#28202

Merged

0x2b3bfa0 mentioned this issue Sep 30, 2021

Standardize on container images instead of machine images iterative/terraform-provider-iterative#146

Open

myoung34 mentioned this issue Oct 7, 2021

matrix docker (container) builds fail myoung34/docker-github-actions-runner#156

Closed

madebyTimo mentioned this issue Jul 20, 2022

Maintenance: Parallelized Github Actions runner Gamify-IT/issues#119

Closed

3 tasks

nmalaguti mentioned this issue Dec 7, 2022

nodejs cannot be found when running Docker inside Docker myoung34/docker-github-actions-runner#261

Closed

th0th mentioned this issue Feb 22, 2023

i/o timeout error th0th/rancher-redeploy-workload#4

Open

l-maciej mentioned this issue Mar 8, 2023

Self hosted github runner needs version bump to 2.302.1 marcel-dempers/docker-development-youtube-series#189

Closed

karahiyo mentioned this issue Jun 10, 2023

support Docker container action karahiyo/actions-job#2

Open

echus mentioned this issue Oct 4, 2023

Set up Github Actions runners on NCI for deployment workflows ACCESS-NRI/build-ci#5

Closed

myoung34 mentioned this issue Mar 22, 2024

Not able to run GH action in container (aarch64/arm64) myoung34/docker-github-actions-runner#349

Closed

dk-coligo mentioned this issue May 22, 2024

TCB - docker-in-docker build toradex/vscode-torizon-templates#202

Closed

Support Runner inside of Docker Container #406

Support Runner inside of Docker Container #406

Comments

jpb commented Apr 3, 2020

jpb commented Apr 7, 2020

jpb commented May 13, 2020

SonicGD commented Jun 8, 2020

npalm commented Jun 8, 2020

jupe commented Feb 16, 2021

vincentbrison commented Feb 19, 2021

callum-tait-pbx commented Apr 17, 2021

pratikbin commented Jun 27, 2021

uwehdaub commented Jul 2, 2021

uwehdaub commented Jul 8, 2021

bryanmacfarlane commented Jul 9, 2021

hross commented Jul 12, 2021

brandonschabell commented Jul 13, 2021

nehagargSeequent commented Aug 18, 2021

myoung34 commented Oct 7, 2021

na-jakobs commented Nov 17, 2021

pl4nty commented Dec 18, 2021

ecout commented Sep 7, 2022 • edited Loading

ecout commented Sep 7, 2022 • edited Loading

alexjoeyyong commented Jan 18, 2023

AJMcKane commented Feb 25, 2023

AllanOricil commented May 3, 2023 • edited Loading

AllanOricil commented May 3, 2023 • edited Loading

ChristopherHX commented May 3, 2023

AllanOricil commented May 3, 2023

AllanOricil commented May 5, 2023

AllanOricil commented May 5, 2023 • edited Loading

AllanOricil commented May 5, 2023 • edited Loading

AllanOricil commented May 5, 2023

AllanOricil commented May 5, 2023

AllanOricil commented May 5, 2023

Sebastian-0 commented May 8, 2023

AllanOricil commented May 8, 2023 • edited Loading

nbrugger-tgm commented Jun 26, 2023

timnolte commented Aug 21, 2023

AllanOricil commented Aug 21, 2023

fabio-s-franco commented Sep 14, 2023

Terraform code

Network

Docker container node definition (where the agent would be)

Building a custom dind image

Dockerfile

Entrypoint file

daemon.json file

juannavalonribas commented Feb 15, 2024

sergii-rybin-tfs commented Aug 2, 2024

ecout commented Sep 7, 2022 •

edited

Loading

ecout commented Sep 7, 2022 •

edited

Loading

AllanOricil commented May 3, 2023 •

edited

Loading

AllanOricil commented May 3, 2023 •

edited

Loading

AllanOricil commented May 5, 2023 •

edited

Loading

AllanOricil commented May 5, 2023 •

edited

Loading

AllanOricil commented May 8, 2023 •

edited

Loading