Skip to content

Commit

Permalink
[SPARK-26685][K8S] Correct placement of ARG declaration
Browse files Browse the repository at this point in the history
Latest Docker releases are stricter in their enforcement of build argument scope.  The location of the `ARG spark_uid` declaration in the Python and R Dockerfiles means the variable is out of scope by the time it is used in a `USER` declaration resulting in a container running as root rather than the default/configured UID.

Also with some of the refactoring of the script that has happened since my PR that introduced the configurable UID it turns out the `-u <uid>` argument is not being properly passed to the Python and R image builds when those are opted into

## What changes were proposed in this pull request?

This commit moves the `ARG` declaration to just before the argument is used such that it is in scope.  It also ensures that Python and R image builds receive the build arguments that include the `spark_uid` argument where relevant

## How was this patch tested?

Prior to the patch images are produced where the Python and R images ignore the default/configured UID:

```
> docker run -it --entrypoint /bin/bash rvesse/spark-py:uid456
bash-4.4# whoami
root
bash-4.4# id -u
0
bash-4.4# exit
> docker run -it --entrypoint /bin/bash rvesse/spark:uid456
bash-4.4$ id -u
456
bash-4.4$ exit
```

Note that the Python image is still running as `root` having ignored the configured UID of 456 while the base image has the correct UID because the relevant `ARG` declaration is correctly in scope.

After the patch the correct UID is observed:

```
> docker run -it --entrypoint /bin/bash rvesse/spark-r:uid456
bash-4.4$ id -u
456
bash-4.4$ exit
exit
> docker run -it --entrypoint /bin/bash rvesse/spark-py:uid456
bash-4.4$ id -u
456
bash-4.4$ exit
exit
> docker run -it --entrypoint /bin/bash rvesse/spark:uid456
bash-4.4$ id -u
456
bash-4.4$ exit
```

Closes #23611 from rvesse/SPARK-26685.

Authored-by: Rob Vesse <[email protected]>
Signed-off-by: Marcelo Vanzin <[email protected]>
  • Loading branch information
rvesse authored and Marcelo Vanzin committed Jan 22, 2019
1 parent c542c24 commit dc2da72
Show file tree
Hide file tree
Showing 3 changed files with 4 additions and 3 deletions.
3 changes: 2 additions & 1 deletion bin/docker-image-tool.sh
Original file line number Diff line number Diff line change
Expand Up @@ -154,10 +154,11 @@ function build {
fi

local BINDING_BUILD_ARGS=(
${BUILD_PARAMS}
${BUILD_ARGS[@]}
--build-arg
base_img=$(image_ref spark)
)

local BASEDOCKERFILE=${BASEDOCKERFILE:-"kubernetes/dockerfiles/spark/Dockerfile"}
local PYDOCKERFILE=${PYDOCKERFILE:-false}
local RDOCKERFILE=${RDOCKERFILE:-false}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,6 @@
#

ARG base_img
ARG spark_uid=185

FROM $base_img
WORKDIR /
Expand All @@ -35,4 +34,5 @@ WORKDIR /opt/spark/work-dir
ENTRYPOINT [ "/opt/entrypoint.sh" ]

# Specify the User that the actual main process will run as
ARG spark_uid=185
USER ${spark_uid}
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,6 @@
#

ARG base_img
ARG spark_uid=185

FROM $base_img
WORKDIR /
Expand Down Expand Up @@ -46,4 +45,5 @@ WORKDIR /opt/spark/work-dir
ENTRYPOINT [ "/opt/entrypoint.sh" ]

# Specify the User that the actual main process will run as
ARG spark_uid=185
USER ${spark_uid}

0 comments on commit dc2da72

Please sign in to comment.