Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Build] Improve docker build performance #11111

Merged
merged 4 commits into from
Jun 16, 2022

Conversation

xumia
Copy link
Collaborator

@xumia xumia commented Jun 11, 2022

Why I did it

The docker storage driver vfs is not a good option for build, it uses the “deep copy” when building a new layer, leads to lower performance and more space used on disk than other storage drivers.
A better docker storage driver is the default one overlay2, it is a modern union filesystem.

  1. Improve build performance, reduce the build time
    Reduce around 3 hours in Broadcom official build, from 12.5 hours to 9.5 hours.
  2. Reduce the disk write bytes
    Reduce from 1600G to 360G in Broadcom official build.
  3. Reduce the required disk size during the build.

How I did it

  1. Change the storage driver setting of the docker in slave container.
  2. Mount the local file system to /var/lib/docker, since the overlay fs as the backing filesystem is not supported.

How to verify it

Verify the build option:
For default option:
BLDENV=buster make -f Makefile.work docker-start KEEP_SLAVE_ON=yes

xumia@dccae0cd559c:/sonic$ docker info | grep "Storage Driver"
Storage Driver: vfs

Enable overlay2:
SONIC_SLAVE_DOCKER_DRIVER=overlay2 BLDENV=buster make -f Makefile.work docker-start KEEP_SLAVE_ON=yes

xumia@ba4b89040dbb:/sonic$ docker info | grep "Storage Driver"
Storage Driver: overlay2

Compare the disk write bytes metrics:
image

Which release branch to backport (provide reason below if selected)

  • 201811
  • 201911
  • 202006
  • 202012
  • 202106
  • 202111

Description for the changelog

Link to config_db schema for YANG module changes

A picture of a cute animal (not mandatory but encouraged)

@xumia xumia requested review from qiluo-msft and lguohan as code owners June 11, 2022 00:18
@xumia xumia added the Build label Jun 11, 2022
@xumia xumia changed the title [Build] Improve docker build performance [Build] Improve SONiC docker build performance Jun 11, 2022
@xumia xumia changed the title [Build] Improve SONiC docker build performance [Build] Improve docker build performance Jun 11, 2022
@lguohan
Copy link
Collaborator

lguohan commented Jun 11, 2022

what is the difference between this one with the SONIC_CONFIG_USE_NATIVE_DOCKERD_FOR_BUILD?

@xumia
Copy link
Collaborator Author

xumia commented Jun 11, 2022

...

@xumia
Copy link
Collaborator Author

xumia commented Jun 11, 2022

what is the difference between this one with the SONIC_CONFIG_USE_NATIVE_DOCKERD_FOR_BUILD?

It uses different docker instances, the native docker uses the same docker instance started on the host. In the native docker mode, in the salve container, you can see everything the same as in the host. If multiple users use the same build system, they are sharing the same docker instance. We do not use the native docker in azp.
The PR is not for native docker, it still starts a new docker instance inside of the docker slave container, the behavior is not changed.

In the salve container, we can see the container d48d9949fb53 itself, when native docker enabled.

$ BLDENV=buster make -f Makefile.work sonic-slave-bash SONIC_CONFIG_USE_NATIVE_DOCKERD_FOR_BUILD=y

xumia@d48d9949fb53:/sonic$ sudo docker ps | grep d48d9949fb53
d48d9949fb53        sonic-slave-buster-xumia:c177586b087   "bash"                   About a minute ago   Up About a minute   22/tcp              strange_kilby
xumia@d48d9949fb53:/sonic$ 

@lguohan
Copy link
Collaborator

lguohan commented Jun 11, 2022

The default docker storage driver vfs is not configured in sonic slave images. If someone uses the images for the other
scenarios, not sonic build, and want to use the docker daemon inside of the salve docker, needs to config the docker options
himself or herself.

I am wondering if you can keep vfs as the default option and add new option for this? i think the slave image is used in some other places.

Makefile.work Show resolved Hide resolved
@xumia
Copy link
Collaborator Author

xumia commented Jun 12, 2022

The default docker storage driver vfs is not configured in sonic slave images. If someone uses the images for the other
scenarios, not sonic build, and want to use the docker daemon inside of the salve docker, needs to config the docker options
himself or herself.

I am wondering if you can keep vfs as the default option and add new option for this? i think the slave image is used in some other places.

Yes, fixed. We can config the overlay2 option inside of the slave containers, not necessary to change in the slave image.

@xumia
Copy link
Collaborator Author

xumia commented Jun 13, 2022

@lguohan , do you have more comments?

rules/config Outdated
@@ -229,3 +229,6 @@ DEFAULT_CONTAINER_REGISTRY ?=
# ENABLE_FIPS - support FIPS flag, if enabled, no additional config requred for the image to support FIPS
ENABLE_FIPS_FEATURE ?= y
ENABLE_FIPS ?= n

# SONIC_SLAVE_DOCKER_DRIVER - set the sonic slave docker storage driver
SONIC_SLAVE_DOCKER_DRIVER ?= overlay2
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggest to keep the default option as vfs, and we can experiment this overlay2 in our azure pipeline more before we change this option?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it makes sense to only enable it in azp.
Fixed, thanks.

@lguohan
Copy link
Collaborator

lguohan commented Jun 15, 2022

for reference, why we changed to vfs in the first place. #2016

@xumia
Copy link
Collaborator Author

xumia commented Jun 15, 2022

for reference, why we changed to vfs in the first place. #2016

Thanks for the info. It is true, the overlay fs cannot be used as backing filesystem of docker. If the source folder is an overlay fs, then it will be an issue, need to mount a non-overlay fs for the slave docker. In the azp agents, it should work fine.

@xumia xumia merged commit b6811a5 into sonic-net:master Jun 16, 2022
@xumia xumia deleted the improve-docker-storage branch June 16, 2022 06:13
yxieca pushed a commit that referenced this pull request Jun 17, 2022
Why I did it
The docker storage driver vfs is not a good option for build, it uses the “deep copy” when building a new layer, leads to lower performance and more space used on disk than other storage drivers.
A better docker storage driver is the default one overlay2, it is a modern union filesystem.
@jusherma
Copy link
Contributor

This change assumes that all users building SONIC have password-less sudo access, which is not the case in our build environment. This is causing build problems for us.

@xumia
Copy link
Collaborator Author

xumia commented Jul 8, 2022

This change assumes that all users building SONIC have password-less sudo access, which is not the case in our build environment. This is causing build problems for us.

@jusherma , it was caused by the following line?
It is to clean up the data, maybe we can move the cleanup action inside of the slave container in slave.mk.

$(shell sudo rm -rf $(DOCKER_ROOT) && mkdir -p $(DOCKER_ROOT))

lguohan pushed a commit that referenced this pull request Jun 5, 2024
overlay2 is more faster comparing to vfs (more info in #11111)
And we use this option to build SONiC almost two years.
But it may be not obvious for some people that default choice is slow.
I think we should set overlay2 as a default storage driver for DinD.
arun1355492 pushed a commit to arun1355492/sonic-buildimage that referenced this pull request Jul 26, 2024
overlay2 is more faster comparing to vfs (more info in sonic-net#11111)
And we use this option to build SONiC almost two years.
But it may be not obvious for some people that default choice is slow.
I think we should set overlay2 as a default storage driver for DinD.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants