Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

huge file size in tarpaulin builds #516

Closed
xMAC94x opened this issue Jul 29, 2020 · 3 comments
Closed

huge file size in tarpaulin builds #516

xMAC94x opened this issue Jul 29, 2020 · 3 comments

Comments

@xMAC94x
Copy link
Contributor

xMAC94x commented Jul 29, 2020

Hey devs,
We did some tests in our CI, where we are doing `cargo check, build, test, bench, doc, release builds, and tarpaulin.
We stored their caches in different directories:

$tree -L 2
.
|-- cache-all
|   |-- debug
|   `-- release
|-- cache-release-linux
|   `-- release
|-- cache-release-macos
|   |-- release
|   `-- x86_64-apple-darwin
|-- cache-release-windows
|   |-- release
|   `-- x86_64-pc-windows-gnu
|-- cache-tarpaulin
|   `-- debug

with the following sizes

$ du -sh *
5.3G	cache-all
1.3G	cache-release-linux
1.2G	cache-release-macos
1.2G	cache-release-windows
11G	cache-tarpaulin

we noticed that tarpaulin allone uses 11GB, while all other builds, including check, build, test, bench, doc only take 5GB of space.
And in this issue we wanted to ask if there is potential point of improvement.
We know that tarpaulin ofc needs to add some information to the binaries in order to check the coverage, but lets go on:

$ cache-tarpaulin/debug# du -sh *
500M	build
5.7G	deps
4.0K	examples
4.1G	incremental

we see 6 GB of dependencies and 4 GB of incremental builds

$cache-tarpaulin/debug/deps# ls -Shl
total 5.7G
-rwxr-xr-x 2 root root  585M Jul  7 12:28 veloren_voxygen-a83d23850890f9ee
-rw-r--r-- 1 root root  219M Jul  7 12:27 libveloren_voxygen-99fa318ed0535ffc.rlib
-rw-r--r-- 1 root root  212M Jul  7 12:27 libveloren_voxygen-c940f6851bc3d210.rlib
-rwxr-xr-x 1 root root  187M Jul  7 12:28 veloren_voxygen-3329630e0989a9f9
-rwxr-xr-x 1 root root  145M Jul  7 12:26 veloren_client-723270e98d6d62c5
-rwxr-xr-x 1 root root  143M Jul  7 12:27 veloren_server-0549e449137df30c
-rw-r--r-- 1 root root  137M Jul  7 12:25 libveloren_server-ab80dc9c64adfe14.rlib
-rw-r--r-- 1 root root  132M Jul  7 12:25 libveloren_server-2a0d58b13bb14580.rlib
-rwxr-xr-x 1 root root  119M Jul  7 12:26 veloren_world-4cf4a3e00d663e42
-rw-r--r-- 1 root root  103M Jul  7 12:23 libgtk-5f86ecb13cac07aa.rlib
-rw-r--r-- 1 root root  102M Jul  7 12:24 libgtk-5f12019f6217edef.rlib
-rwxr-xr-x 1 root root   90M Jul  7 12:24 integration-4439d9a92a137e7b
-rw-r--r-- 1 root root   89M Jul  7 12:24 libveloren_common-b328f57ca788792e.rlib
-rw-r--r-- 1 root root   86M Jul  7 12:25 libveloren_common-6115d7ac45ca4614.rlib
-rw-r--r-- 1 root root   72M Jul  7 12:26 libveloren_world-98f55c44726caace.rlib
-rw-r--r-- 1 root root   71M Jul  7 12:26 libveloren_world-1031254064696422.rlib
-rwxr-xr-x 1 root root   51M Jul  7 12:23 closing-74f274e9c3d629bb
-rwxr-xr-x 1 root root   44M Jul  7 12:25 veloren_common-3b8b497e3122ba3d
-rwxr-xr-x 1 root root   44M Jul  7 12:21 libdiesel_derives-acf9b65d516c16df.so
...

these big files in total are about 2581MB from 5.7 GB total.

I am wondering 2 things here:

  • Many files seem to be double in here:
RUSTFLAGS="--cfg procmacro2_semver_exempt" cargo install --git https://github.com/xd009642/tarpaulin --rev d3df2d98c6eb459129193c74aac941b4fec43e29;
ln -s /dockercache/cache-tarpaulin /dockercache/veloren/target; \
cargo tarpaulin -v; \

The next question i have is:

  • Some files seem to be quite big and bloated so i am wondering if they are compiled with detailed info, however is this neeed for 3rd party crates that are not covered by tarpaulin anyway? e.g: libgtk, closing, libdiesel_derives ?

lets compare witht he normal cargo build sizes:

$ cache-all/debug/deps# ls -Shl
total 1.8G
-rw-r--r-- 1 root root   32M Jul  7 12:02 libveloren_voxygen-99fa318ed0535ffc.rlib
-rw-r--r-- 2 root root   29M Jul  7 12:02 libveloren_voxygen-c940f6851bc3d210.rlib
-rwxr-xr-x 2 root root   27M Jul  7 12:03 veloren_voxygen-a83d23850890f9ee
-rw-r--r-- 1 root root   26M Jul  7 12:01 libgtk-5f12019f6217edef.rlib
-rw-r--r-- 1 root root   25M Jul  7 12:01 libveloren_common-b328f57ca788792e.rlib
-rw-r--r-- 1 root root   25M Jul  7 12:01 libgtk-5f86ecb13cac07aa.rlib
...
-rw-r--r-- 1 root root   18M Jul  7 12:00 libgtk-5f12019f6217edef.rmeta
-rw-r--r-- 1 root root   18M Jul  7 12:00 libgtk-5f86ecb13cac07aa.rmeta
...
-rw-r--r-- 1 root root   14M Jul  7 11:56 libdiesel-ad6c877cc5d19df1.rlib

if we don't cover those libs, couldn'd we use the smaller file size version of them?

also when inspecting the cache-tarpaulin/debug/incremental directory i noticed that it has alot of duplicate folders:
e.g.

26M	veloren_voxygen-11dxozsyonp5x
301M	veloren_voxygen-1kgnv3waswtoz
40M	veloren_voxygen-1ofw4sg2egpcz
473M	veloren_voxygen-2pndicdynx6t0
464M	veloren_voxygen-3cb6v82afvb1k
338M	veloren_world-35lnid4gbvwhd
302M	veloren_world-3p6wlvl9al6t0
277M	veloren_world-z5wvp4cjry4b

Some finishing words:
Ofc we are aware that you are prob not the cargo guys and don't know all the internals.
However whe are creating 20GB docker images in order to provide our runners with caches. And just wanted to ask for you to seek some optimizing potential, reducing it to like 16GB would already be a great win :)
With that said: Have a nice day :)

@xd009642
Copy link
Owner

So tarpaulin uses link-dead-code as a linker option because otherwise it doesn't detect unused functions in your own code. I don't think there's a way to specify a linker flag and not have it propagate to all the dependencies so this would mean all the unused parts of your dependencies can be linked in as well. I'd guess this is what creates the bloat.

Alternatively, I could look at removing the flag and using the source-analysis part of tarpaulin to identify every line you can hit but that's a significant amount of work and might not be robust against things like macros. I've been wondering for a while if that was a worthwhile route to take so I'll start to look towards prototyping something for it and let you know if I make some progress 👍

Also side note you don't need RUSTFLAGS="--cfg procmacro2_semver_exempt" for tarpaulin since proc macros reached stable so you can remove that unless it's needed for your own code to build.

@xMAC94x
Copy link
Contributor Author

xMAC94x commented Jul 30, 2020

Hi @xd009642 thank you for your answer, it helps me alot understanding what is going on (:
I am looking forward to hear from you how prototyping works out.
And thanks for the tip regarding the RUSTFLAGS ;)
Have a nice evening

@xd009642
Copy link
Owner

So I'm going to close this as inactive, but I am watching some RFCs on crate specific RUST_FLAGS which will dramatically reduce file sizes! Unfortunately my other experiments in this area didn't work

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants