-
-
Notifications
You must be signed in to change notification settings - Fork 264
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Why does creating a virtualenv for torch take about 7 seconds? #2312
Comments
I got pretty much all of that wrong! |
Ok, for the last line A quick experiment though shows that the times here are ~10x larger than they should be; so I need to drill into this aspect. Next though is the So I think both of these can likely be optimized, but I won't be back at a keyboard to dive in until ~January 4th. |
Ok, so it turns out both of these cases of ~3s are all consumed in hashing distribution files. So the time taken at least makes sense now. This is hashing ~4GB of files x2. The trick is to see if the hashing can be avoided or amortized or parallelized. |
Previously, installing a wheel that was already installed incurred the cost of hashing the installed wheel chroot every time. The overhead of this wasted work for a warm cache was egregious for large distributions like PyTorch, with gigabytes of files to hash taking seconds. Work towards pex-tool#2312.
Ok, #2315 addresses "Building 0 artifacts and installing 22: 3142.4ms" and takes it to ~10ms. There is still "Installing 22 wheels in venv at ./tenv: 3585.3ms" to improve. The issue there is the same, re-hashing all files in an installed wheel chroot, but instead of doing that to get a single chroot hash, it's being done to create a compliant RECORD for the venv (for interoperability; e.g. so you can run |
Previously, installing a wheel that was already installed incurred the cost of hashing the installed wheel chroot every time. The overhead of this wasted work for a warm cache was egregious for large distributions like PyTorch, with gigabytes of files to hash taking seconds. Work towards #2312.
This is a question about pex's performance when creating virtualenvs. The example below is extracted from a Dockerfile where I am trying to understand if I can speed up the slowest step.
I created a lockfile for torch by running:
Then later I create venv by running:
This takes about 7 seconds on my machine even when all of the wheels have been downloaded and extracted.
The output shows:
I see two seconds that appear to take some time:
and
Adding more
-v
flags doesn't shed more light on this. Is there any obvious reason why these steps would take multiple seconds?For the first line I would expect that since the wheels are already installed to
.pex/installed_wheels
it should be a no-op.For the second line since installing wheels into a venv is just creating hardlinks from the
.pex/installed_wheels
directory and updating theRECORD
from the wheel I would expect it to be really fast.The text was updated successfully, but these errors were encountered: