Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why does creating a virtualenv for torch take about 7 seconds? #2312

Open
zmanji opened this issue Dec 29, 2023 · 4 comments
Open

Why does creating a virtualenv for torch take about 7 seconds? #2312

zmanji opened this issue Dec 29, 2023 · 4 comments

Comments

@zmanji
Copy link
Collaborator

zmanji commented Dec 29, 2023

This is a question about pex's performance when creating virtualenvs. The example below is extracted from a Dockerfile where I am trying to understand if I can speed up the slowest step.

I created a lockfile for torch by running:

PEX_SCRIPT=pex3 pex lock create -v --resolver-version pip-2020-resolver --pip-version 22.3 --no-build -o torch.lock --platform 'linux_x86_64-cp-311-cp311' --platform 'macosx_11_0_arm64-cp-311-cp311' --platform 'linux_x86_64-cp-310-cp310' 'torch==2.1.2'

Then later I create venv by running:

PEX_SCRIPT='pex3' pex venv create -v --force -d ./tenv  --no-build --lock torch.lock 'torch'

This takes about 7 seconds on my machine even when all of the wheels have been downloaded and extracted.

The output shows:

PEX_SCRIPT='pex3' pex venv create -v --force -d ./tenv  --no-build --lock torch.lock 'torch'
pex: Using cached artifact at /home/zmanji/.pex/downloads/57dbda9b35157b05fb3e58ee91448612eb674172fab98ee235ccb0b5bee19a1c for FileArtifact(url='https://files.pythonhosted.org/packages/81/54/84d42a0bee35edba99dee7b59a8d4970eccdd44b99fe728ed912106fc781/filelock-3.13.1-py3-none-any.whl', fingerprint=Fingerprint(algorithm='sha256', hash='57dbda9b35157b05fb3e58ee91448612eb674172fab98ee235ccb0b5bee19a1c'), verified=False, filename='filelock-3.13.1-py3-none-any.whl')
pex: Using cached artifact at /home/zmanji/.pex/downloads/3a871edd6c02dae77ad810335c0833391c1a4ce49af21ea8cf0f6a5d2096eea8 for FileArtifact(url='https://files.pythonhosted.org/packages/03/f1/13137340776dd5d5bcfd2574c9c6dfcc7618285035cd77240496e5c1a79b/torch-2.1.2-cp310-cp310-manylinux1_x86_64.whl', fingerprint=Fingerprint(algorithm='sha256', hash='3a871edd6c02dae77ad810335c0833391c1a4ce49af21ea8cf0f6a5d2096eea8'), verified=False, filename='torch-2.1.2-cp310-cp310-manylinux1_x86_64.whl')
pex: Using cached artifact at /home/zmanji/.pex/downloads/d800d87f72189a745fa3d6b033b9dc4a34ad069f60ca60b943a63599f5501960 for FileArtifact(url='https://files.pythonhosted.org/packages/70/25/fab23259a52ece5670dcb8452e1af34b89e6135ecc17cd4b54b4b479eac6/fsspec-2023.12.2-py3-none-any.whl', fingerprint=Fingerprint(algorithm='sha256', hash='d800d87f72189a745fa3d6b033b9dc4a34ad069f60ca60b943a63599f5501960'), verified=False, filename='fsspec-2023.12.2-py3-none-any.whl')
pex: Using cached artifact at /home/zmanji/.pex/downloads/6088930bfe239f0e6710546ab9c19c9ef35e29792895fed6e6e31a023a182a61 for FileArtifact(url='https://files.pythonhosted.org/packages/bc/c3/f068337a370801f372f2f8f6bad74a5c140f6fda3d9de154052708dd3c65/Jinja2-3.1.2-py3-none-any.whl', fingerprint=Fingerprint(algorithm='sha256', hash='6088930bfe239f0e6710546ab9c19c9ef35e29792895fed6e6e31a023a182a61'), verified=False, filename='Jinja2-3.1.2-py3-none-any.whl')
pex: Using cached artifact at /home/zmanji/.pex/downloads/f18c69adc97877c42332c170849c96cefa91881c99a7cb3e95b7c659ebdc1ec2 for FileArtifact(url='https://files.pythonhosted.org/packages/d5/f0/8fbc882ca80cf077f1b246c0e3c3465f7f415439bdea6b899f6b19f61f70/networkx-3.2.1-py3-none-any.whl', fingerprint=Fingerprint(algorithm='sha256', hash='f18c69adc97877c42332c170849c96cefa91881c99a7cb3e95b7c659ebdc1ec2'), verified=False, filename='networkx-3.2.1-py3-none-any.whl')
pex: Using cached artifact at /home/zmanji/.pex/downloads/ee53ccca76a6fc08fb9701aa95b6ceb242cdaab118c3bb152af4e579af792728 for FileArtifact(url='https://files.pythonhosted.org/packages/37/6d/121efd7382d5b0284239f4ab1fc1590d86d34ed4a4a2fdb13b30ca8e5740/nvidia_cublas_cu12-12.1.3.1-py3-none-manylinux1_x86_64.whl', fingerprint=Fingerprint(algorithm='sha256', hash='ee53ccca76a6fc08fb9701aa95b6ceb242cdaab118c3bb152af4e579af792728'), verified=False, filename='nvidia_cublas_cu12-12.1.3.1-py3-none-manylinux1_x86_64.whl')
pex: Using cached artifact at /home/zmanji/.pex/downloads/339b385f50c309763ca65456ec75e17bbefcbbf2893f462cb8b90584cd27a1c2 for FileArtifact(url='https://files.pythonhosted.org/packages/b6/9f/c64c03f49d6fbc56196664d05dba14e3a561038a81a638eeb47f4d4cfd48/nvidia_cuda_nvrtc_cu12-12.1.105-py3-none-manylinux1_x86_64.whl', fingerprint=Fingerprint(algorithm='sha256', hash='339b385f50c309763ca65456ec75e17bbefcbbf2893f462cb8b90584cd27a1c2'), verified=False, filename='nvidia_cuda_nvrtc_cu12-12.1.105-py3-none-manylinux1_x86_64.whl')
pex: Using cached artifact at /home/zmanji/.pex/downloads/794e3948a1aa71fd817c3775866943936774d1c14e7628c74f6f7417224cdf56 for FileArtifact(url='https://files.pythonhosted.org/packages/86/94/eb540db023ce1d162e7bea9f8f5aa781d57c65aed513c33ee9a5123ead4d/nvidia_cufft_cu12-11.0.2.54-py3-none-manylinux1_x86_64.whl', fingerprint=Fingerprint(algorithm='sha256', hash='794e3948a1aa71fd817c3775866943936774d1c14e7628c74f6f7417224cdf56'), verified=False, filename='nvidia_cufft_cu12-11.0.2.54-py3-none-manylinux1_x86_64.whl')
pex: Using cached artifact at /home/zmanji/.pex/downloads/5ccb288774fdfb07a7e7025ffec286971c06d8d7b4fb162525334616d7629ff9 for FileArtifact(url='https://files.pythonhosted.org/packages/ff/74/a2e2be7fb83aaedec84f391f082cf765dfb635e7caa9b49065f73e4835d8/nvidia_cudnn_cu12-8.9.2.26-py3-none-manylinux1_x86_64.whl', fingerprint=Fingerprint(algorithm='sha256', hash='5ccb288774fdfb07a7e7025ffec286971c06d8d7b4fb162525334616d7629ff9'), verified=False, filename='nvidia_cudnn_cu12-8.9.2.26-py3-none-manylinux1_x86_64.whl')
pex: Using cached artifact at /home/zmanji/.pex/downloads/e54fde3983165c624cb79254ae9818a456eb6e87a7fd4d56a2352c24ee542d7e for FileArtifact(url='https://files.pythonhosted.org/packages/7e/00/6b218edd739ecfc60524e585ba8e6b00554dd908de2c9c66c1af3e44e18d/nvidia_cuda_cupti_cu12-12.1.105-py3-none-manylinux1_x86_64.whl', fingerprint=Fingerprint(algorithm='sha256', hash='e54fde3983165c624cb79254ae9818a456eb6e87a7fd4d56a2352c24ee542d7e'), verified=False, filename='nvidia_cuda_cupti_cu12-12.1.105-py3-none-manylinux1_x86_64.whl')
pex: Using cached artifact at /home/zmanji/.pex/downloads/6e258468ddf5796e25f1dc591a31029fa317d97a0a94ed93468fc86301d61e40 for FileArtifact(url='https://files.pythonhosted.org/packages/eb/d5/c68b1d2cdfcc59e72e8a5949a37ddb22ae6cade80cd4a57a84d4c8b55472/nvidia_cuda_runtime_cu12-12.1.105-py3-none-manylinux1_x86_64.whl', fingerprint=Fingerprint(algorithm='sha256', hash='6e258468ddf5796e25f1dc591a31029fa317d97a0a94ed93468fc86301d61e40'), verified=False, filename='nvidia_cuda_runtime_cu12-12.1.105-py3-none-manylinux1_x86_64.whl')
pex: Using cached artifact at /home/zmanji/.pex/downloads/f3b50f42cf363f86ab21f720998517a659a48131e8d538dc02f8768237bd884c for FileArtifact(url='https://files.pythonhosted.org/packages/65/5b/cfaeebf25cd9fdec14338ccb16f6b2c4c7fa9163aefcf057d86b9cc248bb/nvidia_cusparse_cu12-12.1.0.106-py3-none-manylinux1_x86_64.whl', fingerprint=Fingerprint(algorithm='sha256', hash='f3b50f42cf363f86ab21f720998517a659a48131e8d538dc02f8768237bd884c'), verified=False, filename='nvidia_cusparse_cu12-12.1.0.106-py3-none-manylinux1_x86_64.whl')
pex: Using cached artifact at /home/zmanji/.pex/downloads/8a7ec542f0412294b15072fa7dab71d31334014a69f953004ea7a118206fe0dd for FileArtifact(url='https://files.pythonhosted.org/packages/bc/1d/8de1e5c67099015c834315e333911273a8c6aaba78923dd1d1e25fc5f217/nvidia_cusolver_cu12-11.4.5.107-py3-none-manylinux1_x86_64.whl', fingerprint=Fingerprint(algorithm='sha256', hash='8a7ec542f0412294b15072fa7dab71d31334014a69f953004ea7a118206fe0dd'), verified=False, filename='nvidia_cusolver_cu12-11.4.5.107-py3-none-manylinux1_x86_64.whl')
pex: Using cached artifact at /home/zmanji/.pex/downloads/9d264c5036dde4e64f1de8c50ae753237c12e0b1348738169cd0f8a536c0e1e0 for FileArtifact(url='https://files.pythonhosted.org/packages/44/31/4890b1c9abc496303412947fc7dcea3d14861720642b49e8ceed89636705/nvidia_curand_cu12-10.3.2.106-py3-none-manylinux1_x86_64.whl', fingerprint=Fingerprint(algorithm='sha256', hash='9d264c5036dde4e64f1de8c50ae753237c12e0b1348738169cd0f8a536c0e1e0'), verified=False, filename='nvidia_curand_cu12-10.3.2.106-py3-none-manylinux1_x86_64.whl')
pex: Using cached artifact at /home/zmanji/.pex/downloads/1a6c4acefcbebfa6de320f412bf7866de856e786e0462326ba1bac40de0b5e71 for FileArtifact(url='https://files.pythonhosted.org/packages/a4/05/23f8f38eec3d28e4915725b233c24d8f1a33cb6540a882f7b54be1befa02/nvidia_nccl_cu12-2.18.1-py3-none-manylinux1_x86_64.whl', fingerprint=Fingerprint(algorithm='sha256', hash='1a6c4acefcbebfa6de320f412bf7866de856e786e0462326ba1bac40de0b5e71'), verified=False, filename='nvidia_nccl_cu12-2.18.1-py3-none-manylinux1_x86_64.whl')
pex: Using cached artifact at /home/zmanji/.pex/downloads/66439923a30d5d48399b08a9eae10370f6c261a5ec864a64983bae63152d39d7 for FileArtifact(url='https://files.pythonhosted.org/packages/4d/22/91a8af421c8a8902dde76e6ef3db01b258af16c53d81e8c0d0dc13900a9e/triton-2.1.0-0-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.whl', fingerprint=Fingerprint(algorithm='sha256', hash='66439923a30d5d48399b08a9eae10370f6c261a5ec864a64983bae63152d39d7'), verified=False, filename='triton-2.1.0-0-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.whl')
pex: Using cached artifact at /home/zmanji/.pex/downloads/dc21cf308ca5691e7c04d962e213f8a4aa9bbfa23d95412f452254c2caeb09e5 for FileArtifact(url='https://files.pythonhosted.org/packages/da/d3/8057f0587683ed2fcd4dbfbdfdfa807b9160b809976099d36b8f60d08f03/nvidia_nvtx_cu12-12.1.105-py3-none-manylinux1_x86_64.whl', fingerprint=Fingerprint(algorithm='sha256', hash='dc21cf308ca5691e7c04d962e213f8a4aa9bbfa23d95412f452254c2caeb09e5'), verified=False, filename='nvidia_nvtx_cu12-12.1.105-py3-none-manylinux1_x86_64.whl')
pex: Using cached artifact at /home/zmanji/.pex/downloads/af72aea155e91adfc61c3ae9e0e342dbc0cba726d6cba4b6c72c1f34e47291cd for FileArtifact(url='https://files.pythonhosted.org/packages/b7/f4/6a90020cd2d93349b442bfcb657d0dc91eee65491600b2cb1d388bc98e6b/typing_extensions-4.9.0-py3-none-any.whl', fingerprint=Fingerprint(algorithm='sha256', hash='af72aea155e91adfc61c3ae9e0e342dbc0cba726d6cba4b6c72c1f34e47291cd'), verified=False, filename='typing_extensions-4.9.0-py3-none-any.whl')
pex: Using cached artifact at /home/zmanji/.pex/downloads/c3588cd4295d0c0f603d0f2ae780587e64e2efeedb3521e46b9bb1d08d184fa5 for FileArtifact(url='https://files.pythonhosted.org/packages/d2/05/e6600db80270777c4a64238a98d442f0fd07cc8915be2a1c16da7f2b9e74/sympy-1.12-py3-none-any.whl', fingerprint=Fingerprint(algorithm='sha256', hash='c3588cd4295d0c0f603d0f2ae780587e64e2efeedb3521e46b9bb1d08d184fa5'), verified=False, filename='sympy-1.12-py3-none-any.whl')
pex: Using cached artifact at /home/zmanji/.pex/downloads/65c1a9bcdadc6c28eecee2c119465aebff8f7a584dd719facdd9e825ec61ab52 for FileArtifact(url='https://files.pythonhosted.org/packages/12/b3/d9ed2c0971e1435b8a62354b18d3060b66c8cb1d368399ec0b9baa7c0ee5/MarkupSafe-2.1.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl', fingerprint=Fingerprint(algorithm='sha256', hash='65c1a9bcdadc6c28eecee2c119465aebff8f7a584dd719facdd9e825ec61ab52'), verified=False, filename='MarkupSafe-2.1.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl')
pex: Using cached artifact at /home/zmanji/.pex/downloads/64335a8088e2b9d196ae8665430bc6a2b7e6ef2eb877a9c735c804bd4ff6467c for FileArtifact(url='https://files.pythonhosted.org/packages/1e/07/bf730d44c2fe1b676ad9cc2be5f5f861eb5d153fb6951987a2d6a96379a9/nvidia_nvjitlink_cu12-12.3.101-py3-none-manylinux1_x86_64.whl', fingerprint=Fingerprint(algorithm='sha256', hash='64335a8088e2b9d196ae8665430bc6a2b7e6ef2eb877a9c735c804bd4ff6467c'), verified=False, filename='nvidia_nvjitlink_cu12-12.3.101-py3-none-manylinux1_x86_64.whl')
pex: Using cached artifact at /home/zmanji/.pex/downloads/a0b2b9fe80bbcd81a6647ff13108738cfb482d481d826cc0e02f5b35e5c88d2c for FileArtifact(url='https://files.pythonhosted.org/packages/43/e3/7d92a15f894aa0c9c4b49b8ee9ac9850d6e63b03c9c32c0367a13ae62209/mpmath-1.3.0-py3-none-any.whl', fingerprint=Fingerprint(algorithm='sha256', hash='a0b2b9fe80bbcd81a6647ff13108738cfb482d481d826cc0e02f5b35e5c88d2c'), verified=False, filename='mpmath-1.3.0-py3-none-any.whl')
pex: Resolving distributions :: Resolving requirements from lock file torch.lock :: Building 0 artifacts and installing 22 :: Calculating project names for direct requirements:
  PyPIRequirement(line=LogicalLine(raw_text='torch', processed_text='torch', source='<string>', start_line=1, end_line=1), requirement=Requirement(name='torch', url=None, extras=frozenset(), specifier=<Spex: Resolving distributions :: Resolving requirements from lock file torch.lock :: Building 0 artifacts and installing 22 :: Installing 22 distributions                                                                                                                                                                                                                                                             pex: Resolving distributions: 3175.2ms                                                                                                                   
pex:   Parsing lock torch.lock: 31.1ms
pex:   Resolving requirements from lock file torch.lock: 3143.6ms
pex:     Parsing requirements: 0.5ms
pex:     Resolving urls to fetch for 1 requirements from lock torch.lock: 10.3ms
pex:     Hashing pex: 28.2ms
pex:     Isolating pex: 0.0ms
pex:     Downloading 22 distributions to satisfy 1 requirements: 8.5ms
pex:     Categorizing 22 downloaded artifacts: 0.2ms
pex:     Building 0 artifacts and installing 22: 3094.9ms
pex:       Calculating project names for direct requirements:
  PyPIRequirement(line=LogicalLine(raw_text='torch', processed_text='torch', source='<string>', start_line=1, end_line=1), requirement=Requirement(name='torch', url=None, extras=frozenset(), specifier=<SpecifierSet('')>, marker=None), editable=False): 0.0ms
pex:       Installing 22 distributions: 3026.8ms
pex: Re-writing /home/zmanji/tmp/tenv/bin/convert-caffe2-to-onnx
pex: Re-writing /home/zmanji/tmp/tenv/bin/convert-onnx-to-caffe2
pex: Re-writing /home/zmanji/tmp/tenv/bin/isympy
pex: Re-writing /home/zmanji/tmp/tenv/bin/torchrun
pex: Installing 22 wheels in venv at ./tenv: 3585.3ms

I see two seconds that appear to take some time:

Building 0 artifacts and installing 22: 3142.4ms

and

Installing 22 wheels in venv at ./tenv: 3650.0ms

Adding more -v flags doesn't shed more light on this. Is there any obvious reason why these steps would take multiple seconds?

For the first line I would expect that since the wheels are already installed to .pex/installed_wheels it should be a no-op.

For the second line since installing wheels into a venv is just creating hardlinks from the .pex/installed_wheels directory and updating the RECORD from the wheel I would expect it to be really fast.

@jsirois
Copy link
Member

jsirois commented Dec 29, 2023

For the second line since installing wheels into a venv is just creating hardlinks ...

Negative. When externalizing PEX never uses symlinks or hard links back into the PEX cache, it always copies. ... or it should. Digging a bit deeper here, but also, you appear to be missing updates from the last few releases that offer --no-pre-install-wheels and --max-install-jobs / PEX_MAX_INSTALL_JOBS to help deal with ridiculous distributions like Torch.

I got pretty much all of that wrong!

@jsirois
Copy link
Member

jsirois commented Dec 29, 2023

Ok, for the last line pex: Installing 22 wheels in venv at ./tenv: 3585.3ms - that is actually the time it takes to hardlink the site-packages tree. If I switch the symlink=False here to True I go from ~3.5s myself to ~30ms:
https://github.com/pantsbuild/pex/blob/a58848cbee4eff915a978e340c608fd9b50bfbb0/pex/cli/commands/venv.py#L311-L317

A quick experiment though shows that the times here are ~10x larger than they should be; so I need to drill into this aspect.

Next though is the pex: Building 0 artifacts and installing 22: 3094.9ms line. That appears to be all due to re-hashing as part of this optimization:
https://github.com/pantsbuild/pex/blob/7a2cee3126d05408252c8d0b103adaf13eff8fe6/pex/resolver.py#L433-L500

So I think both of these can likely be optimized, but I won't be back at a keyboard to dive in until ~January 4th.

@jsirois
Copy link
Member

jsirois commented Jan 4, 2024

Ok, so it turns out both of these cases of ~3s are all consumed in hashing distribution files. So the time taken at least makes sense now. This is hashing ~4GB of files x2. The trick is to see if the hashing can be avoided or amortized or parallelized.

jsirois added a commit to jsirois/pex that referenced this issue Jan 5, 2024
Previously, installing a wheel that was already installed incurred the
cost of hashing the installed wheel chroot every time. The overhead of
this wasted work for a warm cache was egregious for large distributions
like PyTorch, with gigabytes of files to hash taking seconds.

Work towards pex-tool#2312.
@jsirois
Copy link
Member

jsirois commented Jan 5, 2024

Ok, #2315 addresses "Building 0 artifacts and installing 22: 3142.4ms" and takes it to ~10ms. There is still "Installing 22 wheels in venv at ./tenv: 3585.3ms" to improve. The issue there is the same, re-hashing all files in an installed wheel chroot, but instead of doing that to get a single chroot hash, it's being done to create a compliant RECORD for the venv (for interoperability; e.g. so you can run pip uninstall X in the venv) with a hashed / sized entry for each file in the wheel. I think I can just re-use the hash and size values from the original .whl file RECORD but I need to think through the chain of custody / security implications a bit before charging ahead there. I'll leave this issue open for that work.

jsirois added a commit that referenced this issue Jan 5, 2024
Previously, installing a wheel that was already installed incurred the
cost of hashing the installed wheel chroot every time. The overhead of
this wasted work for a warm cache was egregious for large distributions
like PyTorch, with gigabytes of files to hash taking seconds.

Work towards #2312.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants