bugfix: Fix flashinfer download-cubin#1729
Conversation
Summary of ChangesHello @tiran, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses critical issues and introduces significant performance enhancements for the Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request significantly improves the flashinfer download-cubin functionality. It correctly uses tqdm.contrib.logging to fix a logging issue, properly manages ThreadPoolExecutor resources with a context manager to prevent leaks, and introduces requests.Session to dramatically speed up downloads by reusing connections. These are excellent changes that enhance performance and robustness. My review includes a couple of suggestions for further improving code clarity and explicitness in control flow.
| download_file(uri, cubin_path, session=session) | ||
| return load_cubin(cubin_path, sha256) |
There was a problem hiding this comment.
The download_file function returns a boolean indicating success or failure. It's better to check this return value explicitly rather than relying on load_cubin to fail implicitly. This makes the control flow clearer and more robust, as download_file already handles logging on failure.
| download_file(uri, cubin_path, session=session) | |
| return load_cubin(cubin_path, sha256) | |
| if download_file(uri, cubin_path, session=session): | |
| return load_cubin(cubin_path, sha256) | |
| return b"" |
The previous version was attempting to use `tqdm` module as callable. The new version replaces the custom logging handler with `tqdm.contrib.logging` helper. Fixes `'module' object is not callable. Did you mean: 'tqdm.tqdm(...)'?` The `ThreadPoolExecutor` is now correctly wrapped in a context manager. The progress bar is updated by a future done callback. Signed-off-by: Christian Heimes <cheimes@redhat.com>
`requests.get` creates a new session object for each GET request. This is pretty inefficient, because each request has to perform DNS lookup, TCP handshake, and HTTPS handshake including certificate validation. A `requests.Session` can be shared between requests and across threads to make use of HTTP keep-alive. This change more than doubles the download speed and reduces the load on the server. Signed-off-by: Christian Heimes <cheimes@redhat.com>
6a29b89 to
4de6fc3
Compare
|
@yzh119 for review, please |
|
|
||
| def download_file(source, local_path, retries=3, delay=5, timeout=10, lock_timeout=30): | ||
| def download_file( | ||
| source, local_path, retries=3, delay=5, timeout=10, lock_timeout=30, session=None |
There was a problem hiding this comment.
better to also document session
|
@yzh119 when you plan to tag a new release? Thank you for your reviews and help! |
📌 Description
Summary: fix
flashinfer download-cubinand double download speedThe previous version was attempting to use
tqdmmodule as callable. The new version replaces the custom logging handler withtqdm.contrib.logginghelper. Fixes'module' object is not callable. Did you mean: 'tqdm.tqdm(...)'?The
ThreadPoolExecutoris now correctly wrapped in a context manager. The progress bar is updated by a future done callback.requests.getcreates a new session object for each GET request. This is pretty inefficient, because each request has to perform DNS lookup, TCP handshake, and HTTPS handshake including certificate validation.A
requests.Sessioncan be shared between requests and across threads to make use of HTTP keep-alive. This change more than doubles the download speed and reduces the load on the server.🔍 Related Issues
Fixes #1728
🚀 Pull Request Checklist
Thank you for contributing to FlashInfer! Before we review your pull request, please make sure the following items are complete.
✅ Pre-commit Checks
pre-commitby runningpip install pre-commit(or used your preferred method).pre-commit install.pre-commit run --all-filesand fixed any reported issues.🧪 Tests
unittest, etc.).Reviewer Notes