Fix cutlass inductor options for PyTorch < 2.8.0#3988
Conversation
The cuda.cutlass_epilogue_fusion_enabled and cuda.cutlass_tma_only inductor config options were added in PyTorch 2.8.0. Using these options on older PyTorch versions causes a RuntimeError during GRPOTrainer initialization. This fix adds a version check to only include these options when running PyTorch 2.8.0 or later, allowing GRPO training to work on older PyTorch versions (e.g., Colab environments with PyTorch 2.5-2.7).
Summary of ChangesHello @danielhanchen, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses a critical compatibility issue preventing Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request correctly adds a version check for PyTorch to conditionally include cutlass-specific inductor options, which is a great fix to support older PyTorch versions. I've identified a couple of areas for improvement. Firstly, the PyTorch version parsing could be made more robust to handle different version string formats. Secondly, there's a critical bug in how the cuda_options string is constructed, which would lead to a SyntaxError. My review includes suggestions to address both of these points.
| cuda_options = """ | ||
| "triton.enable_persistent_tma_matmul": torch.cuda.get_device_capability()[0] >= 9,""" | ||
| # cutlass options were added in PyTorch 2.8.0 | ||
| if torch_version >= Version("2.8.0"): | ||
| cuda_options += """ | ||
| "cuda.cutlass_epilogue_fusion_enabled": torch.cuda.get_device_capability()[0] >= 9, | ||
| "cuda.cutlass_tma_only": torch.cuda.get_device_capability()[0] >= 9, | ||
| "cuda.cutlass_tma_only": torch.cuda.get_device_capability()[0] >= 9,""" | ||
| cuda_options += """ | ||
| "cuda.compile_opt_level" : "-O2", | ||
| "cuda.enable_cuda_lto" : True, | ||
| }""" |
There was a problem hiding this comment.
The way the cuda_options string is constructed introduces a syntax error. There's a trailing comma after "cuda.enable_cuda_lto": True, which is the last item in the dictionary being constructed. This will cause a SyntaxError when the string is evaluated as Python code.
I've provided a suggestion that fixes this bug by removing the comma and also makes the string construction slightly more readable by combining the unconditional string additions.
| cuda_options = """ | |
| "triton.enable_persistent_tma_matmul": torch.cuda.get_device_capability()[0] >= 9,""" | |
| # cutlass options were added in PyTorch 2.8.0 | |
| if torch_version >= Version("2.8.0"): | |
| cuda_options += """ | |
| "cuda.cutlass_epilogue_fusion_enabled": torch.cuda.get_device_capability()[0] >= 9, | |
| "cuda.cutlass_tma_only": torch.cuda.get_device_capability()[0] >= 9, | |
| "cuda.cutlass_tma_only": torch.cuda.get_device_capability()[0] >= 9,""" | |
| cuda_options += """ | |
| "cuda.compile_opt_level" : "-O2", | |
| "cuda.enable_cuda_lto" : True, | |
| }""" | |
| cuda_options = ''' | |
| "triton.enable_persistent_tma_matmul": torch.cuda.get_device_capability()[0] >= 9,''' | |
| # cutlass options were added in PyTorch 2.8.0 | |
| if torch_version >= Version("2.8.0"): | |
| cuda_options += ''' | |
| "cuda.cutlass_epilogue_fusion_enabled": torch.cuda.get_device_capability()[0] >= 9, | |
| "cuda.cutlass_tma_only": torch.cuda.get_device_capability()[0] >= 9,''' | |
| cuda_options += ''' | |
| "cuda.compile_opt_level" : "-O2", | |
| "cuda.enable_cuda_lto" : True | |
| }''' |
|
|
||
| # Get PyTorch version for feature detection | ||
| try: | ||
| torch_version = Version(torch.__version__.split("+")[0].split("a")[0].split("b")[0]) |
There was a problem hiding this comment.
The current method of parsing the PyTorch version string is a bit fragile. It handles a and b for alpha/beta releases, but it doesn't handle dev releases correctly. For example, a version like 2.8.0.dev20240715 would not be stripped down, and Version('2.8.0.dev...') is considered less than Version('2.8.0'), which might not be the intended behavior if dev releases should already include the new features.
A more robust approach would be to use a regular expression to extract just the base X.Y.Z version string. This would handle a wider variety of version formats gracefully.
| torch_version = Version(torch.__version__.split("+")[0].split("a")[0].split("b")[0]) | |
| torch_version = Version(re.match(r"^\d+\.\d+\.\d+", torch.__version__).group(0)) |
The cuda.cutlass_epilogue_fusion_enabled and cuda.cutlass_tma_only inductor config options were added in PyTorch 2.8.0. Using these options on older PyTorch versions causes a RuntimeError during GRPOTrainer initialization. This fix adds a version check to only include these options when running PyTorch 2.8.0 or later, allowing GRPO training to work on older PyTorch versions (e.g., Colab environments with PyTorch 2.5-2.7). Co-authored-by: Daniel Hanchen <danielhanchen@users.noreply.github.com>
Summary
cuda.cutlass_epilogue_fusion_enabledandcuda.cutlass_tma_onlyoptions were added in PyTorch 2.8.0RuntimeError: Unexpected optimization option cuda.cutlass_epilogue_fusion_enabledChanges
torch_versionvariable for PyTorch version detectiontorch_version >= Version("2.8.0"))Test plan