Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[release/2.5] FEAT:at least one of ROCM_HOME or CUDA_HOME must be None. #1814

Merged
merged 1 commit into from
Jan 6, 2025

Conversation

jithunnair-amd
Copy link
Collaborator

@jithunnair-amd jithunnair-amd commented Jan 6, 2025

This PR is a release/2.5-based version of #1809

Copied description by @hj-wei from #1809

Hi all, I manually generating nvcc to bypass NVIDIA component checks(Megatron-LM),
see https://github.com/NVIDIA/Megatron-LM/blob/2da43ef4c1b9e76f03b7567360cf7390e877f1b6/megatron/legacy/fused_kernels/__init__.py#L57

but it can lead to incorrect CUDA_HOME configurations. This can cause initialization anomalies in downstream libraries like DeepSpeed

@jithunnair-amd jithunnair-amd merged commit e814ee8 into release/2.5 Jan 6, 2025
@jithunnair-amd jithunnair-amd deleted the jnair/cp_PR_1809 branch January 6, 2025 16:26
@jithunnair-amd
Copy link
Collaborator Author

! cherry-pick --onto release/2.4

@jithunnair-amd
Copy link
Collaborator Author

!cherry-pick --onto release/2.4

rocm-mici pushed a commit that referenced this pull request Jan 6, 2025
This PR is a release/2.5-based version of
#1809

Copied description by @hj-wei from
#1809

> Hi all, I manually generating nvcc to bypass NVIDIA component
checks(Megatron-LM),
see
https://github.com/NVIDIA/Megatron-LM/blob/2da43ef4c1b9e76f03b7567360cf7390e877f1b6/megatron/legacy/fused_kernels/__init__.py#L57

> but it can lead to incorrect CUDA_HOME configurations. This can cause
initialization anomalies in downstream libraries like DeepSpeed
@rocm-mici
Copy link

Created branch release/2.4_cherry-pick_pr-1814 and #1815

@jithunnair-amd jithunnair-amd changed the title FEAT:at least one of ROCM_HOME or CUDA_HOME must be None. [release/2.5] FEAT:at least one of ROCM_HOME or CUDA_HOME must be None. Jan 6, 2025
jithunnair-amd added a commit that referenced this pull request Jan 6, 2025
…HOME must be None. (#1815)

Cherry-pick of #1814

Co-authored-by: Jithun Nair <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants