v4.1.x: Fix compilation issue in OFI with CUDA #11382

jtamzn · 2023-02-06T18:49:52Z

This patch to fix compilation issue in #11381

bot:notacherrypick

github-actions · 2023-02-06T18:50:39Z

Hello! The Git Commit Checker CI bot found a few problems with this PR:

e492149: Fix compilation issue in OFI with CUDA

check_signed_off: does not contain a valid Signed-off-by line
check_cherry_pick: does not include a cherry pick message (did you need to bot:notacherrypick?)

Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks!

github-actions · 2023-02-06T18:52:39Z

Hello! The Git Commit Checker CI bot found a few problems with this PR:

3e1d43c: Fix compilation issue in OFI with CUDA

check_cherry_pick: does not include a cherry pick message (did you need to bot:notacherrypick?)

Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks!

jsquyres · 2023-02-06T18:53:17Z

Thanks for the fix! Does this fix also apply to the main branch?

github-actions · 2023-02-06T18:54:22Z

Hello! The Git Commit Checker CI bot found a few problems with this PR:

3e1d43c: Fix compilation issue in OFI with CUDA

check_cherry_pick: does not include a cherry pick message (did you need to bot:notacherrypick?)

Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks!

wenduwan · 2023-02-06T19:05:28Z

Thanks for the fix! Does this fix also apply to the main branch?

Thanks for looking at the issue! main branch does not have this problem so we do not need to apply the fix.

jtamzn · 2023-02-06T19:22:10Z

@jsquyres no, we believe main/5.0.x is good. I think it only applies to v4 branch.

jsquyres · 2023-02-06T19:33:10Z

Ok, if it's not needed on main, then bot:notacherrypick is appropriate -- I see you already added that.

bwbarrett

I don't think this is the wrong fix for what is actually a bad merge in 55d3501.

bwbarrett · 2023-02-06T21:17:24Z

Yeah, there's no reason for the OFI MTL on the 4.1.x series to have any CUDA code in it at all. It shouldn't be closing it, and it looks like it's just a bad merge that brought it in.

jtamzn · 2023-02-06T21:21:29Z

Yeah, there's no reason for the OFI MTL on the 4.1.x series to have any CUDA code in it at all. It shouldn't be closing it, and it looks like it's just a bad merge that brought it in.

Oh I forget switching github account.

Actually this sounds more reasonable to me. I was thinking that could be a backported CUDA code. If not, we should remove it.

Signed-off-by: Jingyin Tang <[email protected]>

jtamzn · 2023-02-07T18:18:00Z

@bwbarrett I updated the PR to remove blocking cuda related code here. However, upon checking the file, there is an additional block also calling in Line 911 as

#if OPAL_CUDA_SUPPORT
    /**
     * Some providers do not require the use of the CUDA convertor
     * in OMPI and its use will cause performance degradation. The
     * following providers will disable it when selected.
     */
    if (!strncmp(prov->fabric_attr->prov_name, "psm3", 4)
        || !strncmp(prov->fabric_attr->prov_name, "psm2", 4))
    {
        ompi_mtl_ofi.base.mtl_flags |= MCA_MTL_BASE_FLAG_CUDA_INIT_DISABLE;
    }
#endif /* OPAL_CUDA_SUPPORT */

It doesn't preventing compilation and compiled bin can pass OSU microbenchmark. Are these lines legit here? If not, we should remove them, too.

bwbarrett · 2023-02-07T20:52:38Z

@bwbarrett I updated the PR to remove blocking cuda related code here. However, upon checking the file, there is an additional block also calling in Line 911 as

I think that code should be fine, and since it doesn't reference any CUDA calls, doesn't need the include you asked about.

github-actions bot added the Target: v4.1.x label Feb 6, 2023

jtamzn force-pushed the v4.1.x branch from e492149 to 3e1d43c Compare February 6, 2023 18:51

jsquyres added this to the v4.1.5 milestone Feb 6, 2023

jsquyres changed the title ~~Fix compilation issue in OFI with CUDA~~ v4.1.x: Fix compilation issue in OFI with CUDA Feb 6, 2023

lrbison approved these changes Feb 6, 2023

View reviewed changes

bwbarrett self-assigned this Feb 6, 2023

bwbarrett requested changes Feb 6, 2023

View reviewed changes

bwbarrett added the Severity: blocker label Feb 6, 2023

Fix compilation issue in OFI with CUDA

7676618

Signed-off-by: Jingyin Tang <[email protected]>

jtamzn force-pushed the v4.1.x branch from 3e1d43c to 7676618 Compare February 7, 2023 18:15

bwbarrett approved these changes Feb 13, 2023

View reviewed changes

bwbarrett merged commit 1bebe3d into open-mpi:v4.1.x Feb 13, 2023

github-actions bot mentioned this pull request Mar 21, 2023

migrate Jenkins CI to Pipeline Joe-Downs/ompi#29

Merged

Flamefire mentioned this pull request Jul 2, 2024

add patch to fix implicit function declaration in OpenMPI 4.1.4 easybuilders/easybuild-easyconfigs#20949

Merged

v4.1.x: Fix compilation issue in OFI with CUDA #11382

v4.1.x: Fix compilation issue in OFI with CUDA #11382

Uh oh!

Conversation

jtamzn commented Feb 6, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Feb 6, 2023

Uh oh!

github-actions bot commented Feb 6, 2023

Uh oh!

jsquyres commented Feb 6, 2023

Uh oh!

github-actions bot commented Feb 6, 2023

Uh oh!

wenduwan commented Feb 6, 2023

Uh oh!

jtamzn commented Feb 6, 2023

Uh oh!

jsquyres commented Feb 6, 2023

Uh oh!

bwbarrett left a comment

Choose a reason for hiding this comment

Uh oh!

bwbarrett commented Feb 6, 2023

Uh oh!

jtamzn commented Feb 6, 2023

Uh oh!

jtamzn commented Feb 7, 2023

Uh oh!

bwbarrett commented Feb 7, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

jtamzn commented Feb 6, 2023 •

edited

Loading

bwbarrett commented Feb 7, 2023 •

edited

Loading