Skip to content

Conversation

@snnn
Copy link
Contributor

@snnn snnn commented Apr 10, 2025

The Azure DevOps pipeline template /nuget/templates/dml-vs-2022.yml is used to build the ONNX Runtime DirectML (DML) components. It historically contained two potential mechanisms for creating NuGet packages:

  1. Invoking python tools/ci_build/build.py with the --build_nuget flag.
  2. Executing a specific NuPackScript (usually calling msbuild /t:CreatePackage).

This redundancy created a significant problem during release builds (when the pipeline parameter IsReleaseBuild is set to true). Here's why:

  • Duplicate Package Creation: Both packaging methods would execute.
    • build.py --build_nuget created a package with a development/pre-release version suffix (e.g., Microsoft.ML.OnnxRuntime.DirectML.1.21.1-dev-20250408-0849-84808eb710.nupkg).
    • The NuPackScript's msbuild call, influenced by IsReleaseBuild=true, created the clean release version package (e.g., Microsoft.ML.OnnxRuntime.DirectML.1.21.1.nupkg).
  • ren Command Failure: For the x86 and arm64 builds, the NuPackScript contains a command like:
    ren Microsoft.ML.OnnxRuntime.DirectML.* win-dml-x86.zip
    This command fails when two files match the pattern Microsoft.ML.OnnxRuntime.DirectML.* (the dev package and the release package), as ren requires a single source file when using wildcards for renaming.
  • Result: This caused build failures specifically when attempting to create release candidates or final release builds for x86 and arm64 DML components. This issue did not typically occur in regular nightly builds (IsReleaseBuild: false) because only one package (the dev version) was likely produced, allowing the ren command to succeed. Therefore we only found the problem when doing a patch release for ONNX Runtime 1.21.

(@amarin16, the release manager of ONNX Runtime 1.21, found the issue and explained it to us why the pipeline was not working)

The change is relatively simple. This PR removes the --build_nuget flag from the python tools/ci_build/build.py command within the dml-vs-2022.yml template. By removing the redundant packaging step from build.py, only the NuPackScript's msbuild command generates a package file. This ensures only one file matches the Microsoft.ML.OnnxRuntime.DirectML.* pattern, allowing the subsequent ren command in the x86 and arm64 scripts to execute successfully during release builds.

Background (how the DML packaging pipeline works)

The build has two stages:

  1. Individual Architecture Builds (Using dml-vs-2022.yml): Each stage (x64, x86, arm64) runs, now reliably using only its specific NuPackScript to generate its artifact without the risk of the ren command failing during release.
    x64 produces: Microsoft.ML.OnnxRuntime.DirectML.[version].nupkg
    x86 produces: win-dml-x86.zip
    arm64 produces: win-dml-arm64.zip
    (arm32 is not built/included).
  2. Final Packaging Stage (e.g., stages/nuget_dml_packaging_stage.yml): Downloads these artifacts and combines them by unpacking the base x64 .nupkg, injecting the contents of the .zip files into the appropriate runtimes/ directories (e.g., runtimes/win-x86/native/, runtimes/win-arm64/native/), and re-packing the final, multi-architecture Microsoft.ML.OnnxRuntime.DirectML.nupkg.

In stage 1 only x64 produces a nuget package, therefore specific MSBuild parameters: /p:IsReleaseBuild=${{ parameters.IsReleaseBuild }} is passed to all architectures' MSBuild calls, while /p:CurrentData=$(BuildDate) /p:CurrentTime=$(BuildTime) are passed only in the x64 script. BTW, the property "CurrentData" apparently is a typo. It should be CurrentDate.

@snnn snnn changed the title Remove build-nuget from dml-vs-2022.ymlhttps://github.com/microsoft/onnxruntime/compare/nuget/templates/dml-vs-2022.yml Remove build-nuget from dml-vs-2022.yml Apr 10, 2025
@amarin16 amarin16 merged commit 2f8d79e into main Apr 10, 2025
134 of 143 checks passed
@amarin16 amarin16 deleted the snnn-patch-5 branch April 10, 2025 15:26
amarin16 pushed a commit that referenced this pull request Apr 10, 2025
The Azure DevOps pipeline template
[/nuget/templates/dml-vs-2022.yml](https://github.com/microsoft/onnxruntime/blob/main/tools/ci_build/github/azure-pipelines/nuget/templates/dml-vs-2022.yml)
is used to build the ONNX Runtime DirectML (DML) components. It
historically contained two potential mechanisms for creating NuGet
packages:

1. Invoking `python tools/ci_build/build.py` with the `--build_nuget`
flag.
2. Executing a specific `NuPackScript` (usually calling `msbuild
/t:CreatePackage`).

This redundancy created a significant problem during release builds
(when the pipeline parameter IsReleaseBuild is set to true). Here's why:
- Duplicate Package Creation: Both packaging methods would execute.
- build.py --build_nuget created a package with a
development/pre-release version suffix (e.g.,
Microsoft.ML.OnnxRuntime.DirectML.1.21.1-dev-20250408-0849-84808eb710.nupkg).
- The NuPackScript's msbuild call, influenced by IsReleaseBuild=true,
created the clean release version package (e.g.,
Microsoft.ML.OnnxRuntime.DirectML.1.21.1.nupkg).
- ren Command Failure: For the x86 and arm64 builds, the NuPackScript
contains a command like:
    ```Bash
    ren Microsoft.ML.OnnxRuntime.DirectML.* win-dml-x86.zip
    ``` 
This command fails when two files match the pattern
Microsoft.ML.OnnxRuntime.DirectML.* (the dev package and the release
package), as ren requires a single source file when using wildcards for
renaming.
- Result: This caused build failures specifically when attempting to
create release candidates or final release builds for x86 and arm64 DML
components. This issue did not typically occur in regular nightly builds
(IsReleaseBuild: false) because only one package (the dev version) was
likely produced, allowing the ren command to succeed. Therefore we only
found the problem when doing a patch release for ONNX Runtime 1.21.

(@amarin16, the release manager of ONNX Runtime 1.21, found the issue
and explained it to us why the pipeline was not working)

The change is relatively simple. This PR removes the `--build_nuget`
flag from the `python tools/ci_build/build.py` command within the
dml-vs-2022.yml template. By removing the redundant packaging step from
build.py, only the NuPackScript's msbuild command generates a package
file. This ensures only one file matches the
Microsoft.ML.OnnxRuntime.DirectML.* pattern, allowing the subsequent ren
command in the x86 and arm64 scripts to execute successfully during
release builds.

# Background (how the DML packaging pipeline works)

The build has two stages:

1. Individual Architecture Builds (Using dml-vs-2022.yml): Each stage
(x64, x86, arm64) runs, now reliably using only its specific
NuPackScript to generate its artifact without the risk of the ren
command failing during release.
x64 produces: Microsoft.ML.OnnxRuntime.DirectML.[version].nupkg
x86 produces: win-dml-x86.zip
arm64 produces: win-dml-arm64.zip
(arm32 is not built/included).
2. Final Packaging Stage (e.g., stages/nuget_dml_packaging_stage.yml):
Downloads these artifacts and combines them by unpacking the base x64
.nupkg, injecting the contents of the .zip files into the appropriate
runtimes/ directories (e.g., runtimes/win-x86/native/,
runtimes/win-arm64/native/), and re-packing the final,
multi-architecture Microsoft.ML.OnnxRuntime.DirectML.nupkg.

In stage 1 only x64 produces a nuget package, therefore specific MSBuild
parameters: `/p:IsReleaseBuild=${{ parameters.IsReleaseBuild }}` is
passed to all architectures' MSBuild calls, while
`/p:CurrentData=$(BuildDate) /p:CurrentTime=$(BuildTime)` are passed
only in the x64 script. BTW, the property "CurrentData" apparently is a
typo. It should be `CurrentDate`.
adrianlizarraga added a commit that referenced this pull request Apr 22, 2025
adrianlizarraga added a commit that referenced this pull request Apr 22, 2025
Reverts #24372

The above PR removes the `build-nuget` command-line argument from the
`dml-vs-2022.yml` file. This PR reverts that change and adds the
`build-nuget` back to the file.


The `--build_nuget` option creates the
`csharp\src\Microsoft.ML.OnnxRuntime\bin\RelWithDebInfo` directory
structure and stores binaries in there. There's a subsequent task in the
yaml file that tries to sign DLLs in the
`csharp\src\Microsoft.ML.OnnxRuntime\bin\RelWithDebInfo`, however this
task fails because the directory structure is now never created (due to
removal of `--build_nuget`).
ashrit-ms pushed a commit that referenced this pull request Apr 24, 2025
The Azure DevOps pipeline template
[/nuget/templates/dml-vs-2022.yml](https://github.com/microsoft/onnxruntime/blob/main/tools/ci_build/github/azure-pipelines/nuget/templates/dml-vs-2022.yml)
is used to build the ONNX Runtime DirectML (DML) components. It
historically contained two potential mechanisms for creating NuGet
packages:

1. Invoking `python tools/ci_build/build.py` with the `--build_nuget`
flag.
2. Executing a specific `NuPackScript` (usually calling `msbuild
/t:CreatePackage`).

This redundancy created a significant problem during release builds
(when the pipeline parameter IsReleaseBuild is set to true). Here's why:
- Duplicate Package Creation: Both packaging methods would execute.
- build.py --build_nuget created a package with a
development/pre-release version suffix (e.g.,
Microsoft.ML.OnnxRuntime.DirectML.1.21.1-dev-20250408-0849-84808eb710.nupkg).
- The NuPackScript's msbuild call, influenced by IsReleaseBuild=true,
created the clean release version package (e.g.,
Microsoft.ML.OnnxRuntime.DirectML.1.21.1.nupkg).
- ren Command Failure: For the x86 and arm64 builds, the NuPackScript
contains a command like:
    ```Bash
    ren Microsoft.ML.OnnxRuntime.DirectML.* win-dml-x86.zip
    ``` 
This command fails when two files match the pattern
Microsoft.ML.OnnxRuntime.DirectML.* (the dev package and the release
package), as ren requires a single source file when using wildcards for
renaming.
- Result: This caused build failures specifically when attempting to
create release candidates or final release builds for x86 and arm64 DML
components. This issue did not typically occur in regular nightly builds
(IsReleaseBuild: false) because only one package (the dev version) was
likely produced, allowing the ren command to succeed. Therefore we only
found the problem when doing a patch release for ONNX Runtime 1.21.

(@amarin16, the release manager of ONNX Runtime 1.21, found the issue
and explained it to us why the pipeline was not working)

The change is relatively simple. This PR removes the `--build_nuget`
flag from the `python tools/ci_build/build.py` command within the
dml-vs-2022.yml template. By removing the redundant packaging step from
build.py, only the NuPackScript's msbuild command generates a package
file. This ensures only one file matches the
Microsoft.ML.OnnxRuntime.DirectML.* pattern, allowing the subsequent ren
command in the x86 and arm64 scripts to execute successfully during
release builds.

# Background (how the DML packaging pipeline works)

The build has two stages:

1. Individual Architecture Builds (Using dml-vs-2022.yml): Each stage
(x64, x86, arm64) runs, now reliably using only its specific
NuPackScript to generate its artifact without the risk of the ren
command failing during release.
x64 produces: Microsoft.ML.OnnxRuntime.DirectML.[version].nupkg
x86 produces: win-dml-x86.zip
arm64 produces: win-dml-arm64.zip
(arm32 is not built/included).
2. Final Packaging Stage (e.g., stages/nuget_dml_packaging_stage.yml):
Downloads these artifacts and combines them by unpacking the base x64
.nupkg, injecting the contents of the .zip files into the appropriate
runtimes/ directories (e.g., runtimes/win-x86/native/,
runtimes/win-arm64/native/), and re-packing the final,
multi-architecture Microsoft.ML.OnnxRuntime.DirectML.nupkg.

In stage 1 only x64 produces a nuget package, therefore specific MSBuild
parameters: `/p:IsReleaseBuild=${{ parameters.IsReleaseBuild }}` is
passed to all architectures' MSBuild calls, while
`/p:CurrentData=$(BuildDate) /p:CurrentTime=$(BuildTime)` are passed
only in the x64 script. BTW, the property "CurrentData" apparently is a
typo. It should be `CurrentDate`.
intbf pushed a commit to intbf/onnxruntime that referenced this pull request Apr 25, 2025
Reverts microsoft#24372

The above PR removes the `build-nuget` command-line argument from the
`dml-vs-2022.yml` file. This PR reverts that change and adds the
`build-nuget` back to the file.

The `--build_nuget` option creates the
`csharp\src\Microsoft.ML.OnnxRuntime\bin\RelWithDebInfo` directory
structure and stores binaries in there. There's a subsequent task in the
yaml file that tries to sign DLLs in the
`csharp\src\Microsoft.ML.OnnxRuntime\bin\RelWithDebInfo`, however this
task fails because the directory structure is now never created (due to
removal of `--build_nuget`).

Signed-off-by: bfilipek <[email protected]>
vraspar pushed a commit that referenced this pull request Apr 28, 2025
Reverts #24372

The above PR removes the `build-nuget` command-line argument from the
`dml-vs-2022.yml` file. This PR reverts that change and adds the
`build-nuget` back to the file.


The `--build_nuget` option creates the
`csharp\src\Microsoft.ML.OnnxRuntime\bin\RelWithDebInfo` directory
structure and stores binaries in there. There's a subsequent task in the
yaml file that tries to sign DLLs in the
`csharp\src\Microsoft.ML.OnnxRuntime\bin\RelWithDebInfo`, however this
task fails because the directory structure is now never created (due to
removal of `--build_nuget`).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants