Skip to content

Conversation

@fs-eire
Copy link
Contributor

@fs-eire fs-eire commented Jun 6, 2025

Description

In-memory initializers are currently not handled correctly for non-CPU devices. If the memory planner assigned the initializer to a non-CPU device, the initializer is not copied to the device during the initialization steps.

This PR fixes this problem by adding an extra step to copy the initializer to the device if needed.

Motivation and Context

Fixes #24768

@fs-eire fs-eire requested review from Copilot and skottmckay June 6, 2025 21:10
@fs-eire
Copy link
Contributor Author

fs-eire commented Jun 6, 2025

@huningxin

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR addresses an issue where in-memory initializers assigned to non-CPU devices were not correctly copied to their target device. The changes add an extra step to check the device type and, if needed, allocate a new tensor on the non-CPU device and perform a copy of the initializer.

  • Added a conditional branch to check if the device is non-CPU and perform tensor copying.
  • Incorporated error handling for string tensors during the copy operation.
  • Maintained backward compatibility by preserving the original initializer handling logic when on CPU.
Comments suppressed due to low confidence (1)

onnxruntime/core/framework/session_state_utils.cc:415

  • [nitpick] Consider adding a comment explaining why initializers assigned to non-CPU devices require an explicit copy from CPU memory, to help future maintainers understand the rationale behind this conditional branch.
if (device_type != OrtDevice::CPU) {

// if the initializer is on a non-CPU device, copy it from CPU to the device.
const auto& initializer_tensor = ort_value.Get<Tensor>();
if (initializer_tensor.GetElementType() == ONNX_NAMESPACE::TensorProto_DataType_STRING) {
return ORT_MAKE_STATUS(ONNXRUNTIME, FAIL, "string tensor is not supported for copying between allocators");
Copy link

Copilot AI Jun 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message for string tensors ('string tensor is not supported for copying between allocators') could be enhanced with additional context to guide users toward proper usage or handling. Consider providing a more descriptive message or suggesting alternative approaches where applicable.

Suggested change
return ORT_MAKE_STATUS(ONNXRUNTIME, FAIL, "string tensor is not supported for copying between allocators");
return ORT_MAKE_STATUS(ONNXRUNTIME, FAIL,
"String tensor copying between allocators is not supported due to limitations in ONNX Runtime. "
"Consider preprocessing string tensors on the CPU or avoiding operations that require copying them between devices.");

Copilot uses AI. Check for mistakes.
@snnn
Copy link
Contributor

snnn commented Jun 7, 2025

Add a test case?

@fs-eire
Copy link
Contributor Author

fs-eire commented Jun 9, 2025

The original issue can be fixed by #23979. As suggested by @skottmckay we can wait for it to be merged first.

@fs-eire
Copy link
Contributor Author

fs-eire commented Jul 9, 2025

Close this PR. The problem is fixed by #23979.

@fs-eire fs-eire closed this Jul 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Native WebGPU EP fails to run model with in-memory external data

2 participants