Error running phi-3 vision directml P5000 gpu #822

elephantpanda · 2024-08-20T20:47:49Z

I am running the phi3 vision directml tutorial code on NVidia Quadro P5000 GPU, 16GB VRAM, +12GB RAM (Windows 10) , but it fails when I try to put an image path in there:

It works without putting an image there.

I have tried both jpg and png images. Here is my image:

Any ideas what could be wrong?

I have 16GB GPU RAM and 12GB RAM and it's only using about half of it so I don't think that's the problem.

Come to think of it the phi-3 vision tutorial doesn't say it supports DML yet... even though there is a DML model. It says "Support for DirectML is coming soon!" But not sure how soon this means.

I tried it in C# and get the same error ☹

OnnxRuntimeGenAIException: Non-zero status code returned while running MemcpyToHost node. Name:'Memcpy_token_5' Status Message: D:\a\_work\1\s\onnxruntime\core\providers\dml\DmlExecutionProvider\src\MLOperatorAuthorImpl.cpp(2557)\onnxruntime.dll!00007FF8171EFC45: (caller: 00007FF81780254D) Exception(9) tid(5324) 887A0006 The GPU will not respond to more commands, most likely because of an invalid command passed by the calling application.

Microsoft.ML.OnnxRuntimeGenAI.Result.VerifySuccess (System.IntPtr nativeResult) (at D:/a/_work/1/onnxruntime-genai/src/csharp/Result.cs:26)
Microsoft.ML.OnnxRuntimeGenAI.Generator.ComputeLogits () (at D:/a/_work/1/onnxruntime-genai/src/csharp/Generator.cs:25)

I feel like my specifications meet above the recommended.
(I also tried it with the CPU only version and it works but is incredibly slow. e.g. 5 minutes+ to get a response even with a very small image. The image size doesn't seem to make a difference which is odd(!) I'm not sure how the vision thing works. Is it iterating over every small patch or something?).

The text was updated successfully, but these errors were encountered:

elephantpanda · 2024-08-20T22:44:28Z

~~I think it is a something to do with input length.~~
Since I can get the same bug by making a really long prompt like this:
for (int i = 0; i < 500; i++) prompt += " cat";
But it is a a bit weird to get a memory bug when it is only using half my VRAM and RAM. As has been noted by other people it seems to have a problem with long contexts and has memory bugs.

[Actually the problem with certain prompts causing an error seems to be a different bug]

As an aside, the image seems to be compressed to about 2500 tokens (50x50?). Is there a way to lower this for smaller images?

elephantpanda · 2024-08-26T05:31:36Z

Same bug is in version 0.4.0

elephantpanda · 2024-08-27T04:56:00Z

See also here, for running just the vision part of the model in onnxruntime.

github-actions bot added ep:DML model:transformer platform:windows labels Aug 20, 2024

elephantpanda mentioned this issue Aug 26, 2024

[Performance] Why does genai run 2x as fast as vanilla managed onnxruntime? microsoft/onnxruntime#21847

Open

elephantpanda mentioned this issue Aug 27, 2024

Error when trying to run vision model onnx microsoft/onnxruntime#21869

Open

baijumeswani assigned PatriceVignola Sep 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error running phi-3 vision directml P5000 gpu #822

Error running phi-3 vision directml P5000 gpu #822

elephantpanda commented Aug 20, 2024 •

edited

Loading

elephantpanda commented Aug 20, 2024 •

edited

Loading

elephantpanda commented Aug 26, 2024

elephantpanda commented Aug 27, 2024

Error running phi-3 vision directml P5000 gpu #822

Error running phi-3 vision directml P5000 gpu #822

Comments

elephantpanda commented Aug 20, 2024 • edited Loading

elephantpanda commented Aug 20, 2024 • edited Loading

elephantpanda commented Aug 26, 2024

elephantpanda commented Aug 27, 2024

elephantpanda commented Aug 20, 2024 •

edited

Loading

elephantpanda commented Aug 20, 2024 •

edited

Loading