Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error running phi-3 vision directml P5000 gpu #822

Open
elephantpanda opened this issue Aug 20, 2024 · 3 comments
Open

Error running phi-3 vision directml P5000 gpu #822

elephantpanda opened this issue Aug 20, 2024 · 3 comments

Comments

@elephantpanda
Copy link

elephantpanda commented Aug 20, 2024

I am running the phi3 vision directml tutorial code on NVidia Quadro P5000 GPU, 16GB VRAM, +12GB RAM (Windows 10) , but it fails when I try to put an image path in there:

image

It works without putting an image there.

image

I have tried both jpg and png images. Here is my image:
decoded

Any ideas what could be wrong?

I have 16GB GPU RAM and 12GB RAM and it's only using about half of it so I don't think that's the problem.

Come to think of it the phi-3 vision tutorial doesn't say it supports DML yet... even though there is a DML model. It says "Support for DirectML is coming soon!" But not sure how soon this means.

I tried it in C# and get the same error ☹

OnnxRuntimeGenAIException: Non-zero status code returned while running MemcpyToHost node. Name:'Memcpy_token_5' Status Message: D:\a\_work\1\s\onnxruntime\core\providers\dml\DmlExecutionProvider\src\MLOperatorAuthorImpl.cpp(2557)\onnxruntime.dll!00007FF8171EFC45: (caller: 00007FF81780254D) Exception(9) tid(5324) 887A0006 The GPU will not respond to more commands, most likely because of an invalid command passed by the calling application.

Microsoft.ML.OnnxRuntimeGenAI.Result.VerifySuccess (System.IntPtr nativeResult) (at D:/a/_work/1/onnxruntime-genai/src/csharp/Result.cs:26)
Microsoft.ML.OnnxRuntimeGenAI.Generator.ComputeLogits () (at D:/a/_work/1/onnxruntime-genai/src/csharp/Generator.cs:25)

I feel like my specifications meet above the recommended.
(I also tried it with the CPU only version and it works but is incredibly slow. e.g. 5 minutes+ to get a response even with a very small image. The image size doesn't seem to make a difference which is odd(!) I'm not sure how the vision thing works. Is it iterating over every small patch or something?).

@elephantpanda
Copy link
Author

elephantpanda commented Aug 20, 2024

I think it is a something to do with input length.
Since I can get the same bug by making a really long prompt like this:
for (int i = 0; i < 500; i++) prompt += " cat";
But it is a a bit weird to get a memory bug when it is only using half my VRAM and RAM. As has been noted by other people it seems to have a problem with long contexts and has memory bugs.

[Actually the problem with certain prompts causing an error seems to be a different bug]

As an aside, the image seems to be compressed to about 2500 tokens (50x50?). Is there a way to lower this for smaller images?

@elephantpanda
Copy link
Author

Same bug is in version 0.4.0

@elephantpanda
Copy link
Author

See also here, for running just the vision part of the model in onnxruntime.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants