Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DmlExecutionProvider is missing after olive-ai installation #1619

Open
sashidhar540 opened this issue Feb 14, 2025 · 3 comments
Open

DmlExecutionProvider is missing after olive-ai installation #1619

sashidhar540 opened this issue Feb 14, 2025 · 3 comments
Labels
DirectML DirectML

Comments

@sashidhar540
Copy link

I am using a windows laptop with processor AMD Ryzen 7 PRO 7840U w/ Radeon 780M Graphics and 32 GB RAM with DirectX support.

I created a new conda environment and activated the environment using below commands:

conda create -n olive-directml python=3.12

conda activate olive-directml

My python version installed is 3.12.9

First I tried to check whether directml support is present for my device by installing onnxruntime-directml using below command:

pip install onnxruntime-directml

onnxruntime-directml version installed: 1.20.1

I checked the provider and device information using below code:

import onnxruntime as ort
providers = ort.get_available_providers()
print("Available providers:", providers)
print("Device:", ort.get_device())

Output:-

Available providers: ['DmlExecutionProvider', 'CPUExecutionProvider']
Device: CPU-DML

Next when I tried to follow github documentation for olive, and setup olive for Windows DirectML.

https://microsoft.github.io/Olive/getting-started/getting-started.html

I had installed the command:

pip install olive-ai[directml,finetune]

onnxruntime-directml version installed: 1.20.1

Now when I checked the provider and device information using below code:

import onnxruntime as ort
providers = ort.get_available_providers()
print("Available providers:", providers)
print("Device:", ort.get_device())

Output:-

Available providers: ['AzureExecutionProvider', 'CPUExecutionProvider']
Device: CPU

'DmlExecutionProvider' is missing after the olive-ai installation.

Is this DmlExecutionProvider missing is already known issue or am I doing something wrong?

Can I use integrated Radeon 780M GPU of my machine for inferencing using DirectML or not? If I can use, please let me know the steps to follow?

I was able to follow the Olive documentation for CPU and perform inferencing using Olive. But I would like to explore DirectML capabilities of my device as well for inferencing.

@jambayk
Copy link
Contributor

jambayk commented Feb 14, 2025

Hi, this happened because the onnxruntime-genai dependency for the finetune extra also installed the cpu onnxruntime package. This installation overrode your existing onnxruntime-directml installations.

I opened a PR to fix this #1620.

Meanwhile, you can fix this by

pip uninstall -y onnxruntime onnxruntime-directml onnxruntime-genai
pip install onnxruntime-directml onnxruntime-geani-directml

@sashidhar540
Copy link
Author

sashidhar540 commented Feb 15, 2025

I had again started from scratch to check the complete flow:

conda create --name olive-directml python=3.12

conda activate olive-directml

From the documentation https://microsoft.github.io/Olive/getting-started/getting-started.html I had followed the steps for Olive installation for Windows DirectML

pip install olive-ai[directml,finetune]

pip install transformers==4.44.2 onnxruntime-genai-directml

After following these steps, I could not find 'DmlExecutionProvider'.

So as per you above comments(typo in onnxruntime-geani-directml which corrected) I executed below commands:

pip uninstall -y onnxruntime onnxruntime-directml onnxruntime-genai
pip install onnxruntime-directml onnxruntime-genai-directml

Now I could see 'DmlExecutionProvider' in the providers list and device as 'CPU-DML'

Later I tried to execute below command for Automatic Optimization of Model with Olive:

olive auto-opt --model_name_or_path meta-llama/Llama-3.2-1B-Instruct --trust_remote_code --output_path models/Llama-3.2-1B-Instruct --device gpu --provider DmlExecutionProvider --use_ort_genai --precision int4 --log_level 1

I got error as ModuleNotFoundError: No module named 'onnxruntime_genai.models'

So I installed onnxruntime_genai without dependencies as below:

pip install onnxruntime-genai --no-deps

Even after this command is executed I could see 'DmlExecutionProvider' in the providers list and device as 'CPU-DML'

Again I tried to execute below command for Automatic Optimization of Model with Olive:

olive auto-opt --model_name_or_path meta-llama/Llama-3.2-1B-Instruct --trust_remote_code --output_path models/Llama-3.2-1B-Instruct --device gpu --provider DmlExecutionProvider --use_ort_genai --precision int4 --log_level 1

I used device as gpu and DmlExecutionProvider as provider.

Now the model got successfully downloaded and saved to the given output path.

When I try to run the model using below code:

import onnxruntime_genai as og

model_folder = "models/Llama-3.2-1B-Instruct/model"

model = og.Model(model_folder)
tokenizer = og.Tokenizer(model)
tokenizer_stream = tokenizer.create_stream()

search_options = {}
search_options['max_length'] = 4096
search_options['past_present_share_buffer'] = False

chat_template = """<|begin_of_text|><|start_header_id|>system<|end_header_id|>

You are a helpful assistant<|eot_id|><|start_header_id|>user<|end_header_id|>

{input}<|eot_id|><|start_header_id|>assistant<|end_header_id|>
"""

text = input("Input: ")

while text != "exit":
    if not text:
        print("Error, input cannot be empty")
        exit

    # generate prompt (prompt template + input)
    prompt = f'{chat_template.format(input=text)}'

    # encode the prompt using the tokenizer
    input_tokens = tokenizer.encode(prompt)

    params = og.GeneratorParams(model)
    params.set_search_options(**search_options)
    
    generator = og.Generator(model, params)
    generator.append_tokens(input_tokens)

    print("Output: ", end='', flush=True)
    # stream the output
    try:
        while not generator.is_done():
            generator.generate_next_token()

            new_token = generator.get_next_tokens()[0]
            print(tokenizer_stream.decode(new_token), end='', flush=True)
    except KeyboardInterrupt:
        print("  --control+c pressed, aborting generation--")

    print()
    text = input("Input: ")

I am getting error as:

model = og.Model(model_folder)
        ^^^^^^^^^^^^^^^^^^^^^^

RuntimeError: Unknown provider type: dml

The way I am passing device as gpu and provider as DmlExecutionProvider for Automatic Optimization of Model with Olive are they correct or not?

Help me resolve the issue. I am very much interested in utilizing the DirectML capabilities of my device.

@jambayk
Copy link
Contributor

jambayk commented Feb 20, 2025

I think the packages installations might have gotten mixed up again. Could you check pip list to see what onnxruntime and onnxruntime-genai packages are installed?
If there are more than one of each please uninstall all of them and then reinstall onnxruntime-directml onnxruntime-genai-directml

@devang-ml devang-ml added the DirectML DirectML label Mar 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
DirectML DirectML
Projects
None yet
Development

No branches or pull requests

3 participants