diff --git a/doc/RunGPTOSSWebGPU.md b/doc/RunGPTOSSWebGPU.md deleted file mode 100644 index e51ba8a6d2..0000000000 --- a/doc/RunGPTOSSWebGPU.md +++ /dev/null @@ -1,46 +0,0 @@ -# Run GPT OSS 20B with Web GPU (preview) - -These instructions allow you to experiment with GPT OSS 20B on any GPU. - -Note: once the model is officially supported by Foundry Local that will be the easiest way to run it. - -## Model - -Download the model - -```bash -https://huggingface.co/onnx-community/gpt-oss-20b-ONNX -``` - -Copy the files in the onnx folder into the parent folder - -Change the name of model_q4f16.onnx into model.onnx (as referenced in genai_config.json) - -## Runtime - -```bash -pip install onnxruntime-webgpu -pip install onnxruntime-genai --no-deps -``` - -## Example script - -```bash -curl -O https://raw.githubusercontent.com/microsoft/onnxruntime-genai/refs/heads/main/examples/python/model-qa.py -``` - -## Run the model - -```bash -python model-qa.py -m gpt-oss-20b-ONNX -``` - -Note that the reasoning tokens are not processed by this script and appear in the output. - -```bash -Prompt (Use quit() to exit): Hello - -Output: <|channel|>analysis<|message|>We need to respond to "Hello". Simple greeting. Probably a short hello back and ask how can help.<|end|><|start|>assistant<|channel|>final<|message|>Hello! 👋 How can I help you today? - -Prompt (Use quit() to exit): -``` diff --git a/doc/DownloadModels.md b/documents/DownloadModels.md similarity index 100% rename from doc/DownloadModels.md rename to documents/DownloadModels.md diff --git a/doc/Runtime_option.md b/documents/Runtime_option.md similarity index 100% rename from doc/Runtime_option.md rename to documents/Runtime_option.md