Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add Run With ollama API On M1 Mac (run_for_ollama_api_in_M1_mac.sh) #322

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,12 @@ bash scripts/run_for_openai_api_with_gpu_in_Linux_or_WSL.sh
bash scripts/run_for_openai_api_in_M1_mac.sh
```

## Run With ollama API On M1 Mac

```bash
bash scripts/run_for_ollama_api_in_M1_mac.sh
```

## Run With 3B LLM (MiniChat-2-3B-INT8-GGUF) On M1 Mac
```bash
bash scripts/run_for_3B_in_M1_mac.sh
Expand Down
6 changes: 6 additions & 0 deletions README_zh.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,12 @@ bash scripts/run_for_openai_api_with_gpu_in_Linux_or_WSL.sh
bash scripts/run_for_openai_api_in_M1_mac.sh
```

## 在M1Mac环境下使用Ollama API

```bash
bash scripts/run_for_ollama_api_in_M1_mac.sh
```

## 在M1Mac环境下使用3B LLM((MiniChat-2-3B-INT8-GGUF)

```bash
Expand Down
6 changes: 5 additions & 1 deletion qanything_kernel/connector/llm/llm_for_openai_api.py
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,7 @@ def num_tokens_from_messages(self, messages, model=None):
"gpt-4-32k-0613",
"gpt-4-32k",
# "gpt-4-1106-preview",
"qwen:32b",
}:
tokens_per_message = 3
tokens_per_name = 1
Expand All @@ -97,7 +98,10 @@ def num_tokens_from_messages(self, messages, model=None):
# 对于 gpt-4 模型可能会有更新,此处返回假设为 gpt-4-0613 的token数量,并给出警告
debug_logger.info("Warning: gpt-4 may update over time. Returning num tokens assuming gpt-4-0613.")
return self.num_tokens_from_messages(messages, model="gpt-4-0613")

elif "qwen:32b" in model:
# 对于 qwen 模型可能会有更新,此处返回假设为 qwen:32b 的token数量,并给出警告
debug_logger.info("Warning: qwen may update over time. Returning num tokens assuming qwen:32b.")
return self.num_tokens_from_messages(messages, model="qwen:32b")
else:
# 对于没有实现的模型,抛出未实现错误
raise NotImplementedError(
Expand Down
2 changes: 2 additions & 0 deletions scripts/run_for_ollama_api_in_M1_mac.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
#!/bin/bash
bash scripts/base_run.sh -s "M1mac" -w 4 -m 19530 -q 8777 -o -b 'http://localhost:11434/v1' -k 'ollama' -n 'qwen:32b' -M '32B' -l '4096'