3. 模型配置

Coze Studio 是基于大语言模型的 AI 应用开发平台，首次部署运行 Coze Studio 开源版之前，你需要先克隆到本地的项目中，配置所需要的模型。项目正常运行过程中，也可以随时按需添加新的模型服务、删除不需要的模型服务。

模型列表

Coze Studio 支持的模型服务如下：

火山方舟 | Byteplus ModelArk
OpenAI
DeepSeek
Claude
Ollama
Qwen
Gemini

模型配置说明

在 Coze Studio 开源版中，模型配置统一放在backend/conf/model 目录中，目录下存在多个 yaml 文件，每个文件对应一个可访问的模型。为方便开发者快速配置，Coze Studio 在 backend/conf/model/template 目录下提供了一些模板文件，覆盖了常见的模型类型，例如火山方舟、OpenAI 等。开发者可以找到对应厂商的模型模板，复制到backend/conf/model 目录，根据模板注释设置各个参数。

注意事项

在填写模型配置文件之前，请确保你已经了解了以下注意事项：

保证每个模型文件 id 唯一，且配置上线后不修改 id。id 是模型配置在系统中流转的标识，修改模型 id 可能导致已有的智能体功能出现问题。
在删除模型之前，请确保此模型已无线上流量。
智能体或工作流根据模型 ID 来调用模型。对于已上线的模型，请勿修改模型 ID，否则可能导致模型调用失败。

为 Coze Studio 配置模型

Coze Studio 是基于大语言模型的 AI 应用开发平台，首次部署并启动 Coze Studio 开源版之前，你需要先在 Coze Studio 项目里配置模型服务，否则创建智能体或者工作流时，无法正常选择模型。首次部署并设置方舟模型服务，可直接参考快速开始完成配置；如需添加更多模型，或改为其他模型服务，可以参考以下步骤。

步骤一：修改模型配置文件

复制模板文件。
1. 在backend/conf/template目录下，根据要添加的模型名称找到对应的模版 yaml 文件，例如 OpenAI 模型的配置文件模型为 model_template_openai.yaml。
2. 将模板文件复制到 backend/conf/model 目录下。
修改模型配置。
1. 进入目录 backend/conf/model，打开步骤 1 中复制的文件。
2. 修改文件中的 id 、meta.conn_config.api_key、meta.conn_config.model 字段，保存文件。
  - id：Coze Studio 中的模型 ID，由开发者自行定义，必须是非 0 的整数，且全局唯一。智能体或工作流根据模型 ID 来调用模型。对于已上线的模型，请勿修改模型 ID，否则可能导致模型调用失败。
  - meta.conn_config.api_key：模型服务的 API Key，在本示例中为火山方舟的 API Key，获取方式可参考获取火山方舟 API Key 或获取 Byteplus ModelArk API Key。
  - meta.conn_config.model：模型服务的 Model name，在本示例中为火山方舟的 Model ID 或 Endpoint ID，获取方式可参考获取火山方舟 Model ID / 获取火山方舟 Endpoint ID，或者参考获取 BytePlus ModelArk Model ID / 获取 BytePlus ModelArk Endpoint ID。
  中国境内用户可选用火山方舟（Volcengine Ark），非中国境内的用户则可用 BytePlus ModelArk。
  
  其他参数可维持默认配置，你也可以按需修改。以方舟模型为例，配置方式可参考火山方舟模型列表。

步骤二：重新启动服务

修改配置文件之后，执行以下命令重启服务，使配置生效。

docker compose --profile "*" restart coze-server

服务启动成功之后，打开agent编辑页面，在模型下拉列表里选择已配置的模型。

配置参考

Coze Studio 支持各种常见的模型服务和模型协议，你可以根据模型服务的类型配置对应的模板文件、protocol 与 base_url，例如火山方舟等三方模型可参考[第三方模型服务] 部分配置对应的模板文件、OpenAI 等官方模型服务，可参考[官方模型服务]部分。所有模型服务的模板文件均位于 /backend/conf/model/template，你需要拷贝至 /backend/conf/model 后修改，重启服务后生效。另外，base_url 对于【基础组件配置】中【Embedding 配置】通用。

第三方模型服务

平台	基础模板文件名	protocol	base_url	特别说明
火山方舟	model_template_ark.yaml	ark	国内火山引擎：https://ark.cn-beijing.volces.com/api/v3/ 海外 BytePlus：https://ark.ap-southeast.bytepluses.com/api/v3/	无
阿里百炼	model_template_openai.yaml 或 model_template_qwen.yaml	openai 或 qwen	https://dashscope.aliyuncs.com/compatible-mode/v1	qwen3 系列在非流式调用时不支持 thinking，如果使用需要设置 conn_config 中 enable_thinking: false，coze studio后续版本会适配此能力。
硅基流动	model_template_openai.yaml	openai	https://api.siliconflow.cn/v1	无
其他第三方 api 中转	model_template_openai.yaml	openai	api 文档中提供的地址注意路径通常带 /v1 后缀，不带 /chat/completions 后缀	如果平台仅中转或代理模型服务，非 openai 模型请按照【官方模型服务】部分文档配置 protocol

开源框架

框架	基础模板文件名	protocol	base_url	特别说明
ollama	model_template_ollama.yaml	ollama	http://${ip}:11434	1. 镜像网络模式是 bridge，coze-server 镜像内 localhost 不是主机的 localhost。需要修改为 ollama 部署机器的 ip，或 `http://host.docker.internal:11434` 2. 检查 api_key：未设置 api_key 时，此参数置空。 3. 确认部署 Ollama 主机的防火墙是否已开放 11434 端口。 4. 确认 ollama 网络已开启对外暴露。
vllm	model_template_openai.yaml	openai	http://${ip}:8000/v1（port 启动时指定）	无
xinference	model_template_openai.yaml	openai	http://${ip}:9997/v1 （port 启动时指定）	无
sglang	model_template_openai.yaml	openai	http://${ip}:35140/v1（port 启动时指定）	无
LMStudio	model_template_openai.yaml	openai	http://${ip}:${port}/v1	无

官方模型服务

模型	基础模板文件名	protocol	base_url	特别说明
Doubao	model_template_ark.yaml	ark	https://ark.cn-beijing.volces.com/api/v3/	无
OpenAI	model_template_openai.yaml	openai	https://api.openai.com/v1	检查 by_azure 字段配置，如果模型是微软 azure 提供的模型服务，此参数应设置为 true。
Deepseek	model_template_deepseek.yaml	deepseek	https://api.deepseek.com/	无
Qwen	model_template_qwen.yaml	qwen	https://dashscope.aliyuncs.com/compatible-mode/v1	qwen3 系列在非流式调用时不支持 thinking，如果使用需要设置 conn_config 中 enable_thinking: false，coze studio后续版本会适配此能力。
Gemini	model_template_gemini.yaml	gemini	https://generativelanguage.googleapis.com/	无
Claude	model_template_claude.yaml	claude	https://api.anthropic.com/v1/	无

字段说明

模型信息

模型元信息文件中描述了模型基础能力和模型连接信息。

基础模型元信息模板见 backend/conf/model/template/model_template_basic.yaml。
各模型的元信息模板见 backend/conf/model/template 下的对应模型名称文件。

以下为完整的模型配置字段，其中标明了各字段的含义：

字段名称	是否必选	示例	参数描述
id	是	0	模型 id
name	是	test_model	模型平台展示名称
icon_uri	否	default_icon/doubao_v2.png	模型展示图片 uri uri 为 docker/volumes/minio 下的相对路径；模板文件中填写的 uri 已经默认上传，可直接使用。
icon_url	否	test_icon_url	模型展示图片 url 可配置静态链接，展示时优先使用 url 配置，url 为空时使用 uri
description	否	-	模型默认描述
description.zh	否	这是模型描述信息	中文版模型描述，用于平台展示
description.en	否	This is model description	英文版模型描述，用于平台展示
default_parameters	否	-	模型参数列表列表中元素整体可以参考模板文件
default_parameters.name	是	temperature	模型参数名称，枚举值： temperature, top_p, top_k, max_tokens, response_format, frequency_penalty, presence_penalty
default_parameters.label	是	-	模型参数平台展示名称
default_parameters.label.zh	是	生成随机性	模型参数平台展示名称-中文
default_parameters.label.en	是	Temperature	模型参数平台展示名称-英文
default_parameters.desc	是	-	模型参数平台展示描述
default_parameters.desc.zh	是	temperature: 调高温度会使得模型的输出更多样性和创新性	模型参数平台展示描述-中文
default_parameters.desc.en	是	Temperature: When you increase this value, the model outputs more diverse and innovative content	模型参数平台展示描述-英文
default_parameters.type	是	int	字段值类型，枚举值： int，float，boolean，string
default_parameters.min	否	'0'	（数值类型时）字段最小值
default_parameters.max	否	'1'	（数值类型时）字段最大值
default_parameters.default_val	是	-	精准 / 平衡 / 创意 / 自定义模式下的默认值
default_parameters.default_val.default_val	是	'1.0'	自定义模式下的默认值
default_parameters.default_val.creative	否	'1.0'	创意模式下的默认值
default_parameters.default_val.balance	否	'0.8'	平衡模式下的默认值
default_parameters.default_val.precise	否	'0.3'	精准模式下的默认值
default_parameters.precision	否	2	（类型为 float 时）精度
default_parameters.style	是	-	展示类型
default_parameters.style.widget	是	slider	展示样式，枚举值： slider（滑块） radio_button（按钮）
default_parameters.style.label	是	-	归类
default_parameters.style.label.zh	是	生成多样性	归类标签名称-中文
default_parameters.style.label.en	是	Generation diversity	归类标签名称-英文
meta	是	-	模型元信息
meta.name	是	test_model_name	模型名称，用于记录，不展示
meta.protocol	是	test_protocol	模型连接协议
meta.capability	是	-	模型基础能力
meta.capability.function_call	否	true	模型是否支持 function call
meta.capability.input_modal	否	["text", "image", "audio", "video"]	模型输入支持模态
meta.capability.input_tokens	否	1024	输入 token 上限
meta.capability.output_modal	否	["text", "image", "audio", "video"]	模型输出支持模态
meta.capability.output_tokens	否	1024	输出 token 上限
meta.capability.max_tokens	否	2048	最大 token 数量
meta.capability.json_mode	否	true	是否支持 json mode
meta.capability.prefix_caching	否	false	是否支持 prefix caching
meta.capability.reasoning	否	false	是否支持 reasoning
meta.conn_config	是	-	模型连接参数
meta.conn_config.base_url	是	https://localhost:1234/chat/completion	模型服务基础URL
meta.conn_config.api_key	是	qweasdzxc	API密钥
meta.conn_config.timeout	否	100	超时时间（纳秒）
meta.conn_config.model	是	model_name	模型名称
meta.conn_config.temperature	否	0.7	默认 temperature
meta.conn_config.frequency_penalty	否	0	默认 frequency_penalty
meta.conn_config.presence_penalty	否	0	默认 presence_penalty
meta.conn_config.max_tokens	否	2048	默认 max_tokens
meta.conn_config.top_p	否	0	默认 top_p
meta.conn_config.top_k	否	0	默认 top_k
meta.conn_config.enable_thinking	否	false	是否启用思考过程
meta.conn_config.stop	否	["bye"]	停止词列表
meta.conn_config.openai	否	-	OpenAI专用配置
meta.conn_config.openai.by_azure	否	true	是否使用Azure
meta.conn_config.openai.api_version	否	2024-10-21	API版本
meta.conn_config.openai.response_format.type	否	text	响应格式类型
meta.conn_config.claude	否	-	Claude专用配置
meta.conn_config.claude.by_bedrock	否	true	是否使用Bedrock
meta.conn_config.claude.access_key	否	bedrock_ak	Bedrock访问密钥
meta.conn_config.claude.secret_access_key	否	bedrock_secret_ak	Bedrock密钥
meta.conn_config.claude.session_token	否	bedrock_session_token	Bedrock会话令牌
meta.conn_config.claude.region	否	bedrock_region	Bedrock地区
meta.conn_config.ark	否	-	Ark专用配置
meta.conn_config.ark.region	否	region	地区
meta.conn_config.ark.access_key	否	ak	访问密钥
meta.conn_config.ark.secret_key	否	sk	密钥
meta.conn_config.ark.retry_times	否	123	重试次数
meta.conn_config.ark.custom_header	否	{"key": "val"}	自定义请求头
meta.conn_config.deepseek	否	-	Deepseek专用配置
meta.conn_config.deepseek.response_format_type	否	text	响应格式类型
meta.conn_config.qwen	否	-	Qwen专用配置
meta.conn_config.qwen.response_format	否	-	返回格式化
meta.conn_config.gemini	否	-	Gemini专用配置
meta.conn_config.gemini.backend	否	0	Gemini 后端配置 * 0：默认 * 1：GeminiAPI * 2：VertexAI
meta.conn_config.gemini.project	否	test_project	GCP Project ID for Vertex AI backend=2 时必填
meta.conn_config.gemini.location	否	test_loc	GCP Location/Region for Vertex AI. backend=2 时必填
meta.conn_config.gemini.api_version	否	v1beta	api 版本
meta.conn_config.headers	否	-	http headers
meta.conn_config.timeout_ms	否	-	http 超时时间
meta.conn_config.include_thoughts	否	true	返回是否包含 thinking 内容
meta.conn_config.thinking_budget	否	123	thinking token 消耗预算
meta.status	否	1	模型状态 * 0: 未配置时的默认状态，等同于 1 * 1: 应用中，可使用可新建 * 5: 待下线，可使用不可新建 * 10: 已下线，不可使用不可新建

配置文件示例

各模型的完整配置示例及字段说明可参考 backend/conf/model/template/model_template_basic.yaml。你也可以参考以下最简配置，大部分字段配置基本相同，差异主要在于 protocol 与 conn_config。

Ark 火山方舟

id: 2002
name: Doubao Model
icon_uri: default_icon/doubao_v2.png
icon_url: ""
description:
    zh: 豆包模型简介
    en: doubao model description
default_parameters:
    - name: temperature
      label:
        zh: 生成随机性
        en: Temperature
      desc:
        zh: '- **temperature**: 调高温度会使得模型的输出更多样性和创新性，反之，降低温度会使输出内容更加遵循指令要求但减少多样性。建议不要与“Top p”同时调整。'
        en: '**Temperature**:\n\n- When you increase this value, the model outputs more diverse and innovative content; when you decrease it, the model outputs less diverse content that strictly follows the given instructions.\n- It is recommended not to adjust this value with \"Top p\" at the same time.'
      type: float
      min: "0"
      max: "1"
      default_val:
        balance: "0.8"
        creative: "1"
        default_val: "1.0"
        precise: "0.3"
      precision: 1
      options: []
      style:
        widget: slider
        label:
            zh: 生成多样性
            en: Generation diversity
    - name: max_tokens
      label:
        zh: 最大回复长度
        en: Response max length
      desc:
        zh: 控制模型输出的Tokens 长度上限。通常 100 Tokens 约等于 150 个中文汉字。
        en: You can specify the maximum length of the tokens output through this value. Typically, 100 tokens are approximately equal to 150 Chinese characters.
      type: int
      min: "1"
      max: "4096"
      default_val:
        default_val: "4096"
      options: []
      style:
        widget: slider
        label:
            zh: 输入及输出设置
            en: Input and output settings
    - name: top_p
      label:
        zh: Top P
        en: Top P
      desc:
        zh: '- **Top p 为累计概率**: 模型在生成输出时会从概率最高的词汇开始选择，直到这些词汇的总概率累积达到Top p 值。这样可以限制模型只选择这些高概率的词汇，从而控制输出内容的多样性。建议不要与“生成随机性”同时调整。'
        en: '**Top P**:\n\n- An alternative to sampling with temperature, where only tokens within the top p probability mass are considered. For example, 0.1 means only the top 10% probability mass tokens are considered.\n- We recommend altering this or temperature, but not both.'
      type: float
      min: "0"
      max: "1"
      default_val:
        default_val: "0.7"
      precision: 2
      options: []
      style:
        widget: slider
        label:
            zh: 生成多样性
            en: Generation diversity
    - name: response_format
      label:
        zh: 输出格式
        en: Response format
      desc:
        zh: '- **文本**: 使用普通文本格式回复\n- **Markdown**: 将引导模型使用Markdown格式输出回复\n- **JSON**: 将引导模型使用JSON格式输出'
        en: '**Response Format**:\n\n- **Text**: Replies in plain text format\n- **Markdown**: Uses Markdown format for replies\n- **JSON**: Uses JSON format for replies'
      type: int
      min: ""
      max: ""
      default_val:
        default_val: "0"
      options:
        - label: Text
          value: "0"
        - label: Markdown
          value: "1"
        - label: JSON
          value: "2"
      style:
        widget: radio_buttons
        label:
            zh: 输入及输出设置
            en: Input and output settings
meta:
    name: Doubao
    protocol: ark
    capability:
        function_call: true
        input_modal:
            - text
            - image
        input_tokens: 128000
        json_mode: false
        max_tokens: 128000
        output_modal:
            - text
        output_tokens: 16384
        prefix_caching: false
        reasoning: false
        prefill_response: false
    conn_config:
        base_url: ""
        api_key: ""
        timeout: 0s
        model: ""
        temperature: 0.1
        frequency_penalty: 0
        presence_penalty: 0
        max_tokens: 4096
        top_p: 0.7
        top_k: 0
        stop: []
        openai: null
        claude: null
        ark:
            region: ""
            access_key: ""
            secret_key: ""
            retry_times: null
            custom_header: {}
        deepseek: null
        qwen: null
        gemini: null
        custom: {}
    status: 0

Claude

id: 2006
name: Claude-3.5-Sonnet
icon_uri: default_icon/claude_v2.png
icon_url: ""
description:
    zh: claude 模型简介
    en: claude model description
default_parameters:
    - name: temperature
      label:
        zh: 生成随机性
        en: Temperature
      desc:
        zh: '- **temperature**: 调高温度会使得模型的输出更多样性和创新性，反之，降低温度会使输出内容更加遵循指令要求但减少多样性。建议不要与“Top p”同时调整。'
        en: '**Temperature**:\n\n- When you increase this value, the model outputs more diverse and innovative content; when you decrease it, the model outputs less diverse content that strictly follows the given instructions.\n- It is recommended not to adjust this value with \"Top p\" at the same time.'
      type: float
      min: "0"
      max: "1"
      default_val:
        balance: "0.8"
        creative: "1"
        default_val: "1.0"
        precise: "0.3"
      precision: 1
      options: []
      style:
        widget: slider
        label:
            zh: 生成多样性
            en: Generation diversity
    - name: max_tokens
      label:
        zh: 最大回复长度
        en: Response max length
      desc:
        zh: 控制模型输出的Tokens 长度上限。通常 100 Tokens 约等于 150 个中文汉字。
        en: You can specify the maximum length of the tokens output through this value. Typically, 100 tokens are approximately equal to 150 Chinese characters.
      type: int
      min: "1"
      max: "4096"
      default_val:
        default_val: "4096"
      options: []
      style:
        widget: slider
        label:
            zh: 输入及输出设置
            en: Input and output settings
meta:
    name: Claude-3.5-Sonnet
    protocol: claude
    capability:
        function_call: true
        input_modal:
            - text
            - image
        input_tokens: 128000
        json_mode: false
        max_tokens: 128000
        output_modal:
            - text
        output_tokens: 16384
        prefix_caching: false
        reasoning: false
        prefill_response: false
    conn_config:
        base_url: ""
        api_key: ""
        timeout: 0s
        model: ""
        temperature: 0.7
        frequency_penalty: 0
        presence_penalty: 0
        max_tokens: 4096
        top_p: 1
        top_k: 0
        stop: []
        openai: null
        claude:
            by_bedrock: false
            access_key: ""
            secret_access_key: ""
            session_token: ""
            region: ""
        ark: null
        deepseek: null
        qwen: null
        gemini: null
        custom: {}
    status: 0

Deepseek

id: 2004
name: DeepSeek-V3
icon_uri: default_icon/deepseek_v2.png
icon_url: ""
description:
    zh: deepseek 模型简介
    en: deepseek model description
default_parameters:
    - name: temperature
      label:
        zh: 生成随机性
        en: Temperature
      desc:
        zh: '- **temperature**: 调高温度会使得模型的输出更多样性和创新性，反之，降低温度会使输出内容更加遵循指令要求但减少多样性。建议不要与“Top p”同时调整。'
        en: '**Temperature**:\n\n- When you increase this value, the model outputs more diverse and innovative content; when you decrease it, the model outputs less diverse content that strictly follows the given instructions.\n- It is recommended not to adjust this value with \"Top p\" at the same time.'
      type: float
      min: "0"
      max: "1"
      default_val:
        balance: "0.8"
        creative: "1"
        default_val: "1.0"
        precise: "0.3"
      precision: 1
      options: []
      style:
        widget: slider
        label:
            zh: 生成随机性
            en: Generation diversity
    - name: max_tokens
      label:
        zh: 最大回复长度
        en: Response max length
      desc:
        zh: 控制模型输出的Tokens 长度上限。通常 100 Tokens 约等于 150 个中文汉字。
        en: You can specify the maximum length of the tokens output through this value. Typically, 100 tokens are approximately equal to 150 Chinese characters.
      type: int
      min: "1"
      max: "4096"
      default_val:
        default_val: "4096"
      options: []
      style:
        widget: slider
        label:
            zh: 输入及输出设置
            en: Input and output settings
    - name: response_format
      label:
        zh: 输出格式
        en: Response format
      desc:
        zh: '- **文本**: 使用普通文本格式回复\n- **JSON**: 将引导模型使用JSON格式输出'
        en: '**Response Format**:\n\n- **Text**: Replies in plain text format\n- **Markdown**: Uses Markdown format for replies\n- **JSON**: Uses JSON format for replies'
      type: int
      min: ""
      max: ""
      default_val:
        default_val: "0"
      options:
        - label: Text
          value: "0"
        - label: JSON Object
          value: "1"
      style:
        widget: radio_buttons
        label:
            zh: 输入及输出设置
            en: Input and output settings
meta:
    name: DeepSeek-V3
    protocol: deepseek
    capability:
        function_call: false
        input_modal:
            - text
        input_tokens: 128000
        json_mode: false
        max_tokens: 128000
        output_modal:
            - text
        output_tokens: 16384
        prefix_caching: false
        reasoning: false
        prefill_response: false
    conn_config:
        base_url: ""
        api_key: ""
        timeout: 0s
        model: ""
        temperature: 0.7
        frequency_penalty: 0
        presence_penalty: 0
        max_tokens: 4096
        top_p: 1
        top_k: 0
        stop: []
        openai: null
        claude: null
        ark: null
        deepseek:
            response_format_type: text
        qwen: null
        gemini: null
        custom: {}
    status: 0

Ollama

id: 2003
name: Gemma-3
icon_uri: default_icon/ollama.png
icon_url: ""
description:
    zh: ollama 模型简介
    en: ollama model description
default_parameters:
    - name: temperature
      label:
        zh: 生成随机性
        en: Temperature
      desc:
        zh: '- **temperature**: 调高温度会使得模型的输出更多样性和创新性，反之，降低温度会使输出内容更加遵循指令要求但减少多样性。建议不要与“Top p”同时调整。'
        en: '**Temperature**:\n\n- When you increase this value, the model outputs more diverse and innovative content; when you decrease it, the model outputs less diverse content that strictly follows the given instructions.\n- It is recommended not to adjust this value with \"Top p\" at the same time.'
      type: float
      min: "0"
      max: "1"
      default_val:
        balance: "0.8"
        creative: "1"
        default_val: "1.0"
        precise: "0.3"
      precision: 1
      options: []
      style:
        widget: slider
        label:
            zh: 生成多样性
            en: Generation diversity
    - name: max_tokens
      label:
        zh: 最大回复长度
        en: Response max length
      desc:
        zh: 控制模型输出的Tokens 长度上限。通常 100 Tokens 约等于 150 个中文汉字。
        en: You can specify the maximum length of the tokens output through this value. Typically, 100 tokens are approximately equal to 150 Chinese characters.
      type: int
      min: "1"
      max: "4096"
      default_val:
        default_val: "4096"
      options: []
      style:
        widget: slider
        label:
            zh: 输入及输出设置
            en: Input and output settings
meta:
    name: Gemma-3
    protocol: ollama
    capability:
        function_call: true
        input_modal:
            - text
        input_tokens: 128000
        json_mode: false
        max_tokens: 128000
        output_modal:
            - text
        output_tokens: 16384
        prefix_caching: false
        reasoning: false
        prefill_response: false
    conn_config:
        base_url: ""
        api_key: ""
        timeout: 0s
        model: ""
        temperature: 0.6
        frequency_penalty: 0
        presence_penalty: 0
        max_tokens: 4096
        top_p: 0.95
        top_k: 20
        stop: []
        openai: null
        claude: null
        ark: null
        deepseek: null
        qwen: null
        gemini: null
        custom: {}
    status: 0

OpenAI

id: 2001
name: GPT-4o
icon_uri: default_icon/openai_v2.png
icon_url: ""
description:
    zh: gpt 模型简介
    en: Multi-modal, 320ms, 88.7% MMLU, excels in education, customer support, health, and entertainment.
default_parameters:
    - name: temperature
      label:
        zh: 生成随机性
        en: Temperature
      desc:
        zh: '- **temperature**: 调高温度会使得模型的输出更多样性和创新性，反之，降低温度会使输出内容更加遵循指令要求但减少多样性。建议不要与“Top p”同时调整。'
        en: '**Temperature**:\n\n- When you increase this value, the model outputs more diverse and innovative content; when you decrease it, the model outputs less diverse content that strictly follows the given instructions.\n- It is recommended not to adjust this value with \"Top p\" at the same time.'
      type: float
      min: "0"
      max: "1"
      default_val:
        balance: "0.8"
        creative: "1"
        default_val: "1.0"
        precise: "0.3"
      precision: 1
      options: []
      style:
        widget: slider
        label:
            zh: 生成多样性
            en: Generation diversity
    - name: max_tokens
      label:
        zh: 最大回复长度
        en: Response max length
      desc:
        zh: 控制模型输出的Tokens 长度上限。通常 100 Tokens 约等于 150 个中文汉字。
        en: You can specify the maximum length of the tokens output through this value. Typically, 100 tokens are approximately equal to 150 Chinese characters.
      type: int
      min: "1"
      max: "4096"
      default_val:
        default_val: "4096"
      options: []
      style:
        widget: slider
        label:
            zh: 输入及输出设置
            en: Input and output settings
    - name: top_p
      label:
        zh: Top P
        en: Top P
      desc:
        zh: '- **Top p 为累计概率**: 模型在生成输出时会从概率最高的词汇开始选择，直到这些词汇的总概率累积达到Top p 值。这样可以限制模型只选择这些高概率的词汇，从而控制输出内容的多样性。建议不要与“生成随机性”同时调整。'
        en: '**Top P**:\n\n- An alternative to sampling with temperature, where only tokens within the top p probability mass are considered. For example, 0.1 means only the top 10% probability mass tokens are considered.\n- We recommend altering this or temperature, but not both.'
      type: float
      min: "0"
      max: "1"
      default_val:
        default_val: "0.7"
      precision: 2
      options: []
      style:
        widget: slider
        label:
            zh: 生成多样性
            en: Generation diversity
    - name: frequency_penalty
      label:
        zh: 重复语句惩罚
        en: Frequency penalty
      desc:
        zh: '- **frequency penalty**: 当该值为正时，会阻止模型频繁使用相同的词汇和短语，从而增加输出内容的多样性。'
        en: '**Frequency Penalty**: When positive, it discourages the model from repeating the same words and phrases, thereby increasing the diversity of the output.'
      type: float
      min: "-2"
      max: "2"
      default_val:
        default_val: "0"
      precision: 2
      options: []
      style:
        widget: slider
        label:
            zh: 生成多样性
            en: Generation diversity
    - name: presence_penalty
      label:
        zh: 重复主题惩罚
        en: Presence penalty
      desc:
        zh: '- **presence penalty**: 当该值为正时，会阻止模型频繁讨论相同的主题，从而增加输出内容的多样性'
        en: '**Presence Penalty**: When positive, it prevents the model from discussing the same topics repeatedly, thereby increasing the diversity of the output.'
      type: float
      min: "-2"
      max: "2"
      default_val:
        default_val: "0"
      precision: 2
      options: []
      style:
        widget: slider
        label:
            zh: 生成多样性
            en: Generation diversity
    - name: response_format
      label:
        zh: 输出格式
        en: Response format
      desc:
        zh: '- **文本**: 使用普通文本格式回复\n- **Markdown**: 将引导模型使用Markdown格式输出回复\n- **JSON**: 将引导模型使用JSON格式输出'
        en: '**Response Format**:\n\n- **Text**: Replies in plain text format\n- **Markdown**: Uses Markdown format for replies\n- **JSON**: Uses JSON format for replies'
      type: int
      min: ""
      max: ""
      default_val:
        default_val: "0"
      options:
        - label: Text
          value: "0"
        - label: Markdown
          value: "1"
        - label: JSON
          value: "2"
      style:
        widget: radio_buttons
        label:
            zh: 输入及输出设置
            en: Input and output settings
meta:
    name: GPT-4o
    protocol: openai
    capability:
        function_call: true
        input_modal:
            - text
            - image
        input_tokens: 128000
        json_mode: false
        max_tokens: 128000
        output_modal:
            - text
        output_tokens: 16384
        prefix_caching: false
        reasoning: false
        prefill_response: false
    conn_config:
        base_url: ""
        api_key: ""
        timeout: 0s
        model: ""
        temperature: 0.7
        frequency_penalty: 0
        presence_penalty: 0
        max_tokens: 4096
        top_p: 1
        top_k: 0
        stop: []
        openai:
            by_azure: true
            api_version: ""
            response_format:
                type: text
                jsonschema: null
        claude: null
        ark: null
        deepseek: null
        qwen: null
        gemini: null
        custom: {}
    status: 0

Qwen

id: 2005
name: Qwen3-32B
icon_uri: default_icon/qwen_v2.png
icon_url: ""
description:
    zh: 通义千问模型
    en: qwen model description
default_parameters:
    - name: temperature
      label:
        zh: 生成随机性
        en: Temperature
      desc:
        zh: '- **temperature**: 调高温度会使得模型的输出更多样性和创新性，反之，降低温度会使输出内容更加遵循指令要求但减少多样性。建议不要与“Top p”同时调整。'
        en: '**Temperature**:\n\n- When you increase this value, the model outputs more diverse and innovative content; when you decrease it, the model outputs less diverse content that strictly follows the given instructions.\n- It is recommended not to adjust this value with \"Top p\" at the same time.'
      type: float
      min: "0"
      max: "1"
      default_val:
        balance: "0.8"
        creative: "1"
        default_val: "1.0"
        precise: "0.3"
      precision: 1
      options: []
      style:
        widget: slider
        label:
            zh: 生成多样性
            en: Generation diversity
    - name: max_tokens
      label:
        zh: 最大回复长度
        en: Response max length
      desc:
        zh: 控制模型输出的Tokens 长度上限。通常 100 Tokens 约等于 150 个中文汉字。
        en: You can specify the maximum length of the tokens output through this value. Typically, 100 tokens are approximately equal to 150 Chinese characters.
      type: int
      min: "1"
      max: "4096"
      default_val:
        default_val: "4096"
      options: []
      style:
        widget: slider
        label:
            zh: 输入及输出设置
            en: Input and output settings
    - name: top_p
      label:
        zh: Top P
        en: Top P
      desc:
        zh: '- **Top p 为累计概率**: 模型在生成输出时会从概率最高的词汇开始选择，直到这些词汇的总概率累积达到Top p 值。这样可以限制模型只选择这些高概率的词汇，从而控制输出内容的多样性。建议不要与“生成随机性”同时调整。'
        en: '**Top P**:\n\n- An alternative to sampling with temperature, where only tokens within the top p probability mass are considered. For example, 0.1 means only the top 10% probability mass tokens are considered.\n- We recommend altering this or temperature, but not both.'
      type: float
      min: "0"
      max: "1"
      default_val:
        default_val: "0.95"
      precision: 2
      options: []
      style:
        widget: slider
        label:
            zh: 生成多样性
            en: Generation diversity
meta:
    name: Qwen3-32B
    protocol: qwen
    capability:
        function_call: true
        input_modal:
            - text
        input_tokens: 128000
        json_mode: false
        max_tokens: 128000
        output_modal:
            - text
        output_tokens: 16384
        prefix_caching: false
        reasoning: false
        prefill_response: false
    conn_config:
        base_url: ""
        api_key: ""
        timeout: 0s
        model: ""
        temperature: 0.7
        frequency_penalty: 0
        presence_penalty: 0
        max_tokens: 4096
        top_p: 1
        top_k: 0
        stop: []
        openai: null
        claude: null
        ark: null
        deepseek: null
        qwen:
            response_format:
                type: text
                jsonschema: null
        gemini: null
        custom: {}
    status: 0

Gemini

id: 2007
name: Gemini-2.5-Flash
icon_uri: default_icon/gemini_v2.png
icon_url: ""
description:
    zh: gemini 模型简介
    en: gemini model description
default_parameters:
    - name: temperature
      label:
        zh: 生成随机性
        en: Temperature
      desc:
        zh: '- **temperature**: 调高温度会使得模型的输出更多样性和创新性，反之，降低温度会使输出内容更加遵循指令要求但减少多样性。建议不要与“Top p”同时调整。'
        en: '**Temperature**:\n\n- When you increase this value, the model outputs more diverse and innovative content; when you decrease it, the model outputs less diverse content that strictly follows the given instructions.\n- It is recommended not to adjust this value with \"Top p\" at the same time.'
      type: float
      min: "0"
      max: "1"
      default_val:
        balance: "0.8"
        creative: "1"
        default_val: "1.0"
        precise: "0.3"
      precision: 1
      options: []
      style:
        widget: slider
        label:
            zh: 生成多样性
            en: Generation diversity
    - name: max_tokens
      label:
        zh: 最大回复长度
        en: Response max length
      desc:
        zh: 控制模型输出的Tokens 长度上限。通常 100 Tokens 约等于 150 个中文汉字。
        en: You can specify the maximum length of the tokens output through this value. Typically, 100 tokens are approximately equal to 150 Chinese characters.
      type: int
      min: "1"
      max: "4096"
      default_val:
        default_val: "4096"
      options: []
      style:
        widget: slider
        label:
            zh: 输入及输出设置
            en: Input and output settings
    - name: top_p
      label:
        zh: Top P
        en: Top P
      desc:
        zh: '- **Top p 为累计概率**: 模型在生成输出时会从概率最高的词汇开始选择，直到这些词汇的总概率累积达到Top p 值。这样可以限制模型只选择这些高概率的词汇，从而控制输出内容的多样性。建议不要与“生成随机性”同时调整。'
        en: '**Top P**:\n\n- An alternative to sampling with temperature, where only tokens within the top p probability mass are considered. For example, 0.1 means only the top 10% probability mass tokens are considered.\n- We recommend altering this or temperature, but not both.'
      type: float
      min: "0"
      max: "1"
      default_val:
        default_val: "0.7"
      precision: 2
      options: []
      style:
        widget: slider
        label:
            zh: 生成多样性
            en: Generation diversity
    - name: response_format
      label:
        zh: 输出格式
        en: Response format
      desc:
        zh: '- **文本**: 使用普通文本格式回复\n- **JSON**: 将引导模型使用JSON格式输出'
        en: '**Response Format**:\n\n- **JSON**: Uses JSON format for replies'
      type: int
      min: ""
      max: ""
      default_val:
        default_val: "0"
      options:
        - label: Text
          value: "0"
        - label: JSON
          value: "2"
      style:
        widget: radio_buttons
        label:
            zh: 输入及输出设置
            en: Input and output settings
meta:
    name: Gemini-2.5-Flash
    protocol: gemini
    capability:
        function_call: true
        input_modal:
            - text
            - image
            - audio
            - video
        input_tokens: 1048576
        json_mode: true
        max_tokens: 1114112
        output_modal:
            - text
        output_tokens: 65536
        prefix_caching: true
        reasoning: true
        prefill_response: true
    conn_config:
        base_url: ""
        api_key: ""
        timeout: 0s
        model: gemini-2.5-flash
        temperature: 0.7
        frequency_penalty: 0
        presence_penalty: 0
        max_tokens: 4096
        top_p: 1
        top_k: 0
        stop: []
        openai: null
        claude: null
        ark: null
        deepseek: null
        qwen: null
        gemini:
            backend: 0
            project: ""
            location: ""
            api_version: ""
            headers:
                key_1:
                    - val_1
                    - val_2
            timeout_ms: 0
            include_thoughts: true
            thinking_budget: null
        custom: {}
    status: 0

Home

3. 模型配置

模型列表

模型配置说明

注意事项

为 Coze Studio 配置模型

步骤一：修改模型配置文件

步骤二：重新启动服务

配置参考

第三方模型服务

开源框架

官方模型服务

字段说明

模型信息

配置文件示例

Ark 火山方舟

Claude

Deepseek

Ollama

OpenAI

Qwen

Gemini

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally