ms-swift 3.1版本中多轮对话+单轮grounding（多轮对话）数据集构建 #3088

corkiyao · 2025-02-13T00:45:11Z

在自定义数据集的时候，我不清楚ms-swift3.1版本的多轮对话grounding数据集格式。我只知道单轮对话。因为第一次使用，这个部分不太清楚，期待作者可以回复，非常感谢。
具体来说，比如：

这是监督微调的数据集格式：
{"messages": [{"role": "system", "content": "<system>"}, {"role": "user", "content": "<query1>"}, {"role": "assistant", "content": "<response1>"}, {"role": "user", "content": "<query2>"}, {"role": "assistant", "content": "<response2>"}]}

我注意到这是多轮对话的格式，另外下面的是目标定位的格式：

{"messages": [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "<image>找到图像中的<ref-object>"}, {"role": "assistant", "content": "<bbox><bbox>"}], "images": ["/xxx/x.jpg"], "objects": {"ref": ["羊"], "bbox": [[90.9, 160.8, 135, 212.8], [360.9, 480.8, 495, 532.8]]}}

但是我想先使用图像描述作为第一轮，之后定位作为第二轮。但是按照作者提供的格式，似乎不满足这样的方式。

请问可以这样吗？

{"messages": [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "<image>请问这是一个青青草原吗？"}, {"role": "assistant", "content": "是滴，这是一个青青草原。"}, {"role": "user", "content": "<image>找到图像中的<ref-object>"}, {"role": "assistant", "content": "<bbox><bbox>"}], "images": ["/xxx/x.jpg"], "objects": {"ref": ["羊"], "bbox": [[90.9, 160.8, 135, 212.8], [360.9, 480.8, 495, 532.8]]}}

先构建一个图像描述，再增加目标定位。？

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ms-swift 3.1版本中多轮对话+单轮grounding（多轮对话）数据集构建 #3088

ms-swift 3.1版本中多轮对话+单轮grounding（多轮对话）数据集构建 #3088

corkiyao commented Feb 13, 2025 •

edited

Loading

ms-swift 3.1版本中多轮对话+单轮grounding（多轮对话）数据集构建 #3088

ms-swift 3.1版本中多轮对话+单轮grounding（多轮对话）数据集构建 #3088

Comments

corkiyao commented Feb 13, 2025 • edited Loading

corkiyao commented Feb 13, 2025 •

edited

Loading