-
Notifications
You must be signed in to change notification settings - Fork 7.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
In the output, the order of 'ta' is sometimes reversed as 'at'. #8015
Comments
Hello @wsbao! I'm here to help you with any bugs, questions, or contributions you have for the Dify project. Let's work together to solve this issue! There are no specific tests or validation checks documented in the repository for ensuring the correct order of characters in generated text [1]. Additionally, there are no known issues or bugs related to character ordering in the text generation module documented in the repository [1]. The issue might be related to the model or the way the agent framework in ReAct mode processes and generates responses. The |
@dosu, I created a YAML DSL for this App to reproduce this bug. I used LM Studio to host Qwen2-7B-Instruct-q2_k.gguf locally with CPU only. app:
description: ''
icon: speaking_head_in_silhouette
icon_background: '#E0EAFF'
mode: agent-chat
name: Gossip
use_icon_as_answer_icon: false
kind: app
model_config:
agent_mode:
enabled: true
max_iteration: 5
prompt: null
strategy: react
tools: []
annotation_reply:
enabled: false
chat_prompt_config: {}
completion_prompt_config: {}
dataset_configs:
datasets:
datasets: []
retrieval_model: multiple
dataset_query_variable: ''
external_data_tools: []
file_upload:
image:
detail: high
enabled: false
number_limits: 3
transfer_methods:
- remote_url
- local_file
model:
completion_params:
max_tokens: 4096
stop: []
temperature: 0
mode: chat
name: Qwen2-7B-Instruct-q2_k.gguf
provider: openai_api_compatible
more_like_this:
enabled: false
opening_statement: ''
pre_prompt: ''
prompt_type: simple
retriever_resource:
enabled: true
sensitive_word_avoidance:
configs: []
enabled: false
type: ''
speech_to_text:
enabled: false
suggested_questions: []
suggested_questions_after_answer:
enabled: false
text_to_speech:
enabled: false
language: ''
voice: ''
user_input_form: []
version: 0.1.2 When we ask "What are ISO standard and stable diffusion?", the "rather" in the front-end output is mis-spelt as "athn". However, if we use same JSON payload {
"model": "Qwen2-7B-Instruct-q2_k.gguf",
// "stream": true,
"temperature": 0,
"max_tokens": 4096,
"messages": [
{
"role": "system",
"content": "Respond to the human as helpfully and accurately as possible. \n\n\n\nYou have access to the following tools:\n\n[]\n\nUse a json blob to specify a tool by providing an action key (tool name) and an action_input key (tool input).\nValid \"action\" values: \"Final Answer\" or \n\nProvide only ONE action per $JSON_BLOB, as shown:\n\n```\n{\n \"action\": $TOOL_NAME,\n \"action_input\": $ACTION_INPUT\n}\n```\n\nFollow this format:\n\nQuestion: input question to answer\nThought: consider previous and subsequent steps\nAction:\n```\n$JSON_BLOB\n```\nObservation: action result\n... (repeat Thought/Action/Observation N times)\nThought: I know what to respond\nAction:\n```\n{\n \"action\": \"Final Answer\",\n \"action_input\": \"Final response to human\"\n}\n```\n\nBegin! Reminder to ALWAYS respond with a valid json blob of a single action. Use tools if necessary. Respond directly if appropriate. Format is Action:```$JSON_BLOB```then Observation:.\n"
},
{
"role": "user",
"content": "What are ISO standard and stable diffusion?"
}
],
"stop": [
"Observation"
]
} to query this model API, and the result is correct: {
"id": "chatcmpl-***h",
"object": "chat.completion",
"created": 17***01,
"model": "Qwen2-7B-Instruct/Repository/qwen2-7b-instruct-q2_k.gguf",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Thought: The user wants information about ISO standards and stable diffusion processes separately rather than asking for their relationship or comparison.\n\nAction:\n```\n{\n \"action\": \"Final Answer\",\n \"action_input\": {\n \"ISO Standards\": \"ISO standards refer to International Organization for Standardization guidelines that provide specifications for products, services, systems, processes, etc., ensuring quality assurance.\",\n \"Stable Diffusion Processes\": \"Stable diffusion refers to the process where substances move from areas of higher concentration to lower concentration until equilibrium is reached.\"\n}\n}\n```\n\n"
},
"logprobs": null,
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 242,
"completion_tokens": 112,
"total_tokens": 354
},
"system_fingerprint": "Qwen2-7B-Instruct/***/qwen2-7b-instruct-q2_k.gguf"
} |
@Yeuoly @JohnJyong @GarfieldDai I think I figured out the root cause of this bug. It arises from api/core/agent/output_parser/cot_output_parser.py which is called by api/core/agent/cot_agent_runner.py at line 123. Basically, the code snippet api/core/agent/output_parser/cot_output_parser.py transforms the output from LLM by:
In the implementation, variables To rectify this undesired behavior, it is always good to release all the historically cached content back to the stream before the current letter stored in With these observations, I suggest two improvements for api/core/agent/output_parser/cot_output_parser.py:
Below is my proposed revision for api/core/agent/output_parser/cot_output_parser.py import json
import re
from collections.abc import Generator
from typing import Union
from core.agent.entities import AgentScratchpadUnit
from core.model_runtime.entities.llm_entities import LLMResultChunk
class CotAgentOutputParser:
@classmethod
def handle_react_stream_output(
cls, llm_response: Generator[LLMResultChunk, None, None], usage_dict: dict
) -> Generator[Union[str, AgentScratchpadUnit.Action], None, None]:
def parse_action(json_str):
try:
action = json.loads(json_str)
action_name = None
action_input = None
# cohere always returns a list
if isinstance(action, list) and len(action) == 1:
action = action[0]
for key, value in action.items():
if "input" in key.lower():
action_input = value
else:
action_name = value
if action_name is not None and action_input is not None:
return AgentScratchpadUnit.Action(
action_name=action_name,
action_input=action_input,
)
else:
return json_str or ""
except:
return json_str or ""
def extra_json_from_code_block(code_block) -> Generator[Union[dict, str], None, None]:
code_blocks = re.findall(r"```(.*?)```", code_block, re.DOTALL)
if not code_blocks:
return
for block in code_blocks:
json_text = re.sub(r"^[a-zA-Z]+\n", "", block.strip(), flags=re.MULTILINE)
yield parse_action(json_text)
code_block_cache = ""
code_block_delimiter_count = 0
in_code_block = False
json_cache = ""
json_quote_count = 0
in_json = False
got_json = False
action_cache = ""
action_str = "action:"
action_idx = 0
thought_cache = ""
thought_str = "thought:"
thought_idx = 0
last_character = ""
for response in llm_response:
if response.delta.usage:
usage_dict["usage"] = response.delta.usage
response = response.delta.message.content
if not isinstance(response, str):
continue
# stream
index = 0
while index < len(response):
steps = 1
delta = response[index : index + steps]
# last_character = response[index - 1] if index > 0 else ""
yield_delta = False
if delta == "`":
last_character = delta
code_block_cache += delta
code_block_delimiter_count += 1
else:
if not in_code_block:
if code_block_delimiter_count > 0:
last_character = delta
yield code_block_cache
code_block_cache = ""
else:
last_character = delta
code_block_cache += delta
code_block_delimiter_count = 0
if not in_code_block and not in_json:
if delta.lower() == action_str[action_idx] and action_idx == 0:
if last_character not in {"\n", " ", ""}:
yield_delta = True
# index += steps
# yield delta
# continue
else:
last_character = delta
action_cache += delta
action_idx += 1
if action_idx == len(action_str):
action_cache = ""
action_idx = 0
index += steps
continue
elif delta.lower() == action_str[action_idx] and action_idx > 0:
last_character = delta
action_cache += delta
action_idx += 1
if action_idx == len(action_str):
action_cache = ""
action_idx = 0
index += steps
continue
else:
if action_cache:
last_character = delta
yield action_cache
action_cache = ""
action_idx = 0
if delta.lower() == thought_str[thought_idx] and thought_idx == 0:
if last_character not in {"\n", " ", ""}:
yield_delta = True
# index += steps
# yield delta
# continue
else:
last_character = delta
thought_cache += delta
thought_idx += 1
if thought_idx == len(thought_str):
thought_cache = ""
thought_idx = 0
index += steps
continue
elif delta.lower() == thought_str[thought_idx] and thought_idx > 0:
last_character = delta
thought_cache += delta
thought_idx += 1
if thought_idx == len(thought_str):
thought_cache = ""
thought_idx = 0
index += steps
continue
else:
if thought_cache:
last_character = delta
yield thought_cache
thought_cache = ""
thought_idx = 0
if yield_delta:
index += steps
last_character = delta
yield delta
continue
if code_block_delimiter_count == 3:
if in_code_block:
last_character = delta
yield from extra_json_from_code_block(code_block_cache)
code_block_cache = ""
in_code_block = not in_code_block
code_block_delimiter_count = 0
if not in_code_block:
# handle single json
if delta == "{":
json_quote_count += 1
in_json = True
last_character = delta
json_cache += delta
elif delta == "}":
last_character = delta
json_cache += delta
if json_quote_count > 0:
json_quote_count -= 1
if json_quote_count == 0:
in_json = False
got_json = True
index += steps
continue
else:
if in_json:
last_character = delta
json_cache += delta
if got_json:
got_json = False
last_character = delta
yield parse_action(json_cache)
json_cache = ""
json_quote_count = 0
in_json = False
if not in_code_block and not in_json:
last_character = delta
yield delta.replace("`", "")
index += steps
if code_block_cache:
yield code_block_cache
if json_cache:
yield parse_action(json_cache) |
The issue you're experiencing with character ordering in the output seems to be specific to the ReAct mode in the Dify platform when using the Qwen2-7B-Instruct model. Here are a few steps you can take to investigate and potentially resolve this issue:
By following these steps, you should be able to narrow down the cause of the character ordering issue and implement a fix. If the problem persists, consider reaching out to the Dify community for further assistance. |
…anggenius#8015, removed unused code, and installed ruff and dotenv-linter to run dev/reformat again.
* main: (35 commits) fix #9409 (#9433) update dataset clean rule (#9426) add clean 7 days datasets (#9424) fix: resolve overlap issue with API Extension selector and modal (#9407) refactor: update the default values of top-k parameter in vdb to be consistent (#9367) fix: incorrect webapp image displayed (#9401) Fix/economical knowledge retrieval (#9396) feat: add timezone conversion for time tool (#9393) fix: Deprecated gemma2-9b model in Fireworks AI Provider (#9373) feat: storybook (#9324) fix: use gpt-4o-mini for validating credentials (#9387) feat: Enable baiduvector intergration test (#9369) fix: remove the stream option of zhipu and gemini (#9319) fix: add missing vikingdb param in docker .env.example (#9334) feat: add minimax abab6.5t support (#9365) fix: (#9336 followup) skip poetry preperation in style workflow when no change in api folder (#9362) feat: add glm-4-flashx, deprecated chatglm_turbo (#9357) fix: Azure OpenAI o1 max_completion_token and get_num_token_from_messages error (#9326) fix: In the output, the order of 'ta' is sometimes reversed as 'at'. #8015 (#8791) refactor: Add an enumeration type and use the factory pattern to obtain the corresponding class (#9356) ...
* feat/new-login: (30 commits) feat: add init login type fix #9409 (#9433) update dataset clean rule (#9426) add clean 7 days datasets (#9424) fix: resolve overlap issue with API Extension selector and modal (#9407) refactor: update the default values of top-k parameter in vdb to be consistent (#9367) fix: incorrect webapp image displayed (#9401) Fix/economical knowledge retrieval (#9396) feat: add timezone conversion for time tool (#9393) fix: Deprecated gemma2-9b model in Fireworks AI Provider (#9373) feat: storybook (#9324) fix: use gpt-4o-mini for validating credentials (#9387) feat: Enable baiduvector intergration test (#9369) fix: remove the stream option of zhipu and gemini (#9319) fix: add missing vikingdb param in docker .env.example (#9334) feat: add minimax abab6.5t support (#9365) fix: (#9336 followup) skip poetry preperation in style workflow when no change in api folder (#9362) feat: add glm-4-flashx, deprecated chatglm_turbo (#9357) fix: Azure OpenAI o1 max_completion_token and get_num_token_from_messages error (#9326) fix: In the output, the order of 'ta' is sometimes reversed as 'at'. #8015 (#8791) ...
Self Checks
Dify version
0.7.3
Cloud or Self Hosted
Self Hosted (Docker)
Steps to reproduce
Then, in the output, the order of 'ta' is sometimes reversed as 'at'.
✔️ Expected Behavior
Correct order of the output alpabets, especially for 'a' and 't'.
❌ Actual Behavior
The order of 'ta' is sometimes reversed as 'at'.
The text was updated successfully, but these errors were encountered: