[DOC] ML-Commons agent framework updates (#10682)

pyek-bot · kolchfa-aws · natebower · web-flow · commit d6029c5d80f8 · 2025-08-15T06:27:31.000-04:00
* per_agent_updates

Signed-off-by: Pavan Yekbote &lt;pybot@amazon.com&gt;

* remove redundant versioning

Signed-off-by: Pavan Yekbote &lt;pybot@amazon.com&gt;

* fix: prompts and remove the 3.0 details

Signed-off-by: Pavan Yekbote &lt;pybot@amazon.com&gt;

* Apply suggestions from code review

Signed-off-by: kolchfa-aws &lt;105444904+kolchfa-aws@users.noreply.github.com&gt;

* Remove copy button from endpoints

Signed-off-by: Fanit Kolchina &lt;kolchfa@amazon.com&gt;

* Update _ml-commons-plugin/agents-tools/agents/plan-execute-reflect.md

Signed-off-by: kolchfa-aws &lt;105444904+kolchfa-aws@users.noreply.github.com&gt;

* Update _ml-commons-plugin/agents-tools/agents/plan-execute-reflect.md

Signed-off-by: Nathan Bower &lt;nbower@amazon.com&gt;

---------

Signed-off-by: Pavan Yekbote &lt;pybot@amazon.com&gt;
Signed-off-by: kolchfa-aws &lt;105444904+kolchfa-aws@users.noreply.github.com&gt;
Signed-off-by: Fanit Kolchina &lt;kolchfa@amazon.com&gt;
Signed-off-by: Nathan Bower &lt;nbower@amazon.com&gt;
Co-authored-by: kolchfa-aws &lt;105444904+kolchfa-aws@users.noreply.github.com&gt;
Co-authored-by: Fanit Kolchina &lt;kolchfa@amazon.com&gt;
Co-authored-by: Nathan Bower &lt;nbower@amazon.com&gt;
diff --git a/_ml-commons-plugin/agents-tools/agents/plan-execute-reflect.md b/_ml-commons-plugin/agents-tools/agents/plan-execute-reflect.md
@@ -12,9 +12,6 @@ grand_parent: Agents and tools
 **Introduced 3.0**
 {: .label .label-purple }
 
-This is an experimental feature and is not recommended for use in a production environment. For updates on the progress of the feature or if you want to leave feedback, see the associated [GitHub issue](https://github.com/opensearch-project/ml-commons/issues/3745).    
-{: .warning}
-
 Plan-execute-reflect agents are designed to solve complex tasks that require iterative reasoning and step-by-step execution. These agents use one large language model (LLM)---the _planner_---to create and update a plan and another LLM (or the same one by default) to execute each individual step using a built-in conversational agent.
 
 A plan-execute-reflect agent works in three phases:
@@ -255,66 +252,121 @@ The plan-execute-reflect agent uses the following predefined prompts. You can cu
 
 ### Planner template and prompt
 
-To create a custom planner prompt template, modify the `planner_prompt_template` parameter.
-The following template is used to ask the LLM to devise a plan for the given task:
+To create a custom planner prompt template, modify the `planner_prompt_template` parameter. The following template is used to ask the LLM to devise a plan for the given task:
 
 ```json
-${parameters.planner_prompt} \n Objective: ${parameters.user_prompt} \n ${parameters.plan_execute_reflect_response_format}
+${parameters.tools_prompt} \n${parameters.planner_prompt} \nObjective: ${parameters.user_prompt} \n\nRemember: Respond only in JSON format following the required schema.
 ```
 
 To create a custom planner prompt, modify the `planner_prompt` parameter.
 The following prompt is used to ask the LLM to devise a plan for the given task:
 
 ```
-For the given objective, come up with a simple step by step plan. This plan should involve individual tasks, that if executed correctly will yield the correct answer. Do not add any superfluous steps. The result of the final step should be the final answer. Make sure that each step has all the information needed - do not skip steps. At all costs, do not execute the steps. You will be told when to execute the steps.
+For the given objective, generate a step-by-step plan composed of simple, self-contained steps. The final step should directly yield the final answer. Avoid unnecessary steps.
 ```
 
 ### Planner prompt with a history template
 
-To create a custom planner prompt with a history template, modify the `planner_with_history_template` parameter.
-The following template is used when `memory_id` is provided during agent execution to give the LLM context about the previous task:
+To create a custom planner prompt with a history template, modify the `planner_with_history_template` parameter. The following template is used when `memory_id` is provided during agent execution to give the LLM context about the previous task::
 
 ```json
-${parameters.planner_prompt} \n Objective: ${parameters.user_prompt} \n\n You have currently executed the following steps: \n[${parameters.completed_steps}] \n\n \n ${parameters.plan_execute_reflect_response_format}
+${parameters.tools_prompt} \n${parameters.planner_prompt} \nObjective: ```${parameters.user_prompt}``` \n\nYou have currently executed the following steps: \n[${parameters.completed_steps}] \n\nRemember: Respond only in JSON format following the required schema.
 ```
 
 ### Reflection prompt and template
 
-To create a custom reflection prompt template, modify the `reflect_prompt_template` parameter.
-The following template is used to ask the LLM to rethink the original plan based on completed steps:
+To create a custom reflection prompt template, modify the `reflect_prompt_template` parameter. The following template is used to ask the LLM to rethink the original plan based on completed steps:
 
 ```json
-${parameters.planner_prompt} \n Objective: ${parameters.user_prompt} \n Original plan:\n [${parameters.steps}] \n You have currently executed the following steps: \n [${parameters.completed_steps}] \n ${parameters.reflect_prompt} \n ${parameters.plan_execute_reflect_response_format}
+${parameters.tools_prompt} \n${parameters.planner_prompt} \n\nObjective: ```${parameters.user_prompt}```\n\nOriginal plan:\n[${parameters.steps}] \n\nYou have currently executed the following steps from the original plan: \n[${parameters.completed_steps}] \n\n${parameters.reflect_prompt} \n\n.Remember: Respond only in JSON format following the required schema.
 ```
 
 To create a custom reflection prompt, modify the `reflect_prompt` parameter.
 The following prompt is used to ask the LLM to rethink the original plan:
 
 ```
-Update your plan accordingly. If no more steps are needed and you can return to the user, then respond with that. Otherwise, fill out the plan. Only add steps to the plan that still NEED to be done. Do not return previously done steps as part of the plan. Please follow the below response format
+Update your plan based on the latest step results. If the task is complete, return the final answer. Otherwise, include only the remaining steps. Do not repeat previously completed steps.
 ```
 
 ### Planner system prompt
 
-To create a custom planner system prompt, modify the `system_prompt` parameter.
-The following is the planner system prompt:
+To create a custom planner system prompt, modify the `system_prompt` parameter. The following is the planner system prompt:
 
 ```
-You are part of an OpenSearch cluster. When you deliver your final result, include a comprehensive report. This report MUST:\n1. List every analysis or step you performed.\n2. Summarize the inputs, methods, tools, and data used at each step.\n3. Include key findings from all intermediate steps — do NOT omit them.\n4. Clearly explain how the steps led to your final conclusion.\n5. Return the full analysis and conclusion in the 'result' field, even if some of this was mentioned earlier.\n\nThe final response should be fully self-contained and detailed, allowing a user to understand the full investigation without needing to reference prior messages. Always respond in JSON format.
+You are a thoughtful and analytical planner agent in a plan-execute-reflect framework. Your job is to design a clear, step-by-step plan for a given objective.
+
+Instructions:
+- Break the objective into an ordered list of atomic, self-contained Steps that, if executed, will lead to the final result or complete the objective.
+- Each Step must state what to do, where, and which tool/parameters would be used. You do not execute tools, only reference them for planning.
+- Use only the provided tools; do not invent or assume tools. If no suitable tool applies, use reasoning or observations instead.
+- Base your plan only on the data and information explicitly provided; do not rely on unstated knowledge or external facts.
+- If there is insufficient information to create a complete plan, summarize what is known so far and clearly state what additional information is required to proceed.
+- Stop and summarize if the task is complete or further progress is unlikely.
+- Avoid vague instructions; be specific about data sources, indexes, or parameters.
+- Never make assumptions or rely on implicit knowledge.
+- Respond only in JSON format.
+
+Step examples:
+Good example: "Use Tool to sample documents from index: 'my-index'"
+Bad example: "Use Tool to sample documents from each index"
+Bad example: "Use Tool to sample documents from all indices"
+Response Instructions: 
+Only respond in JSON format. Always follow the given response instructions. Do not return any content that does not follow the response instructions. Do not add anything before or after the expected JSON. 
+Always respond with a valid JSON object that strictly follows the below schema:
+{
+	"steps": array[string], 
+	"result": string 
+}
+Use "steps" to return an array of strings where each string is a step to complete the objective, leave it empty if you know the final result. Please wrap each step in quotes and escape any special characters within the string. 
+Use "result" return the final response when you have enough information, leave it empty if you want to execute more steps. Please escape any special characters within the result. 
+Here are examples of valid responses following the required JSON schema:
+
+Example 1 - When you need to execute steps:
+{
+	"steps": ["This is an example step", "this is another example step"],
+	"result": ""
+}
+
+Example 2 - When you have the final result:
+{
+	"steps": [],
+	"result": "This is an example result\n with escaped special characters"
+}
+Important rules for the response:
+1. Do not use commas within individual steps 
+2. Do not add any content before or after the JSON 
+3. Only respond with a pure JSON object 
+
+When you deliver your final result, include a comprehensive report. This report must:
+1. List every analysis or step you performed.
+2. Summarize the inputs, methods, tools, and data used at each step.
+3. Include key findings from all intermediate steps — do NOT omit them.
+4. Clearly explain how the steps led to your final conclusion. Only mention the completed steps.
+5. Return the full analysis and conclusion in the 'result' field, even if some of this was mentioned earlier. Ensure that special characters are escaped in the 'result' field.
+6. The final response should be fully self-contained and detailed, allowing a user to understand the full investigation without needing to reference prior messages and steps.
 ```
 
+We do not recommend modifying the response format instructions. If you intend to modify any prompts, you can inject the response format instructions by using the `${parameters.plan_execute_reflect_response_format}` parameter.
+{: .tip}
+
 ### Executor system prompt
 
-To create a custom executor system prompt, modify the `executor_system_prompt` parameter.
-The following is the executor system prompt:
+To create a custom executor system prompt, modify the `executor_system_prompt` parameter. The following is the executor system prompt:
 
 ```
-You are a dedicated helper agent working as part of a plan‑execute‑reflect framework. Your role is to receive a discrete task, execute all necessary internal reasoning or tool calls, and return a single, final response that fully addresses the task. You must never return an empty response. If you are unable to complete the task or retrieve meaningful information, you must respond with a clear explanation of the issue or what was missing. Under no circumstances should you end your reply with a question or ask for more information. If you search any index, always include the raw documents in the final result instead of summarizing the content. This is critical to give visibility into what the query retrieved.
+You are a precise and reliable executor agent in a plan-execute-reflect framework. Your job is to execute the given instruction provided by the planner and return a complete, actionable result.
+
+Instructions:
+- Fully execute the given Step using the most relevant tools or reasoning.
+- Include all relevant raw tool outputs (e.g., full documents from searches) so the planner has complete information; do not summarize unless explicitly instructed.
+- Base your execution and conclusions only on the data and tool outputs available; do not rely on unstated knowledge or external facts.
+- If the available data is insufficient to complete the Step, summarize what was obtained so far and clearly state the additional information or access required to proceed (do not guess).
+- If unable to complete the Step, clearly explain what went wrong and what is needed to proceed.
+- Avoid making assumptions and relying on implicit knowledge.
+- Your response must be self-contained and ready for the planner to use without modification. Never end with a question.
+- Break complex searches into simpler queries when appropriate.
 ```
 
-We recommend never modifying `${parameters.plan_execute_reflect_response_format}` and always including it toward the end of your prompt templates.
-{: .tip}
-
 ## Modifying default prompts
 
 To modify the prompts, provide them during agent registration:
diff --git a/_ml-commons-plugin/api/agent-apis/execute-agent.md b/_ml-commons-plugin/api/agent-apis/execute-agent.md
@@ -32,7 +32,7 @@ The following table lists the available request fields.
 
 Field | Data type | Required/Optional | Description
 :---  | :--- | :--- 
-`parameters`| Object | Required | The parameters required by the agent. 
+`parameters`| Object | Required | The parameters required by the agent. Any agent parameters configured during registration can be overridden using this field.
 `parameters.verbose`| Boolean | Optional | Provides verbose output. 
 
 ## Example request
diff --git a/_ml-commons-plugin/api/agent-apis/register-agent.md b/_ml-commons-plugin/api/agent-apis/register-agent.md
@@ -26,7 +26,6 @@ For more information about agents, see [Agents]({{site.url}}{{site.baseurl}}/ml-
 ```json
 POST /_plugins/_ml/agents/_register
 ```
-{% include copy-curl.html %}
 
 ## Request body fields
 
@@ -47,7 +46,11 @@ Field | Data type | Required/Optional | Agent type | Description
 `parameters.executor_agent_id`| Integer | Optional | `plan_execute_and_reflect` | The `plan_execute_and_reflect` agent internally uses a `conversational` agent to execute each step. By default, this executor agent uses the same model as the planning model specified in the `llm` configuration. To use a different model for executing steps, create a `conversational` agent using another model and pass the agent ID in this field. This can be useful if you want to use different models for planning and execution.
 `parameters.max_steps` | Integer | Optional | `plan_execute_and_reflect` | The maximum number of steps executed by the LLM. Default is `20`.
 `parameters.executor_max_iterations` | Integer | Optional | `plan_execute_and_reflect` | The maximum number of messages sent to the LLM by the executor agent. Default is `20`.
+`parameters.message_history_limit` | Integer | Optional | `plan_execute_and_reflect` | The number of recent messages from conversation memory to include as context for the planner. Default is `10`. 
+`parameters.executor_message_history_limit` | Integer | Optional | `plan_execute_and_reflect` | The number of recent messages from conversation memory to include as context for the executor. Default is `10`.
 `parameters._llm_interface` | String | Required | `plan_execute_and_reflect`, `conversational` | Specifies how to parse the LLM output when using function calling. Valid values are: <br> - `bedrock/converse/claude`: Anthropic Claude conversational models hosted on Amazon Bedrock  <br> - `bedrock/converse/deepseek_r1`: DeepSeek-R1 models hosted on Amazon Bedrock <br> - `openai/v1/chat/completions`: OpenAI chat completion models hosted on OpenAI. Each interface defines a default response schema and function call parser.
+`inject_datetime` | Boolean | Optional | `conversational`, `plan_execute_and_reflect` | Whether to automatically inject the current date into the system prompt. Default is `false`.
+`datetime_format` | String | Optional | `conversational`, `plan_execute_and_reflect` | A format string for dates used when `inject_datetime` is enabled. Default is `"yyyy-MM-dd'T'HH:mm:ss'Z'"` (ISO format).
 
 The `tools` array contains a list of tools for the agent. Each tool contains the following fields.
 
diff --git a/_tutorials/gen-ai/agents/build-plan-execute-reflect-agent.md b/_tutorials/gen-ai/agents/build-plan-execute-reflect-agent.md
@@ -81,22 +81,25 @@ Note the model ID; you'll use it in the following steps.
 
 ### Step 1(c): Configure a retry policy
 
-Because the agent is a long-running agent that executes multiple steps, we strongly recommend configuring a retry policy for your model. For more information, see the `client_config` parameter in [Configuration parameters]({{site.url}}{{site.baseurl}}/ml-commons-plugin/remote-models/blueprints/#configuration-parameters). For example, to configure unlimited retries, set `max_retry_times` to `-1`:
+Because the agent is a long-running agent that executes multiple steps, we strongly recommend configuring a retry policy for your connector. For more information, see the `client_config` parameter in [Configuration parameters]({{site.url}}{{site.baseurl}}/ml-commons-plugin/remote-models/blueprints/#configuration-parameters). For example, to configure unlimited retries, set `max_retry_times` to `-1`:
 
 ```json
-PUT /_plugins/_ml/models/your_model_id
+PUT /_plugins/_ml/connectors/<connector_id>
 {
-  "connector": {
-    "client_config": {
-      "max_retry_times": -1,
-      "retry_backoff_millis": 300,
-      "retry_backoff_policy": "exponential_full_jitter"
-    }
+  "client_config": {
+    "max_retry_times": -1,
+    "retry_backoff_millis": 300,
+    "retry_backoff_policy": "exponential_full_jitter"
   }
 }
 ```
 {% include copy-curl.html %}
 
+If you have deployed your model or made a predict call, you must undeploy your model before updating the `client_config`. For more information, see [Undeploy Model API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/undeploy-model/).
+
+For more information about deploying your model, see [Deploy Model API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/deploy-model/).
+
+
 ## Step 2: Create an agent
 
 Create a `plan_execute_and_reflect` agent configured with the following information: