Skip to content

Commit ee1e0d4

Browse files
authored
feat: implement scene update logic based on predicted action type (#61)
* fix: update import for Collaborator and pin compatible package versions - Fixed import path case issue for `Collaborator` to ensure cross-platform compatibility. - Replaced editable Git dependency with pinned `mcp==1.10.1` for stable installation. * refactor: switch from stdio to HTTP MCP client and update robot status API - Replaced stdio-based MCP connection with HTTP-based streamable client using config-defined URL. - Updated `config.yaml` to include MCP server URL under `mcp.URL`. - Refactored `robot_status` endpoint to return structured list of registered robot states. - Removed unused Flask imports and simplified code structure. * delete: redundant parameters * feat(ui): add interactive tool config modal with dynamic parameter table and step delay * docs: add advanced deployment settings and task publishing guide with images * fix: Failed to switch tabs when clicking on tool page * feat: add manual deployment guide, task submission options, and async subtask dispatching * update * update * update README.md * update: docker run commed * update * fix(docker): improve formatting and fix volume mount in run command - Fixed inconsistent indentation in the Docker run command - Added missing container path for the `-v` volume mount - Removed trailing whitespace after `-p 8888:8888` - Improved readability with consistent parameter alignment * update flag_scale to latest GitHub version for bug fixes * pin FlagScale to specific commit a0687db for reproducibility * Configure Git LFS to track mp4 files * feat: add deployment video without LFS * feat: update Readme.md * feat: update video link * docs(README): add GitHub Pages link for deployment tutorial video * fix: avoid repeated tool calls and restrict actions to uncompleted, task-required steps * feat: add scene update logic based on predicted action type (add/remove/position) * refactor: encapsulated add/remove/position logic into SceneUpdater class for better modularity and maintainability * update format * feat: add read_all_environment method and enhance planning prompt with scene info - Implemented read_all_environment to fetch and parse all environment data from Redis - Updated MASTER_PLANNING_PLANNING prompt to include scene_info and require task decomposition to consider available objects in the scene * docs: add key configuration guidance for slaver robot - document required fields in master/slaver config.yaml including model.cloud_server, collaborator, profile, tool, and robot - provide examples for slaver.robot in both local and remote call modes - clarify default values and usage to help users correctly configure RoboOS * update commit id
1 parent 635f1c9 commit ee1e0d4

File tree

14 files changed

+824
-362
lines changed

14 files changed

+824
-362
lines changed

README.md

Lines changed: 118 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,12 @@ RoboOS: A Hierarchical Embodied Framework for Cross-Embodiment and Multi-Agent C
1717
</a>&nbsp&nbsp🤖 <a href="https://github.com/FlagOpen/RoboBrain/">RoboBrain 1.0</a>: A Unified Brain Model for Robotic Manipulation from Abstract to Concrete.
1818
</p>
1919

20+
💬 If you have any questions, feel free to contact us via WeChat.
21+
<p align="center">
22+
<img src="./assets/wechat.png" width="300" alt="WeChat QR Code">
23+
</p>
24+
25+
2026
## 🔥 Overview
2127
The rise of embodied intelligence has intensified the need for robust multi-agent collaboration in industrial automation, service robotics, and smart manufacturing. However, current robotic systems struggle with critical limitations, including poor cross-embodiment adaptability, inefficient task scheduling, and inadequate dynamic error correction. While end-to-end vision-language-action (VLA) models (e.g., OpenVLA, RDT, Pi-0) exhibit weak long-horizon planning and task generalization, hierarchical VLA models (e.g., Helix, Gemini-Robotics, GR00T-N1) lack cross-embodiment compatibility and multi-agent coordination capabilities.
2228
To address these challenges, we present **RoboOS**, the first open-source embodied operating system based on a *Brain-Cerebellum* hierarchical architecture, facilitating a paradigm shift from single-agent to swarm intelligence. Specifically, RoboOS comprises three key components: **(1) the Embodied Cloud Model**, a multimodal large language model (MLLM) for global perception and high-level decision-making; **(2) the Cerebellum Skill Library**, a modular, plug-and-play toolkit for seamless multi-skill execution; and **(3) Real-Time Shared Memory**, a spatiotemporal synchronization mechanism for multi-agent state coordination. By integrating hierarchical information flow, RoboOS bridges the Embodied Brain and Cerebellum Skill Library, enabling robust planning, scheduling, and error correction for long-horizon tasks while ensuring efficient multi-agent collaboration by Real-Time Shared Memory. Moreover, we optimize edge-cloud communication and cloud-based distributed inference to support high-frequency interactions and scalable deployment.
@@ -68,7 +74,7 @@ docker run -itd \
6874
--shm-size=500g \
6975
--name agent \
7076
--hostname flagscale-agent \
71-
-v {your_local_path}/BAAI/RoboBrain2.0-7B:/path/in/container \
77+
-v {your_local_path}/BAAI/RoboBrain2.0-7B:/workspace/RoboBrain2.0-7B \
7278
--network=host \
7379
-p 8888:8888 \
7480
-w /workspace/RoboOS \
@@ -100,7 +106,7 @@ pip install -r requirements.txt
100106

101107
git clone https://github.com/FlagOpen/FlagScale
102108
cd FlagScale
103-
git checkout a0687db035ba1d9c7b2661d8142ee4e8348b1459
109+
git checkout 3fc2037f90917227bd4aebabd9d7b330523f437c
104110

105111
# Install in editable mode with PYTHONPATH
106112
PYTHONPATH=./:$PYTHONPATH pip install . --verbose --no-build-isolation
@@ -146,6 +152,116 @@ python skill.py
146152
Visit the web UI at http://127.0.0.1:8888 and follow the on-screen instructions to complete configuration.
147153
Once finished, you can control the robot and trigger skills from the interface.
148154

155+
156+
##### ⚡️ 5. Start vLLM Model Service
157+
158+
RoboOS requires a large language model backend to handle reasoning and tool calls.
159+
We recommend using **vLLM** to serve the [RoboBrain2.0-7B](https://www.modelscope.cn/models/BAAI/RoboBrain2.0-7B/summary) model.
160+
161+
162+
#### 5.1 Install vLLM
163+
164+
```bash
165+
pip install vllm
166+
```
167+
168+
#### 5.2 Prepare Chat Template
169+
The tool_chat_template_hermes.jinja file must be provided for tool-call parsing.
170+
Place it in the following directory:
171+
172+
```arduino
173+
RoboOS/deploy/templates/tool_chat_template_hermes.jinja
174+
```
175+
#### 5.3 Launch vLLM
176+
Run the following command to start the model service:
177+
178+
```bash
179+
vllm serve RoboBrain2.0-7B \
180+
--gpu-memory-utilization=0.9 \
181+
--max-model-len=10000 \
182+
--max-num-seqs=256 \
183+
--port=4567 \
184+
--trust-remote-code \
185+
--enable-chunked-prefill \
186+
--enable-auto-tool-choice \
187+
--tool-call-parser hermes \
188+
--chat-template RoboOS/deploy/templates/tool_chat_template_hermes.jinja
189+
```
190+
191+
### ⚙️ 6. Master & Slaver Configuration
192+
Before running the system, you need to configure both the **master** and **slaver** agents.
193+
Each agent requires a `config.yaml` file to define model connection, audio, and logging settings.
194+
195+
#### 6.1 Configuration Files
196+
- `master/config.yaml`
197+
- `slaver/config.yaml`
198+
199+
A default template is provided below (you may adjust according to your environment):
200+
201+
```yaml
202+
203+
204+
# Cloud Server (vLLM) Model Parameters
205+
model:
206+
model_select: "/workspace/model/BAAI/RoboBrain2.0-7B"
207+
model_retry_planning: 5
208+
model_dict:
209+
cloud_model: "/workspace/model/BAAI/RoboBrain2.0-7B"
210+
cloud_type: "default"
211+
cloud_api_key: "EMPTY"
212+
cloud_server: "http://localhost:4567/v1/"
213+
max_chat_message: 50
214+
215+
# Redis Collaborator
216+
collaborator:
217+
host: "127.0.0.1"
218+
port: 6379
219+
db: 0
220+
clear: true
221+
password: ""
222+
223+
# Slaver Robot
224+
robot:
225+
# "local" with a fold name such as "demo_robot"
226+
# "remote" with URL such as "http://127.0.0.1:8000", and run the Python script 'skill.py' on the robot itself.
227+
# call_type: local
228+
# path: "demo_robot_local"
229+
name: demo_robot
230+
call_type: remote
231+
path: "http://127.0.0.1:8000"
232+
233+
# Master Scene profile
234+
profile:
235+
path: ./scene/profile.yaml
236+
237+
# Slaver
238+
tool:
239+
# Has the model undergone targeted training on tool_calls
240+
support_tool_calls: false
241+
242+
```
243+
244+
245+
#### 6.2 Key Parameters
246+
247+
+ model.cloud_server:
248+
Must point to your vLLM service (default: http://localhost:4567/v1/)
249+
250+
+ collaborator:
251+
Redis server configuration (default: 127.0.0.1:6379)
252+
253+
+ profile:
254+
Path to the scene profile YAML file that defines environment and task settings (e.g., ./scene/profile.yaml)
255+
256+
+ tool:
257+
Enable or disable tool-call support. Set `support_tool_calls: true` if your model has been trained for tool calls
258+
+ robot:
259+
Two modes of calling robot tools
260+
261+
262+
⚠️ Make sure these fields are correctly configured; otherwise, RoboOS may fail to connect to vLLM, Redis, or load scene/tool profiles.
263+
264+
149265
## 🔧 Manual Deployment (Advanced)
150266
If you prefer to manually run RoboOS without using the deployment web UI, follow the steps below to start the system components directly from source.
151267

assets/wechat.png

205 KB
Loading
Lines changed: 130 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,130 @@
1+
{%- macro json_to_python_type(json_spec) %}
2+
{%- set basic_type_map = {
3+
"string": "str",
4+
"number": "float",
5+
"integer": "int",
6+
"boolean": "bool"
7+
} %}
8+
9+
{%- if basic_type_map[json_spec.type] is defined %}
10+
{{- basic_type_map[json_spec.type] }}
11+
{%- elif json_spec.type == "array" %}
12+
{{- "list[" + json_to_python_type(json_spec|items) + "]" }}
13+
{%- elif json_spec.type == "object" %}
14+
{%- if json_spec.additionalProperties is defined %}
15+
{{- "dict[str, " + json_to_python_type(json_spec.additionalProperties) + ']' }}
16+
{%- else %}
17+
{{- "dict" }}
18+
{%- endif %}
19+
{%- elif json_spec.type is iterable %}
20+
{{- "Union[" }}
21+
{%- for t in json_spec.type %}
22+
{{- json_to_python_type({"type": t}) }}
23+
{%- if not loop.last %}
24+
{{- "," }}
25+
{%- endif %}
26+
{%- endfor %}
27+
{{- "]" }}
28+
{%- else %}
29+
{{- "Any" }}
30+
{%- endif %}
31+
{%- endmacro %}
32+
33+
34+
{{- bos_token }}
35+
{{- "<|im_start|>system\nYou are a function calling AI model. You are provided with function signatures within <tools></tools> XML tags. You may call one or more functions to assist with the user query. Don't make assumptions about what values to plug into functions. Here are the available tools: <tools> " }}
36+
{%- if tools is iterable and tools | length > 0 %}
37+
{%- for tool in tools %}
38+
{%- if tool.function is defined %}
39+
{%- set tool = tool.function %}
40+
{%- endif %}
41+
{{- '{"type": "function", "function": ' }}
42+
{{- '{"name": "' + tool.name + '", ' }}
43+
{{- '"description": "' + tool.name + '(' }}
44+
{%- for param_name, param_fields in tool.parameters.properties|items %}
45+
{{- param_name + ": " + json_to_python_type(param_fields) }}
46+
{%- if not loop.last %}
47+
{{- ", " }}
48+
{%- endif %}
49+
{%- endfor %}
50+
{{- ")" }}
51+
{%- if tool.return is defined %}
52+
{{- " -> " + json_to_python_type(tool.return) }}
53+
{%- endif %}
54+
{{- " - " + tool.description + "\n\n" }}
55+
{%- for param_name, param_fields in tool.parameters.properties|items %}
56+
{%- if loop.first %}
57+
{{- " Args:\n" }}
58+
{%- endif %}
59+
{{- " " + param_name + "(" + json_to_python_type(param_fields) + "): " + param_fields.description|trim }}
60+
{%- endfor %}
61+
{%- if tool.return is defined and tool.return.description is defined %}
62+
{{- "\n Returns:\n " + tool.return.description }}
63+
{%- endif %}
64+
{{- '"' }}
65+
{{- ', "parameters": ' }}
66+
{%- if tool.parameters.properties | length == 0 %}
67+
{{- "{}" }}
68+
{%- else %}
69+
{{- tool.parameters|tojson }}
70+
{%- endif %}
71+
{{- "}" }}
72+
{%- if not loop.last %}
73+
{{- "\n" }}
74+
{%- endif %}
75+
{%- endfor %}
76+
{%- endif %}
77+
{{- " </tools>" }}
78+
{{- 'Use the following pydantic model json schema for each tool call you will make: {"properties": {"name": {"title": "Name", "type": "string"}, "arguments": {"title": "Arguments", "type": "object"}}, "required": ["name", "arguments"], "title": "FunctionCall", "type": "object"}}
79+
' }}
80+
{{- "For each function call return a json object with function name and arguments within <tool_call></tool_call> XML tags as follows:
81+
" }}
82+
{{- "<tool_call>
83+
" }}
84+
{{- '{"name": <function-name>, "arguments": <args-dict>}
85+
' }}
86+
{{- '</tool_call><|im_end|>' }}
87+
{%- for message in messages %}
88+
{%- if message.role == "user" or message.role == "system" or (message.role == "assistant" and message.tool_calls is not defined) %}
89+
{{- '<|im_start|>' + message.role + '\n' + message.content + '<|im_end|>' + '\n' }}
90+
{%- elif message.role == "assistant" and message.tool_calls is defined %}
91+
{{- '<|im_start|>' + message.role }}
92+
{%- for tool_call in message.tool_calls %}
93+
{{- '\n<tool_call>\n' }}
94+
{%- if tool_call.function is defined %}
95+
{%- set tool_call = tool_call.function %}
96+
{%- endif %}
97+
{{- '{' }}
98+
{{- '"name": "' }}
99+
{{- tool_call.name }}
100+
{{- '"' }}
101+
{%- if tool_call.arguments is defined %}
102+
{{- ', ' }}
103+
{{- '"arguments": ' }}
104+
{{- tool_call.arguments|tojson }}
105+
{%- endif %}
106+
{{- '}' }}
107+
{{- '\n</tool_call>' }}
108+
{%- endfor %}
109+
{{- '<|im_end|>\n' }}
110+
{%- elif message.role == "tool" %}
111+
{%- if loop.previtem and loop.previtem.role != "tool" %}
112+
{{- '<|im_start|>tool\n' }}
113+
{%- endif %}
114+
{{- '<tool_response>\n' }}
115+
{{- message.content }}
116+
{%- if not loop.last %}
117+
{{- '\n</tool_response>\n' }}
118+
{%- else %}
119+
{{- '\n</tool_response>' }}
120+
{%- endif %}
121+
{%- if not loop.last and loop.nextitem.role != "tool" %}
122+
{{- '<|im_end|>' }}
123+
{%- elif loop.last %}
124+
{{- '<|im_end|>' }}
125+
{%- endif %}
126+
{%- endif %}
127+
{%- endfor %}
128+
{%- if add_generation_prompt %}
129+
{{- '<|im_start|>assistant\n' }}
130+
{%- endif %}

master/agents/agent.py

Lines changed: 22 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,8 @@ def __init__(self, config_path="config.yaml"):
2222

2323
self.logger.info(f"Configuration loaded from {config_path} ...")
2424
self.logger.info(f"Master Configuration:\n{self.config}")
25+
26+
self._init_scene(self.config["profile"])
2527
self._start_listener()
2628

2729
def _init_logger(self, logger_config):
@@ -54,9 +56,26 @@ def _init_config(self, config_path="config.yaml"):
5456
with open(config_path, "r", encoding="utf-8") as f:
5557
self.config = yaml.safe_load(f)
5658

59+
def _init_scene(self, scene_config):
60+
"""Initialize scene object"""
61+
path = scene_config["path"]
62+
if not os.path.exists(path):
63+
self.logger.error(f"Scene config file {path} does not exist.")
64+
raise FileNotFoundError(f"Scene config file {path} not found.")
65+
with open(path, "r", encoding="utf-8") as f:
66+
self.scene = yaml.safe_load(f)
67+
68+
scenes = self.scene.get("scene", [])
69+
for scene_info in scenes:
70+
scene_name = scene_info.pop("name", None)
71+
if scene_name:
72+
self.collaborator.record_environment(scene_name, json.dumps(scene_info))
73+
else:
74+
print("Warning: Missing 'name' in scene_info:", scene_info)
75+
5776
def _handle_register(self, robot_name: Dict) -> None:
5877
"""Listen for robot registrations."""
59-
robot_info = self.collaborator.retrieve_agent(robot_name)
78+
robot_info = self.collaborator.read_agent_info(robot_name)
6079
self.logger.info(
6180
f"AGENT_REGISTRATION: {robot_name} \n {json.dumps(robot_info)}"
6281
)
@@ -170,8 +189,8 @@ def reasoning_and_subtasks_is_right(self, reasoning_and_subtasks: dict) -> bool:
170189
if isinstance(subtask, dict) and "robot_name" in subtask
171190
}
172191

173-
# Retrieve list of all registered robots from the collaborator
174-
robots_list = set(self.collaborator.retrieve_all_agents_name())
192+
# Read list of all registered robots from the collaborator
193+
robots_list = set(self.collaborator.read_all_agents_name())
175194

176195
# Check if all workers are registered
177196
return worker_list.issubset(robots_list)

master/agents/planner.py

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -67,11 +67,12 @@ def display_profiling_info(self, description: str, message: any):
6767
def forward(self, task: str) -> str:
6868
"""Get the sub-tasks from the task."""
6969

70-
all_robots_name = self.collaborator.retrieve_all_agents_name()
71-
all_robots_info = self.collaborator.retrieve_all_agents()
70+
all_robots_name = self.collaborator.read_all_agents_name()
71+
all_robots_info = self.collaborator.read_all_agents_info()
72+
all_environments_info = self.collaborator.read_environment()
7273

7374
content = MASTER_PLANNING_PLANNING.format(
74-
robot_name_list=all_robots_name, robot_tools_info=all_robots_info, task=task
75+
robot_name_list=all_robots_name, robot_tools_info=all_robots_info, task=task, scene_info=all_environments_info
7576
)
7677

7778
messages = [

master/agents/prompts.py

Lines changed: 27 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -1,24 +1,27 @@
1-
MASTER_PLANNING_PLANNING = """
2-
3-
Please only use {robot_name_list} with skills {robot_tools_info}.
4-
Please break down the given task into sub-tasks, each of which cannot be too complex, make sure that a single robot can do it.
5-
It can't be too simple either, e.g. it can't be a sub-task that can be done by a single step robot tool.
6-
Each sub-task in the output needs a concise name of the sub-task, which includes the robots that need to complete the sub-task.
7-
Additionally you need to give a 200+ word reasoning explanation on subtask decomposition and analyze if each step can be done by a single robot based on each robot's tools!
8-
9-
## The output format is as follows, in the form of a JSON structure:
10-
{{
11-
"reasoning_explanation": xxx,
12-
"subtask_list": [
13-
{{"robot_name": xxx, "subtask": xxx, "subtask_order": xxx}},
14-
{{"robot_name": xxx, "subtask": xxx, "subtask_order": xxx}},
15-
{{"robot_name": xxx, "subtask": xxx, "subtask_order": xxx}},
16-
]
17-
}}
18-
19-
## Note: 'subtask_order' means the order of the sub-task.
20-
If the tasks are not sequential, please set the same 'task_order' for the same task. For example, if two robots are assigned to the two tasks, both of which are independance, they should share the same 'task_order'.
21-
If the tasks are sequential, the 'task_order' should be set in the order of execution. For example, if the task_2 should be started after task_1, they should have different 'task_order'.
22-
23-
# The task to be completed is: {task}. Your output answer:
24-
"""
1+
MASTER_PLANNING_PLANNING = """
2+
3+
Please only use {robot_name_list} with skills {robot_tools_info}.
4+
You must also consider the following scene information when decomposing the task:
5+
{scene_info}
6+
7+
Please break down the given task into sub-tasks, each of which cannot be too complex, make sure that a single robot can do it.
8+
It can't be too simple either, e.g. it can't be a sub-task that can be done by a single step robot tool.
9+
Each sub-task in the output needs a concise name of the sub-task, which includes the robots that need to complete the sub-task.
10+
Additionally you need to give a 200+ word reasoning explanation on subtask decomposition and analyze if each step can be done by a single robot based on each robot's tools!
11+
12+
## The output format is as follows, in the form of a JSON structure:
13+
{{
14+
"reasoning_explanation": xxx,
15+
"subtask_list": [
16+
{{"robot_name": xxx, "subtask": xxx, "subtask_order": xxx}},
17+
{{"robot_name": xxx, "subtask": xxx, "subtask_order": xxx}},
18+
{{"robot_name": xxx, "subtask": xxx, "subtask_order": xxx}},
19+
]
20+
}}
21+
22+
## Note: 'subtask_order' means the order of the sub-task.
23+
If the tasks are not sequential, please set the same 'task_order' for the same task. For example, if two robots are assigned to the two tasks, both of which are independance, they should share the same 'task_order'.
24+
If the tasks are sequential, the 'task_order' should be set in the order of execution. For example, if the task_2 should be started after task_1, they should have different 'task_order'.
25+
26+
# The task to be completed is: {task}. Your output answer:
27+
"""

0 commit comments

Comments
 (0)