haosulab · StoneT2000 · Oct 26, 2024 · Oct 25, 2024 · Oct 25, 2024 · Oct 25, 2024
diff --git a/docs/source/user_guide/tutorials/custom_tasks/intro.md b/docs/source/user_guide/tutorials/custom_tasks/intro.md
@@ -49,11 +49,11 @@ class PushCubeEnv(BaseEnv):
 
 At the start of any task, you must load in all objects (robots, assets, articulations, lighting etc.) into each parallel environment, also known as a sub-scene. This is also known as **reconfiguration** and generally only ever occurs once. Loading these objects is done in the `_load_scene` function of your custom task class. The objective is to simply load objects in with initial poses that ensure they don't collide on the first step, and nothing else. For GPU simulation at this stage you cannot change any object states (like velocities, qpos), only initial poses can be modified. Changing/randomizing states is done in the section on [episode initialization / randomization](#episode-initialization--randomization).
 
-Building objects in ManiSkill is nearly the exact same as it is in SAPIEN. You create an `ActorBuilder` via `self.scene.create_actor_builder` and via the actor builder add visual and collision shapes. Visual shapes only affect visual rendering processes while collision shapes affect the physical simulation. ManiSkill further will create the actor for you in every sub-scene (unless you use [scene-masks/scene-idxs](./advanced.md#scene-masks), a more advanced feature).
+Building objects in ManiSkill is nearly the exact same as it is in SAPIEN. You create an `ActorBuilder` via `self.scene.create_actor_builder` and via the actor builder add visual and collision shapes. Visual shapes only affect visual rendering processes while collision shapes affect the physical simulation. ManiSkill further will create the actor for you in every sub-scene (unless you use [scene-masks/scene-idxs](./advanced.md#scene-masks), a more advanced feature for enabling heterogeneous simulation).
 
 ### Building Robots
 
-This is the simplest part and requires almost no additional work here. Robots are added in for you automatically and have their base initialized at 0. You can specify the default robot(s) added in via the init function. In PushCube this is done as so by adding `SUPPORTED_ROBOTS` to ensure users can only run your task with the selected robots. You can further add typing if you wish to the `agent` class attribute. 
+This is the simplest part and requires almost no additional work here. Robots are added in for you automatically and by default are initialized at the origin. You can specify the default robot(s) added in via the init function. In PushCube this is done as so by adding `SUPPORTED_ROBOTS` to ensure users can only run your task with the selected robots. You can further add typing if you wish to the `agent` class attribute. 
 
 ```python
 from mani_skill.agents.robots import Fetch, Panda
@@ -68,15 +68,20 @@ class PushCubeEnv(BaseEnv):
         # robot_uids="fetch" is possible, or even multi-robot 
         # setups via robot_uids=("fetch", "panda")
         super().__init__(*args, robot_uids=robot_uids, **kwargs)
+
+    def _load_agent(self, options: dict):
+        super()._load_agent(options, sapien.Pose(p=[0, 0, 1]))
 ```
+To define the initial pose of the robot you ovverride the `_load_agent` function. This is done in PushCube above. It is recommended to set initial poses for all objects such that if they were spawned there they don't intersect other objects. Here we spawn the robot 1 meter above the ground which won't clash with anything else.
+
 
-Initializing these robots occurs in the initialization / randomization section covered later. With this setup you can later access agent data via `self.agent` and the specific articulation data of the robot via `self.agent.robot`. For multi-robot setups you can access each agent via `self.agent.agents`.
+Initializing/randomizing these robots occurs in the initialization / randomization section covered later. With this setup you can later access agent data via `self.agent` and the specific articulation data of the robot via `self.agent.robot`. For multi-robot setups you can access each agent via `self.agent.agents`.
 
 To create your own custom robots/agents, see the [custom robots tutorial](../custom_robots.md).
 
 ### Building Actors
 
-the `_load_scene` function must be implemented to build objects besides agents. It is also given an `options` dictionary which is the same options dictionary passed to `env.reset` and defaults to an empty dictionary (which may be useful for controlling how to load a scene with just reset arguments).
+The `_load_scene` function must be implemented to build objects besides agents. It is also given an `options` dictionary which is the same options dictionary passed to `env.reset` and defaults to an empty dictionary (which may be useful for controlling how to load a scene with just reset arguments).
 
 Building a **dynamic** actor like a cube in PushCube is done as so
 ```python
@@ -96,8 +101,8 @@ class PushCubeEnv(BaseEnv):
                 base_color=[1, 0, 0, 1],
             ),
         )
-        # optionally set an initial pose so that it initially spawns there
-        # self.builder.initial_pose = sapien.Pose(p=[1, 1, 1], q=[1, 0, 0, 0])
+        # strongly recommended to set initial poses for objects, even if you plan to modify them later
+        builder.initial_pose = sapien.Pose(p=[0, 0, 0.02], q=[1, 0, 0, 0])
         self.obj = builder.build(name="cube")
         # PushCube has some other code after this removed for brevity that 
         # spawns a goal object (a red/white target) stored at self.goal_region
@@ -109,8 +114,7 @@ You can build a **kinematic** actor with `builder.build_kinematic` and a **stati
 - Kinematic actors can have their pose changed at any time. Static actors must have an initial pose set before calling `build_static` via `builder.initial_pose = ...`
 - Use static instead of kinematic whenever possible as it saves a lot of GPU memory. Static objects must have an initial pose set before building via `builder.initial_pose = ...`.
 
-Note that by default, if an object does not have an initial pose set in its builder, ManiSkill tries its best to choose initial poses of objects such that they are unlikely to collide with each other during GPU simulation initialization (which uses the initial poses of the builders). These poses are definitely not what you want for task initialization but need to be far apart initially to ensure stable GPU simulation.
-
+Note that by default, if an object does not have an initial pose set in its builder, ManiSkill will set it to a default pose of `q=[1,0,0,0], p=[0,0,0]` and give a warning. For simple tasks this may not matter but when working with more complex objects and articulations, it is strongly recommended to set initial poses for all objects as GPU simulation might run into bugs/issues if the objects at their initial poses collide.
 
 We also provide some functions that build some more complex shapes that you can use by importing the following:
 ```python
@@ -144,7 +148,7 @@ We recommend you to first complete this tutorial before moving onto the next.
 
 ## Episode Initialization / Randomization
 
-Task initialization and randomization is handled in the `_initialize_episode` function and is called whenever `env.reset` is called. The objective here is to set the initial states of all non-static objects, including the robot. Note that objects that do not have initial poses set when building or set during episode initialization won't necessarily spawn at the origin.
+Task initialization and randomization is handled in the `_initialize_episode` function and is called whenever `env.reset` is called. The objective here is to set the initial states of all non-static objects, including the robot.
 
 As the task ideally should be simulatable on the GPU, batched code is unavoidable. Note that furthermore, by default everything in ManiSkill tries to stay batched, even if there is only one element. Finally, like `_load_scene` the options argument is also passed down here if needed.
 

diff --git a/docs/source/user_guide/tutorials/custom_tasks/loading_objects.md b/docs/source/user_guide/tutorials/custom_tasks/loading_objects.md
@@ -11,12 +11,15 @@ ManiSkill provides two ways to load actors, loading directly from existing simul
 ManiSkill supports easily loading assets from existing datasets such as the YCB dataset. In the beta release this is the only asset database available, more will be provided once we finish integrating a 3D asset database system.
 
 ```python
+import sapien
 from mani_skill.utils.building import actors
 def _load_scene(self, options):
     builder = actors.get_actor_builder(
         self.scene,
         id=f"ycb:{model_id}",
     )
+    # choose a reasonable initial pose that doesn't intersect other objects
+    builder.inital_pose = sapien.Pose(p=[0, 0, 0.5])
     builder.build(name="object")
 ```
 
@@ -45,6 +48,9 @@ def _load_scene(self, options):
     builder = articulations.get_articulation_builder(
         self.scene, f"partnet-mobility:{model_id}"
     )
+    # choose a reasonable initial pose that doesn't intersect other objects
+    # this matters a lot for articulations in GPU sim or else simulation bugs can occur
+    builder.inital_pose = sapien.Pose(p=[0, 0, 0.5])
     builder.build(name="object")
 ```
 
@@ -71,6 +77,9 @@ def _load_scene(self, options):
     # actors and cameras but we only use the articulations
     articulation_builders = loader.parse(str(urdf_path))["articulation_builders"]
     builder = articulation_builders[0]
+    # choose a reasonable initial pose that doesn't intersect other objects
+    # this matters a lot for articulations in GPU sim or else simulation bugs can occur
+    builder.initial_pose = sapien.Pose(p=[0, 0, 0.5])
     builder.build(name="my_articulation")
 ```
 

diff --git a/mani_skill/agents/base_agent.py b/mani_skill/agents/base_agent.py
@@ -18,6 +18,7 @@
 from mani_skill.sensors.base_sensor import BaseSensor, BaseSensorConfig
 from mani_skill.utils import assets, download_asset, sapien_utils
 from mani_skill.utils.structs import Actor, Array, Articulation
+from mani_skill.utils.structs.pose import Pose
 
 from .controllers.base_controller import (
     BaseController,
@@ -51,6 +52,8 @@ class BaseAgent:
         control_mode (str | None): uid of controller to use
         fix_root_link (bool): whether to fix the robot root link
         agent_idx (str | None): an index for this agent in a multi-agent task setup If None, the task should be single-agent
+        initial_pose (sapien.Pose | Pose | None): the initial pose of the robot. Important to set for GPU simulation to ensure robot
+        does not collide with other objects in the scene during GPU initialization which occurs before `env._initialize_episode` is called
     """
 
     uid: str
@@ -81,6 +84,7 @@ def __init__(
         control_freq: int,
         control_mode: Optional[str] = None,
         agent_idx: Optional[str] = None,
+        initial_pose: Optional[Union[sapien.Pose, Pose]] = None,
     ):
         self.scene = scene
         self._control_freq = control_freq
@@ -93,7 +97,7 @@ def __init__(
         self.sensors: Dict[str, BaseSensor] = dict()
         """The sensors that come with the robot."""
 
-        self._load_articulation()
+        self._load_articulation(initial_pose)
         self._after_loading_articulation()
 
         # Controller
@@ -141,7 +145,9 @@ def _controller_configs(
     def device(self):
         return self.scene.device
 
-    def _load_articulation(self):
+    def _load_articulation(
+        self, initial_pose: Optional[Union[sapien.Pose, Pose]] = None
+    ):
         """
         Loads the robot articulation
         """
@@ -185,7 +191,9 @@ def _load_articulation(self):
                     f"Exiting as assets for robot {self.uid} are not found. Check that this agent is properly registered with the appropriate download asset ids"
                 )
                 exit()
-        self.robot: Articulation = loader.load(asset_path)
+        builder = loader.parse(asset_path)["articulation_builders"][0]
+        builder.initial_pose = initial_pose
+        self.robot = builder.build()
         assert self.robot is not None, f"Fail to load URDF/MJCF from {asset_path}"
 
         # Cache robot link names

diff --git a/mani_skill/envs/minimal_template.py b/mani_skill/envs/minimal_template.py
@@ -1,6 +1,7 @@
 from typing import Any, Dict, Union
 
 import numpy as np
+import sapien
 import torch
 
 from mani_skill.agents.robots import Fetch, Panda
@@ -37,6 +38,9 @@ def _default_human_render_camera_configs(self):
         pose = sapien_utils.look_at([0.6, 0.7, 0.6], [0.0, 0.0, 0.35])
         return CameraConfig("render_camera", pose=pose, width=512, height=512, fov=1)
 
+    def _load_agent(self, options: dict):
+        super()._load_agent(options, sapien.Pose(p=[0, 0, 0]))
+
     def _load_scene(self, options: dict):
         pass
 

diff --git a/mani_skill/envs/sapien_env.py b/mani_skill/envs/sapien_env.py
@@ -1,6 +1,5 @@
 import copy
 import gc
-import logging
 import os
 from functools import cached_property
 from typing import Any, Dict, List, Optional, Sequence, Tuple, Union
@@ -35,6 +34,7 @@
 from mani_skill.sensors.depth_camera import StereoDepthCamera, StereoDepthCameraConfig
 from mani_skill.utils import common, gym_utils, sapien_utils
 from mani_skill.utils.structs import Actor, Articulation
+from mani_skill.utils.structs.pose import Pose
 from mani_skill.utils.structs.types import Array, SimConfig
 from mani_skill.utils.visualization.misc import tile_images
 from mani_skill.viewer import create_viewer
@@ -197,7 +197,7 @@ def __init__(
         self.reconfiguration_freq = reconfiguration_freq if reconfiguration_freq is not None else 0
         self._reconfig_counter = 0
         if shader_dir is not None:
-            logging.warn("shader_dir argument will be deprecated after ManiSkill v3.0.0 official release. Please use sensor_configs/human_render_camera_configs to set shaders.")
+            logger.warn("shader_dir argument will be deprecated after ManiSkill v3.0.0 official release. Please use sensor_configs/human_render_camera_configs to set shaders.")
             sensor_configs |= dict(shader_pack=shader_dir)
             human_render_camera_configs |= dict(shader_pack=shader_dir)
             viewer_camera_configs |= dict(shader_pack=shader_dir)
@@ -379,9 +379,11 @@ def gpu_sim_enabled(self):
     @property
     def _default_sim_config(self):
         return SimConfig()
-    def _load_agent(self, options: dict):
+    def _load_agent(self, options: dict, initial_agent_poses: Optional[Union[sapien.Pose, Pose]] = None):
         agents = []
         robot_uids = self.robot_uids
+        if not isinstance(initial_agent_poses, list):
+            initial_agent_poses = [initial_agent_poses]
         if robot_uids == "none" or robot_uids == ("none", ):
             self.agent = None
             return
@@ -402,6 +404,7 @@ def _load_agent(self, options: dict):
                     self._control_freq,
                     self._control_mode,
                     agent_idx=i if len(robot_uids) > 1 else None,
+                    initial_pose=initial_agent_poses[i] if initial_agent_poses is not None else None,
                 )
                 agents.append(agent)
         if len(agents) == 1:
@@ -640,7 +643,9 @@ def _reconfigure(self, options = dict()):
         self._clear()
         # load everything into the scene first before initializing anything
         self._setup_scene()
+
         self._load_agent(options)
+
         self._load_scene(options)
         self._load_lighting(options)
 

diff --git a/mani_skill/envs/tasks/control/ant.py b/mani_skill/envs/tasks/control/ant.py
@@ -1,5 +1,5 @@
 import os
-from typing import Any, Dict, Union
+from typing import Any, Dict, Optional, Union
 
 import numpy as np
 import sapien
@@ -59,7 +59,9 @@ def _controller_configs(self):
             )
         )
 
-    def _load_articulation(self):
+    def _load_articulation(
+        self, initial_pose: Optional[Union[sapien.Pose, Pose]] = None
+    ):
         """
         Load the robot articulation
         """
@@ -68,7 +70,9 @@ def _load_articulation(self):
 
         loader.name = self.uid
 
-        self.robot = loader.parse(asset_path)["articulation_builders"][0].build()
+        builder = loader.parse(asset_path)["articulation_builders"][0]
+        builder.initial_pose = initial_pose
+        self.robot = builder.build()
         assert self.robot is not None, f"Fail to load URDF/MJCF from {asset_path}"
         self.robot_link_ids = [link.name for link in self.robot.get_links()]
 

diff --git a/mani_skill/envs/tasks/control/cartpole.py b/mani_skill/envs/tasks/control/cartpole.py
@@ -1,7 +1,7 @@
 """Adapted from https://github.com/google-deepmind/dm_control/blob/main/dm_control/suite/cartpole.py"""
 
 import os
-from typing import Any, Dict, Union
+from typing import Any, Dict, Optional, Union
 
 import numpy as np
 import sapien
@@ -50,7 +50,9 @@ def _controller_configs(self):
             )
         )
 
-    def _load_articulation(self):
+    def _load_articulation(
+        self, initial_pose: Optional[Union[sapien.Pose, Pose]] = None
+    ):
         """
         Load the robot articulation
         """
@@ -60,7 +62,9 @@ def _load_articulation(self):
         loader.name = self.uid
 
         # only need the robot
-        self.robot = loader.parse(asset_path)["articulation_builders"][0].build()
+        builder = loader.parse(asset_path)["articulation_builders"][0]
+        builder.initial_pose = initial_pose
+        self.robot = builder.build()
         assert self.robot is not None, f"Fail to load URDF/MJCF from {asset_path}"
 
         # Cache robot link ids
@@ -111,17 +115,18 @@ def _load_scene(self, options: dict):
         loader = self.scene.create_mjcf_loader()
         actor_builders = loader.parse(MJCF_FILE)["actor_builders"]
         for a in actor_builders:
+            a.initial_pose = sapien.Pose()
             a.build(a.name)
 
         # background visual wall
         self.wall = self.scene.create_actor_builder()
         self.wall.add_box_visual(
             half_size=(1e-3, 20, 10),
-            pose=sapien.Pose(p=[0, 1, 1], q=euler2quat(0, 0, np.pi / 2)),
             material=sapien.render.RenderMaterial(
                 base_color=np.array([0.3, 0.3, 0.3, 1])
             ),
         )
+        self.wall.initial_pose = sapien.Pose(p=[0, 1, 1], q=euler2quat(0, 0, np.pi / 2))
         self.wall.build_static(name="wall")
 
     def evaluate(self):

diff --git a/mani_skill/envs/tasks/control/hopper.py b/mani_skill/envs/tasks/control/hopper.py
@@ -1,7 +1,7 @@
 """Adapted from https://github.com/google-deepmind/dm_control/blob/main/dm_control/suite/hopper.py"""
 
 import os
-from typing import Any, Dict, Union
+from typing import Any, Dict, Optional, Union
 
 import numpy as np
 import sapien
@@ -75,7 +75,9 @@ def _controller_configs(self):
             )
         )
 
-    def _load_articulation(self):
+    def _load_articulation(
+        self, initial_pose: Optional[Union[sapien.Pose, Pose]] = None
+    ):
         """
         Load the robot articulation
         """
@@ -84,7 +86,9 @@ def _load_articulation(self):
 
         loader.name = self.uid
 
-        self.robot = loader.parse(asset_path)["articulation_builders"][0].build()
+        builder = loader.parse(asset_path)["articulation_builders"][0]
+        builder.initial_pose = initial_pose
+        self.robot = builder.build()
         assert self.robot is not None, f"Fail to load URDF/MJCF from {asset_path}"
         self.robot_link_ids = [link.name for link in self.robot.get_links()]
 

diff --git a/mani_skill/envs/tasks/digital_twins/bridge_dataset_eval/base_env.py b/mani_skill/envs/tasks/digital_twins/bridge_dataset_eval/base_env.py
@@ -271,6 +271,11 @@ def _load_lighting(self, options: dict):
         self.scene.add_directional_light([-1, -0.5, -1], [0.7, 0.7, 0.7])
         self.scene.add_directional_light([1, 1, -1], [0.7, 0.7, 0.7])
 
+    def _load_agent(self, options: dict):
+        super()._load_agent(
+            options, sapien.Pose(p=[0.127, 0.060, 0.85], q=[0, 0, 0, 1])
+        )
+
     def _load_scene(self, options: dict):
         # original SIMPLER envs always do this? except for open drawer task
         for i in range(self.num_envs):