Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallel observations #1687

Merged
merged 153 commits into from
Apr 21, 2023
Merged

Parallel observations #1687

merged 153 commits into from
Apr 21, 2023

Conversation

Gamenot
Copy link
Collaborator

@Gamenot Gamenot commented Oct 28, 2022

No description provided.

@Adaickalavan
Copy link
Member

Something to consider in parallel computing: mpi4py, albeit this approach requires significant code changes.

@Gamenot Gamenot force-pushed the tucker/feature-parallel_observations branch 2 times, most recently from 1f45b2a to a15ca0b Compare November 4, 2022 18:51
@Gamenot
Copy link
Collaborator Author

Gamenot commented Nov 4, 2022

This changeset has a lot of useful features but the changes are not turning out quite as I hoped. Process communication seems to be slow as determined by the tests.

@Gamenot Gamenot marked this pull request as ready for review November 28, 2022 20:29
@Gamenot Gamenot changed the title [WIP] Tucker/feature parallel observations Parallel observations Nov 28, 2022
@Gamenot Gamenot linked an issue Nov 29, 2022 that may be closed by this pull request
"""The default serialization for the road map."""
import cloudpickle

return cloudpickle.dumps(road_map)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How did you get around the issues related to lanepoints that we were seeing with this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ended up using a proxy object to format and then reconstruct the road_map.

def dumps(__o):
"""Serializes the given object."""
import cloudpickle
_lazy_init()
r = __o
type_ = type(__o)
# TODO: Add a formatter parameter instead of handling proxies internal to serialization
proxy_func = _proxies.get(type_)
if proxy_func:
r = proxy_func(__o)
return cloudpickle.dumps(r)
def loads(__o):
"""Deserializes the given object."""
import cloudpickle
r = cloudpickle.loads(__o)
if hasattr(r, "deproxy"):
r = r.deproxy()
return r
class Proxy:
"""Defines a proxy object used to facilitate serialization of a non-serializable object."""
def deproxy(self):
"""Convert the proxy back into the original object."""
raise NotImplementedError()
@dataclass(frozen=True)
class _SimulationLocalConstantsProxy(Proxy):
road_map_spec: Any
road_map_hash: int
def __eq__(self, __o: object) -> bool:
if __o is None:
return False
return self.road_map_hash == getattr(__o, "road_map_hash")
def deproxy(self):
import smarts.sstudio.types
from smarts.core.simulation_local_constants import SimulationLocalConstants
assert isinstance(self.road_map_spec, smarts.sstudio.types.MapSpec)
road_map, _ = self.road_map_spec.builder_fn(self.road_map_spec)
return SimulationLocalConstants(road_map, self.road_map_hash)
def _proxy_slc(v):
return _SimulationLocalConstantsProxy(v.road_map.map_spec, v.road_map_hash)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@qianyi-sun qianyi-sun force-pushed the tucker/feature-parallel_observations branch from dac8642 to 4fa2b58 Compare December 15, 2022 21:53
smarts/core/agent_manager.py Outdated Show resolved Hide resolved
smarts/core/agent_manager.py Outdated Show resolved Hide resolved
# {sensor_id, ...}
self._discarded_sensors: Set[str] = set()

def step(self, sim_frame, renderer):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Type hints would be nice here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can add the sim_frame type-hint but not the renderer until we extract an interface.

smarts/core/sensor_manager.py Show resolved Hide resolved
"""Clean up resources, resetting the index."""
self._controlled_by = VehicleIndex._build_empty_controlled_by()

for vehicle in self._vehicles.values():
vehicle.teardown(exclude_chassis=True)
vehicle.teardown(renderer=renderer, exclude_chassis=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Passing the renderer to each of these calls seems a bit surprising. Is there a reasonable way to refactor?

Copy link
Collaborator Author

@Gamenot Gamenot Jan 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is difficult. I honestly want to strip out the renderer entirely from the main systems. This is a halfway step towards that.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The end intention is to use the simulation frame to update the state of the renderer on all threads.

smarts/core/vehicle_state.py Outdated Show resolved Hide resolved
smarts/core/vehicle.py Show resolved Hide resolved
@Gamenot Gamenot force-pushed the tucker/feature-parallel_observations branch 2 times, most recently from 25b75b4 to fc1c9e2 Compare December 30, 2022 20:19
@Gamenot
Copy link
Collaborator Author

Gamenot commented Feb 8, 2023

Changes to be made:

  • Split up serial and parallel implementations of sensors.
  • Isolate parallelism to smarts.ray
  • Use dill serialisation to avoid circular dependency chains
  • Integrate engine configuration
  • Reduce parallel implementation code using ray
  • Extract renderer interface
  • Add issue for extracting physics

@Gamenot Gamenot added this to the `develop` branch close-down milestone Feb 14, 2023
@Gamenot Gamenot force-pushed the tucker/feature-parallel_observations branch from edd90db to 4482b0c Compare February 16, 2023 19:26
@Gamenot Gamenot changed the base branch from develop to master February 16, 2023 19:27
@Gamenot Gamenot force-pushed the tucker/feature-parallel_observations branch 2 times, most recently from 1da17ea to 55fd879 Compare February 22, 2023 16:21
@Gamenot Gamenot force-pushed the tucker/feature-parallel_observations branch 2 times, most recently from 4021e6e to 431dc5e Compare March 6, 2023 20:43
@Gamenot Gamenot force-pushed the tucker/feature-parallel_observations branch 3 times, most recently from 80a0839 to f4e8634 Compare March 13, 2023 14:57
Comment on lines +25 to +26
class ActionSpaceType(Enum):
"""Available vehicle action spaces."""
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved this to its own file to simplify imports.

@@ -72,7 +72,7 @@ def is_specific(self) -> bool:
"""If the goal is reachable at a specific position."""
return False

def is_reached(self, vehicle) -> bool:
def is_reached(self, vehicle_state) -> bool:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to restrict passing the vehicle around to data structures because the vehicle has methods that can mutate the engine state.

Comment on lines +392 to +404
def frame(self) -> PlanFrame:
"""Get the state of this plan."""
assert self._mission
return PlanFrame(
road_ids=self._route.road_ids if self._route else [], mission=self._mission
)

@classmethod
def from_frame(cls, plan_frame: PlanFrame, road_map: RoadMap) -> "Plan":
"""Generate the plan from a frame."""
new_plan = cls(road_map=road_map, mission=plan_frame.mission, find_route=False)
new_plan.route = road_map.route_from_road_ids(plan_frame.road_ids)
return new_plan
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an attempt to avoid passing around the road map on simple sets of data like the sensor state. If you need the utility of the plan object you must rebuilt it from the frame and the road map.

Comment on lines +96 to +103
class SensorState:
"""Sensor state information"""

def __init__(self, max_episode_steps: int, plan_frame: PlanFrame):
self._max_episode_steps = max_episode_steps
self._plan_frame = plan_frame
self._step = 0
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This object can then be passed between processes without serialising the road map because of using the plan frame.

Comment on lines 148 to 153
try:
self.destroy()
except TypeError:
pass
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This silences the program exit race condition error if self is somehow already None.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't that throw an AttributeError rather than a TypeError?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, it should, I am unsure why I am using TypeError here. The current Renderer class uses Panda3D underneath. I believe it was related to those resources.

@@ -45,7 +45,7 @@ class SignalState(ActorState):

state: Optional[SignalLightState] = None
stopping_pos: Optional[Point] = None
controlled_lanes: Optional[List[RoadMap.Lane]] = None
controlled_lanes: Optional[List[str]] = None
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was to remove road map (via RoadMap.Lane) from signal actor state.

Comment on lines 76 to +77
def unpack(obj):
"""A helper that can be used to print `nestedtuples`. For example,
"""A helper that can be used to print nested data objects (`tuple`, `dataclass`, `namedtuple`, ...). For example,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This utility ends up being useful for comparison in many cases.

@@ -199,7 +199,6 @@ def large_observation():
),
drivable_area_grid_map=DrivableAreaGridMap(
metadata=GridMapMetadata(
created_at=1649853761,
resolution=0.1953125,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to remove created_at because it made the GridMapMetadata object non-deterministic.

Comment on lines 2 to +5
[core]
debug = false
observation_workers = 0
reset_retries = 0
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Configuration appears to be very easy to add now.

engine.ini Outdated
Comment on lines 2 to 6
[core]
debug = false
observation_workers = 2
reset_retries = 1
[controllers]
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This second config file is used just for testing out separate configuration and will be removed.

@Gamenot Gamenot force-pushed the tucker/feature-parallel_observations branch from fb9835c to 0c6b139 Compare April 11, 2023 14:18
smarts/core/agent_manager.py Outdated Show resolved Hide resolved
Comment on lines 148 to 153
try:
self.destroy()
except TypeError:
pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't that throw an AttributeError rather than a TypeError?

try:
junction_check_proc.start()
except AssertionError:
cls._check_junctions(net_file)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What assertion was being raised?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It has been a while but I believe it was an assertion related to generating a daemon process from a daemon process. This gives a fallback.

or serial_total > parallel_2_total
or serial_total > parallel_3_total
or serial_total > parallel_4_total
), f"{serial_total}, {parallel_1_total}, {parallel_2_total}, {parallel_3_total} {parallel_4_total}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be useful to add a check for the correctness of the returned observations?

Copy link
Collaborator Author

@Gamenot Gamenot Apr 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly, I should scrap the time check and just test that the results are the same between the two resolvers.

smarts/core/sensor_manager.py Outdated Show resolved Hide resolved
smarts/core/sensors/__init__.py Outdated Show resolved Hide resolved
smarts/core/simulation_frame.py Outdated Show resolved Hide resolved
smarts/core/smarts.py Show resolved Hide resolved
smarts/core/smarts.py Outdated Show resolved Hide resolved
@Gamenot
Copy link
Collaborator Author

Gamenot commented Apr 17, 2023

I am going to stop rebasing and switch to merging because the history is now too long.

@Gamenot Gamenot merged commit 5e1af42 into master Apr 21, 2023
@Gamenot Gamenot deleted the tucker/feature-parallel_observations branch April 21, 2023 13:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Parallelize Vehicle Observations
4 participants