-
Notifications
You must be signed in to change notification settings - Fork 190
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parallel observations #1687
Parallel observations #1687
Conversation
Something to consider in parallel computing: mpi4py, albeit this approach requires significant code changes. |
1f45b2a
to
a15ca0b
Compare
This changeset has a lot of useful features but the changes are not turning out quite as I hoped. Process communication seems to be slow as determined by the tests. |
smarts/core/road_map.py
Outdated
"""The default serialization for the road map.""" | ||
import cloudpickle | ||
|
||
return cloudpickle.dumps(road_map) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How did you get around the issues related to lanepoints that we were seeing with this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ended up using a proxy object to format and then reconstruct the road_map
.
SMARTS/smarts/core/serialization/default.py
Lines 28 to 80 in edd90db
def dumps(__o): | |
"""Serializes the given object.""" | |
import cloudpickle | |
_lazy_init() | |
r = __o | |
type_ = type(__o) | |
# TODO: Add a formatter parameter instead of handling proxies internal to serialization | |
proxy_func = _proxies.get(type_) | |
if proxy_func: | |
r = proxy_func(__o) | |
return cloudpickle.dumps(r) | |
def loads(__o): | |
"""Deserializes the given object.""" | |
import cloudpickle | |
r = cloudpickle.loads(__o) | |
if hasattr(r, "deproxy"): | |
r = r.deproxy() | |
return r | |
class Proxy: | |
"""Defines a proxy object used to facilitate serialization of a non-serializable object.""" | |
def deproxy(self): | |
"""Convert the proxy back into the original object.""" | |
raise NotImplementedError() | |
@dataclass(frozen=True) | |
class _SimulationLocalConstantsProxy(Proxy): | |
road_map_spec: Any | |
road_map_hash: int | |
def __eq__(self, __o: object) -> bool: | |
if __o is None: | |
return False | |
return self.road_map_hash == getattr(__o, "road_map_hash") | |
def deproxy(self): | |
import smarts.sstudio.types | |
from smarts.core.simulation_local_constants import SimulationLocalConstants | |
assert isinstance(self.road_map_spec, smarts.sstudio.types.MapSpec) | |
road_map, _ = self.road_map_spec.builder_fn(self.road_map_spec) | |
return SimulationLocalConstants(road_map, self.road_map_hash) | |
def _proxy_slc(v): | |
return _SimulationLocalConstantsProxy(v.road_map.map_spec, v.road_map_hash) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked around, it looks like gymnasium
has a potentially better idea for this: https://github.com/Farama-Foundation/Gymnasium/blob/c2a387702c48d2e50f499a4f47d30e293ad75240/gymnasium/utils/ezpickle.py#L4
dac8642
to
4fa2b58
Compare
smarts/core/sensor_manager.py
Outdated
# {sensor_id, ...} | ||
self._discarded_sensors: Set[str] = set() | ||
|
||
def step(self, sim_frame, renderer): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Type hints would be nice here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can add the sim_frame
type-hint but not the renderer
until we extract an interface.
"""Clean up resources, resetting the index.""" | ||
self._controlled_by = VehicleIndex._build_empty_controlled_by() | ||
|
||
for vehicle in self._vehicles.values(): | ||
vehicle.teardown(exclude_chassis=True) | ||
vehicle.teardown(renderer=renderer, exclude_chassis=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Passing the renderer to each of these calls seems a bit surprising. Is there a reasonable way to refactor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is difficult. I honestly want to strip out the renderer
entirely from the main systems. This is a halfway step towards that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The end intention is to use the simulation frame to update the state of the renderer
on all threads.
25b75b4
to
fc1c9e2
Compare
Changes to be made:
|
edd90db
to
4482b0c
Compare
1da17ea
to
55fd879
Compare
4021e6e
to
431dc5e
Compare
80a0839
to
f4e8634
Compare
class ActionSpaceType(Enum): | ||
"""Available vehicle action spaces.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved this to its own file to simplify imports.
@@ -72,7 +72,7 @@ def is_specific(self) -> bool: | |||
"""If the goal is reachable at a specific position.""" | |||
return False | |||
|
|||
def is_reached(self, vehicle) -> bool: | |||
def is_reached(self, vehicle_state) -> bool: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to restrict passing the vehicle around to data structures because the vehicle has methods that can mutate the engine state.
def frame(self) -> PlanFrame: | ||
"""Get the state of this plan.""" | ||
assert self._mission | ||
return PlanFrame( | ||
road_ids=self._route.road_ids if self._route else [], mission=self._mission | ||
) | ||
|
||
@classmethod | ||
def from_frame(cls, plan_frame: PlanFrame, road_map: RoadMap) -> "Plan": | ||
"""Generate the plan from a frame.""" | ||
new_plan = cls(road_map=road_map, mission=plan_frame.mission, find_route=False) | ||
new_plan.route = road_map.route_from_road_ids(plan_frame.road_ids) | ||
return new_plan |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an attempt to avoid passing around the road map on simple sets of data like the sensor state. If you need the utility of the plan object you must rebuilt it from the frame and the road map.
class SensorState: | ||
"""Sensor state information""" | ||
|
||
def __init__(self, max_episode_steps: int, plan_frame: PlanFrame): | ||
self._max_episode_steps = max_episode_steps | ||
self._plan_frame = plan_frame | ||
self._step = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This object can then be passed between processes without serialising the road map because of using the plan frame.
smarts/core/renderer.py
Outdated
try: | ||
self.destroy() | ||
except TypeError: | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This silences the program exit race condition error if self
is somehow already None
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't that throw an AttributeError
rather than a TypeError
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, it should, I am unsure why I am using TypeError
here. The current Renderer class uses Panda3D underneath. I believe it was related to those resources.
@@ -45,7 +45,7 @@ class SignalState(ActorState): | |||
|
|||
state: Optional[SignalLightState] = None | |||
stopping_pos: Optional[Point] = None | |||
controlled_lanes: Optional[List[RoadMap.Lane]] = None | |||
controlled_lanes: Optional[List[str]] = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was to remove road map (via RoadMap.Lane) from signal actor state.
def unpack(obj): | ||
"""A helper that can be used to print `nestedtuples`. For example, | ||
"""A helper that can be used to print nested data objects (`tuple`, `dataclass`, `namedtuple`, ...). For example, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This utility ends up being useful for comparison in many cases.
@@ -199,7 +199,6 @@ def large_observation(): | |||
), | |||
drivable_area_grid_map=DrivableAreaGridMap( | |||
metadata=GridMapMetadata( | |||
created_at=1649853761, | |||
resolution=0.1953125, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to remove created_at
because it made the GridMapMetadata
object non-deterministic.
[core] | ||
debug = false | ||
observation_workers = 0 | ||
reset_retries = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Configuration appears to be very easy to add now.
engine.ini
Outdated
[core] | ||
debug = false | ||
observation_workers = 2 | ||
reset_retries = 1 | ||
[controllers] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This second config file is used just for testing out separate configuration and will be removed.
6eee8cc
to
a085b1a
Compare
fb9835c
to
0c6b139
Compare
smarts/core/renderer.py
Outdated
try: | ||
self.destroy() | ||
except TypeError: | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't that throw an AttributeError
rather than a TypeError
?
try: | ||
junction_check_proc.start() | ||
except AssertionError: | ||
cls._check_junctions(net_file) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What assertion was being raised?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It has been a while but I believe it was an assertion related to generating a daemon process from a daemon process. This gives a fallback.
or serial_total > parallel_2_total | ||
or serial_total > parallel_3_total | ||
or serial_total > parallel_4_total | ||
), f"{serial_total}, {parallel_1_total}, {parallel_2_total}, {parallel_3_total} {parallel_4_total}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be useful to add a check for the correctness of the returned observations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honestly, I should scrap the time check and just test that the results are the same between the two resolvers.
I am going to stop rebasing and switch to merging because the history is now too long. |
No description provided.