Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

traffic history improvements for imitation learning #741

Merged
merged 10 commits into from
Apr 12, 2021

Conversation

sah-huawei
Copy link
Contributor

@sah-huawei sah-huawei commented Apr 6, 2021

Changes traffic history file format from JSON to an opaque .shf format (currently a SQLite file database).

This reduces the size of history files by over 60%, speeds up stepping from the TrafficHistoryProvider to under 2ms, and fixes bugs related to issue #407 (missed and skipped samples).

This obsoletes the Traffic_history_service (and associated unit tests of it).

Traffic history database files are automatically created when a Scenario specifies them, as before, like:

gen_scenario(
    t.Scenario(
        traffic_histories=["i80-0400.yaml", "i80-0500.yaml"],
    ),
    output_dir=Path(__file__).parent,
)

However we now support a yaml "dataset spec" file as input that contains a pointer to the original dataset to be converted as well as parameters related to the conversion (examples have be added to Issue #732). This converts and imports the data into the sqlite database when scenarios are first built and obsoletes the previous conversion scripts ( tools/interaction_dataset_converter.py and ngsim_dataset_converter.py).

Along the way:

  • added ability to set default agent speed to the imitation learning "agent replacement" example
  • added ability to support different lane widths in off-road checks (as required by the NGSIM dataset, whose lanes are ~6x)
  • added ability to convert old JSON history files to the new format
  • added ability to run SMARTS in "near real-time" such that traffic can be watched in sumo-gui at correct speed

Closes #732, #407

- better smoothing of heading for very low vehicle speeds to reduce "wiggle"
- added ability to set agent speed to imitation learning replacement example
- added option to convert old JSON traffic histories to new sqlite .shf files
- quiet netconvert's "Success" output when shifting maps
- minor refactor of genhistories database creation code
@@ -58,6 +58,7 @@ def __new__(cls):
# disable vsync otherwise we are limited to refresh-rate of screen
loadPrcFileData("", "sync-video false")
loadPrcFileData("", "model-path %s" % os.getcwd())
loadPrcFileData("", "model-cache-dir %s/.panda3d_cache" % os.getcwd())
Copy link
Contributor

@JingfeiPeng JingfeiPeng Apr 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this related to replaying traffic data?

Copy link
Contributor Author

@sah-huawei sah-huawei Apr 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops. No, it's not. I accidentally left that in after an unrelated test. I'll take it back out. Thanks for catching that!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(It did speed up rendering though, so we might consider adding it at some point.)

self._log = logging.getLogger(self.__class__.__name__)
self._graph = graph
self._net_file = net_file
self._default_lane_width = (
default_lane_width if default_lane_width is not None else 3.2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we add some explanation for 3.2 here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I will add a comment.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine too but I might have just preferred a descriptive constant.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, done.

genhistories_py = os.path.join(
os.path.dirname(os.path.realpath(__file__)), "genhistories.py"
)
for hdsr in histories_datasets:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering what does hdsr and hds stand for?...

Copy link
Contributor Author

@sah-huawei sah-huawei Apr 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hds -> "history data set" and hdsr -> "history data set ref" (or something like that!) :)

@@ -146,6 +146,7 @@ def _clean(scenario):
"social_agents/*",
"traffic/*",
"history_mission.pkl",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are using .shf files now we can probably remove "history_mission.pkl" here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, definitely we need to eventually, but I wanted to leave it there a little bit to get rid of any existing files on the next clean.

Copy link
Contributor

@JingfeiPeng JingfeiPeng Apr 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! One other thing we can remove is the ijson dependency. line 64 and 65 in setup.py and line 48 in requirements.txt. ijson was only used for imitation learning

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

funny, I did, but I had to add it back. It's still being used in genhistories.py to convert old json files.

Copy link
Contributor

@JingfeiPeng JingfeiPeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice change!

@JingfeiPeng
Copy link
Contributor

where could I take a look at how NGSIM and Interaction dataset scenario folder looks now?

@sah-huawei
Copy link
Contributor Author

where could I take a look at how NGSIM and Interaction dataset scenario folder looks now?

I have them locally if you want me to zip them up and send them. (I don't think we should push them here though.)
Or, you can download NGSIM from the link in Issue #407 (and INTERACTION from zbzhu99's imitation learning fork) and try out the yaml file examples I put in Issue #732. That's probably better b/c then you can see if I forgot anything in my instructions.

@sah-huawei
Copy link
Contributor Author

Nice change!

Thanks. Your NGSIM jupyter notebook gave me a huge head start on genhistories.py! Thanks again for that.

observations = smarts.reset(scenario)

dones = {agent_id: False}
while not dones[agent_id]:
while not dones.get(agent_id, False):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is better practice but is there any case where the default should be necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I hit that in my testing. It happened early on and I can't clearly remember the circumstances, but I suspect it was related to my having the mission.start_time wrong initially (leading to agent_id not being started yet). So I guess, when things are correct, it's not necessary, but I added it so I wouldn't crash immediately with a key error and could debug other things first.

Comment on lines +61 to +62
# TODO: the following speeds up rendering a bit... might consider it.
# loadPrcFileData("", "model-cache-dir %s/.panda3d_cache" % os.getcwd())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this speed up rendering or model loading?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, right, model loading. (However, as vehicle models can be loaded during the simulation, it can still affect the step time.)

self._log = logging.getLogger(self.__class__.__name__)
self._graph = graph
self._net_file = net_file
self._default_lane_width = (
default_lane_width if default_lane_width is not None else 3.2
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine too but I might have just preferred a descriptive constant.

@@ -224,7 +226,12 @@ def _step(self, agent_actions):
extras = dict(scores=scores)

# 8. Advance the simulation clock.
self._elapsed_sim_time += dt
self._elapsed_sim_time = round(self._elapsed_sim_time + dt, 3)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say rounding here is wrong since it means that people will unable to run the simulation at any step less than 5e-4. While such a small step would not normally be done, I do not think it is on us to try limit a user from attempting a micro-scale time-step.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. (I added it because the addition of dt sometimes caused floating point precision issues, like an _elapsed_sim_time equal to, say, 1.9999999 instead of 2.0.) I just changed it to never "round too much".

Comment on lines 313 to 315
self._use_realtime_clock = scenario.traffic_history and (
not self._traffic_sim.headless or not self._envision.headless
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see why the real-time option might be useful for imitation learning but I have a few concerns. My first inclination here is that it should not be attached to the traffic history because there are other uses for a realtime clock option and this binds the option to the traffic history.

Secondly, Envision is already supposed to play back at real-time and I believe this complicates fixing Envision so I do not think Envision should be included with this option.

Otherwise this is useful to sumo-gui but there is another alternative. It is possible to send the --gui-settings-file option to sumo-gui to add step delay and breakpoints via a configuration file: https://sumo.dlr.de/docs/sumo-gui.html#configuration_files.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re the traffic-history limitation, I agree with you. I just was hesitant to auto-set it and change the behavior for other things that might have inadvertently set headless to False but where no one watches the gui/envision.

Re Envision, there is still a bug there, but you're right that this "masks" the problem.

I will look into the --gui-settings-file option...

Copy link
Contributor Author

@sah-huawei sah-huawei Apr 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using step delay via a --gui-settings-file could work assuming we know (approximately) how long our steps really take. But if these will be variable, or if we'll be improving our step time, then we may have to tweak this occasionally.

I'm willing to do this, but let's chat about it on Monday first. I do see other advantages to having a near-real-time mode (in particular, asyncrhonously running alongside ROS nodes), but there may be better ways to achieve that, so I can be persuaded to leave this out for now (and/or use the sumo-gui delay setting).

self.start_time_offset = 0
self._replaced_vehicle_ids = set()
self._start_time_offset = 0
self._histories_db = None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like you ordered the instance variables but then added one to put it back out of order. 🤣

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hehe, yep! :)

Comment on lines +90 to +92
# Options from NGSIM and INTERACTION currently include:
# 1=motorcycle, 2=auto, 3=truck, 4=pedestrian/bicycle
# But we don't yet have glb models for 1 and 4.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will add this as an issue.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

# But we don't yet have glb models for 1 and 4.
if vehicle_type == 3:
return "truck"
return "passenger"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see this as a potential issue that we would default to a passenger vehicle when provided a pedestrian since we do not support non-road actors yet nor do we yet want to evaluate pedestrians.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I agree. I'm adding a change to skip pedestrians for now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... although on second thought, I think we should probably just add a pedestrian model fairly soon because, in an imitation learning scenario, it could cause problems to train from a vehicle that swerves or stops suddenly for seeming no reason (if a pedestrian that triggered this behavior is not part of the simulation).
So I guess I'm not going to skip them after all. We should just do Issue #756 relatively soon instead.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, more-so I was worried about imitation learning attempting to take over a pedestrian because it thinks it is a passenger vehicle.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah oh. I just added something to prevent that: only passenger cars will now be included in the agent missions used by history_vehicle_replacement_for_imitation_learning.py.

float(self.column_val_in_row(row, "speed")) * self.scale,
self.column_val_in_row(row, "lane_id"),
)
if not any(a is not None and np.isnan(a) for a in traj_args):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the presence of isnan would be a value useful to warn about rather than silently skipping.

Copy link
Contributor Author

@sah-huawei sah-huawei Apr 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, in general I agree. Unfortunately, the rolling window method used to calculate the moving averages for position_x and position_y leaves 1-window-width of NaNs at the end. This was the easiest way to ignore those. I'll add a comment to explain that.

Comment on lines 200 to 206
# Try to match the NGSIM types...
if agent_type == "car":
return 2
elif agent_type == "truck":
return 3
elif agent_type == "pedestrian/bicycle":
return 4
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is motorcycle also supposed to be here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops, yes! good catch!

@sah-huawei sah-huawei merged commit f9c0f08 into develop Apr 12, 2021
@sah-huawei sah-huawei deleted the traffic-history-sqlite branch April 12, 2021 23:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Traffic Histories interface bug and scalability NGSIM dataset abnormal replay
3 participants