Competition submission failed with some unexpected errors #1652

ZibinDong · 2022-10-01T05:41:41Z

High Level Description
Hi! I am trying to submit my model to the 2022 NeurIPS Driving SMARTS Competition track1. The submission failed with some unexpected errors (Seems like the errors were not caused by my code, but by SMARTS).

The error log is too long so I just pick the last few lines that I thought may show how the submission fails. (I also upload the complete error log in case you need it.)

ERROR:/tmp/codalab/tmpKxmG8L/run/program/evaluate.py:Evaluation failed due to error. Attempting retry: evaluate..ProcessContext(process_builder=functools.partial(.process_builder_func at 0x7f44cb3bb550>, env_ctor=functools.partial(, env_type='smarts.env:multi-scenario-v0', scenario='/tmp/codalab/tmpKxmG8L/run/input/ref/eval_scenarios/naturalistic/mrg4-agents_2', shared_configs={'action_space': 'TargetPose', 'img_meters': 50, 'img_pixels': 112, 'sumo_headless': True}, seed=42, wrapper_ctors=), policy_type=), env_name='/tmp/codalab/tmpKxmG8L/run/input/ref/eval_scenarios/naturalistic/mrg4-agents_2', retries=1, last_reply=2560374.018741618)
ERROR:/tmp/codalab/tmpKxmG8L/run/program/evaluate.py:Scoring skipped for evaluation because retries expended: evaluate..ProcessContext(process_builder=functools.partial(.process_builder_func at 0x7f44cb3bb550>, env_ctor=functools.partial(, env_type='smarts.env:multi-scenario-v0', scenario='/tmp/codalab/tmpKxmG8L/run/input/ref/eval_scenarios/naturalistic/mrg4-agents_2', shared_configs={'action_space': 'TargetPose', 'img_meters': 50, 'img_pixels': 112, 'sumo_headless': True}, seed=42, wrapper_ctors=), policy_type=), env_name='/tmp/codalab/tmpKxmG8L/run/input/ref/eval_scenarios/naturalistic/mrg4-agents_2', retries=1, last_reply=2560374.018741618)
Traceback (most recent call last):
  File "/tmp/codalab/tmpKxmG8L/run/program/evaluate.py", line 500, in 
    rank = evaluate(config)
  File "/tmp/codalab/tmpKxmG8L/run/program/evaluate.py", line 222, in evaluate
    kill_all(*running_processes)
  File "/tmp/codalab/tmpKxmG8L/run/program/evaluate.py", line 146, in kill_all
    if process.is_alive():
AttributeError: 'tuple' object has no attribute 'is_alive'
Process SpawnProcess-130:
Traceback (most recent call last):
  File "/usr/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/usr/lib/python3.8/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/tmp/codalab/tmpKxmG8L/run/program/evaluate.py", line 353, in _worker
    output_result["response_tick"] = time.monotonic()
  File "", line 2, in __setitem__
  File "/usr/lib/python3.8/multiprocessing/managers.py", line 834, in _callmethod
    conn.send((self._id, methodname, args, kwds))
  File "/usr/lib/python3.8/multiprocessing/connection.py", line 206, in send
    self._send_bytes(_ForkingPickler.dumps(obj))
  File "/usr/lib/python3.8/multiprocessing/connection.py", line 411, in _send_bytes
    self._send(header + buf)
  File "/usr/lib/python3.8/multiprocessing/connection.py", line 368, in _send
    n = write(self._handle, buf)
BrokenPipeError: [Errno 32] Broken pipe
Exception ignored in: 
Traceback (most recent call last):
  File "/opt/.venv/lib/python3.8/site-packages/smarts/core/smarts.py", line 856, in __del__
TypeError: 'NoneType' object is not callable
:device(error): Error adding inotify watch on /dev/input: No such file or directory
:device(error): Error opening directory /dev/input: No such file or directory
Process SpawnProcess-127:
Traceback (most recent call last):
  File "/usr/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/usr/lib/python3.8/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/tmp/codalab/tmpKxmG8L/run/program/evaluate.py", line 353, in _worker
    output_result["response_tick"] = time.monotonic()
  File "", line 2, in __setitem__
  File "/usr/lib/python3.8/multiprocessing/managers.py", line 834, in _callmethod
    conn.send((self._id, methodname, args, kwds))
  File "/usr/lib/python3.8/multiprocessing/connection.py", line 206, in send
    self._send_bytes(_ForkingPickler.dumps(obj))
  File "/usr/lib/python3.8/multiprocessing/connection.py", line 411, in _send_bytes
    self._send(header + buf)
  File "/usr/lib/python3.8/multiprocessing/connection.py", line 368, in _send
    n = write(self._handle, buf)
BrokenPipeError: [Errno 32] Broken pipe
Process SpawnProcess-129:
Traceback (most recent call last):
  File "/usr/lib/python3.8/multiprocessing/managers.py", line 827, in _callmethod
    conn = self._tls.connection
AttributeError: 'ForkAwareLocal' object has no attribute 'connection'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/usr/lib/python3.8/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/tmp/codalab/tmpKxmG8L/run/program/evaluate.py", line 353, in _worker
    output_result["response_tick"] = time.monotonic()
  File "", line 2, in __setitem__
  File "/usr/lib/python3.8/multiprocessing/managers.py", line 831, in _callmethod
    self._connect()
  File "/usr/lib/python3.8/multiprocessing/managers.py", line 818, in _connect
    conn = self._Client(self._token.address, authkey=self._authkey)
  File "/usr/lib/python3.8/multiprocessing/connection.py", line 502, in Client
    c = SocketClient(address)
  File "/usr/lib/python3.8/multiprocessing/connection.py", line 630, in SocketClient
    s.connect(address)
FileNotFoundError: [Errno 2] No such file or directory
Exception ignored in: 
Traceback (most recent call last):
  File "/opt/.venv/lib/python3.8/site-packages/smarts/core/smarts.py", line 856, in __del__
TypeError: 'NoneType' object is not callable
Exception ignored in: 
Traceback (most recent call last):
  File "/opt/.venv/lib/python3.8/site-packages/smarts/core/smarts.py", line 856, in __del__
TypeError: 'NoneType' object is not callable

Could you please let me know why and how this happens?

And I also notice another unexpected error.

File "/tmp/codalab/tmpKxmG8L/run/input/res/wrappers.py", line 258, in pack_observation
    return np.concatenate(packed_obs, axis=0)
  File "<__array_function__ internals>", line 180, in concatenate
ValueError: need at least one array to concatenate

The pseudocode of function pack_observation in wrappers.py is like

def pack_observation(self, obs: Dict[str, Any]) -> np.ndarray:
    ''' Pack the SMARTS raw observation to numpy observation '''
    packed_obs = []
    for agent_id, raw_obs in obs.items():
        ''' Pack observation ... '''
        packed_obs.append(np.array([...]).reshape(1, -1))
    return np.concatenate(packed_obs, axis=0)

As you can see, the only situation that can happen with this error is when there is no agent to be controlled while the env is not done. It is weird. May it be a bug or do I misunderstand something?

P.S. my submission passed the validation phase and I have never seen these errors before, either in training or in the validation phase.

Desired SMARTS version
[0.6.1]

Operating System
ubuntu 20.04
python 3.8
gym 0.19.0
eclipse-sumo 1.10.0

ZibinDong · 2022-10-01T05:42:42Z

stderr (6).txt

Gamenot · 2022-10-03T23:59:29Z

Hello @GrandpaDZB , it looks like the main cause of the exception is that there were no agents for some reason as you mentioned. The evaluation failed because the hidden evaluation attempts a number of retries and it looks like it was consistent.

We are attempting to debug the issue. Your submission will be re-run when the fix is in.

Adaickalavan · 2022-10-11T22:11:19Z

Hi @GrandpaDZB,

It appears that your submission failed in a multiagent scenario. In multiagent scenarios, the agents may start at different time points in the simulation. Consider the following multiagent scenario with 3 agents.

Time (s)	0-10	11-20	21-30	31-40
Active agents	Agent_0, Agent_1	Agent_1	None	Agent_2
Observation.keys()	Agent_0, Agent_1	Agent_1	None	Agent_2

Agent_0 and Agent_1 start at time 0s, whereas Agent_2 starts at time 31s. Since both Agent_0 and Agent_1 are done by time 20s, the observation returned by SMARTS for time 21s to 30s will be an empty dictionary, although the environment has not ended yet as Agent_2 is yet to become done. This is the expected behaviour of SMARTS.

Please ensure that your policy is capable of handling complex multiagent scenarios.

ZibinDong · 2022-10-12T08:43:47Z

Hi @Adaickalavan ,

Thanks for your reply and clear demonstration. Now I know how to modify the policy to handle these situations and thus I think this issue could be closed. Thanks again for your help.

ZibinDong added the help wanted Extra attention is needed label Oct 1, 2022

ZibinDong closed this as completed Oct 12, 2022

Adaickalavan mentioned this issue Apr 15, 2023

Update docs: Multiple agents may spawn at different times #1961

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Competition submission failed with some unexpected errors #1652

Competition submission failed with some unexpected errors #1652

ZibinDong commented Oct 1, 2022 •

edited

Loading

ZibinDong commented Oct 1, 2022

Gamenot commented Oct 3, 2022

Adaickalavan commented Oct 11, 2022 •

edited

Loading

ZibinDong commented Oct 12, 2022

Competition submission failed with some unexpected errors #1652

Competition submission failed with some unexpected errors #1652

Comments

ZibinDong commented Oct 1, 2022 • edited Loading

ZibinDong commented Oct 1, 2022

Gamenot commented Oct 3, 2022

Adaickalavan commented Oct 11, 2022 • edited Loading

ZibinDong commented Oct 12, 2022

ZibinDong commented Oct 1, 2022 •

edited

Loading

Adaickalavan commented Oct 11, 2022 •

edited

Loading