Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Competition submission failed with some unexpected errors #1652

Closed
ZibinDong opened this issue Oct 1, 2022 · 4 comments
Closed

Competition submission failed with some unexpected errors #1652

ZibinDong opened this issue Oct 1, 2022 · 4 comments
Labels
help wanted Extra attention is needed

Comments

@ZibinDong
Copy link

ZibinDong commented Oct 1, 2022

High Level Description
Hi! I am trying to submit my model to the 2022 NeurIPS Driving SMARTS Competition track1. The submission failed with some unexpected errors (Seems like the errors were not caused by my code, but by SMARTS).

The error log is too long so I just pick the last few lines that I thought may show how the submission fails. (I also upload the complete error log in case you need it.)

ERROR:/tmp/codalab/tmpKxmG8L/run/program/evaluate.py:Evaluation failed due to error. Attempting retry: evaluate..ProcessContext(process_builder=functools.partial(.process_builder_func at 0x7f44cb3bb550>, env_ctor=functools.partial(, env_type='smarts.env:multi-scenario-v0', scenario='/tmp/codalab/tmpKxmG8L/run/input/ref/eval_scenarios/naturalistic/mrg4-agents_2', shared_configs={'action_space': 'TargetPose', 'img_meters': 50, 'img_pixels': 112, 'sumo_headless': True}, seed=42, wrapper_ctors=), policy_type=), env_name='/tmp/codalab/tmpKxmG8L/run/input/ref/eval_scenarios/naturalistic/mrg4-agents_2', retries=1, last_reply=2560374.018741618)
ERROR:/tmp/codalab/tmpKxmG8L/run/program/evaluate.py:Scoring skipped for evaluation because retries expended: evaluate..ProcessContext(process_builder=functools.partial(.process_builder_func at 0x7f44cb3bb550>, env_ctor=functools.partial(, env_type='smarts.env:multi-scenario-v0', scenario='/tmp/codalab/tmpKxmG8L/run/input/ref/eval_scenarios/naturalistic/mrg4-agents_2', shared_configs={'action_space': 'TargetPose', 'img_meters': 50, 'img_pixels': 112, 'sumo_headless': True}, seed=42, wrapper_ctors=), policy_type=), env_name='/tmp/codalab/tmpKxmG8L/run/input/ref/eval_scenarios/naturalistic/mrg4-agents_2', retries=1, last_reply=2560374.018741618)
Traceback (most recent call last):
  File "/tmp/codalab/tmpKxmG8L/run/program/evaluate.py", line 500, in 
    rank = evaluate(config)
  File "/tmp/codalab/tmpKxmG8L/run/program/evaluate.py", line 222, in evaluate
    kill_all(*running_processes)
  File "/tmp/codalab/tmpKxmG8L/run/program/evaluate.py", line 146, in kill_all
    if process.is_alive():
AttributeError: 'tuple' object has no attribute 'is_alive'
Process SpawnProcess-130:
Traceback (most recent call last):
  File "/usr/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/usr/lib/python3.8/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/tmp/codalab/tmpKxmG8L/run/program/evaluate.py", line 353, in _worker
    output_result["response_tick"] = time.monotonic()
  File "", line 2, in __setitem__
  File "/usr/lib/python3.8/multiprocessing/managers.py", line 834, in _callmethod
    conn.send((self._id, methodname, args, kwds))
  File "/usr/lib/python3.8/multiprocessing/connection.py", line 206, in send
    self._send_bytes(_ForkingPickler.dumps(obj))
  File "/usr/lib/python3.8/multiprocessing/connection.py", line 411, in _send_bytes
    self._send(header + buf)
  File "/usr/lib/python3.8/multiprocessing/connection.py", line 368, in _send
    n = write(self._handle, buf)
BrokenPipeError: [Errno 32] Broken pipe
Exception ignored in: 
Traceback (most recent call last):
  File "/opt/.venv/lib/python3.8/site-packages/smarts/core/smarts.py", line 856, in __del__
TypeError: 'NoneType' object is not callable
:device(error): Error adding inotify watch on /dev/input: No such file or directory
:device(error): Error opening directory /dev/input: No such file or directory
Process SpawnProcess-127:
Traceback (most recent call last):
  File "/usr/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/usr/lib/python3.8/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/tmp/codalab/tmpKxmG8L/run/program/evaluate.py", line 353, in _worker
    output_result["response_tick"] = time.monotonic()
  File "", line 2, in __setitem__
  File "/usr/lib/python3.8/multiprocessing/managers.py", line 834, in _callmethod
    conn.send((self._id, methodname, args, kwds))
  File "/usr/lib/python3.8/multiprocessing/connection.py", line 206, in send
    self._send_bytes(_ForkingPickler.dumps(obj))
  File "/usr/lib/python3.8/multiprocessing/connection.py", line 411, in _send_bytes
    self._send(header + buf)
  File "/usr/lib/python3.8/multiprocessing/connection.py", line 368, in _send
    n = write(self._handle, buf)
BrokenPipeError: [Errno 32] Broken pipe
Process SpawnProcess-129:
Traceback (most recent call last):
  File "/usr/lib/python3.8/multiprocessing/managers.py", line 827, in _callmethod
    conn = self._tls.connection
AttributeError: 'ForkAwareLocal' object has no attribute 'connection'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/usr/lib/python3.8/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/tmp/codalab/tmpKxmG8L/run/program/evaluate.py", line 353, in _worker
    output_result["response_tick"] = time.monotonic()
  File "", line 2, in __setitem__
  File "/usr/lib/python3.8/multiprocessing/managers.py", line 831, in _callmethod
    self._connect()
  File "/usr/lib/python3.8/multiprocessing/managers.py", line 818, in _connect
    conn = self._Client(self._token.address, authkey=self._authkey)
  File "/usr/lib/python3.8/multiprocessing/connection.py", line 502, in Client
    c = SocketClient(address)
  File "/usr/lib/python3.8/multiprocessing/connection.py", line 630, in SocketClient
    s.connect(address)
FileNotFoundError: [Errno 2] No such file or directory
Exception ignored in: 
Traceback (most recent call last):
  File "/opt/.venv/lib/python3.8/site-packages/smarts/core/smarts.py", line 856, in __del__
TypeError: 'NoneType' object is not callable
Exception ignored in: 
Traceback (most recent call last):
  File "/opt/.venv/lib/python3.8/site-packages/smarts/core/smarts.py", line 856, in __del__
TypeError: 'NoneType' object is not callable

Could you please let me know why and how this happens?

And I also notice another unexpected error.

File "/tmp/codalab/tmpKxmG8L/run/input/res/wrappers.py", line 258, in pack_observation
    return np.concatenate(packed_obs, axis=0)
  File "<__array_function__ internals>", line 180, in concatenate
ValueError: need at least one array to concatenate

The pseudocode of function pack_observation in wrappers.py is like

def pack_observation(self, obs: Dict[str, Any]) -> np.ndarray:
    ''' Pack the SMARTS raw observation to numpy observation '''
    packed_obs = []
    for agent_id, raw_obs in obs.items():
        ''' Pack observation ... '''
        packed_obs.append(np.array([...]).reshape(1, -1))
    return np.concatenate(packed_obs, axis=0)

As you can see, the only situation that can happen with this error is when there is no agent to be controlled while the env is not done. It is weird. May it be a bug or do I misunderstand something?

P.S. my submission passed the validation phase and I have never seen these errors before, either in training or in the validation phase.

Desired SMARTS version
[0.6.1]

Operating System
ubuntu 20.04
python 3.8
gym 0.19.0
eclipse-sumo 1.10.0

@ZibinDong ZibinDong added the help wanted Extra attention is needed label Oct 1, 2022
@ZibinDong
Copy link
Author

stderr (6).txt

@Gamenot
Copy link
Collaborator

Gamenot commented Oct 3, 2022

Hello @GrandpaDZB , it looks like the main cause of the exception is that there were no agents for some reason as you mentioned. The evaluation failed because the hidden evaluation attempts a number of retries and it looks like it was consistent.

We are attempting to debug the issue. Your submission will be re-run when the fix is in.

@Adaickalavan
Copy link
Member

Adaickalavan commented Oct 11, 2022

Hi @GrandpaDZB,

It appears that your submission failed in a multiagent scenario. In multiagent scenarios, the agents may start at different time points in the simulation. Consider the following multiagent scenario with 3 agents.

Time (s) 0-10 11-20 21-30 31-40
Active agents Agent_0, Agent_1 Agent_1 None Agent_2
Observation.keys() Agent_0, Agent_1 Agent_1 None Agent_2

Agent_0 and Agent_1 start at time 0s, whereas Agent_2 starts at time 31s. Since both Agent_0 and Agent_1 are done by time 20s, the observation returned by SMARTS for time 21s to 30s will be an empty dictionary, although the environment has not ended yet as Agent_2 is yet to become done. This is the expected behaviour of SMARTS.

Please ensure that your policy is capable of handling complex multiagent scenarios.

@ZibinDong
Copy link
Author

Hi @Adaickalavan ,

Thanks for your reply and clear demonstration. Now I know how to modify the policy to handle these situations and thus I think this issue could be closed. Thanks again for your help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants