-
Notifications
You must be signed in to change notification settings - Fork 282
Description
From time to time I see the following error message when using the IBM backend, and would like to know whether you are also experiencing this and maybe have an idea what is the root cause:
While running a circuit that normally executes just fine, I get the following exception log
../../.local/lib/python3.5/site-packages/projectq/cengines/_main.py:304: in flush
self.receive([Command(self, FlushGate(), ([WeakQubitRef(self, -1)],))])
../../.local/lib/python3.5/site-packages/projectq/cengines/_main.py:266: in receive
self.send(command_list)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <projectq.cengines._main.MainEngine object at 0x7fbfb2ee2ac8>, command_list = [<projectq.ops._command.Command object at 0x7fbfb2ee2630>]
def send(self, command_list):
"""
Forward the list of commands to the next engine in the pipeline.
It also shortens exception stack traces if self.verbose is False.
"""
try:
self.next_engine.receive(command_list)
except:
if self.verbose:
raise
else:
exc_type, exc_value, exc_traceback = sys.exc_info()
# try:
last_line = traceback.format_exc().splitlines()
compact_exception = exc_type(str(exc_value) +
'\n raised in:\n' +
repr(last_line[-3]) +
"\n" + repr(last_line[-2]))
compact_exception.__cause__ = None
> raise compact_exception # use verbose=True for more info
E Exception: Failed to run the circuit. Aborting.
E raised in:
E ' File "/home/cgogolin/.local/lib/python3.5/site-packages/projectq/backends/_ibm/_ibm.py", line 295, in _run'
E ' raise Exception("Failed to run the circuit. Aborting.")'
../../.local/lib/python3.5/site-packages/projectq/cengines/_main.py:288: Exception
and on the console I then see:
- There was an error running your code:
502 Server Error: Bad Gateway for url: https://quantumexperience.ng.bluemix.net/api/users/login
The frequency of this error seems to be independent of the type of circuits I run and I get this from time to time, independently of the type of internet connection I use, so that I can exclude simple connection problems on my end.
Running with verbose=true reveals that the source of the error is in _run(self) around line 260 in _ibm.py, namely:
> counts = res['data']['counts']
E TypeError: 'NoneType' object is not subscriptable
i.e., res = send(...) did return None instead of actual results.
An strait forward workaround for me is to simply run send(...) until it returns a non-None result, e.g,, as follows:
if self._retrieve_execution is None:
res = None
retries = 10
while(res is None and retries > 0):
retries -= 1
res = send(info, device=self.device,
user=self._user, password=self._password,
shots=self._num_runs, verbose=self._verbose)
In practices I virtually never need more than a second attempt to get a result. This makes me believe that the problem is also not related to me sending too many queries or other rate limiting mechanisms.
I have found other people having similar spurious 502 errors on blumix. Their application is (probably) not at all quantum related, so maybe we are just suffering from some classical middleware misconfiguration?
Could/Should ProjectQ handle such errors more gracefully?
Digging a little deeper, I see that send() calls _get_result() and this already has a retry mechanism built in. The only way I can see in wich _get_result() and then send() can return None without raising an Exception is if the json of the return value of requests.get() contains the element r_json['qasms'][0]['result'] and this element is None. I can thus also fix the problem by adding the and qasm['result'] is not None in the last but line of the following code in _get_result():
for retries in range(num_retries):
r = requests.get(urljoin(_api_url, suffix),
params={"access_token": access_token})
r.raise_for_status()
r_json = r.json()
if 'qasms' in r_json:
qasm = r_json['qasms'][0]
if 'result' in qasm and qasm['result'] is not None:
return qasm['result']
On a related note: Aren't the default values num_retries=3000 and interval=1 of _get_result(), which result in a total waiting time until the timeout of nearly one hour a bit long? Wouldn't it be nice to make those user customizable?