Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

qstat results can't be parsed by JSON decoder #324

Closed
cms21 opened this issue Mar 3, 2023 · 3 comments · Fixed by #326
Closed

qstat results can't be parsed by JSON decoder #324

cms21 opened this issue Mar 3, 2023 · 3 comments · Fixed by #326

Comments

@cms21
Copy link
Contributor

cms21 commented Mar 3, 2023

Error:
2023-03-03 03:45:15.717 | 37017 | ERROR | balsam:120] Uncaught Exception <class 'json.decoder.JSONDecodeError'>: Invalid \escape: line 20613 column 407 ( char 1728196) Traceback (most recent call last): File "/home/csimpson/polaris/env/lib/python3.8/site-packages/balsam/util/process.py", line 17, in run self._run() File "/home/csimpson/polaris/env/lib/python3.8/site-packages/balsam/site/service/service_base.py", line 23, in _run self.run_cycle() File "/home/csimpson/polaris/env/lib/python3.8/site-packages/balsam/site/service/scheduler.py", line 128, in run_cycle scheduler_jobs = self.scheduler.get_statuses(user=self.username) File "/home/csimpson/polaris/env/lib/python3.8/site-packages/balsam/platform/scheduler/scheduler.py", line 142, in get_statuses stat_dict = cls._parse_status_output(stdout) File "/home/csimpson/polaris/env/lib/python3.8/site-packages/balsam/platform/scheduler/pbs_sched.py", line 180, in _parse_status_output j = json.loads(raw_output) File "/soft/datascience/conda/2022-09-08/mconda3/lib/python3.8/json/__init__.py", line 357, in loads return _default_decoder.decode(s) File "/soft/datascience/conda/2022-09-08/mconda3/lib/python3.8/json/decoder.py", line 337, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File "/soft/datascience/conda/2022-09-08/mconda3/lib/python3.8/json/decoder.py", line 353, in raw_decode obj, end = self.scan_once(s, idx) json.decoder.JSONDecodeError: Invalid \escape: line 20613 column 407 (char 1728196)

This error was found in the service log of a site on Polaris that was unstable (it kept going down according to the server). Every time it would be restarted, this error would occur. At some point, the site seemed to regain stability (this error stopped appearing and the site remained active).

@cms21
Copy link
Contributor Author

cms21 commented Mar 7, 2023

working on this in branch pbs-qstat-bugs

@cms21 cms21 mentioned this issue Mar 8, 2023
@cms21 cms21 reopened this Mar 23, 2023
@cms21
Copy link
Contributor Author

cms21 commented Mar 23, 2023

A wrinkle:
I ran a script on Polaris that called qstat -f -F json every minute and decoded the JSON that was returned. It ran until it encountered an invalid JSON. It failed on a JSON on this line:
"CI_COMMIT_BEFORE_SHA":0000000000000000000000000000000000000000,

@cms21
Copy link
Contributor Author

cms21 commented Apr 13, 2023

We think with the PR that was merged that staged the qstat query, a Balsam site will be unlikely to encounter this problem with PBS. We've noted this PBS issue to operations.

@cms21 cms21 closed this as completed Apr 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant