-
Notifications
You must be signed in to change notification settings - Fork 7.1k
add hook to automatically use ipython3 lexer on examples #59984
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Signed-off-by: Aydin Abiar <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a Sphinx hook to automatically set the ipython3 lexer for specified notebook files. This is a great solution to prevent build failures caused by unsupported syntax in notebooks, like shell commands. The implementation is clean and correctly uses Sphinx's event system. I have a couple of suggestions to improve the code's robustness: one regarding more specific exception handling and another about adhering to file formatting conventions.
|
cc @aslonnie |
Signed-off-by: Aydin Abiar <[email protected]>
doc/source/conf.py
Outdated
| source[0] = json.dumps(notebook) | ||
| except Exception as e: | ||
| logger.warning(f"Failed to add ipython3 lexer to {docname}: {e}") | ||
| # don't fail the build for this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you explain why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'll remove the try/catch, sphinx seems to catch/logs hooks errors good enough :
catch JSONDecodeError example
Traceback (most recent call last):
--
| File "/home/docs/checkouts/readthedocs.org/user_builds/anyscale-ray/envs/59984/lib/python3.10/site-packages/sphinx/events.py", line 97, in emit
| results.append(listener.handler(self.app, *args))
| File "/home/docs/checkouts/readthedocs.org/user_builds/anyscale-ray/checkouts/59984/doc/source/conf.py", line 805, in apply_ipython3_lexer
| notebook = json.loads(source[0])
| File "/home/docs/.asdf/installs/python/3.10.17/lib/python3.10/json/__init__.py", line 346, in loads
| return _default_decoder.decode(s)
| File "/home/docs/.asdf/installs/python/3.10.17/lib/python3.10/json/decoder.py", line 337, in decode
| obj, end = self.raw_decode(s, idx=_w(s, 0).end())
| File "/home/docs/.asdf/installs/python/3.10.17/lib/python3.10/json/decoder.py", line 353, in raw_decode
| obj, end = self.scan_once(s, idx)
| json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 20 column 3 (char 1176)
|
| The above exception was the direct cause of the following exception:
|
| Traceback (most recent call last):
| File "/home/docs/checkouts/readthedocs.org/user_builds/anyscale-ray/envs/59984/lib/python3.10/site-packages/sphinx/cmd/build.py", line 337, in build_main
| app.build(args.force_all, args.filenames)
| File "/home/docs/checkouts/readthedocs.org/user_builds/anyscale-ray/envs/59984/lib/python3.10/site-packages/sphinx/application.py", line 351, in build
| self.builder.build_update()
| File "/home/docs/checkouts/readthedocs.org/user_builds/anyscale-ray/envs/59984/lib/python3.10/site-packages/sphinx/builders/__init__.py", line 293, in build_update
| self.build(to_build,
| File "/home/docs/checkouts/readthedocs.org/user_builds/anyscale-ray/envs/59984/lib/python3.10/site-packages/sphinx/builders/__init__.py", line 313, in build
| updated_docnames = set(self.read())
| File "/home/docs/checkouts/readthedocs.org/user_builds/anyscale-ray/envs/59984/lib/python3.10/site-packages/sphinx/builders/__init__.py", line 419, in read
| self._read_serial(docnames)
| File "/home/docs/checkouts/readthedocs.org/user_builds/anyscale-ray/envs/59984/lib/python3.10/site-packages/sphinx/builders/__init__.py", line 440, in _read_serial
| self.read_doc(docname)
| File "/home/docs/checkouts/readthedocs.org/user_builds/anyscale-ray/envs/59984/lib/python3.10/site-packages/sphinx/builders/__init__.py", line 497, in read_doc
| publisher.publish()
| File "/home/docs/checkouts/readthedocs.org/user_builds/anyscale-ray/envs/59984/lib/python3.10/site-packages/docutils/core.py", line 234, in publish
| self.document = self.reader.read(self.source, self.parser,
| File "/home/docs/checkouts/readthedocs.org/user_builds/anyscale-ray/envs/59984/lib/python3.10/site-packages/sphinx/io.py", line 106, in read
| self.input = self.read_source(settings.env)
| File "/home/docs/checkouts/readthedocs.org/user_builds/anyscale-ray/envs/59984/lib/python3.10/site-packages/sphinx/io.py", line 116, in read_source
| env.events.emit('source-read', env.docname, arg)
| File "/home/docs/checkouts/readthedocs.org/user_builds/anyscale-ray/envs/59984/lib/python3.10/site-packages/sphinx/events.py", line 108, in emit
| raise ExtensionError(
| sphinx.errors.ExtensionError: Handler <function apply_ipython3_lexer at 0x7e68f6114670> for event 'source-read' threw an exception (exception: Expecting property name enclosed in double quotes: line 20 column 3 (char 1176))
|
| Extension error:
| Handler <function apply_ipython3_lexer at 0x7e68f6114670> for event 'source-read' threw an exception (exception: Expecting property name enclosed in double quotes: line 20 column 3 (char 1176))
doc/source/conf.py
Outdated
| notebook.setdefault('metadata', {}) \ | ||
| .setdefault('language_info', {})['pygments_lexer'] = 'ipython3' | ||
| source[0] = json.dumps(notebook) | ||
| except Exception as e: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it possible to catch a more explicit exception type? what exception are we expecting? it is just json parsing, right? I do not thing the source[0] = json.dumps() line is expected to fail.
could you just wrap the json parsing with try..except and only catch json parsing errors (rather than all exception type)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes these are just JSON errors, I also don't expect them to happen. I’ve removed the try/except since any exception here is caught by Sphinx, which will fail the build and show explicit error logs (what we want to happen in our CI checks)
Signed-off-by: Aydin Abiar <[email protected]>
doc/source/data/examples/unstructured_data_ingestion/content/unstructured_data_ingestion.ipynb
Outdated
Show resolved
Hide resolved
Signed-off-by: Aydin Abiar <[email protected]>
Issue
python3lexer on the entire notebook unless the writer specifically set one in the notebook metadata (which frankly never happens)python3lexer can not handle cells that starts with!such as!pip install ...or other command cells.ipython3that wraps aroundpython3and is able to parse command cellsIdeally, sphinx should be able to pick a lexer for each cell but it only supports notebook-level lexer
Solution
This PR automatically adds a
ipython3lexer to any notebook listed in any of the glob patterns inipython3_lexer_patterns. For more precision, they can also exclude files with other glob patterns inipython3_lexer_exclude_patterns. This ensures:exclude_patternsfor example)