Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add configurable format for notebook visualization #138

Closed
boeddeker opened this issue Jun 15, 2021 · 10 comments
Closed

Add configurable format for notebook visualization #138

boeddeker opened this issue Jun 15, 2021 · 10 comments

Comments

@boeddeker
Copy link
Contributor

In Ubuntu 20.04, the default graphviz (i.e. C code, not python) version is 2.43.0.
In https://gitlab.com/graphviz/graphviz/-/issues/1605 someone reported a bug,
that the scale was wrongly calculated (introduced in 2.42.0 and fixed in 2.42.4, not sure, how the versioning works, but 2.43.0 is between them).

I am not in the position to update the C graphviz, but I can easily update the Python graphviz (i.e. this package).
Are you interested in a PR, that has a workaround to fix this?

I observed this bug, when I tried to visualize a Graph in a Jupyter Lab Notebook.
While the small examples worked, the larger ones got broken.

Here is the image from https://gitlab.com/graphviz/graphviz/-/issues/1605 it
should give you an idea, what happened in my notebook (Sorry, my examples were too complicated to post them here):
image
When the graph gets to large, I see something like the first image, while small examples produce the correct output.

At the moment I use the following wrapper to fix the visualization, because I cannot replace the C lib.

class FixDotViewer:
    def __init__(self, dot):
        # SVG backend of dot broken since 2.42.0 and fixed in 2.42.4.
        # Not sure, how the versioning works, but it was reported for
        # 2.43.0, which is also broken. 
        # Ubuntu 20.04 has 2.43.0, hence, we need a workaround.
        # Bug reported in https://gitlab.com/graphviz/graphviz/-/issues/1605
        
        backend = 'svg fix scale'

        if backend == 'svg':
            def _repr_svg_():
                return dot._repr_svg_()
            self._repr_svg_ = _repr_svg_ 
        elif backend == 'svg fix scale':
            def _repr_svg_():
                svg = dot._repr_svg_()
                r = re.compile('scale\((\d+\.\d*) (\d+\.\d*)\)')

                m = r.search(svg)
                if m:
                    scale_x, scale_y = m.groups()
                    scale_x = float(scale_x)
                    scale_y = float(scale_y)
                    if scale_x > 1 or scale_y > 1:
                        if scale_x > 1:
                            scale_x = 1 / scale_x
                        if scale_y > 1:
                            scale_y = 1 / scale_y
                        svg = re.sub('scale\(\d+\.\d* \d+\.\d*\)', f'scale({scale_x} {scale_y})', svg)
                
                return svg
            self._repr_svg_ = _repr_svg_ 
            
        elif backend == 'pdf->svg':
            def _repr_svg_():
                import tempfile
                from pathlib import Path
                import subprocess
                pdf_data = dot.pipe(format='pdf')
                with tempfile.TemporaryDirectory() as tmpdir:
                    tmpdir = Path(tmpdir)
                    pdf = tmpdir / 'image.pdf'
                    pdf.write_bytes(pdf_data)
                    subprocess.run([
                        'inkscape',
                        '--without-gui',
                        f'--file={pdf}',
                        '--export-plain-svg=main.svg',
                    ], cwd=tmpdir)
                    return Path(tmpdir / 'main.svg').read_text()
            self._repr_svg_ = _repr_svg_ 
        elif backend == 'png':
            def _repr_png_():
                return dot.pipe(format='png')
            self._repr_png_ = _repr_png_ 
        elif backend == 'jpeg':
            def _repr_jpeg_():
                return dot.pipe(format='jpeg')
            self._repr_jpeg_ = _repr_jpeg_ 
        else:
            raise ValueError(dot_backend)

I am not sure, which workaround would be the best solution.
Examples:

  • Add global option for default Jupyter visualization backend, e.g. SVG, PDF, PNG, JPEG (Not sure, if more works)
    • Raise a warning for some graphviz versions, that the visualization may be broken and report how to fix it.
  • Fix the SVG Code with regex for some versions or, when a global flag is set.
@xflr6
Copy link
Owner

xflr6 commented Jun 15, 2021

From upstream https://gitlab.com/graphviz/graphviz/-/issues/1605 I understand that the upstream bug was fixed in upstream 2.42.4 but https://launchpad.net/ubuntu/+source/graphviz is still on 2.42.2 for the latest three Ubuntu versions starting with focal. So IIUC this would be something to report to the maintainer(s) of the Ubuntu package at https://bugs.launchpad.net (so it can be fixed for all Ubuntu users).

cannot control the installed graphviz version (i.e. the machines have Ubuntu 20.04 with graphviz 2.43.0).

I assume that you mean that the machines are tied to the latest version of focal (i.e. that they are updated regularly).

Add global option for default Jupyter visualization backend, e.g. SVG, PDF, PNG, JPEG

Independent of the upstream bug, I think we could still add configureable backend for notebook visualization (not sure though if e.g. PDF would work).

@boeddeker
Copy link
Contributor Author

I assume that you mean that the machines are tied to the latest version of focal (i.e. that they are updated regularly).

Yes.

Independent of the upstream bug, I think we could still add configureable backend for notebook visualization (not sure though if e.g. PDF would work).

For PDF I had this in my mind, but it is not well suited for this package.

In IPython.display.display they don't mention PDF, but since IPython.display.display_pdf exists, I tried _repr_pdf_ and it gets called. So PDF is also possible.

Here, some ideas, for a PR that I could create, where I need your opinion:

Implement _repr_png_, _repr_jpeg_ and _repr_pdf_ (Not sure, if _repr_json_ should be added, because I haven't checked the graphviz dot result and from IPython side it has a to high priority to be used)
Drawback: By default, IPython calculates all representations and the frontend decides which to use. This introduces an overhead.
The overhead can be reduced from user side, when the user uses display_svg.

An alternative would be to define _ipython_display_, but when this method is defined, all _repr_*_ are ignored. This would allow a parameter to control, which display style should be used.

A further alternative would be to dynamically define the _repr_*_ functions, but this is a bad code.

I must say, all of them have disadvantages. Defining _repr_*_ introduces an overhead, _ipython_display_ ignores the requested repr (e.g. display_svg and display_png will both display the same thing).

@xflr6 xflr6 changed the title Workaround for SVG bug in Graphviz backend version 2.43.0 Add configurable format for notebook visualization Oct 30, 2021
@xflr6
Copy link
Owner

xflr6 commented Oct 30, 2021

Thanks for your notes.

How about we implement https://ipython.readthedocs.io/en/stable/config/integrating.html#MyObject._repr_mimebundle_ returning a dict with one entry that then is configurable?

@xflr6
Copy link
Owner

xflr6 commented Nov 14, 2021

Made some room for this feature (feel free to send a PR):

class JupyterSvgIntegration(piping.Pipe):
"""Display rendered graph as SVG in Jupyter Notebooks and QtConsole."""
def _repr_svg_(self):
return self.pipe(format='svg', encoding=self._encoding)

@xflr6
Copy link
Owner

xflr6 commented Nov 16, 2021

@boeddeker
Copy link
Contributor Author

Thank you. I used that notebook for the PR.
Does the code of the PR goes in the right direction?
If yes, I will fix the tests (_repr_svg_ is not allowed, when _repr_mimebundle_ should be used) and add new ones for _repr_mimebundle_.

@xflr6
Copy link
Owner

xflr6 commented Nov 16, 2021

Does the code of the PR goes in the right direction?

Thanks. Yes, see my review: #150 (review)

Fixing/adapting the tests SG. Let me know if you have questions.

@xflr6
Copy link
Owner

xflr6 commented Nov 19, 2021

Merged #150 and applied more improvements, maybe you want to take a look.

@boeddeker
Copy link
Contributor Author

Your changes look good.
I tried now to add a new format, but I am not sure if it is good, that the old gets replaced.
(Has advantages and disadvantages)

Nevertheless, for documentation, if someone wants to replace a repr:

def repr_pdf_to_svg(dot):
    import tempfile
    from pathlib import Path
    import subprocess
    pdf_data = dot.pipe(format='pdf')
    with tempfile.TemporaryDirectory() as tmpdir:
        tmpdir = Path(tmpdir)
        pdf = tmpdir / 'image.pdf'
        pdf.write_bytes(pdf_data)
        subprocess.run([
            'inkscape',
            '--without-gui',
            f'--file={pdf}',
            '--export-plain-svg=main.svg',
        ], cwd=tmpdir)
        return Path(tmpdir / 'main.svg').read_text()

import graphviz
from IPython.display import display, display_svg, display_png, display_jpeg

graphviz.jupyter_integration.JupyterIntegration._repr_pdf_to_svg = repr_pdf_to_svg
# graphviz.jupyter_integration.JUPYTER_FORMATS['pdf_svg'] = 'image/svg+xml'
graphviz.jupyter_integration.MIME_TYPES['image/svg+xml'] = '_repr_pdf_to_svg'
# graphviz.set_jupyter_format('pdf_svg')
    
g = graphviz.Digraph('G', filename='hello.gv')
g.edge('Hello', 'World')
display(g)

@xflr6
Copy link
Owner

xflr6 commented Nov 28, 2021

Included in 0.19 release, closing. Thanks for the contribution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants