Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: add multimedia I/O mechanism (display, mm_write, and friends); fixes #3817 #3932

Merged
merged 2 commits into from
Aug 8, 2013

Conversation

stevengj
Copy link
Member

@stevengj stevengj commented Aug 4, 2013

This patch addresses #3817 to provide a general mechanism in Julia to display objects via rich multimedia representations, via three components:

  • A function display(x) to request the richest available multimedia display of a Julia object x (with a text/plain fallback).
  • Overloading writemime allows one to indicate arbitrary multimedia representations (keyed by standard MIME types) of user-defined types.
  • Multimedia-capable display backends may be registered by subclassing a generic Display type and pushing them onto a stack of display backends via pushdisplay.

This design has gone through several iterations since the inception of #3817, and is in part inspired by the design of IPython's custom display logic. It is especially critical for the development of IJulia, where I have tested this technique and found that it works very well.

The main difference from my most recent proposal in #3817 is that I generalized the mechanism to be extensible to arbitrary MIME types without modifying Base, using a parametric singleton-type technique similar to @StefanKarpinski's beautiful MathConst trick in order to allow Julia to dispatch on the MIME type. For example, if I have a type MyImage that I know how to write as PNG, I can simply define a new writemime method:

import Base.writemime
writemime(stream, ::@MIME("image/png"), x::MyImage) = ...write x as PNG to stream...

and any MyImage object will automatically display as a PNG image in IJulia or in any other display device supporting image/png.

See the changes to doc/stdlib/base.rst for the complete documentation.

I also added and documented a base64(x) function to base64-encode binary data, as this is generically useful for sending MIME data to backends over string-based transport protocols (and is essential for IJulia), based on some code by @StefanKarpinski.

cc: @JeffBezanson, @viralshah, @timholy, @fperez, @loladiro

@stevengj
Copy link
Member Author

stevengj commented Aug 4, 2013

There is also a redisplay(x) function that defaults to simply calling display, but which a backend may optionally override in order to modify an existing display for x (as opposed to, for example, opening a new image window). IJulia uses this to defer display until an entire input cell is executed.

This is useful for Pylab/Matlab-like stateful plotting modules in which the a plot is created (e.g. by plot) and then modified many times (e.g. by xlabel, title, etcetera); each of these functions would simply call redisplay.

stevengj added a commit to JuliaLang/IJulia.jl that referenced this pull request Aug 4, 2013
@ViralBShah
Copy link
Member

I like the simplicity of this approach. @dcjones - IIRC, you have a base64 function in Codecs.jl. Could you take a look?

@timholy
Copy link
Member

timholy commented Aug 4, 2013

Let me test my understanding on a concrete example. Now we have two ways of showing profiling data: as text (using Profile.print()) and as an image (defined in ProfileView.jl). Let's say I've gotten my profiling data all wrapped up in a ProfileData type (which doesn't currently exist, but it easily could), with that variable called p.

Now IIUC I would register my two methods something like this:

mm_write(io, ::@MIME("text/plain"), p::ProfileData) = Profile.print(io, p)
mm_write(io, ::@MIME("image/png"), p::ProfileData) = ProfileView.view(io, p)

Really the last one pops up a GUI, so "image/png" doesn't seem quite correct. How does one distinguish between popping up a Tk GUI vs. base64-encoding at a bytestream to send to IPython? That's the job of io? What if I don't supply io, and I haven't yet popped up any Tk windows?

@stevengj
Copy link
Member Author

stevengj commented Aug 4, 2013

@timholy, mm_write does not "pop up a GUI". It simply writes p in image/png format to io, which is just an ordinary binary stream (io::IO). So, you'd probably want it to call some kind of ProfileView.export_png(io, p) function.

The GUI is "popped up" (if at all) during display(p) by an instance of a Display subtype which has been pushed onto the display stack. For instance, IJulia pushes its own InlineDisplay instance onto the display stack. When display(p) is called, it invokes display(d, p) in turn for each display d in the stack until one succeeds. display(d, p), in turn, calls mm_writable(mime, typeof(p)) (true if the corresponding mm_write method exists) for each MIME type mime that d knows how to display. So, for example, the IJulia InlineDisplay knows how to display any object that is writable as image/png, image/jpeg, image/svg+xml, text/html, application/x-latex, or text/plain, and will succeed for any p such that mm_write is defined for one of these types.

For a supported MIME type, display(d, p) will normally call mm_repr(mime, p) or mm_string_repr(mime, p), which allocates a memory buffer (IOBuffer) and calls mm_write(io, mime, p) to write p to that buffer in the requested format. Given that data, the display is then responsible for displaying it (by popping up a GUI, writing it to disk, tossing it in the trash, or whatever it wants to do ... in IJulia's case, we send it to the IPython front-end, which sends it to the browser).

The key is that we separate multimedia export of Julia types from multimedia display of the exported data.

PS. Also, you don't have to define a text/plain mm_write, since a fallback text/plain mm_write is already defined for all types, which simply calls repl_show.

@timholy
Copy link
Member

timholy commented Aug 4, 2013

OK, I think I see now. Let's say a user loads the Images package (which does not handle any kind of graphical output, it's simply image types, I/O, and algorithms). The Images package should presumably define mm_write for image/png and image/jpg (currently we use ImageMagick to handle that). Then, if the user is working from IJulia, display(img) will route it to IJulia, because IJulia has registered a Display.

But let's say instead I'm working from IJulia and then load ImageView. (Let's say I love IJulia as a "better REPL" but need the features and/or performance of ImageView for my specific task.) Are you envisioning that I'll short-circuit all this with my specific display(img::Image) function that takes precedence over yours? Anyone who wants to send an image back to IJulia while also using ImageView can do so by saying display(disp_ijulia, img). Honestly, I'm fine with this as a solution, at least until we discover that it causes problems.

Alternatively, are you hoping ImageView will integrate more closely with this mechanism? If so, presumably it should push a Display instance onto the stack when it loads, even though it hasn't yet opened a window, so that when a user says display(img) it will default to the ImageView one. So in some senses, the ImageView Display instance has to be a mini window manager? There may be a few other interesting details related to display state, which are briefly described here and which may be clearer if you refer to the definition of an ImageCanvas

@stevengj
Copy link
Member Author

stevengj commented Aug 4, 2013

@timholy, what I'm hoping is that ImageView will integrate with this mechanism, by defining a Display subtype that knows how to display image/png etcetera. You may or may not want to push this onto the display stack when ImageView is loaded, but if you do then it will be used to display those image types instead of IJulia (if IJulia is being used). How you manage your window state is up to you — whether you put all of the state into a Display subtype, or you continue to maintain global state as now and your Display subtype is just a dispatch hook that turns around and calls your existing display code.

You are free, of course, to define additional display-like functions that provide more control over the display; display is just the baseline.

Furthermore, the return value of display is implementation-defined (you throw a MethodError exception if you are handed a type you can't display), so if you want to return some kind of display "handle" you are free to do so, and the user can take advantage of this if they know they are using your package.

Finally, if you want provide even richer display of a certain Julia type (say you provide interactive editing, not just display, of a MyImage type), you can do so: just provide a display(d::MyDisplay, x::MyImage) method to do whatever you want for that specific type, and the Julia's method-dispatch will call it automatically whenever display(x) is invoked for x::MyImage.

The whole point of putting this into Julia Base is that both multimedia export and multimedia backends are basic functionalities that can be provided in lots of ways, by lots of packages — this is not something that can or should be limited to IJulia!

@timholy
Copy link
Member

timholy commented Aug 4, 2013

OK, I think I'm slowly catching on. My current thought is that perhaps Tk and Gtk are perhaps the right places to put the parts dealing with low-level window state management.

Anyway, I'm good to merge on this. We'll figure any issues out as they come up.

@ivarne
Copy link
Member

ivarne commented Aug 4, 2013

Should the mm_write function get a hint about the display size? For non vector images it makes a huge difference, and I think that the display should be able to ask for a specific hight and/or width, even if the mm_write implementation might ignore the hint.

Does this interface support display devices might start to show the first frames of an animation before the last framme is generated and mm_write returns?

@stevengj
Copy link
Member Author

stevengj commented Aug 4, 2013

@ivarne, with regards to the first question, my inclination is to think of that as a more application-specific thing; e.g. the default height and width you might want for plotting output would be very different from the default height and width for emoticon output, and since the display knows nothing about the source of the data it doesn't seem like the best place to put that information.

With regards to the second question, it should be perfectly possible in Julia for the display to read and display the output of mm_write asynchronously in a coroutine (thanks to Julia's asynchronous I/O support) in order to display the data in a streaming fashion. It could even discard the output of mm_write as it displays it, so that the whole animation need not be stored in memory at once. @loladiro can probably comment more intelligently on how asynchronous I/O would fit in here.

@Keno
Copy link
Member

Keno commented Aug 5, 2013

For asynchronous I/O all that's needed is some kind of IO object that can handle the data and block the writing task when appropriate.

@stevengj
Copy link
Member Author

stevengj commented Aug 5, 2013

Note that the IPython protocol does incorporate a metadata field that can be used to pass arbitrary data along with the MIME data, and their metadata field is used to pass image-size hints (see ipython/ipython#3190), but it is used in the opposite direction, as I understand it (the data source sends the full image but sometimes hints that it should be shown as a different size, e.g. a thumbnail). However, even though they support arbitrary metadata, these thumbnailing hints are still essentially the only thing their metadata is used for. It seemed to me that this was a problematic design, and in any case there should probably be a more global way to indicate thumbnailing preferences for large images.

@JeffBezanson
Copy link
Member

Maybe it should be called writemm or writemime, after writecsv?

@ViralBShah
Copy link
Member

+1 for writemime.

@stevengj
Copy link
Member Author

stevengj commented Aug 5, 2013

Okay, so mm_write, mm_repr, mm_string_repr, and mm_writable become writemime, reprmime, stringmime, and mimewritable, respectively?

(I agree that "mime" is more readable, although it may be technically incorrect. We are using MIME types (or rather, "Internet media types"), but we are not writing MIME email attachments. I don't really care if you don't, though.)

@stevengj
Copy link
Member Author

stevengj commented Aug 5, 2013

Okay, renamed, and also renamed {push/pop}_display to {push/pop}display

@stevengj
Copy link
Member Author

stevengj commented Aug 6, 2013

Okay to merge?

@Keno
Copy link
Member

Keno commented Aug 6, 2013

Fine with me

@StefanKarpinski
Copy link
Member

@JeffBezanson and I have both been away over the weekend and are still catching up, could you wait a bit so we can review and comment. Sorry for the delay!

@stevengj
Copy link
Member Author

stevengj commented Aug 7, 2013

@StefanKarpinski, no problem.

@Carreau
Copy link

Carreau commented Aug 7, 2013

Hi,

Sorry I'm not a julia expert and I have a few questions.

I didn't totally got how is the richest representation decided in display.

Back to image/png, image/jpeg, image/svg+xml, text/html, application/x-latex.
If my object know how to write itself as image/png, image/svg+xml, application/x-latex.
Would this mean that only the first match will be sent to the fontend ? or the three ?

The for each display d in the stack **until** one succeeds make me think that only the first will.

In this case (only the first) how will this apply when multiple frontend with different display capability are hooked to the same kernel ?

Note that the IPython protocol does incorporate a metadata field that can be used to pass arbitrary data along with the MIME data, and their metadata field is used to pass image-size hints (see ipython/ipython#3190), but it is used in the opposite direction, as I understand it (the data source sends the full image but sometimes hints that it should be shown as a different size, e.g. a thumbnail). However, even though they support arbitrary metadata, these thumbnailing hints are still essentially the only thing their metadata is used for. It seemed to me that this was a problematic design, and in any case there should probably be a more global way to indicate thumbnailing preferences for large images.

The metadata will mainly be use for javascript plugin to receive extra information about the mimetype they are supposed to handle. Right now the only javascript widget we have is the one that deal with image and resize, and so get only width and height metadatafield, but if the image was read from disk you could send along permission, last modified, size on disk... on my case I have some biological images I can send a scale with it.

Brian did some D3js network graph animation using json representation and some 3d vtk in browser, obviously you want to send the pure data representing to the javascript, but all the unrelevant-meta-information that might be plugin specific like the speed of animation or the original orientation of the 3D model have their place in metadata.

The point is that as data are stored in notebook format, they should be handler independent, hence the optional metadata field.

Hope this clarify what I understood of how metadata will be used, and not only for thumbnailing.

@stevengj
Copy link
Member Author

stevengj commented Aug 7, 2013

@Carreau, whichever frontend that is called has access to all of the available representations; it can call as many mimewrite functions as it wants. For example, the IJulia InlineDisplay sends multiple representations to IPython in a JSON message. The loop you were referring to means that the a given object is sent to only one display. For example, if someone pushes a specialized display for images, then images will be shown in that viewer instead of in the IJulia notebook.

I don't understand why a separate metadata field is needed for the uses you mentioned, as opposed to embedding the metadata in the Javascript or elsewhere in the HTML DOM (with Javascript embedded via <script>).

@Carreau
Copy link

Carreau commented Aug 7, 2013

@Carreau, whichever frontend that is called has access to all of the available representations; it can call as many mimewrite functions as it wants. For example, the IJulia InlineDisplay sends multiple representations to IPython in a JSON message.

Ok, I'm starting to get it. I still have difficulties to separate the julia from the IJulia layer, but it makes sens.

I don't understand why a separate metadata field is needed for the uses you mentioned, as opposed to embedding the metadata in the Javascript or elsewhere in the HTML DOM (with Javascript embedded via <script>).

This is because we try as much as possible to avoid making assumption that the frontend will be javascript/html.
.ipynb format does not assume it will be attached to the DOM, nothing from the DOM is actually stored when saving. the Javascript "Cell" object receive exactly the same data at load time that when you executed the notebook. right now the .ipynb json structure is on JS side, but we plan on moving it to the python (server not kernel) side, so that you could actually launch some computation, close your browser, come back

For example Emacs-IPython-Notebook speek with the server through websocket, also have a dispatch on mime-type it might interpreat the size attribute of the metadata to scale image (maybe it does, or not, I have no clues), but still it needs to be able to store it as JSON, as the .ipynb file it saves might be loaded in the html notebook and vice-versa.

Another type of "frontend" is be nbconvert. It actually read the json file, which include mimetype/data/metadata and use it to build a converted document like PDF, where there is no DOM. Metadata can be use to scale image, or assign a \ref{<somename>} to this figure. In the same way, nbconvert could be use to generate ipynb file in a headless manner, and metadata need to be stored in the fileformat for other frontend to be read.

We could have send everything in the following form

"mimetype-key" : {
  metadata :<whatever>, 
  content: <actual content> 
}

But this would have involve significant refactor on one side, also, json does not support binary data which should allow to keep the content binary as long as possible.

You could, in some ways, see metadata of as exif, but for all mime-types, even thoses that don't support it.

@stevengj
Copy link
Member Author

stevengj commented Aug 7, 2013

I'm still confused about the justification for your metadata:

  • Having a separate metadata makes it much harder to send the data to anything that is not IPython (or IPython-aware) without discarding the metadata, since the metadata is not part of the respective MIME format.
  • Any program that would want to make use of the metadata would have to know about the specfic program that created the data (both because the metadata is not part of the MIME data and because its contents are not defined by IPython either). But in that case, you could use almost any parseable mechanism you want to embed the metadata directly in the data. For example, specially formatted comments in HTML/Javascript or EXIF-like tags in an image would work just as well—better, in fact, because of the first point above.

@Carreau
Copy link

Carreau commented Aug 7, 2013

Ok, let's step back I think we probably miss understand each other.

right now we have (form [here]):

content = {
     ...
    'data' : dict, # key are mimetype
    'metadata' : dict #key are mimetype
}

You would like something more like

content = {
     ...
    'dataAndMetaData' : {
         'mimetype' : value
     }
}

Where value is some way of carrying data and metadata. Right ? One mimetype -> one object ?
I don't disagree with you, I think we went the other route for compatibility reason, as it is much more easy to add a field than to modify completely message spec.

Any program that would want to make use of the metadata would have to know about the specfic program that created the data (both because the metadata is not part of the MIME data and because its contents are not defined by IPython either)

Yes, in this should mostly be true in any metada. Nobody should expect metadata to be there, they might not be.
We will though document best practice of what people might expect in metadata. For images those might be size, cell metadata will have an proposed name and tags list which would respectively contain a string and a list of string.

But in that case, you could use almost any parseable mechanism you want to embed the metadata directly in the data. For example, specially formatted comments in HTML/Javascript or EXIF-like tags in an image would work just as well—better, in fact, because of the first point above.

That would be creating yet another mimetype for each existing mimetype. For me, the data of the mimetype should be untouched as many library already now how to process those file. Moreover, embedding the metadata into the data themselve would require to write a extractor for those metadata for each mimetype. Consider also the fact that each embeded data could be extracted from the ipynb file. What assure you that the clever embeding you did in some mimetype will not choke another software ? Sure it might work in html or javascript as you know the comments, but what about application/octet-stream ?

@stevengj
Copy link
Member Author

stevengj commented Aug 7, 2013

No. I just want you to deprecate the metadata field, and leave data as-is (a dictionary from MIME type to corresponding data). (Or actually, I don't care if you keep the metadata field or not; we can always just pass empty metadata.) In any case, the wire format of IPython is not the topic here.

If a particular application needs to embed some additional meta-information in an application-specific way, that will be read and used by an application-specific front-end, it can do that by embedding it within the existing data as I suggested.

That would be creating yet another mimetype for each existing mimetype.

No it wouldn't, because my suggestions (comments in Javascript source or EXIF-like tags in images) utilize existing metadata mechanisms in those formats that would simply be ignored by front-ends that don't know to look for the application-specific metadata. (Alternatively, in a binary format with predictable/detectable length but no designated metadata fields, you could just append application-specific binary metadata beyond the end of the MIME binary data, which would be ignored by readers that didn't know to look for it. But most formats these days already have ways to embed arbitrary metadata within them.)

Yes, it would be hard to embed metadata in application/octet-stream. But will you really lose any sleep over this? Give me a concrete IPython situation that would need application/octet-stream + metadata, and wouldn't simply use application/x-mycustomtype.

@JeffBezanson
Copy link
Member

I like it.
Looks like the functions were not renamed in the exports list?

@stevengj
Copy link
Member Author

stevengj commented Aug 8, 2013

@JeffBezanson, fixed the exports & flushing. I'm actually not sure in retrospect why I bothered flushing, or why I ignored errors from the flush.. Maybe I should just omit the flush?

@stevengj
Copy link
Member Author

stevengj commented Aug 8, 2013

Not sure I understand the Travis failure; at first glance, it has nothing to do with this patch...

@JeffBezanson
Copy link
Member

I'm ready to merge this. @StefanKarpinski ?

@staticfloat
Copy link
Member

@stevengj We've got a couple bugs that are floating around, some of which seem to only manifest on Travis builds, others which intermittently show up on our machines as well. In general, if only one of the builds fail (e.g. gcc fails while the clang build passes), it's one of those problems. If both fail, then you should start to suspect your changes.

I've restarted the Travis build, it should show up green if we're lucky. (How I hate intermittent failures)

JeffBezanson added a commit that referenced this pull request Aug 8, 2013
RFC: add multimedia I/O mechanism (display, mm_write, and friends); fixes #3817
@JeffBezanson JeffBezanson merged commit ad1b0de into JuliaLang:master Aug 8, 2013
@stevengj stevengj mentioned this pull request Aug 8, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants