Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MSC2448: Using BlurHash as a Placeholder for Matrix Media #2448

Open
wants to merge 52 commits into
base: old_master
Choose a base branch
from

Conversation

anoadragon453
Copy link
Member

@anoadragon453 anoadragon453 commented Feb 27, 2020

@anoadragon453 anoadragon453 added the proposal A matrix spec change proposal label Feb 27, 2020
@anoadragon453 anoadragon453 changed the title MSC2445: Using Blurhash in Media Events MSC2448: Using Blurhash in Media Events Feb 27, 2020
@turt2live turt2live self-requested a review February 27, 2020 23:39
@turt2live turt2live self-requested a review February 27, 2020 23:41
Copy link
Member

@turt2live turt2live left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally seems sane if we can litter it with optional.

proposals/2448-blurhash-for-media.md Show resolved Hide resolved
proposals/2448-blurhash-for-media.md Show resolved Hide resolved
proposals/2448-blurhash-for-media.md Outdated Show resolved Hide resolved
@anoadragon453
Copy link
Member Author

All review comments addressed.

@Half-Shot
Copy link
Contributor

I love this! This is awesome. I want it now!

Copy link
Contributor

@Half-Shot Half-Shot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is cool. Couple of nits

proposals/2448-blurhash-for-media.md Outdated Show resolved Hide resolved
proposals/2448-blurhash-for-media.md Outdated Show resolved Hide resolved
proposals/2448-blurhash-for-media.md Outdated Show resolved Hide resolved
Comment on lines 90 to 131
### m.room.avatar

Room avatars having BlurHashes available will be especially useful when
viewing a server's Public Rooms directory.

An optional field is added to `m.room.avatar`'s `content.info` dictionary with the
key `blurhash`. Its value is a BlurHash of the media that is pointed to by
`url`.

Example `m.room.avatar` content:

```json
{
"url": "mxc://amorgan.xyz/a59ee02f180677d83d1b57d366127f8e1afdd4ed",
"info": {
"blurhash": "JadR*.7kCMdnj"
}
}
```

### m.room.member

Much like room avatars, user avatars can have BlurHashes as well. There is a
little more required to implement this, but the outcome of no longer having
missing avatars upon opening a room is worthwhile.

An optional field is added to `m.room.member`'s `content` dictionary with
the key `blurhash`. Its value is a BlurHash of the media that is pointed
to by `avatar_url`.

Note that `blurhash` MUST be omitted if `avatar_url` is not present.

Example `m.room.member` event content:

```json
{
"avatar_url": "mxc://example.org/SEsfnsuifSDFSSEF",
"displayname": "Alice Margatroid",
"membership": "join",
"blurhash": "JadR*.7kCMdnj"
}
```
Copy link
Member

@t3chguy t3chguy Jul 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ara4n raised a point internally that Element Web at least probably wouldn't want to waste CPU cycles on rendering the blurhashes for many tens of avatars on-screen at a time given the server thumbnail is so quick to load and heavily cached, will other clients actually want this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sometimes avatars can take a while to load, and if there are many on-screen, then the client's HTTP connection limit can be reached, making it even slower. Clients can use some sort of heuristic to decide when to use the blurhash, e.g. if it the avatar has been loading for >1s, then use the blurhash.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may also be useful when showing an avatar on someone's profile when you click on them from the timeline. Of course it's up to the client to pull that avatar from /profile or the membership event.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@t3chguy I'm curious if Element Web still holds the same opinion now that some time has passed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It wasn't EWT's opinion, but that of @ara4n

@erikjohnston
Copy link
Member

@anoadragon453 (or anyone) do you have any bandwidth to make the changes to the MSC? We should land the stuff that is already implemented and in use at the very least

@ginnyTheCat ginnyTheCat mentioned this pull request Jul 31, 2022
12 tasks
Copy link
Member Author

@anoadragon453 anoadragon453 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Paging this all back in now - intending to see it through!

"avatar_url": "mxc://example.org/SEsfnsuifSDFSSEF",
"displayname": "Alice Margatroid",
"membership": "join",
"blurhash": "JadR*.7kCMdnj"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The justification for it is that m.room.avatar currently contains an info field (which the spec defines as type ImageInfo, which is similar to the info field of the m.image msgtype for an m.room.message event.

The inconsistency is indeed a bit awkward. If we move blurhash to the top-level of the m.room.avatar event type, then we're inconsistent with the m.image msgtype. We could go the other way and put an info (image_info?) field in the m.room.member event type, but that feels out of place for an event type whose primary purpose isn't media-related.

I'm curious what others think the best way forward is.

Comment on lines 90 to 131
### m.room.avatar

Room avatars having BlurHashes available will be especially useful when
viewing a server's Public Rooms directory.

An optional field is added to `m.room.avatar`'s `content.info` dictionary with the
key `blurhash`. Its value is a BlurHash of the media that is pointed to by
`url`.

Example `m.room.avatar` content:

```json
{
"url": "mxc://amorgan.xyz/a59ee02f180677d83d1b57d366127f8e1afdd4ed",
"info": {
"blurhash": "JadR*.7kCMdnj"
}
}
```

### m.room.member

Much like room avatars, user avatars can have BlurHashes as well. There is a
little more required to implement this, but the outcome of no longer having
missing avatars upon opening a room is worthwhile.

An optional field is added to `m.room.member`'s `content` dictionary with
the key `blurhash`. Its value is a BlurHash of the media that is pointed
to by `avatar_url`.

Note that `blurhash` MUST be omitted if `avatar_url` is not present.

Example `m.room.member` event content:

```json
{
"avatar_url": "mxc://example.org/SEsfnsuifSDFSSEF",
"displayname": "Alice Margatroid",
"membership": "join",
"blurhash": "JadR*.7kCMdnj"
}
```
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@t3chguy I'm curious if Element Web still holds the same opinion now that some time has passed.

* `m.image`
* `m.video`

### m.sticker
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current state of blurhash not supporting transparency (and little movement on this front from the upstream library) is a good sign to potentially back away from supporting blurhashes for m.sticker events at the moment, and leave it as a separate optimisation as @sumnerevans mentioned.

The path data sounds like a potential solution, and other messages services such as Telegram indeed use that method to great effect. Some thought will need to be given to animated stickers though, and what frame should be used when capturing a silhouette of the media (the first frame would not always be effective).

Blurhashes in a sticker picker also seems less useful than silhouettes when searching for a known sticker on a slow network connection.

Comment on lines +253 to +257
BlurHashes are inserted into events by the client, however some clients may not
be able to implement the BlurHash library for whatever reason. In this case, it
would be nice to allow the media repository to calculate the BlurHash of a piece
of media for the client, similar to how thumbnails are calculated by media
repositories today.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's roughly two years on. Are there any valid use cases that we're aware of for this endpoint today? I don't know of any clients that have needed to rely on it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. Element sometimes relies on the media repo for thumbnails, but that doesn't work in encrypted rooms anyway and I haven't seen any client using that endpoint nor any server supporting it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. I'm going to rip it out of this MSC then. cc @turt2live who originally(?) proposed it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been chasing memory allocation issues related to blurhashes in matrix-media-repo (MMR), and my conclusion is I agree with this conclusion.

Element is not using the returned field, and most/all of the bridges I can test quickly aren't using it either. This makes it fairly pointless for servers to support.


For added context, the allocation issue is with the input for calculating a blurhash. The blurhash itself is a non-issue.

image

Specifically, jpegs are awful.

Comment on lines +323 to +326
## Alternatives

We could include a base64 thumbnail of the image in the event, but blurhash
produces much more efficient textual representations.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discuss the alternative algorithm (and self-proclaimed improvement over BlurHash) ThumbHash.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The alpha support is nice.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A comparison of ThumbHash vs. BlurHash.

Pros

Alpha support

ThumbHash handles images with transparent backgrounds in a much nicer fashion than BlurHash. While the obvious use case for this is stickers, as mentioned elsewhere, I think that vector outlines would be the best way to represent stickers before they are downloaded in full resolution.

Still, being able to somewhat make out what an image with a transparent background is before it has loaded is valuable.

image

Better quality

ThumbHash appears to generally create a better representation of the image than BlurHash (examples taken from https://evanw.github.io/thumbhash/):

image

image

image

Cons

Limited Library Support

BlurHash was one of the first to widely publicise this use case, and thus it is a lot more popular than ThumbHash. Compare the number of implementations for ThumbHash versus BlurHash.

Still, the algorithm is so simple that you could presumably translate it into your chosen language in about 30m.

BlurHash is already widely used in Matrix

Again, due to BlurHash coming out much earlier than ThumbHash, Matrix clients have already implemented BlurHash (through this MSC) widely. If we switch, clients with concern for backwards-compatibility will likely need to implement both BlurHash and ThumbHash.

However, currently BlurHash mostly applies to media sent in the timeline, which quickly becomes stale. Element Web only supports m.image events in the timeline. Nheko's BlurHash support extends to both stickers and map previews on location events. I'd be most concerned about room/user avatars, which rarely change. But I've not yet seen clients implement support for that yet.

The more pressing concern would be interacting with older clients that still only send BlurHashes instead of ThumbHashes. However the failure mode here during the transition period wouldn't be too bad - you just won't see blurred thumbnails.

Bandwidth

TODO: I'd like to conduct a test of both algorithms over a range of, say, 100 images. Using base83 encoding for both. Currently I'm not sure whether BlurHash or ThumbHash generally produces smaller encoding sizes.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ThumbHash is also around 10x faster to generate, at least on iOS. I can run a little experiment and report the results if there's interest.

The Circles client is no longer generating BlurHashes. We will continue to display them but we are moving entirely to ThumbHash for the future. The slow performance was a big part of this, but also ThumbHash just seems to work better all around. For example, we were having issues on Android where the BlurHash code crashed the app on an invalid input.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the person who implemented Blurhash based on this MSC in Element Web I'm all for switching to something which supports alpha and has better performance, maintaining blurhash rendering similar to how @cvwright described another client handling it for some time is acceptable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thumbhash shouldn't be significantly faster. What you are probably seeing is that most of the thumbhash libraries downscale images to 32x32 pixels before generating the hash, while blurhash libs expect the library user to do that. If you do such an experiment, make sure you take that into account.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I downsample the source image once to 100x100. Then from that 100x100 I create the ThumbHash and the BlurHash.

Maybe the Swift ThumbHash implementation is just more optimized than the BlurHash version? The author went to some pretty great lengths to make it fast.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't look particularly optimized, but neither was the blurhash implementation, I guess. But feel free to bench it. Seems like I was also wrong, the hashing doesn't downscale first, but rendering the blurhash does create a 32x32 pixels image at best, which would be 10x faster than creating a 100x100 pixels image. So would be interesting to see comparisons, especially if you include my lib: https://github.com/Nheko-Reborn/blurhash :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cvwright If you are doing a benchmark, reporting on the resulting filesizes of the hashes would be appreciated!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@anoadragon453 if this is to progress we probably need to update the MSC to define the field for the thumbhash to live. The lack of alpha channel in blurhash makes it insufficient in my opinion.

I have a PoC impl for EW which sends & renders both blurhash & thumbhash, preferring the latter for rendering when available.

Blurhash
image
Thumbhash
image
Original
image

PoC: content["xyz.amorgan.thumbhash"] base64 encoded matrix-org/matrix-react-sdk#12164

@turt2live
Copy link
Member

we're about to hit the 2 year mark since this could have entered FCP, but it's still missing implementations and needs cleaning up. I'm cancelling FCP to bring this back to the near-FCP state, sorry.

@mscbot fcp cancel

@mscbot mscbot added proposal-in-review and removed proposed-final-comment-period Currently awaiting signoff of a majority of team members in order to enter the final comment period. disposition-merge labels Dec 21, 2023
@turt2live turt2live removed unresolved-concerns This proposal has at least one outstanding concern proposal-in-review labels Dec 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:feature MSC for not-core and not-maintenance stuff needs-implementation This MSC does not have a qualifying implementation for the SCT to review. The MSC cannot enter FCP. proposal A matrix spec change proposal
Projects
Status: Scheduled - v1.8
Development

Successfully merging this pull request may close these issues.