Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MSC2448: Using BlurHash as a Placeholder for Matrix Media #2448
base: old_master
Are you sure you want to change the base?
MSC2448: Using BlurHash as a Placeholder for Matrix Media #2448
Changes from all commits
8ba6071
f71535a
8adf2b2
56703fc
793107d
777c30c
b80822e
b240d92
1afad01
60773e4
df5b6d7
571ce2a
e885eae
4b83c51
b893c21
6e8eb59
3695ecf
842c2a0
3994010
616bc81
e0a7442
385be8a
708b756
b761b06
40d71ff
7f13184
1300a6e
2a02d2c
7ea82b4
f93d708
ed9ed5e
48c4d55
fba60db
c837c83
b3f1915
5b8c191
63d4966
676571f
a302197
594bcee
64116ae
9bd0ba6
e7e0fb7
1d954f0
9e981ba
75a4fa6
934b6d7
974d368
5668397
234877c
abf5283
754fa7a
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Slight hurdle, blurhash doesn't support transparencies so doesn't seem appropriate for Stickers
The reference site breaks when given a transparency
woltapp/blurhash#100
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
random thought: it might be worth defining a "shape" next to the blurhash where a polygon of max 12 points (or whatever) can be defined
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds like you want to maintain cross-platform libraries which extend blurhash with generic path masking :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I now realize this is a no-go. A square sticker isn't that bad: clients might want to put a circle or something around it as a border to match the rough shape of most stickers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One hacky solution is to generate a blurhash on black and a blurhash on white, the client can then pick the once closest to the theme. But that doubles the storage space. If you wanted to get super fancy a client can probably diff the two and tease the transparency info out. But I don't want to mandate that in the spec. (I would rather wait for a blurhashv2 to come out with transparency support).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tbh I disagree, only because stickers look broken with a full-frame blurhash.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current state of blurhash not supporting transparency (and little movement on this front from the upstream library) is a good sign to potentially back away from supporting blurhashes for
m.sticker
events at the moment, and leave it as a separate optimisation as @sumnerevans mentioned.The path data sounds like a potential solution, and other messages services such as Telegram indeed use that method to great effect. Some thought will need to be given to animated stickers though, and what frame should be used when capturing a silhouette of the media (the first frame would not always be effective).
Blurhashes in a sticker picker also seems less useful than silhouettes when searching for a known sticker on a slow network connection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious whether ThumbHash would address this sufficiently: https://github.com/evanw/thumbhash
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cvwright thanks for bringing that up, not heard of it before and the comparison looks good visually - https://evanw.github.io/thumbhash/
Only has few implementations currently though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still think the "rough path outline" solution might be more visually appealing for stickers? Observe how Telegram employs it while stickers are loading in the picker. I think this would be much more appealing than blurry thumbnails:
We shouldn't discount using ThumbHash elsewhere though (see the other discussion on it #2448 (comment)).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm: why is this on the top level but the room avatar isn't? should we just move the room avatar one up a level and call it good?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The justification for it is that
m.room.avatar
currently contains aninfo
field (which the spec defines as typeImageInfo
, which is similar to theinfo
field of them.image
msgtype
for anm.room.message
event.The inconsistency is indeed a bit awkward. If we move
blurhash
to the top-level of them.room.avatar
event type, then we're inconsistent with them.image
msgtype. We could go the other way and put aninfo
(image_info
?) field in them.room.member
event type, but that feels out of place for an event type whose primary purpose isn't media-related.I'm curious what others think the best way forward is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's roughly two years on. Are there any valid use cases that we're aware of for this endpoint today? I don't know of any clients that have needed to rely on it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. Element sometimes relies on the media repo for thumbnails, but that doesn't work in encrypted rooms anyway and I haven't seen any client using that endpoint nor any server supporting it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. I'm going to rip it out of this MSC then. cc @turt2live who originally(?) proposed it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been chasing memory allocation issues related to blurhashes in matrix-media-repo (MMR), and my conclusion is I agree with this conclusion.
Element is not using the returned field, and most/all of the bridges I can test quickly aren't using it either. This makes it fairly pointless for servers to support.
For added context, the allocation issue is with the input for calculating a blurhash. The blurhash itself is a non-issue.
Specifically, jpegs are awful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discuss the alternative algorithm (and self-proclaimed improvement over BlurHash) ThumbHash.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The alpha support is nice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A comparison of ThumbHash vs. BlurHash.
Pros
Alpha support
ThumbHash handles images with transparent backgrounds in a much nicer fashion than BlurHash. While the obvious use case for this is stickers, as mentioned elsewhere, I think that vector outlines would be the best way to represent stickers before they are downloaded in full resolution.
Still, being able to somewhat make out what an image with a transparent background is before it has loaded is valuable.
Better quality
ThumbHash appears to generally create a better representation of the image than BlurHash (examples taken from https://evanw.github.io/thumbhash/):
Cons
Limited Library Support
BlurHash was one of the first to widely publicise this use case, and thus it is a lot more popular than ThumbHash. Compare the number of implementations for ThumbHash versus BlurHash.
Still, the algorithm is so simple that you could presumably translate it into your chosen language in about 30m.
BlurHash is already widely used in Matrix
Again, due to BlurHash coming out much earlier than ThumbHash, Matrix clients have already implemented BlurHash (through this MSC) widely. If we switch, clients with concern for backwards-compatibility will likely need to implement both BlurHash and ThumbHash.
However, currently BlurHash mostly applies to media sent in the timeline, which quickly becomes stale. Element Web only supports
m.image
events in the timeline. Nheko's BlurHash support extends to both stickers and map previews on location events. I'd be most concerned about room/user avatars, which rarely change. But I've not yet seen clients implement support for that yet.The more pressing concern would be interacting with older clients that still only send BlurHashes instead of ThumbHashes. However the failure mode here during the transition period wouldn't be too bad - you just won't see blurred thumbnails.
Bandwidth
TODO: I'd like to conduct a test of both algorithms over a range of, say, 100 images. Using base83 encoding for both. Currently I'm not sure whether BlurHash or ThumbHash generally produces smaller encoding sizes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ThumbHash is also around 10x faster to generate, at least on iOS. I can run a little experiment and report the results if there's interest.
The Circles client is no longer generating BlurHashes. We will continue to display them but we are moving entirely to ThumbHash for the future. The slow performance was a big part of this, but also ThumbHash just seems to work better all around. For example, we were having issues on Android where the BlurHash code crashed the app on an invalid input.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As the person who implemented Blurhash based on this MSC in Element Web I'm all for switching to something which supports alpha and has better performance, maintaining blurhash rendering similar to how @cvwright described another client handling it for some time is acceptable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thumbhash shouldn't be significantly faster. What you are probably seeing is that most of the thumbhash libraries downscale images to 32x32 pixels before generating the hash, while blurhash libs expect the library user to do that. If you do such an experiment, make sure you take that into account.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I downsample the source image once to 100x100. Then from that 100x100 I create the ThumbHash and the BlurHash.
Maybe the Swift ThumbHash implementation is just more optimized than the BlurHash version? The author went to some pretty great lengths to make it fast.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't look particularly optimized, but neither was the blurhash implementation, I guess. But feel free to bench it. Seems like I was also wrong, the hashing doesn't downscale first, but rendering the blurhash does create a 32x32 pixels image at best, which would be 10x faster than creating a 100x100 pixels image. So would be interesting to see comparisons, especially if you include my lib: https://github.com/Nheko-Reborn/blurhash :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cvwright If you are doing a benchmark, reporting on the resulting filesizes of the hashes would be appreciated!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@anoadragon453 if this is to progress we probably need to update the MSC to define the field for the thumbhash to live. The lack of alpha channel in blurhash makes it insufficient in my opinion.
I have a PoC impl for EW which sends & renders both blurhash & thumbhash, preferring the latter for rendering when available.
Blurhash
Thumbhash
Original
PoC:
content["xyz.amorgan.thumbhash"]
base64 encoded matrix-org/matrix-react-sdk#12164