Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Inline URL preview documentation. #13261

Merged
merged 3 commits into from
Jul 12, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 1 addition & 2 deletions changelog.d/13233.doc
Original file line number Diff line number Diff line change
@@ -1,2 +1 @@
Add a link to configuration instructions in the URL preview documentation.

Move the documentation for how URL previews work to the URL preview module.
1 change: 1 addition & 0 deletions changelog.d/13261.doc
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Move the documentation for how URL previews work to the URL preview module.
1 change: 0 additions & 1 deletion docs/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,6 @@
- [Application Services](application_services.md)
- [Server Notices](server_notices.md)
- [Consent Tracking](consent_tracking.md)
- [URL Previews](development/url_previews.md)
- [User Directory](user_directory.md)
- [Message Retention Policies](message_retention_policies.md)
- [Pluggable Modules](modules/index.md)
Expand Down
2 changes: 1 addition & 1 deletion docs/admin_api/user_admin_api.md
Original file line number Diff line number Diff line change
Expand Up @@ -544,7 +544,7 @@ Gets a list of all local media that a specific `user_id` has created.
These are media that the user has uploaded themselves
([local media](../media_repository.md#local-media)), as well as
[URL preview images](../media_repository.md#url-previews) requested by the user if the
[feature is enabled](../development/url_previews.md).
[feature is enabled](../usage/configuration/config_documentation.md#url_preview_enabled).

By default, the response is ordered by descending creation date and ascending media ID.
The newest media is on top. You can change the order with parameters
Expand Down
62 changes: 0 additions & 62 deletions docs/development/url_previews.md

This file was deleted.

5 changes: 1 addition & 4 deletions docs/media_repository.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,7 @@ The media repository
users.
* caches avatars, attachments and their thumbnails for media uploaded by remote
users.
* caches resources and thumbnails used for
[URL previews](development/url_previews.md).
* caches resources and thumbnails used for URL previews.

All media in Matrix can be identified by a unique
[MXC URI](https://spec.matrix.org/latest/client-server-api/#matrix-content-mxc-uris),
Expand Down Expand Up @@ -59,8 +58,6 @@ remote_thumbnail/matrix.org/aa/bb/cccccccccccccccccccc/128-96-image-jpeg
Note that `remote_thumbnail/` does not have an `s`.

## URL Previews
See [URL Previews](development/url_previews.md) for documentation on the URL preview
process.

When generating previews for URLs, Synapse may download and cache various
resources, including images. These resources are assigned temporary media IDs
Expand Down
62 changes: 58 additions & 4 deletions synapse/rest/media/v1/preview_url_resource.py
Original file line number Diff line number Diff line change
Expand Up @@ -109,10 +109,64 @@ class MediaInfo:

class PreviewUrlResource(DirectServeJsonResource):
"""
Generating URL previews is a complicated task which many potential pitfalls.

See docs/development/url_previews.md for discussion of the design and
algorithm followed in this module.
The `GET /_matrix/media/r0/preview_url` endpoint provides a generic preview API
for URLs which outputs Open Graph (https://ogp.me/) responses (with some Matrix
specific additions).

This does have trade-offs compared to other designs:

* Pros:
* Simple and flexible; can be used by any clients at any point
* Cons:
* If each homeserver provides one of these independently, all the homeservers in a
room may needlessly DoS the target URI
* The URL metadata must be stored somewhere, rather than just using Matrix
itself to store the media.
* Matrix cannot be used to distribute the metadata between homeservers.

When Synapse is asked to preview a URL it does the following:

1. Checks against a URL blacklist (defined as `url_preview_url_blacklist` in the
config).
2. Checks the URL against an in-memory cache and returns the result if it exists. (This
is also used to de-duplicate processing of multiple in-flight requests at once.)
3. Kicks off a background process to generate a preview:
1. Checks URL and timestamp against the database cache and returns the result if it
has not expired and was successful (a 2xx return code).
2. Checks if the URL matches an oEmbed (https://oembed.com/) pattern. If it
does, update the URL to download.
3. Downloads the URL and stores it into a file via the media storage provider
and saves the local media metadata.
4. If the media is an image:
1. Generates thumbnails.
2. Generates an Open Graph response based on image properties.
5. If the media is HTML:
1. Decodes the HTML via the stored file.
2. Generates an Open Graph response from the HTML.
3. If a JSON oEmbed URL was found in the HTML via autodiscovery:
1. Downloads the URL and stores it into a file via the media storage provider
and saves the local media metadata.
2. Convert the oEmbed response to an Open Graph response.
3. Override any Open Graph data from the HTML with data from oEmbed.
4. If an image exists in the Open Graph response:
1. Downloads the URL and stores it into a file via the media storage
provider and saves the local media metadata.
2. Generates thumbnails.
3. Updates the Open Graph response based on image properties.
6. If the media is JSON and an oEmbed URL was found:
1. Convert the oEmbed response to an Open Graph response.
2. If a thumbnail or image is in the oEmbed response:
1. Downloads the URL and stores it into a file via the media storage
provider and saves the local media metadata.
2. Generates thumbnails.
3. Updates the Open Graph response based on image properties.
7. Stores the result in the database cache.
4. Returns the result.

The in-memory cache expires after 1 hour.

Expired entries in the database cache (and their associated media files) are
deleted every 10 seconds. The default expiration time is 1 hour from download.
"""

isLeaf = True
Expand Down