-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Server-wide media retention policy #6832
Comments
there are open issues/MSCs around deleting media: we should hunt them down and link them |
I've updated the issue's description with the few issues I could find. I didn't find any MSC relating to that though. |
Thanks. MSC2278 seems to be the MSC I was thinking of. |
After chatting with @neilisfragile and @lampholder about the quarantine concern, the conclusion is that a media retention policy should ignore quarantined media. |
I would like to add that avatars and room avatars should probably be stored in a different way. They mostly seem to be cached, so they are probably not accessed all that often in smaller groups, which mean they would be purged, too. May add a tag or something to all user and room Avatars? Or was that what you mean by quarantined data? |
Nevermind, as it turns out i can not read, you have already thought of that. |
Great points made by babolivier. There are few options for dealing with media: But they don't seem to do the main thing - to delete it on local server. I would agree that there should be a server-wide media retention policy available, that could be set in config in the same way as message retention is now. |
Not sure if this is the correct place to note this. |
The This endpoint calls the Of course, note that this will only work with media uploaded to Synapse's media repository, rather than using tools such as matrix-media-repo. |
Deleting media automatically based on TTL is going to be troublesome as Matrix is used in new ways. Let's say that the problem we actually want to solve is to remove media that has no real references left (a reasonable request, given the disk space this must take up). Unfortunately the homeserver doesn't know which media are pointed to by encrypted events... As an idea, I wonder if we could implement a hybrid scheme:
Finally, an MSC could be written to allow some encrypted events to 'opt-in' to declaring which media they point to, in their unsigned portion. This would be useful for applications such as file storage on Matrix, where you don't want your infrequently-accessed files to just go missing one day, but as it's optional, chat clients (etc) can still choose to be secretive and withhold that reference — they will just have to accept that the media can be deleted whilst it's still potentially accessible. |
@reivilibre very good point. I think user and room avatars are also in the media repo (mxc://) so garbage collecting those would cause problems. Any other uses for media that aren't suitable for garbage collection? Access time is better than creation/modification time, but still problematic in many usecases. |
Note that this API does operate on last access time, which is updated as users browse encrypted rooms.
This endpoint won't delete room and user avatars if
We also won't be aware of events referencing local media in rooms that our homeserver isn't in. To that end, last_access time (with a suitably conservative threshold) is probably the best media TTL solution we have available at the moment. |
It would be unfortunate if media from MSC2545: Image Packs would get deleted by this. |
I filed #12928 about this. |
A feature that's both interesting to have and fairly well requested is the ability to configure a media retention policy at the server level.
A first approach would have been to base the TTL of a media on its date of upload, but then we'd likely delete still-in-use avatars, medias used in community descriptions, etc.
Therefore, a preferred approach is to base that TTL on the date it was last accessed at, to ensure we don't delete media that are still being used. FTR, that date of last access is stored by Synapse for both remote and local media, so it's technically doable.
Another thing to consider is that we currently don't have any way in deleting a media in Synapse, so that'd need to be added in.
Also, we'd need to figure out how this feature would handle quarantined media.
cc @rxl881
Related: #6459, #3479, https://github.com/matrix-org/matrix-doc/issues/790
The text was updated successfully, but these errors were encountered: