Redact all events older than a certain time #1730

rubo77 · 2016-12-30T08:32:09Z

We could add this to the prune API. So additionally, when you prune a room, you can also redact all those events, so all content is removed on federated rooms too

Half-Shot · 2016-12-30T18:17:52Z

I'm not happy with this. Pruning is intended to be used to save space, not to erase history. Redaction and pruning are entirely different things. By pruning, you are saving space on your server, not everyones.

I would consider it counterproductive to the history-first nature of Matrix to start allowing mass history removal across servers.

rubo77 · 2016-12-30T19:48:23Z

It depends on the use case.

The history-first directive isn't always what's needed.

Maybe make it configurable

Half-Shot · 2016-12-31T00:38:34Z

I want to be clear that what your asking is also against the use case of the Prune API.
It's sole purpose is to free up space on the host. It is not intended for redaction. Redaction is a different concept designed for removing events across servers.

What you are asking for is a separate API to remove events in bulk, as a redaction. I don't see this happening due to my reasons given before.

TL;DR - The prune API is a local admin api to free space, not to delete history.

rubo77 · 2016-12-31T02:06:48Z

I clarified the title and the initial message, so it is clear what I aim for.

There are several usecases, where it is desired to clear old messages for the sake of data avoidance and data parsimony

This was requested before already here: #1480

kythyria · 2016-12-31T02:52:17Z

If you want to delete old messages from your server, that's fine. You have exactly no right--and no ability to enforce--that I also delete them.

Half-Shot · 2016-12-31T02:53:18Z

I don't understand, what are you hoping to achieve by redacting over pruning. The only difference is that everyone's server gets affected v.s. yours. Redaction will leave some data, and as Erik explained here, you will still lose more or less the same amount of information in both cases. On top of the fact that deleting events is literally impossible to do entirely due to Matrix's design, so signatures are always left.

The only pro I can see to your argument is that by redacting, people can't paginate to get it back which causes issues for people like me, who want to retain all my history where I can.

TL;DR

If you want to delete old messages from your server, that's fine. You have exactly no right--and no ability to enforce--that I also delete them.

EDIT: I would like to clear up that #1480 is a bug predating pruning which became it later on.

kythyria · 2016-12-31T02:55:02Z

You can't unsend email, or unsend paper mail, or unsay things in general.

rubo77 · 2016-12-31T03:01:22Z

that is true, you cannot unsay it, but you could globally flag it as redacted

kythyria · 2016-12-31T03:02:17Z

You can demand people treat it as unsaid, but you can't actually unsay it.

rubo77 · 2016-12-31T03:08:59Z

yes, that is what I meant.

And additionally you could prevent an email from being sent, if it is still in your outbox. Meaning in Matrix: you could prevent a redacted message from being federated to other servers, if there are no other servers connected with your room yet. This actually is desired in private rooms a lot by people that are trying to avoid excessive data collection in the internet

kythyria · 2016-12-31T03:11:51Z

And additionally you could prevent an email from being sent, if it is still in your outbox.

Well, yes. Things that haven't happened yet are generally easy to undo. However, as soon as that message, matrix or email, touches another server, you've lost control of it. Period. No take-backs. So you have to run that bulk redact before anyone from another server enters the room. After that point the window for redaction is, uh, tiny.

rubo77 · 2016-12-31T03:21:02Z

I am perfectly fine with that.

What I aim for here is, that there will be an option to automatically redact all messages older than a certain timeframe, not deleted from the database (thats what #1621 is about) but redacted as it already works right now.

So if that option is turned on, all old messages are not shown in the clients (Riot) any more (although they theoretically still exist as "redacted" in the database) so normal users cannot scroll back in history more than this time.

If the room is federated, this redaction-flag should be federated too, so the admin has full control over the history of the room

rubo77 · 2016-12-31T03:23:40Z

A script to redact the history would start like this:

#!/bin/bash

# this script will purge all messages of a given room older than a definable age

DOMAIN=yourserver.tld
# add this user as admin in your home server:
ADMIN="@username:$DOMAIN"

#choose the room to prune old messages from
ROOM='!cURbafjkfsMDVwdRDQ:matrix.org' # for example: "Matrix HQ"

# choose a time before which the messages should be pruned:
# TIME='2016-08-31 23:59:59'
TIME='3 months ago'

# creates a timestamp from the given time string:
UNIX_TIMESTAMP=$(date +%s%3N --date='TZ="UTC+2" '"$TIME")

BUSY="pragma busy_timeout=20000"
BUFFER=$(sqlite3 homeserver.db "$BUSY;select event_id from events where type='m.room.message' and received_ts<'$UNIX_TIMESTAMP' and room_id='$ROOM' order by received_ts;")

for line in $BUFFER; do
  # use the api to redact those events
  # ...

kythyria · 2016-12-31T03:27:26Z

If the room is federated, this redaction-flag should be federated too, so the admin has full control over the history of the room

And then you hit a server run by someone like me that's been patched to ignore the flag, or never implemented it to begin with. Oops.

The admin does not, and cannot, have that kind of control. This is a fundamental property of any distributed system whose nodes are owned by unrelated entities. Imagine going up to Google and demanding they remove from the inboxes of their users every message older than X days. That's basically what you're asking for here.

And the point of redact is that stuff is deleted from the database leaving only a tombstone whose sole function is to prevent the room from becoming broken.

rubo77 · 2016-12-31T03:30:19Z

I don't understand what is the problem with flagging a message as "redacted by ...."? And Yes, every federated server can choose how to handle that flag, which is fine.

kythyria · 2016-12-31T03:34:27Z

The problem isn't a "redacted by" thing, it's that having an auto-redact state entry creates a false sense of security, and an even falser sense of control.

rubo77 · 2016-12-31T03:36:11Z

The "sense of security" wouldn't be false, if the history length would be visible in the head of the room.

look at Telegram: there are rooms that delete everything after some minutes and this is very visible to the user.

And moderated rooms are a fine option in chat systems like Slack and Matrix. There only has to be a fine configuration option who is allowed to delete messages or if it is not allowed.

And it has to be transparent

4nd3r · 2016-12-31T09:21:57Z

please see my comment here: #1621 (comment)

kythyria · 2016-12-31T10:29:12Z

The "sense of security" wouldn't be false, if the history length would be visible in the head of the room.

It would be entirely a lie if any server in the room ignored the history length. Which they will. So the only non-wishful history length it would be valid to display is "messages might be retained forever".

To put it another way, I can put Delete-after: 2d in my emails, and write a client that advertises the option to set that header, but that means absolutely nothing if your mail system doesn't honour it. It just means people will incorrectly think the messages will self-destruct.

Telegram can do this because it's a closed system where one party controls all the servers, and using a third-party client is difficult. Neither of those applies to Matrix, except in a strictly non-federated context.

rubo77 · 2016-12-31T15:27:43Z

#1621 (comment) by @kythyria

If and only if the room is completely unfederated, and the server honours the relevant messages, will redaction do what @rubo77 seems to think it does.

So is this all true?

redacting does "flag" a message in the database, so it should not be shown in clients (but still could be shown anyway)
this "flag" is federated to other servers too
If every client would obey not to show the redacted messages any more, they would be visible nowhere anymore (Riot does obey this)
If clients don't obey, they can still show the content of messages that were redacted
If a server is not federated to other servers a complete deletion of the content of a message could be a possibility to implement in the future

kfatehi · 2017-04-18T01:28:59Z

I also would like the ability to clear history in a room on my homeserver.

In the interim I just run these three SQL queries on my homserver.db...

delete from events where room_id = "...";
delete from event_json where room_id = "...";
delete from event_push_actions where room_id = "...";

If I were to go a step further I would parse each event type, and if it's a media message, go and delete the appropriate resources from the content repository, and then expose this feature in the UI to admins. But for now this + clear caches in riot is sufficient for my needs.

rubo77 · 2017-04-18T13:27:30Z

@kfatehi i think you are causing havoc on your database like this. There are a lot more tables affected and the federation completely breaks if you Yost delete stuff directly.

Please use the implemented prune functionality for this

kfatehi · 2017-04-18T21:51:42Z

@rubo77 Thanks for the comment. I am not familiar with prune -- reading the thread above it sounds like it doesn't actually delete messages, and for that I'd have to redact. A script that goes through and redacts everything might be good, but I'm not sure how effective redaction is in a situation like seizure of a homeserver. I'd need to audit these mechanisms and find out for sure.

Had this room been anything but a private direct-chat without federation, I'd have been more cautious!

Keeping an eye on element-hq/element-web#3104 -- thanks for creating these.

rubo77 · 2017-04-19T05:15:45Z

There is a purge Feature that really deletes the messages #911

This request was another idea instead of Prune,redact

ghost · 2017-09-20T09:37:54Z

@rubo77

On top of the fact that deleting events is literally impossible to do entirely due to Matrix's design, so signatures are always left.

Care to elaborate on this?

rubo77 · 2017-09-20T21:30:24Z

The problem is following: In some Rooms, there is just the need of the history being deleted after a certain time. Since really deleting the messages is not possible if the room is federated, because you can only delete it in your homeserver and it will be federated back to live from other homeservers.

The only solution at the moment is to redact all old posts, which will be federated then. (I am aware, that some homeservers could be modified to not to obey the redact flag, but the solution would be "best effort")

It would be easy to create a script, that redacts all posts older than a certain time, so this would be a nice feature, if it would exist directly in the room configuration.

Such an option should be completely transparent to all members, so you see, that if you write something in that room, it will only last for that long.

ghost · 2017-09-22T16:25:33Z

@rubo77 What about "signatures are always left"? I don't understand this part.

rubo77 · 2017-09-22T21:07:00Z

@Half-Shot said:

signatures are always left.

I can only guess what he meant: if you redact messages, there is a rest left in the database, for example the dateof the posts, and who posted it but "signature" is not the correct term for these "relics"

ghost · 2017-09-25T20:33:22Z

@rubo77 so metadata?

I don't think metadata should be left on servers forever, that's a privacy nightmare and there's no reason for that.

MurzNN · 2017-09-26T06:13:53Z

Can we after redacting - keep on servers only signature, without metadata (message text content, etc)? As I understand, via signature server validate message content, but if message is redacted - can we skip validation and accept cleaned up message with 'redact' flag and kept signature?

kythyria · 2017-09-29T09:23:54Z

Redacted messages contain a copy of the redaction message, the id, timestamp, and sender, as far as I can tell. The content is gone (this is all assuming that redaction is correctly implemented, which of course there are zero guarantees about).

The signature validation is designed so that this works (and the redaction message isn't part of the signature, nor could it be). Matrix relies on the signatures chaining together in order for a room to stay coherent, so there needs to be enough for the validation to work.

MurzNN · 2017-09-29T09:41:34Z

@kythyria thanks for the description!

I want to describe more about privacy problem with deletion - Matrix developers very often warns about privacy issues on feature requests about delete room & messages: with federation we can't control other servers and can't be sure that they remove messages & rooms, so they don't want implement deletion (full remove room, self-destruct messages, etc) in Matrix protocol. But most of rooms in server usually not federated and can be succesfully cleaned up on one homeserver with full guarantees. But users miss this feature, even if room is not federated.

So good way on deletion process will be check if room is federated, and show "large red warning" on client side, when user try to cleanup something, and describe that data of this room is removed only on this homeserver, and can be kept on other federated servers. And add per-room option "Disable federation".

This is better that ignoring all delete feature requests from users with "this is insecure so will not implemented".

4nd3r · 2017-09-29T10:32:20Z

@MurzNN what you describe is basically what i described here: #1621 (comment)

rubo77 · 2017-09-29T11:21:51Z

Yes, Great conclusion! So please include this behaviour someone.

What can we do to help accelerate the development in this direction, so we get these options?

MurzNN · 2017-09-29T11:28:40Z

We can already implement this feature now via bot, here is issue: turt2live/matrix-wishlist#82
This is not too hard work, so if anybody have free time or programmer resources - he can do the bot, based on Go-NEB for example.

MurzNN · 2017-10-07T10:36:02Z

Seems here is admin command now in Synapse for purging rooms: https://github.com/matrix-org/synapse/blob/master/docs/admin_api/purge_history_api.rst

4nd3r · 2017-10-07T14:36:48Z

@MurzNN this API doesn't delete events, but just some state related stuff, AFAIK

see https://github.com/matrix-org/synapse/blob/master/synapse/storage/events.py#L2014

rubo77 · 2018-02-15T14:46:04Z

Any news here? An optional per-room auto-deletion feature is strongly needed!

rubo77 · 2018-11-12T23:45:42Z

I added a script to the contrib section, that you can use: https://github.com/matrix-org/synapse/tree/develop/contrib/purge_api
This script only purges the history, so if the rooms are federated, the messages are not gone (unless purged everywhere)

mehturt · 2018-11-13T07:43:12Z

@rubo77 thanks.. what is the best way to discuss if the script does not work for me?

rubo77 · 2018-11-13T07:57:38Z

If you have enhancements to the script then create a pull request here.

Or contact me in https://riot.im/app/#/room/#synapse-admins:yuhu.ddns.net as user rubo77

cuongnv · 2018-11-20T06:34:45Z

@rubo77 : I create simple python script that can be remove messages after predefined timeout.
#4206

richvdh · 2019-08-19T11:38:50Z

I think it unlikely this is a feature we will add to synapse.

MurzNN · 2019-08-19T11:48:01Z

But we have MSC2228: Self destructing events in proposed-final-comment-period - does it not related to current feature?

richvdh · 2019-08-19T12:51:13Z

that's about events which get redacted after a certain period (eg '1 hour') which is different to an API which redacts all events older than a certain point in time (eg '06:00 today')

rubo77 · 2019-08-19T13:56:13Z

Please reopen.

I plan to create a contribution, that adds this as an external script

rubo77 changed the title ~~Prune API: redact all events before pruning~~ Redact all events oder than a certain time Dec 31, 2016

rubo77 changed the title ~~Redact all events oder than a certain time~~ Redact all events older than a certain time Dec 31, 2016

This was referenced Dec 31, 2016

Purge History API doesn't remove events from database #1621

Closed

delete old data to clean up database #890

Open

rubo77 mentioned this issue Feb 1, 2017

Add option to prune history after a certain time element-hq/element-web#3104

Closed

4nd3r mentioned this issue Jun 29, 2017

Limit size of room storage (files and history) in Matrix server #2315

Open

saintger mentioned this issue Jul 12, 2017

Deleting a room with associated contents (messages and uploaded files) matrix-org/matrix-spec-proposals#948

Closed

This was referenced Sep 21, 2017

Room purge bot turt2live/matrix-wishlist#82

Open

Matrix room history purge bot (for delete old messages) turt2live/matrix-wishlist#85

Closed

4nd3r mentioned this issue Oct 14, 2017

make it absolutely clear that purge doesn't remove everything #2540

Merged

ghost mentioned this issue Feb 6, 2019

[WIP] Utilize Riot Instead of Wire privacytools/privacytools.io#562

Merged

richvdh closed this as completed Aug 19, 2019

rubo77 mentioned this issue Oct 29, 2019

add purge_history.sh and purge_remote_media.sh scripts #4155

Merged

Redact all events older than a certain time #1730

Redact all events older than a certain time #1730

Comments

rubo77 commented Dec 30, 2016 • edited Loading

Half-Shot commented Dec 30, 2016 • edited Loading

rubo77 commented Dec 30, 2016

Half-Shot commented Dec 31, 2016

rubo77 commented Dec 31, 2016 • edited Loading

kythyria commented Dec 31, 2016

Half-Shot commented Dec 31, 2016 • edited Loading

kythyria commented Dec 31, 2016

rubo77 commented Dec 31, 2016

kythyria commented Dec 31, 2016

rubo77 commented Dec 31, 2016 • edited Loading

kythyria commented Dec 31, 2016

rubo77 commented Dec 31, 2016

rubo77 commented Dec 31, 2016

kythyria commented Dec 31, 2016

rubo77 commented Dec 31, 2016 • edited Loading

kythyria commented Dec 31, 2016

rubo77 commented Dec 31, 2016 • edited Loading

4nd3r commented Dec 31, 2016

kythyria commented Dec 31, 2016 • edited Loading

rubo77 commented Dec 31, 2016 • edited Loading

kfatehi commented Apr 18, 2017 • edited Loading

rubo77 commented Apr 18, 2017

kfatehi commented Apr 18, 2017

rubo77 commented Apr 19, 2017 • edited Loading

ghost commented Sep 20, 2017

rubo77 commented Sep 20, 2017 • edited Loading

ghost commented Sep 22, 2017 • edited by ghost Loading

rubo77 commented Sep 22, 2017 • edited Loading

ghost commented Sep 25, 2017

MurzNN commented Sep 26, 2017 • edited Loading

kythyria commented Sep 29, 2017

MurzNN commented Sep 29, 2017

4nd3r commented Sep 29, 2017

rubo77 commented Sep 29, 2017

MurzNN commented Sep 29, 2017

MurzNN commented Oct 7, 2017

4nd3r commented Oct 7, 2017

rubo77 commented Feb 15, 2018

rubo77 commented Nov 12, 2018 • edited Loading

mehturt commented Nov 13, 2018

rubo77 commented Nov 13, 2018 • edited Loading

cuongnv commented Nov 20, 2018

richvdh commented Aug 19, 2019

MurzNN commented Aug 19, 2019

richvdh commented Aug 19, 2019

rubo77 commented Aug 19, 2019

rubo77 commented Dec 30, 2016 •

edited

Loading

Half-Shot commented Dec 30, 2016 •

edited

Loading

rubo77 commented Dec 31, 2016 •

edited

Loading

Half-Shot commented Dec 31, 2016 •

edited

Loading

rubo77 commented Dec 31, 2016 •

edited

Loading

rubo77 commented Dec 31, 2016 •

edited

Loading

rubo77 commented Dec 31, 2016 •

edited

Loading

kythyria commented Dec 31, 2016 •

edited

Loading

rubo77 commented Dec 31, 2016 •

edited

Loading

kfatehi commented Apr 18, 2017 •

edited

Loading

rubo77 commented Apr 19, 2017 •

edited

Loading

rubo77 commented Sep 20, 2017 •

edited

Loading

ghost commented Sep 22, 2017 •

edited by ghost

Loading

rubo77 commented Sep 22, 2017 •

edited

Loading

MurzNN commented Sep 26, 2017 •

edited

Loading

rubo77 commented Nov 12, 2018 •

edited

Loading

rubo77 commented Nov 13, 2018 •

edited

Loading