-
Notifications
You must be signed in to change notification settings - Fork 345
Crypto: Fixed a bug where room keys would be rotated unecessarily in the presence of blacklisted/withheld devices #4954
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Previously, `is_session_overshared_for_user` did not take into account that `shared_with_set` also contains withheld device IDs who explicitly have never received the session keys. This would lead to it mistakenly determining oversharing for those devices for every event being sent in the presence of blacklisted/withheld devices in the room, and rotating the group session accordingly. The fix is to correctly exclude devices with ShareInfo::Withheld from the enumeration.
|
I don't really understand the failing test: It seems to withhold room keys for Dan's devices because of The correct test case should be to actually do share the keys with both devices first, so that a rotation is in fact necessary. |
|
#2729 (which is kind of the opposite of the behavior I was trying to fix here) would indicate that there are probably other considerations about which devices may actually eventually get the key, e.g. |
poljar
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really understand the failing test: It seems to withhold room keys for Dan's devices because of
m.no_olm, so it doesn't actually share the keys with Dan, and hence correctly determines that no rotation is necessary after Dan's second device logs out.
I think the test might have been written before the support for the m.room_key.withheld messages was added. It therefore didn't set up the required Olm sessions for the room key to be shared even though it obviously wanted Dan's devices to be in the shared set.
To fix the test we'll probably want to rewrite the test so it doesn't call share_room_key() and mark_request_as_sent() to modify the state of the outbound group session. Instead we can just call OutboundGroupSession::mark_shared_with(). This won't require any setup of the Olm sessions.
The bug you discovered was likely introduced with the .withheld messages as well.
I think that this PR makes sense, could you just fix the failing test and introduce a regression test which checks the case where we don't want to rotate as well.
|
@poljar Thank you for your feedback. Added the two tests. I believe there was an additional bug in the previous implementation for not taking into account pending to-device requests. I can't prove to myself why it should not be possible for pending requests to exist before share_room_key is called, especially because it specifically handles this case after calling There is some potential for refactoring, especially to properly encapsulate the |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #4954 +/- ##
==========================================
- Coverage 85.95% 85.94% -0.02%
==========================================
Files 325 325
Lines 35649 35651 +2
==========================================
- Hits 30641 30639 -2
- Misses 5008 5012 +4 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
On second read, I believe the other bug is pretty much exactly #2729, so this should potentially be a fix for that IIUC |
They can exists if you call They can't exists if you call The way to add the pending requests is to call Both of those cases happen after the call to
You are right. The reason why this works is that we discard the room key if we don't manage to mark all the pending requests as sent: matrix-rust-sdk/crates/matrix-sdk/src/room/mod.rs Lines 1774 to 1785 in a60e336
Now, there is still an edge case where #2729 is important, if the future gets cancelled and we fail to discard the room key. This can happen if the application closes unexpectedly, then we are left with an
I was imagining that we should have a stateful API for this, i.e. For example it would contain a per-room lock, users can't attempt to share a room key for the same room concurrently, it would also contain logic that would only persist and "activate" the room key if we managed to send out all the pending requests. So something like: async fn share_room_key() -> ShareRoomKeyState {
...
}
impl Drop for ShareRoomKeyState {
fn drop(&mut self) {
if !self.is_done() {
self.session_manager.discard_room_key(&self.room_id);
}
}
}
let state = olm_machine.share_room_key().await;
for request in state.pending_requests {
// The `?` is fine because `drop()` will discard the room key if we fail.
let response = client.send_request(request).await?;
// This will set the `done` flag for the state if this is our final request.
state.mark_request_as_sent(response).await;
}This still has a problem with cancel safety, unless we also change the way the All that being said, your approach might be better since it removes the cancel-safety problem. |
poljar
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
…equests into account" This reverts commit c36d512.
poljar
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, looks good.
|
@poljar Done.
FYI there is at least one test which fails an |
That doesn't surprise me, as mentioned, the API of the crypto crate doesn't prevent this state. The |
For js-sdk users, this includes the following:
- Send stable identifier `sender_device_keys` for MSC4147 (Including device keys with Olm-encrypted events).
([#4964](matrix-org/matrix-rust-sdk#4964))
- Check the `sender_device_keys` field on _all_ incoming Olm-encrypted to-device messages and ignore any to-device messages which include the field but whose data is invalid (as per [MSC4147](matrix-org/matrix-spec-proposals#4147)).
([#4922](matrix-org/matrix-rust-sdk#4922))
- Fix bug which caused room keys to be unnecessarily rotated on every send in the presence of blacklisted/withheld devices in the room.
([#4954](matrix-org/matrix-rust-sdk#4954))
- Fix [matrix-rust-sdk#2729](matrix-org/matrix-rust-sdk#2729) which in rare cases can cause room key oversharing.
([#4975](matrix-org/matrix-rust-sdk#4975))
For js-sdk users, this includes the following:
- Send stable identifier `sender_device_keys` for MSC4147 (Including device keys with Olm-encrypted events).
([#4964](matrix-org/matrix-rust-sdk#4964))
- Check the `sender_device_keys` field on _all_ incoming Olm-encrypted to-device messages and ignore any to-device messages which include the field but whose data is invalid (as per [MSC4147](matrix-org/matrix-spec-proposals#4147)).
([#4922](matrix-org/matrix-rust-sdk#4922))
- Fix bug which caused room keys to be unnecessarily rotated on every send in the presence of blacklisted/withheld devices in the room.
([#4954](matrix-org/matrix-rust-sdk#4954))
- Fix [matrix-rust-sdk#2729](matrix-org/matrix-rust-sdk#2729) which in rare cases can cause room key oversharing.
([#4975](matrix-org/matrix-rust-sdk#4975))
For js-sdk users, this includes the following:
- Send stable identifier `sender_device_keys` for MSC4147 (Including device keys with Olm-encrypted events).
([#4964](matrix-org/matrix-rust-sdk#4964))
- Check the `sender_device_keys` field on _all_ incoming Olm-encrypted to-device messages and ignore any to-device messages which include the field but whose data is invalid (as per [MSC4147](matrix-org/matrix-spec-proposals#4147)).
([#4922](matrix-org/matrix-rust-sdk#4922))
- Fix bug which caused room keys to be unnecessarily rotated on every send in the presence of blacklisted/withheld devices in the room.
([#4954](matrix-org/matrix-rust-sdk#4954))
- Fix [matrix-rust-sdk#2729](matrix-org/matrix-rust-sdk#2729) which in rare cases can cause room key oversharing.
([#4975](matrix-org/matrix-rust-sdk#4975))
For js-sdk users, this includes the following:
- Send stable identifier `sender_device_keys` for MSC4147 (Including device keys with Olm-encrypted events).
([#4964](matrix-org/matrix-rust-sdk#4964))
- Check the `sender_device_keys` field on _all_ incoming Olm-encrypted to-device messages and ignore any to-device messages which include the field but whose data is invalid (as per [MSC4147](matrix-org/matrix-spec-proposals#4147)).
([#4922](matrix-org/matrix-rust-sdk#4922))
- Fix bug which caused room keys to be unnecessarily rotated on every send in the presence of blacklisted/withheld devices in the room.
([#4954](matrix-org/matrix-rust-sdk#4954))
- Fix [matrix-rust-sdk#2729](matrix-org/matrix-rust-sdk#2729) which in rare cases can cause room key oversharing.
([#4975](matrix-org/matrix-rust-sdk#4975))
For js-sdk users, this includes the following:
- Send stable identifier `sender_device_keys` for MSC4147 (Including device keys with Olm-encrypted events).
([#4964](matrix-org/matrix-rust-sdk#4964))
- Check the `sender_device_keys` field on _all_ incoming Olm-encrypted to-device messages and ignore any to-device messages which include the field but whose data is invalid (as per [MSC4147](matrix-org/matrix-spec-proposals#4147)).
([#4922](matrix-org/matrix-rust-sdk#4922))
- Fix bug which caused room keys to be unnecessarily rotated on every send in the presence of blacklisted/withheld devices in the room.
([#4954](matrix-org/matrix-rust-sdk#4954))
- Fix [matrix-rust-sdk#2729](matrix-org/matrix-rust-sdk#2729) which in rare cases can cause room key oversharing.
([#4975](matrix-org/matrix-rust-sdk#4975))
Previously,
is_session_overshared_for_userdid not take into account thatshared_with_setalso contains withheld device IDs who explicitly have never received the session keys. This would lead to it mistakenly determining oversharing for those devices for every event being sent in the presence of blacklisted/withheld devices in the room, and rotating the group session accordingly.The fix is to correctly exclude devices with
ShareInfo::Withheldfrom the enumeration.Signed-off-by: Niklas Baumstark [email protected]