-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Query missing cross-signing keys on local sig upload #7289
Query missing cross-signing keys on local sig upload #7289
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this save the key in the cache after fetching it and if not, should it? I can see above in the /keys/query path that it claims to be storing keys in the cache although failing to find where it actually does, and also not sure if this applies to cross-signing keys.
synapse/handlers/e2e_keys.py
Outdated
user = UserID.from_string(user_id) | ||
try: | ||
remote_result = yield self.federation.query_client_keys( | ||
user.domain, {"device_keys": {user_id: []}}, timeout=10 * 1000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the user ID we're querying be in here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume so? That's what the user_id
in {"device_keys": {user_id: []}}
is. I left the value as []
as according to the MSC that should query all of their keys?
Does this save the key in the cache after fetching it and if not, should it?
It's not, currently, but I suspect we want to. However atm I don't see any cache involved here. It looks like it's just pulling them directly from the db. I can save them to the db instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh right yes, I was completely failing to read the object literal correctly.
and yeah, I think cache === database according to this code (
synapse/synapse/handlers/e2e_keys.py
Line 281 in d2a9b45
def get_cross_signing_keys_from_cache(self, query, from_user_id): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, if there's other code that relies on these keys being available then we'd also need the same fix there, and in that case it'd be more efficient to update the database in each instance.
But, changing the state of the db here may break other assumptions, so there's an argument against it. Unfortunately I don't have a grasp on everything Synapse is doing related to cross-signing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although, looking at what code calls get_e2e_cross_signing_key
, it's only the sig upload servlet and federation key queries, both of which shouldn't be harmed by having the extra keys in there.
Let me see what the code for saving them after retrieving them looks like.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks plausible, but yes, unsure if you would need to notify clients.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably best to notify clients.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we seem to have a very similar situation in E2eKeysHandler.query_devices
, which I don't think does any notification in this situation. After all, we're assuming that the user does not share any rooms with the remote user, so why would we notify clients of the new key.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, I wonder if we should not be sharing code with E2eKeysHandler.query_devices
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I originally pulled do_remote_query
out of that method to try and share it between query_devices
and _get_e2e_cross_signing_verify_key
, but decided it was doing too much and just plucked self.federation.query_client_keys
out of it instead.
Now we've transitioned to self.federation.query_user_devices
which is a lot simpler. I'm just now worried/wondering whether we should be taking the device_keys
returned by this method and doing anything with them (which query_devices
seems to do, but it does so much else as well).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit concerned that a client could easily cause a DoS here by uploading a bunch of signatures which it claims are for a user on some remote server. AFAICT we would then retrieve the key for every single signature?
synapse/handlers/e2e_keys.py
Outdated
user = UserID.from_string(user_id) | ||
try: | ||
remote_result = yield self.federation.query_client_keys( | ||
user.domain, {"device_keys": {user_id: []}}, timeout=10 * 1000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we seem to have a very similar situation in E2eKeysHandler.query_devices
, which I don't think does any notification in this situation. After all, we're assuming that the user does not share any rooms with the remote user, so why would we notify clients of the new key.
synapse/handlers/e2e_keys.py
Outdated
user = UserID.from_string(user_id) | ||
try: | ||
remote_result = yield self.federation.query_client_keys( | ||
user.domain, {"device_keys": {user_id: []}}, timeout=10 * 1000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, I wonder if we should not be sharing code with E2eKeysHandler.query_devices
Correct, which is a problem that kind of invalidates this whole approach. Hm. |
@dbkr Is the problem that the race condition occurs between the client creating a DM with the remote user and trying to upload key signatures from that user? How can you upload signatures if your homeserver has ignored device updates from that user? How did you have their public keys to sign? Is the client waiting until the room is created before uploading signatures? If so, then there's definitely an issue with us querying for the rooms a remote user is in on the server. |
I don't think so: I've been creating a direct chat with the user, accepting, then starting a verification, so I don't think any of this could race when it's happening in the amount of time it takes me to click on stuff.
So in
It definitely is doing this: I'm not starting the verification process until the other user has joined (although this is also a good point - we've not done anything in particular to prevent you from trying to verify a user you aren't in a room with, but that's not the problem in this case). |
@dbkr Thanks. My working theory on this is that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks roughly fine, other than a little nit. Please do try to test it in some way before merging, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a few more thoughts and questions from me
Looking at the signature update handling code, we may want to update clients about new devices as well: synapse/synapse/handlers/e2e_keys.py Lines 1261 to 1262 in 40042de
Edit: Added some code to handle this. |
synapse/handlers/e2e_keys.py
Outdated
try: | ||
key_id, verify_key = get_verify_key_from_cross_signing_key(key) | ||
except ValueError as e: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm afraid this still feels like spaghetti. Why are we returning key_id and verify_key from _retrieve_cross_signing_keys_for_remote_user
if we immediately throw them away?
how about:
key = yield self.store.get_e2e_cross_signing_key(
user_id, key_type, from_user_id
)
if key:
# we found a copy of this key in our database. decode it and return it.
# XXX is a try/except needed here? If so, why? we didn't have one before.
key_id, verify_key = get_verify_key_from_cross_signing_key(key)
return key, key_id, verify_key
# some words about the edge case
if not self.is_mine(user) and key_type in ["master", "self_signing"]:
(
key,
key_id,
verify_key,
) = yield self._retrieve_cross_signing_keys_for_remote_user(user, key_type)
if key is None:
logger.debug("No %s key found for %s", key_type, user_id) # XXX or wrning?
raise NotFoundError("No %s key found for %s" % (key_type, user_id))
return key, key_id, verify_key
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
XXX is a try/except needed here? If so, why? we didn't have one before.
I suppose one wouldn't be if we trust the data in our database, which we probably should be.
synapse/handlers/e2e_keys.py
Outdated
# At the same time, store this key in the db for | ||
# subsequent queries | ||
yield self.store.set_e2e_cross_signing_key( | ||
user.to_string(), key_type, key_content | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it correct that we store it before we check that it can be parsed? (In short, is the data in the table supposed to have been validated or not? If not, we need to be careful when we extract it). What does the existing code do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Considering the above call from the database did not have a try/except, I believe we're operating on the assumption that the data in the database is valid.
Yes, we should validate this data in some way. Some level of such is done in do_remote_query
:
synapse/synapse/handlers/e2e_keys.py
Lines 236 to 248 in 461f01a
for user_id, keys in remote_result["device_keys"].items(): | |
if user_id in destination_query: | |
results[user_id] = keys | |
if "master_keys" in remote_result: | |
for user_id, key in remote_result["master_keys"].items(): | |
if user_id in destination_query: | |
cross_signing_keys["master_keys"][user_id] = key | |
if "self_signing_keys" in remote_result: | |
for user_id, key in remote_result["self_signing_keys"].items(): | |
if user_id in destination_query: | |
cross_signing_keys["self_signing_keys"][user_id] = key |
I'd hate to duplicate but that could be lifted and modified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Speaking of verification, one thing that's confusing me is that in the loop of _retrieve_cross_signing_keys_for_remote_user
, remote_result
should look something like this:
synapse/synapse/federation/transport/client.py
Lines 404 to 415 in fd8d154
Response: | |
{ | |
"device_keys": { | |
"<user_id>": { | |
"<device_id>": {...} | |
} } | |
"master_keys": { | |
"<user_id>": {...} | |
} } | |
"self_signing_keys": { | |
"<user_id>": {...} | |
} } } |
Thus key_content
should look something like:
synapse/synapse/federation/transport/client.py
Lines 407 to 408 in fd8d154
"<user_id>": { | |
"<device_id>": {...} |
And thus we should need to extract the key_contents
from that dictionary before we pass it to get_verify_key_from_cross_signing_key
.
And yet this PR works, so something isn't lining up here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Already, so my docstring update is incorrect, as the thing I actually get back from query_client_keys
is in the form:
{
"user_id": "@test18:workerstest.amorgan.xyz",
"stream_id": 213,
"devices": [
{
"device_id": "MLYKYOAKEZ"
},
{
"device_id": "NBICDLRVHV",
"keys": {
"algorithms": [
"m.olm.v1.curve25519-aes-sha2",
"m.megolm.v1.aes-sha2"
],
"device_id": "NBICDLRVHV",
"keys": {
"curve25519:NBICDLRVHV": "E9ryKjylPorQIxyRA4FsJwi9d2bPECLvWKjgcuqmamU",
"ed25519:NBICDLRVHV": "yk5AMkCyhogBCQt1CrVbFL9weRb/5iExmIDYd7JmBhE"
},
"signatures": {
"@test18:workerstest.amorgan.xyz": {
"ed25519:NBICDLRVHV": "JTAL+uIduDDJjdQneocTbjqjIDMljqDOq8bJ6ZgAYOQ01wMHrX4qRmQufQT9KAiYoXQ+b6UY+ZNEIFjyi1rjCg",
"ed25519:cEYsFkRyHrfj1fLLgq4rA+JyvOwHda2mO092N4scL/E": "RQSm3aYjvWMWHx0GHZPRRKvj90TEdlqjOGXulXBUSDEuI2P64A7N/7/uSdx//BhqOgcmZSS/9o1Awlpb+KiiCg"
}
},
"user_id": "@test18:workerstest.amorgan.xyz"
},
"device_display_name": "https://riot.im/develop/ via Firefox on Linux"
}
],
"master_key": {
"user_id": "@test18:workerstest.amorgan.xyz",
"usage": [
"master"
],
"keys": {
"ed25519:RPg3wmdeZOvjMCmi3/pP9+4ndSW9W2dhE/RkLyCdr+E": "RPg3wmdeZOvjMCmi3/pP9+4ndSW9W2dhE/RkLyCdr+E"
},
"signatures": {
"@test18:workerstest.amorgan.xyz": {
"ed25519:NBICDLRVHV": "YLdKUD4aDmJx/vqIWjbqXTOZXv+cU1ba8Z9hw44vVn32uhLdJAiGC+L/6HeBTz7+Wh+NdWGBiy+usea3408bDA"
}
}
},
"self_signing_key": {
"user_id": "@test18:workerstest.amorgan.xyz",
"usage": [
"self_signing"
],
"keys": {
"ed25519:cEYsFkRyHrfj1fLLgq4rA+JyvOwHda2mO092N4scL/E": "cEYsFkRyHrfj1fLLgq4rA+JyvOwHda2mO092N4scL/E"
},
"signatures": {
"@test18:workerstest.amorgan.xyz": {
"ed25519:RPg3wmdeZOvjMCmi3/pP9+4ndSW9W2dhE/RkLyCdr+E": "1H9aaL9623L764uGAezZM5jH3Gp5DLwto4s9rz+IP3bGQqLo0Cyi2bWK54aJbVrJ6akwxahZRs/n9vLDn3nIDQ"
}
}
}
}
Lemme update that.
In terms of verification, I think we just need to check the user_id matches the user we're querying for, and that there's a keys
field under each *_keys
key.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aaand I was looking at the output of query_user_devices
, not query_client_keys
. Have updated the right docstring now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So to conclude, we now validate the key by checking it belongs to the right user, and has a valid keys
key dict in it with get_verify_key_from_cross_signing_key
.
Co-Authored-By: Richard van der Hoff <[email protected]>
…:matrix-org/synapse into anoa/query_cross_signing_keys_key_upload * 'anoa/query_cross_signing_keys_key_upload' of github.com:matrix-org/synapse: Update changelog.d/7289.bugfix
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a few nits left. looking pretty good though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm otherwise. please fix up and if it passes CI, merge.
Synapse 1.12.4rc1 (2020-04-22) ============================== Features -------- - Always send users their own device updates. ([\#7160](#7160)) - Add support for handling GET requests for `account_data` on a worker. ([\#7311](#7311)) Bugfixes -------- - Fix a bug that prevented cross-signing with users on worker-mode synapses. ([\#7255](#7255)) - Do not treat display names as globs in push rules. ([\#7271](#7271)) - Fix a bug with cross-signing devices belonging to remote users who did not share a room with any user on the local homeserver. ([\#7289](#7289))
Synapse v1.12.4 Features: * Always send users their own device updates. (#7160) * Add support for handling GET requests for account_data on a worker. (#7311) Bugfixes: * Fix a bug that prevented cross-signing with users on worker-mode synapses. (#7255) * Do not treat display names as globs in push rules. (#7271) * Fix a bug with cross-signing devices belonging to remote users who did not share a room with any user on the local homeserver. (#7289)
…anoa/temp_working_cache_config * 'release-v1.12.4' of github.com:matrix-org/synapse: (123 commits) 1.12.4 formatting for the changelog 1.12.4rc1 1.12.4rc1 Do not treat display names as globs for push rules. (#7271) Query missing cross-signing keys on local sig upload (#7289) Fix changelog file Support GET account_data requests on a worker (#7311) Revert "Query missing cross-signing keys on local sig upload" Always send the user updates to their own device list (#7160) Query missing cross-signing keys on local sig upload Only register devices edu handler on the master process (#7255) tweak changelog 1.12.3 Fix the debian build in a better way. (#7212) Fix changelog wording 1.12.2 Pin Pillow>=4.3.0,<7.1.0 to fix dep issue 1.12.1 Note where bugs were introduced ...
…cache-config-without-synctl * 'develop' of github.com:matrix-org/synapse: (76 commits) 1.12.4 Revert "Revert "Merge pull request #7315 from matrix-org/babolivier/request_token"" Revert "Merge pull request #7315 from matrix-org/babolivier/request_token" Stop the master relaying USER_SYNC for other workers (#7318) Config option to inhibit 3PID errors on /requestToken Fix replication metrics when using redis (#7325) formatting for the changelog Another go at fixing one-word commands (#7326) 1.12.4rc1 1.12.4rc1 fix changelog name Extend StreamChangeCache to support multiple entities per stream ID (#7303) Extend room admin api with additional attributes (#7225) Add ability to run replication protocol over redis. (#7040) Do not treat display names as globs for push rules. (#7271) Reduce logging verbosity of URL cache cleanup. (#7295) Query missing cross-signing keys on local sig upload (#7289) import urllib.parse when using urllib.parse.quote (#7319) Reduce federation logging on success (#7321) Fix changelog file ...
Mitigates #7276
So we have a race condition that appears in the following scenario: You want to cross-sign with a remote user who doesn't share a room with any users on your homeserver. When you start cross-signing, a DM is created between you and the remote user.
When the remote user sends you their keys, the DM room may not have completed creation yet. In this case, your homeserver ignores the new keys. When you then upload your signatures of the remote user, your homeserver rejects them as it doesn't have the remote user's keys.
This seems to only occur with workerised Synapse setups. A possible solution to this is to fix the race condition, but that still leaves a number of people with broken setups. Another solution, and one we should probably do anyways, is to query the remote user's keys when a local user signs them, if the homeserver doesn't have them already.
So that's what this PR does. At least I think it's what it does. This is untested and I used MSC1756 as a reference for all of this so it may or may not be right. Review appreciated!