Skip to content

Conversation

@anoadragon453
Copy link
Member

@anoadragon453 anoadragon453 commented Sep 12, 2025

This test started to fail when ran against element-hq/synapse#18899. The failure appeared to be due to the test not waiting for signatures to propagate over federation. The loop would exit as soon as it saw a 'signatures' key. But the value was an empty dict.

A few moments later, the dict would be populated with the key we were waiting for. But while that was being sent over federation, the test would fail and exit abruptly.

This commit changes the loop to actually check for the signature we're waiting for, instead of exiting upon seeing the 'signatures' key. This seems like it would be less flaky in general, and prevents the test from failing in this case.

@anoadragon453
Copy link
Member Author

anoadragon453 commented Sep 13, 2025

Looks like this is failing on Synapse workers - cache invalidation not propagating properly? I need to look through the logs.

Edit: Solved. It was indeed cache invalidation replication not working.

This test started to fail after implementing
element-hq/synapse#18899. The failure
appeared to be due to the test not waiting for signatures to
propagate over federation. The  loop
would exit as soon as it saw a 'signatures' key. But the
value was an empty dict.

A few moments later, the dict would be populated with the key
we were waiting for. But while that was being sent over federation,
the test would fail and exit abruptly.

This commit changes the  loop to actually
check for the signature we're waiting for, instead of exiting
upon seeing the 'signatures' key. This seems like it would be
less flaky in general, and prevents the test from failing in
this case.
@anoadragon453 anoadragon453 force-pushed the anoa/cache_device_key_signatures branch from 9c7bec9 to 46be345 Compare September 16, 2025 13:50
@anoadragon453 anoadragon453 marked this pull request as ready for review September 17, 2025 10:49
@anoadragon453 anoadragon453 requested a review from a team as a code owner September 17, 2025 10:49
@MadLittleMods MadLittleMods added the Z-Flaky Tests which seem to fail at random label Sep 17, 2025
Comment on lines +566 to +567
my $user2_device_key_id_hash = "EmkqvokUn8p+vQAGZitOk4PWjp7Ukp3txV2TbMPEiBQ";
my $user2_device_key_id = "ed25519:$user2_device_key_id_hash";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like prior art but how/why are these static?

I would've thought auto-generated device names, etc would make these change.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The key IDs are determined by the client (typically derived from a public key, as shown here). We could generate keys to use in these tests, but it's not necessary (and would add a smidge more time to running the tests).

I found it actually helps with debugging (being able to reuse grep commands) to have a static key ID.

exists $sigs->{$user2_id}
&& exists $sigs->{$user2_id}{$user2_device_key_id}
&& $sigs->{$user2_id}{$user2_device_key_id} eq $cross_signature
or die "Expected cross-signature ($user2_device_key_id}->$cross_signature not visible";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The braces/parentheses are unbalanced and seem wrong here. I don't know which way they should go.

Suggested change
or die "Expected cross-signature ($user2_device_key_id}->$cross_signature not visible";
or die "Expected cross-signature {$user2_device_key_id}->$cross_signature not visible";

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, this is just string formatting (literally printing (...) in the error). But I see now that using -> made it seem like we're pulling data out of a structure.

Co-authored-by: Eric Eastwood <[email protected]>
Copy link
Member Author

@anoadragon453 anoadragon453 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review @MadLittleMods! 🐛

exists $sigs->{$user2_id}
&& exists $sigs->{$user2_id}{$user2_device_key_id}
&& $sigs->{$user2_id}{$user2_device_key_id} eq $cross_signature
or die "Expected cross-signature ($user2_device_key_id}->$cross_signature not visible";
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, this is just string formatting (literally printing (...) in the error). But I see now that using -> made it seem like we're pulling data out of a structure.

Comment on lines +566 to +567
my $user2_device_key_id_hash = "EmkqvokUn8p+vQAGZitOk4PWjp7Ukp3txV2TbMPEiBQ";
my $user2_device_key_id = "ed25519:$user2_device_key_id_hash";
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The key IDs are determined by the client (typically derived from a public key, as shown here). We could generate keys to use in these tests, but it's not necessary (and would add a smidge more time to running the tests).

I found it actually helps with debugging (being able to reuse grep commands) to have a static key ID.

@anoadragon453 anoadragon453 merged commit 9fa1325 into develop Sep 18, 2025
6 of 7 checks passed
@anoadragon453 anoadragon453 deleted the anoa/cache_device_key_signatures branch September 18, 2025 11:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Z-Flaky Tests which seem to fail at random

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants