-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
[meta] Synapse does not currently pass all of Complement's tests #8421
Comments
@anoadragon453 Was this just using the |
From some discussion with @kegsay he thinks the error above is due to a VPN. I was able to get an actual error out (at least from a particular test):
So looks like something is going run with logins, which would not be great. :) |
I took a look at this I think what is happening is that the test run is starting with a fresh database (instead of re-using the database from the blueprint run that creates users). This is with a sqlite database and @kegsay said that the Docker flow makes a full copy of the file system, so to look into whether the data is being flushed to sqlite3 before the registration endpoint returns a response. I believe that Python flushes when you commit, which we do before the endpoint returns so I was a bit surprised by this not working. I was running the following to only run a subset of tests:
I also tried making some local changes and running the following in my synapse dir:
|
Sorry yes, just this. Thanks for providing some more commands.
This sounds plausible. |
So I believe this is indeed happening due to the container being copied before the SQLite3 database is written to. This likely won't be a problem with Postgres as we now autocommit while running. I've tracked down the meat of this to the following steps. We create all necessary homeserver containers from the provided image: https://github.com/matrix-org/complement/blob/0e345aa81e79dbf147a8a4e92198f2a62223457d/internal/docker/builder.go#L220-L224 and run all blueprint commands in them (creating users, rooms, etc): https://github.com/matrix-org/complement/blob/0e345aa81e79dbf147a8a4e92198f2a62223457d/internal/docker/builder.go#L262-L288 The homeserver in each container is ready and has all the information we need to run the requested test against it. However, this information has not actually been committed to the db yet. It's just sitting inside of Synapse - in RAM effectively. Information about each created container is stored in Each of the containers is committed, which creates a new image from the changes in the container. However, only the filesystem is copied here. Everything in RAM is thrown away. I believe stopping the container before committing, thus shutting down Synapse and allowing it time to flush all changes to the db, will help here. Now excuse me as I broke my docker in the meantime and can't finishing debugging this :^) Edit: Rebooted to fix docker. Though unfortunately it seems stopping the container before committing it still doesn't carry the necessary database changes over to the deployed image that's used in the test 🤔 |
Right, forget all of the above. It's because the database is put in This is defined at: https://github.com/matrix-org/synapse/blob/develop/docker/Dockerfile#L75 This won't be saved on committing to an image, so you end up with an entirely new database on startup. |
DB fix is at matrix-org/complement#29 |
I seem to get:
randomly, which fails all tests and requires a reboot to fix once it starts. Anyone have a solution handy? :) According to JC this is due to the container not starting correctly, which the Go's HTTP client then reports "page not found". So... why isn't the container getting started? I don't know, problem for tomorrow :) |
So it turns out Synapse can sometimes take some time to start, and Complement gives up waiting for it after 2500ms. Setting the |
The next problem to solve is certificate verification failure:
We need to configure Synapse - much like in the demo scripts - to be able to allow self-signed certificates for federation traffic. |
It looks like we actually try to create certificates that each homeserver will trust through a common CA - so I'll aim to get that working rather than just disabling verification everywhere. |
Looks like getting certificate validation working won't be possible until matrix-org/complement#28 or similar lands. This is because we're not testing Synapse against itself, but a fake complement federation server. This server has its own dummy CA which is generating certificates we can't trust. The above linked PR will expose the certificate of that dummy CA to the homeserver containers to trust - at that point we can turn federation certificate verification back on. After this there seem to be no more fundamental issues plaguing Synapse support for Complement 🎉 Just actual issues which will take Synapse and spec work to solve :) |
PR for certification verification disabling: matrix-org/complement#30 |
There's been a few PRs landing in Complement to fix some more of the wide-sweeping test failures Synapse is experiencing:
Testing with these all merged, I've updated the test results in the original comment. |
Note that the original comment is being continuously updated with the current state of the tests and the PRs that will allow them to pass. |
Synapse is currently failing to pass all of the tests in Complement. Fixing this would be a requirement to removing the "soft-fail" attribute for running it in CI.
As of now, the tests Synapse is failing are:
Anecdotally, it's worth mentioning that all tests failed when my personal VPN was on, failing with:
The text was updated successfully, but these errors were encountered: