-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve tear down speed #671
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The SQL was originally grabbed from calling `toString()` on the original Knex queries. They have all now been run through a SQL formatter to make them more readable.
Cruikshanks
requested review from
Demwunz,
robertparkinson,
Jozzey and
Beckyrose200
January 19, 2024 09:51
robertparkinson
approved these changes
Jan 19, 2024
Cruikshanks
added a commit
that referenced
this pull request
Jan 19, 2024
https://eaflood.atlassian.net/browse/WATER-4336 In [Improve tear down speed](#671) we made sweeping changes to the tear-down service which is used to support our [water-abstraction-acceptance-tests](https://github.com/DEFRA/water-abstraction-acceptance-tests). The new changes improved the performance of the clean down. However, we've spotted that one of the delete statements got lost in the whirlwind of our strategy testing; licence agreements not marked as 'test' but linked to 'test' licences. This a common issue the tear-down service has to deal with due to the way the tests are currently written and the inherited test seed process works. This changes adds the missing statement.
Cruikshanks
added a commit
that referenced
this pull request
Jan 19, 2024
https://eaflood.atlassian.net/browse/WATER-4336 In [Improve tear down speed](#671) we made sweeping changes to the tear-down service which is used to support our [water-abstraction-acceptance-tests](https://github.com/DEFRA/water-abstraction-acceptance-tests). The new changes improved the performance of the clean-down. However, we've spotted that one of the delete statements got lost in the whirlwind of our strategy testing; licence agreements not marked as 'test' but linked to 'test' licences. This a common issue the tear-down service has to deal with due to the way the tests are currently written and the inherited test seed process works. This change adds the missing statement.
Cruikshanks
added a commit
that referenced
this pull request
Jan 19, 2024
https://eaflood.atlassian.net/browse/WATER-4336 In [Improve tear down speed](#671) we made sweeping changes to the tear-down service which is used to support our [water-abstraction-acceptance-tests](https://github.com/DEFRA/water-abstraction-acceptance-tests). The new changes improved the performance of the clean-down. One of the critical changes was disabling and enabling table triggers before deleting records. Where a table has a foreign key constraint these would be checked on every deletion. We know we eventually intend to delete _everything_ related to the test data so these checks are redundant in the context of the tear-down. Disabling the checks before running the delete got a big bump in performance. We therefore wrapped any delete statement for a table with a foreign key constraint with `ALTER TABLE my_table DISABLE TRIGGER ALL;` and `ALTER TABLE my_table ENABLE TRIGGER ALL;`. All seemed well, until we started seeing random failures in the tear-down when running the full suite. A further dive into triggers found that these checks are also being done the other way. For example, `water.regions` has no foreign key constraint. But `water.licences` has one that links to `water.regions`. If you try and delete a region that is linked to a licence before deleting the licence you get an error. So, to avoid these issues we need to disable all triggers first, _then_ run the delete statements before finishing with re-enabling them all! 😩🤦
Cruikshanks
added a commit
that referenced
this pull request
Jan 21, 2024
In [Improve tear down speed](#671) we made sweeping changes to the tear-down service which is used to support our [water-abstraction-acceptance-tests](https://github.com/DEFRA/water-abstraction-acceptance-tests). The new changes improved the performance of the clean-down. One of the critical changes was disabling and enabling table triggers before deleting records. Where a table has a foreign key constraint these would be checked on every deletion. We know we eventually intend to delete _everything_ related to the test data so these checks are redundant in the context of the tear-down. Disabling the checks before running the delete got a big bump in performance. We therefore wrapped any delete statement for a table with a foreign key constraint with `ALTER TABLE my_table DISABLE TRIGGER ALL;` and `ALTER TABLE my_table ENABLE TRIGGER ALL;`. All seemed well until we started seeing random failures in the tear-down when running the full suite. A further dive into triggers found that these checks are also being done the other way. For example, `water.regions` has no foreign key constraint. But `water.licences` has one that links to `water.regions`. If you try and delete a region that is linked to a licence before deleting the licence you get an error. To avoid these issues we need to disable all triggers first, _then_ run the delete statements before finishing with re-enabling them all. 😩🤦 But that's not all! 😱 As we started refactoring the tear-down services we realised there were also occasions where we needed to delete test data from tables in a specific order, for example, billing data. To determine which billing transactions to delete you have to be able to link from them to the 'test' region via billing invoice licences, billing invoices, and then billing batches. If we delete the test data from one of these linking tables before the delete billing transactions statement gets called nothing will get cleansed. This might be the real reason why the **Promise all** strategy came out tops; it wasn't always deleting everything. So, this change switches the tear-down strategy to **Single query**.
Cruikshanks
added a commit
that referenced
this pull request
Jan 21, 2024
https://eaflood.atlassian.net/browse/WATER-4336 Ok, we've made a bit of a hash with trying to [improve the tear-down speed](#671). We realised in the re-write we'd dropped a licence agreement DELETE statement that needed including so [fixed it](https://github.com/DEFRA/water-abstraction-system/pull/673/files). Then we realised the tear-down was now faster but more flaky so [switched to the single query strategy](#674). Only in the switch what do you know, we lost the licence agreement delete statement again. Hopefully for the last time, we fix this!
Cruikshanks
added a commit
that referenced
this pull request
Jan 21, 2024
https://eaflood.atlassian.net/browse/WATER-4336 Ok, we've made a bit of a hash with trying to [improve the tear-down speed](#671). We realised in the rewrite we'd dropped a licence agreement DELETE statement that needed including so [fixed it](https://github.com/DEFRA/water-abstraction-system/pull/673/files). Then we realised the tear-down was now faster but more flaky so [switched to the single query strategy](#674). Only in the switch what do you know, we lost the licence agreement DELETE statement again. Hopefully, for the last time, we fix this!
Cruikshanks
added a commit
that referenced
this pull request
Feb 20, 2024
https://eaflood.atlassian.net/browse/WATER-4336 We made changes to [Improve tear down speed](#671). What we didn't spot when doing this is a table that needs clearing down has been missing from the `TearDownService` from the very start. This change ensures `water.charge_purposes` is cleared down when the `/tear-down` endpoint is hit.
Cruikshanks
added a commit
that referenced
this pull request
Feb 20, 2024
https://eaflood.atlassian.net/browse/WATER-4336 We made changes to [Improve tear down speed](#671). What we didn't spot when doing this is a table that needs clearing down has been missing from the `TearDownService` from the very start. This change ensures `water.charge_purposes` is cleared down when the `/tear-down` endpoint is hit.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
https://eaflood.atlassian.net/browse/WATER-4336
The tear-down feature supports our water-abstraction-acceptance-tests. At the start of each test, it will make an HTTP call to
POST /data/tear-down
, resulting inTearDownService.go()
being called.The service and those it calls are responsible for clearing the various schemas we use of test data. Some tables have a
is_test
field that immediately identifies the record as being created for a test.But each test will create data that won't be flagged this way. So, the tear-down services also remove records we can identify as being test data.
is_test
recordname
, that is specific to the acceptance testsIdeally, we wouldn't need to remove anything between the tests. We could then avoid this step and the delay it causes. But until we re-architect fully the acceptance tests we've inherited we have to work with what we've got.
When we took on WRLS water-abstraction-service had a tear-down endpoint. But it would take more than a minute to complete. This caused the tests to take a long time to finish and often caused them to fail.
We built a more performant version. It would typically get the job done in 15 to 20 seconds. But sometimes this would creep to 40 seconds which again would cause test runs to fail.
So, we've gone back and tried a series of alternate strategies for clearing the data. Luckily, we've been able to identify a faster solution. This change updates tear-down to use the new strategy. We've also included a README covering what we tried and the results that led us to this decision (repeated below).