-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restructure 'seeding' to be Knex based #1230
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
https://eaflood.atlassian.net/browse/WATER-4575 We have been working on replicating how the [water-abstraction-import](https://github.com/DEFRA/water-abstraction-import) service imports a licence. See [Import Licence versions](#1195) for example. This is in readiness for moving to ReSP managing licences rather than NALD. What it exposed was that our test data setup needed expanding. We started with [create wrapper for licence-version-purpose-condition table](#1181) which added the ability to seed licence purpose condition types. This was then expanded by [Add new test reference data seeders](#1206) which added some more reference data, for example, purposes. The issue we've got is the existing changes are straddling two different mechanisms. We've included them as a seed file in our [Knex seeding capabilities](https://knexjs.org/guide/migrations.html#seed-files). This is reference data so why not seed it?! And we are seeding it whenever we perform a Database clean. Our intent is to have real reference data when unit testing, and until we've removed calls to `DatabaseSupport.clean()` from all our tests we need to do this each time. That has led to the logic and data itself living in the `test/` folder. There is a conceptual problem with this; something we depend on 'for real' lives in our `test/` folder. Plus the functionality is split into 2 places. It also means our `test/support/seeders` has a mix of types. Original 'unit test seeders' which were intended to help tidy up bulky test setup away from within the test itself, with 'knex seeders', things we'll also call when we run `npm run seed` against the non-test DB. So, on reflection we think we need to carry out a little restructuring. _All_ the 'seeding' functionality needs to move to `db/seeds` but with the ability to still call it from `clean()` until such time as we have updated all tests and can drop the command. Whilst we're doing this housekeeping we'll also update our existing users seed to match some of the conventions that have been introduced, whilst also re-implementing some existing ones. The end result is that we can seed both the real and test databases with reference data, even when a DB already has existing records and this functionality will live in one place and be built on **Knex**.
Cruikshanks
added
the
housekeeping
Refactoring, tidying up or other work which supports the project
label
Aug 2, 2024
Rather than bundling _all_ the reference data into a single seeder we'll be splitting them out. This will hopefully make it easy to see what data is being seeded, and maintain in the future. We're going to start with regions.
We move the data into a `db/seeds` and away from `test/support/seeders/data` so that the seeding and the seed data will live together. We also make some changes to the contents. Though possible to export values directly we think it is clearer to first declare them then export them. We also make changes to the seed data itself. We shouldn't de hard coding the created and updated at times as these will be set when the records are created. In our tests we sometime do need to set these. But only ever when dealing with transactional data. There is no scenario where the created or updated at values of a reference record matters.
The original seeder was using raw SQL to perform the update/insert. However, there is no reason why we cannot use Objection.js as we do in the majority of our database interactions. We've also changed what kind of queries get run. With no constraint on the table, we'll always need to run 2 queries. The original ran both the update and insert, each with clauses that would mean only one would actually make any changes. On reflection, we think it is clearer and safer to only run the 'mutating' query needed. So, instead our first query is to determine if a region already exists. If it does, we perform an update. If it doesn't we perform the insert. We've also extended the seeder to cater for the fact `water.regions` has an `is_test` column we don't expose in `public.regions`. Our existing tear down and acceptance tests depend on the 'Test Region' having this flag set. So, we cater for it in our new seeder. The final change is the file name. Knex runs the seed files in the order found. Because of this it is considered good practise to name your seed files in such a way that you control the order they are run in. Hence the pattern of prefixing seed file names with numbers.
Needed else Lab fails. Another change to highlight is moving where we have referenced the regions data into its own section in the test file header. We think this will make it clearer that this is test data we are referencing, as opposed to a standard test helper.
Cruikshanks
force-pushed
the
restructure-seeders
branch
from
August 5, 2024 10:00
5285d3c
to
84308df
Compare
Closed
The need to group them was more a helper/simplifier for the call we need to make in Database.clean(). Now we have the files split Knex seeding can call them as normal and we can move that simplification to where it is actually used.
These tables all default to the current date and time when inserting a new record for the `createdAt` and `updatedAt` fields. But unlike our tables they don't have triggers that will ensure `updatedAt` gets set whenever an `UPDATE` is called. So, this change ensures the seeders set `updatedAt`.
Spotted when we update it to reference the new location for purpose data. We were still generating an external ID using the helper generate code methods but then selecting values from the source reference data. An external ID is built from the codes of the purposes the return requirement purpose is linked to. So, it _should_ be in sync with the purposes actually linked. We think there are other places where selecting a random entry from the reference data would also be needed. So, we also refactor some of the logic to a new helper method.
We're about to update all references to `Region.add()` to be `Region.select` so that we are using the pre-existing reference data rather than adding duplicate records solely for a test. This one test currently relies heavily on injecting data into the DB. However, it is calling 2 other services that handle fetching the data which have their own tests. So, unlike the changes we're about to make, we can update this test to be 100% stub dependent.
test/services/idm/fetch-user-roles-and-groups.service.test.js
Dismissed
Show dismissed
Hide dismissed
test/services/idm/fetch-user-roles-and-groups.service.test.js
Dismissed
Show dismissed
Hide dismissed
This is an env var we don't default in our config and now we've included seeding users in our test setup we depend on it existing for CI to work.
This means, should we ever have to, we could run the seeds in `production` and not add a bunch of users we've identified in our source code!
We got intermittent failures in CI with the user testing, especially where we are having to create user-role records. This change reduces the number of times we are having to create them by moving all the setup of the user tests into a `before()` rather than `beforeEach()`. We also remove random record selection and ensure a spread of users, groups and roles across the tests.
Having ensured they stuck around for the UserHelper and UserGroupHelper because it made sense, it made us think again about their use in reference-only helpers. They weren't being used and would be unlikely to. We've also found that in most cases we've been defining indexes in the tests. So, `DEFAULT_INDEX` as also not really needed.
Cruikshanks
added a commit
that referenced
this pull request
Aug 20, 2024
https://eaflood.atlassian.net/browse/WATER-4579 We recently started [seeding the test DB with reference data](#1206). We then [Restructured that seeding to be Knex based](#1230). We thought we'd covered all reference data with the seeding. But we've just spotted that we missed charge version change reasons (`ChangeReasonModel`). We're about to add some new unit tests in [Add history attributes to ChargeVersionModel](#1272) that depend on them. So, rather than insert yet more into the DB, this change switches change reasons to be 'reference' based, i.e. tests should select from pre-seeded reasons rather than adding generated ones.
Cruikshanks
added a commit
that referenced
this pull request
Aug 20, 2024
https://eaflood.atlassian.net/browse/WATER-4579 We recently started [seeding the test DB with reference data](#1206). We then [Restructured that seeding to be Knex based](#1230). We thought we'd covered all reference data with the seeding. But we've just spotted that we missed charge version change reasons (`ChangeReasonModel`). We're about to add some new unit tests in [Add history attributes to ChargeVersionModel](#1272) that depend on them. So, rather than inserting more into the DB, this change switches change reasons to be 'reference' based, i.e., tests should select from pre-seeded reasons rather than adding generated ones.
Cruikshanks
added a commit
that referenced
this pull request
Aug 20, 2024
https://eaflood.atlassian.net/browse/WATER-4579 We recently started [seeding the test DB with reference data](#1206). We then [Restructured that seeding to be Knex based](#1230). We thought we'd covered all reference data with the seeding. But we've just spotted that we missed charge version change reasons (`ChangeReasonModel`). We're about to add some new unit tests in [Add history attributes to ChargeVersionModel](#1272) that depend on them. So, rather than inserting more into the DB, this change switches change reasons to be 'reference' based, i.e., tests should select from pre-seeded reasons rather than adding generated ones.
Jozzey
added a commit
that referenced
this pull request
Sep 5, 2024
Currently with the way data is being inserted into `charge_categories` using the helper to create records during unit testing. We are getting intermittent failures when running the unit tests when the charge `reference` gets duplicated due to it's lack of randomness. We have therefore decided that since the data in `charge_categories` is reference data we shouldn't really be creating a new charge category records each time a unit test requires one. Instead we should just be seeding the data once and then writing the unit tests to use this seeded data. The bulk of this work has been done for other tables in this PR #1230
Jozzey
added a commit
that referenced
this pull request
Sep 11, 2024
Currently, with the way data is being inserted into `charge_categories` using the helper to create records during unit testing. We are getting intermittent failures when running the unit tests when the charge `reference` gets duplicated due to its lack of randomness. We have therefore decided that since the data in `charge_categories` is reference data we shouldn't be creating new charge category records each time a unit test requires one. Instead, we should just be seeding the data once and then writing the unit tests to use this seeded data. The bulk of this work has been done for other tables in this PR #1230 This PR will create the functions required to seed the `charge_categories` and then fix any code affected by this change.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
https://eaflood.atlassian.net/browse/WATER-4575
We have been working on replicating how the water-abstraction-import service imports a licence. See Import Licence versions for example.
This is in readiness for moving to ReSP managing licences rather than NALD.
What it exposed was that our test data setup needed expanding. The import relies on 'real' reference data being present. So, it makes testing easier if this is the case in our test database when writing unit tests.
We started with create wrapper for licence-version-purpose-condition table which added the ability to seed licence-purpose condition types. This was then expanded by Add new test reference data seeders which added some more reference data, for example, purposes.
The issue we've had is that the existing changes are straddling two different mechanisms. We've included them as a seed file in our Knex seeding capabilities. This is reference data, so why not seed it?!
And we are seeding it whenever we perform a database clean. Our intent is to have real reference data when unit testing, and until we've removed calls to
DatabaseSupport.clean()
from all our tests we need to do this each time. That has led to the logic and data itself living in thetest/
folder.There is a conceptual problem with this: something we depend on 'for real' lives in our
test/
folder. Plus, the functionality is split into 2 places.It also means our
test/support/seeders
have a mix of types. Original 'unit test seeders' that were intended to help move bulky test setup away from within the test itself, with 'knex seeders', things we'll also call when we runnpm run seed
against the non-test DB.So, on reflection, we felt the need to carry out a little restructuring. All the 'seeding' functionality is moved to
db/seeds
but with the ability to still call it fromclean()
until such time as we have updated all tests and can drop the command.While we're doing this housekeeping, we also update our existing user seed to match the conventions that have been introduced while also re-implementing some existing ones, for example, preferring Objection.js to raw SQL queries.
The end result is that we can seed both the real and test databases with reference data, even when a database already has existing records, and this functionality will live in one place and be built on Knex.
But there's more!
Having made these changes to ensure both the real and test databases are correctly populated with reference data, it no longer made any sense to be adding new reference records within our tests.
So, we updated the relevant test helpers to provide a
select()
in place of theadd()
method. This means you can select the details of a reference record without having to first create or query for it.We then went through the affected tests and 'test-seeders' and swapped calls to
Helper.add()
withHelper.select()
. We also performed some housekeeping on some of the tests we touched.