Skip to content
This repository has been archived by the owner on May 15, 2024. It is now read-only.

Backwards compatibility validator #3

Merged
merged 6 commits into from
May 25, 2021
Merged

Backwards compatibility validator #3

merged 6 commits into from
May 25, 2021

Conversation

sldblog
Copy link
Contributor

@sldblog sldblog commented May 20, 2021

Validates that during change to reference data CSVs,

  • entries are not deleted (instead, add a start_date/end_date column pair)
  • data structure changes are backwards-compatible in the same file:
    • cannot remove existing columns
    • cannot rename existing columns
    • can add new columns

To bypass these validations, a new file has to be created


Currently, there's no check that

each entry still has a unique ID, which has to be in the first column

because it's failing, as it looks like we may need composite PK columns:

OFFENCES.csv:
  ❗ error: duplicate key: ID `LM82008` appears more than once
  ❗ error: duplicate key: ID `LM82009` appears more than once
<snip>

How does it work?

  1. Retrieves the *.csv files from a git reference, by default origin/main (VersionedRegisters)
  2. Then compares all of them against the version in the local filesystem (CheckCompatibility)

(The thing is orchestrated by App)

Why?

To make it extremely easy to update reference data for anyone (github text editor, basically). It also gives us history 🎉

And to provide guarantees for applications loading this data up that there won't be suddenly missing entries or incompatible payloads.

Notes

As the app already ignores test files, there's scope to introduce a convention to ignore certain files that are currently experimental, i.e. ones with -draft in them, or similar.

sldblog added 2 commits May 20, 2021 17:26
The intent with this module is to compare:
- the original file (likely from version control)
- the changed file.

For now, the code will assume the first column is the primary key of all
register CSVs (it's true for all of them).

Documents the assertions checked in unit tests in the readme.
@sldblog sldblog requested review from Mjwillis and a team May 20, 2021 16:41
steps:
- uses: actions/checkout@v2
with:
fetch-depth: 0
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because the app logic relies on git history; fetch-depth: 0 means fetch the entire history, not just a single commit

@sldblog sldblog force-pushed the validator branch 3 times, most recently from 2ee64b3 to 8b6da1d Compare May 21, 2021 12:26
sldblog added 3 commits May 21, 2021 13:41
We need to

- verify that new versions of a CSV is backwards compatible
- checking columns and rows are compatible

The first point is done via this new `VersionedRegisters` class, the
latter is by the previous `CheckVersion` class
Using `fetch-depth: 0` on checkout because the code relies on the
history of the repository
I've run down a rabbit hole when a test was not working and it turns out
it had a non-existing git SHA as a starting point

This check will ensure we abort as soon as possible
@sldblog
Copy link
Contributor Author

sldblog commented May 25, 2021

I'll consider the silence as 👍 and sneak this in -- we can always revert

@sldblog sldblog merged commit d22f610 into main May 25, 2021
@sldblog sldblog deleted the validator branch May 25, 2021 22:13
@sldblog sldblog mentioned this pull request Jul 2, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant