feat: no more user-required-changes to migration group loaders #6047

kylemumma · 2024-06-21T15:19:53Z

Users no longer need to make changes to the group loader when creating a new migration. They are now only responsible for naming the migration correctly xxxx_migration_name.py and putting it in the correct folder snuba_migrations/dataset/. The group loader will find the new migrations in these folders by itself.

Ordering the migrations run in is based on migration numbers.

edge cases and validation

Migration numbers must begin at 0001 and strictly increase by 1 which is validated. The following would cause errors:

0000_migration_name.py
0001_migration.py, 0002_migration.py, 0004_migration.py

how did i test

(see commit ac0f0a5 for this code)
I added assert to every single get_migrations to verify the new function generated the same as the hard-coded list. I ran snuba migrations list with the asserts in place to validate.

I verified edge-case / validation behavior by hand.

considerations

if users slightly misname their migration it will be ignored and they might not realize or if they do, understand why
such as 001_migration.py will be ignored and the only indication to user is that it wont show up in snuba list migrations

next steps

this will make my migration autogeneration easier since it doesnt need to modify the group loaders either.

MeredithAnya · 2024-06-21T17:53:06Z

snuba/migrations/group_loader.py

+                    f"Migrations in folder {migration_folder} were not formatted as expected. Expected migration number {str(i+1).zfill(4)} for migration {fname} but got {fname[:4]}"
+                )
+
+        return migration_filenames


Should we be caching the results of the migration filenames ? Or is this called infrequently enough that it wouldn't provide much benefit?

I'm not sure what calls this function, but my guess (based on what the function does) would be that it is not called very frequently or under time sensitive circumstances.

Additionally since caching adds complexity, maybe it would be better to leave out unless we find it necessary in the future? Unless you have reason to believe that it may be necessary.

MeredithAnya · 2024-06-21T17:58:56Z

if users slightly misname their migration it will be ignored and they might not realize or if they do, understand why
such as 001_migration.py will be ignored and the only indication to user is that it wont show up in snuba list migrations

This might be out of scope but could be cool to something similar to django migrations https://docs.djangoproject.com/en/5.0/topics/migrations/#workflow where the user can pass a name but it auto generates the numerical prefix

onewland · 2024-06-21T18:34:14Z

snuba/migrations/group_loader.py

+            )
+        )
+        migration_filenames = []
+        pattern = re.compile(r"[0-9][0-9][0-9][0-9]_.*\.py")


it's a nit but I think you could avoid the separate regex match plus file listing (os.listdir) by using glob: https://docs.python.org/3/library/glob.html

Thanks Oliver I didn't know you could do this with glob. I updated it to use glob

onewland

I think I'm OK with this but I'd like us to consider the following:

I'm curious if we want the constraint that makes 0001_migration.py, 0002_migration.py, 0004_migration.py impossible. My thought: let's say 0003_migration.py makes an ALTER TABLE change that works in ClickHouse 21.x but not in 23.x. But the table works fine without it in both versions. This constraint forces us to keep the file around, maybe modifying it to be a no-op?

Also, consider the glob comment above.

kylemumma · 2024-06-21T20:00:53Z

This might be out of scope but could be cool to something similar to django migrations https://docs.djangoproject.com/en/5.0/topics/migrations/#workflow where the user can pass a name but it auto generates the numerical prefix

@MeredithAnya Good idea i'll make sure to have this in the autogeneration UI

I think I'm OK with this but I'd like us to consider the following:

I'm curious if we want the constraint that makes 0001_migration.py, 0002_migration.py, 0004_migration.py impossible. My thought: let's say 0003_migration.py makes an ALTER TABLE change that works in ClickHouse 21.x but not in 23.x. But the table works fine without it in both versions. This constraint forces us to keep the file around, maybe modifying it to be a no-op?

Also, consider the glob comment above.

@onewland Good point, it is an unnecessary constraint. I modified it s.t. the only constraint is non-duplicate migration numbers. Now something like this is valid: 0000, 0004, 0008
and they are ordered by migration number

github-actions bot added the migrations label Jun 21, 2024

kylemumma marked this pull request as ready for review June 21, 2024 15:40

kylemumma requested a review from a team as a code owner June 21, 2024 15:40

kylemumma changed the title ~~verify that the new method produces correct results in all cases~~ feat: no more user-changes to migration group loaders Jun 21, 2024

kylemumma changed the title ~~feat: no more user-changes to migration group loaders~~ feat: no more user-required-changes to migration group loaders Jun 21, 2024

MeredithAnya reviewed Jun 21, 2024

View reviewed changes

onewland reviewed Jun 21, 2024

View reviewed changes

onewland approved these changes Jun 21, 2024

View reviewed changes

kylemumma force-pushed the krm/autogrouploader branch from dd8d4aa to 00de8ac Compare June 21, 2024 19:44

kylemumma added 5 commits June 24, 2024 09:55

verify that the new method produces correct results in all cases

2aefb1a

auto group load

4a647e6

add glob to replace regex and verify it produces same results

04cdad9

pr feedback

8e26406

sdf

3591977

kylemumma force-pushed the krm/autogrouploader branch from 00de8ac to 3591977 Compare June 24, 2024 14:57

kylemumma merged commit 869e1a7 into master Jun 24, 2024
28 checks passed

kylemumma deleted the krm/autogrouploader branch June 24, 2024 16:04

kylemumma mentioned this pull request Jun 25, 2024

feat: autogenerate addcolumn migrations #6053

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: no more user-required-changes to migration group loaders #6047

feat: no more user-required-changes to migration group loaders #6047

kylemumma commented Jun 21, 2024 •

edited

Loading

MeredithAnya Jun 21, 2024 •

edited

Loading

kylemumma Jun 21, 2024

MeredithAnya commented Jun 21, 2024

onewland Jun 21, 2024 •

edited

Loading

kylemumma Jun 21, 2024

onewland left a comment

kylemumma commented Jun 21, 2024 •

edited

Loading

feat: no more user-required-changes to migration group loaders #6047

feat: no more user-required-changes to migration group loaders #6047

Conversation

kylemumma commented Jun 21, 2024 • edited Loading

edge cases and validation

how did i test

considerations

next steps

MeredithAnya Jun 21, 2024 • edited Loading

Choose a reason for hiding this comment

kylemumma Jun 21, 2024

Choose a reason for hiding this comment

MeredithAnya commented Jun 21, 2024

onewland Jun 21, 2024 • edited Loading

Choose a reason for hiding this comment

kylemumma Jun 21, 2024

Choose a reason for hiding this comment

onewland left a comment

Choose a reason for hiding this comment

kylemumma commented Jun 21, 2024 • edited Loading

kylemumma commented Jun 21, 2024 •

edited

Loading

MeredithAnya Jun 21, 2024 •

edited

Loading

onewland Jun 21, 2024 •

edited

Loading

kylemumma commented Jun 21, 2024 •

edited

Loading