-
-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: no more user-required-changes to migration group loaders #6047
Conversation
f"Migrations in folder {migration_folder} were not formatted as expected. Expected migration number {str(i+1).zfill(4)} for migration {fname} but got {fname[:4]}" | ||
) | ||
|
||
return migration_filenames |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we be caching the results of the migration filenames ? Or is this called infrequently enough that it wouldn't provide much benefit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what calls this function, but my guess (based on what the function does) would be that it is not called very frequently or under time sensitive circumstances.
Additionally since caching adds complexity, maybe it would be better to leave out unless we find it necessary in the future? Unless you have reason to believe that it may be necessary.
This might be out of scope but could be cool to something similar to django migrations https://docs.djangoproject.com/en/5.0/topics/migrations/#workflow where the user can pass a name but it auto generates the numerical prefix |
snuba/migrations/group_loader.py
Outdated
) | ||
) | ||
migration_filenames = [] | ||
pattern = re.compile(r"[0-9][0-9][0-9][0-9]_.*\.py") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's a nit but I think you could avoid the separate regex match plus file listing (os.listdir
) by using glob: https://docs.python.org/3/library/glob.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Oliver I didn't know you could do this with glob. I updated it to use glob
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'm OK with this but I'd like us to consider the following:
I'm curious if we want the constraint that makes 0001_migration.py, 0002_migration.py, 0004_migration.py
impossible. My thought: let's say 0003_migration.py
makes an ALTER TABLE
change that works in ClickHouse 21.x but not in 23.x. But the table works fine without it in both versions. This constraint forces us to keep the file around, maybe modifying it to be a no-op?
Also, consider the glob comment above.
dd8d4aa
to
00de8ac
Compare
@MeredithAnya Good idea i'll make sure to have this in the autogeneration UI
@onewland Good point, it is an unnecessary constraint. I modified it s.t. the only constraint is non-duplicate migration numbers. Now something like this is valid: 0000, 0004, 0008 |
00de8ac
to
3591977
Compare
Users no longer need to make changes to the group loader when creating a new migration. They are now only responsible for naming the migration correctly
xxxx_migration_name.py
and putting it in the correct foldersnuba_migrations/dataset/
. The group loader will find the new migrations in these folders by itself.Ordering the migrations run in is based on migration numbers.
edge cases and validation
Migration numbers must begin at 0001 and strictly increase by 1 which is validated. The following would cause errors:
how did i test
(see commit ac0f0a5 for this code)
I added assert to every single
get_migrations
to verify the new function generated the same as the hard-coded list. I ransnuba migrations list
with the asserts in place to validate.I verified edge-case / validation behavior by hand.
considerations
if users slightly misname their migration it will be ignored and they might not realize or if they do, understand why
such as
001_migration.py
will be ignored and the only indication to user is that it wont show up insnuba list migrations
next steps
this will make my migration autogeneration easier since it doesnt need to modify the group loaders either.