Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Storage paths must be unique #26380

Closed
wants to merge 7 commits into from

Conversation

jolheiser
Copy link
Member

@jolheiser jolheiser commented Aug 7, 2023

I propose reverting #26318 in 1.20 and implementing this PR as-is, and also frontporting this enhancement for future versions.

#26318 would need to be reverted, otherwise the derivation would be "fixed" and this PR wouldn't have the intended effect.

This PR aims to help mitigate the bug being fixed by #26318 in a way that doesn't lead to silent data loss for existing users affected, while also being a valuable sanity check in the future.

/cc @go-gitea/technical-oversight-committee

Also /cc @earl-warren for the initial idea implemented here


For anyone directed here because of the log message

You'll want to follow the message and explicitly set your storage paths. Be aware that this will mean Gitea looks in new paths, so for any data that still lives in the old path, you will need to move it to the correct spot, such as repositories, LFS, attachments, etc.

Feel free to stop by any of our support avenues if you need further assistance.

@jolheiser jolheiser added the type/enhancement An improvement of existing functionality label Aug 7, 2023
@jolheiser jolheiser added this to the 1.20.3 milestone Aug 7, 2023
@GiteaBot GiteaBot added the lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. label Aug 7, 2023
@pull-request-size pull-request-size bot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Aug 7, 2023
@GiteaBot GiteaBot added lgtm/need 1 This PR needs approval from one additional maintainer to be merged. and removed lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. labels Aug 7, 2023
Signed-off-by: jolheiser <[email protected]>
@delvh delvh mentioned this pull request Aug 7, 2023
@delvh delvh added the frontport/main This PR is not targeting the main branch. Still, its changes should also be added there label Aug 7, 2023
Signed-off-by: jolheiser <[email protected]>
@GiteaBot GiteaBot added lgtm/done This PR has enough approvals to get merged. There are no important open reservations anymore. and removed lgtm/need 1 This PR needs approval from one additional maintainer to be merged. labels Aug 7, 2023
lunny
lunny previously requested changes Aug 8, 2023
Copy link
Member

@lunny lunny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this PR's idea but it will not conflict with #26318 whatever we will revert #26318

modules/setting/storage.go Outdated Show resolved Hide resolved
@GiteaBot GiteaBot added lgtm/blocked A maintainer has reservations with the PR and thus it cannot be merged and removed lgtm/done This PR has enough approvals to get merged. There are no important open reservations anymore. labels Aug 8, 2023
@jolheiser
Copy link
Member Author

I like this PR's idea but it will not conflict with #26318 whatever we will revert #26318

#26318 would need to be reverted, otherwise the derivation would be "fixed" and this PR wouldn't have the intended effect.

@wxiaoguang
Copy link
Contributor

wxiaoguang commented Aug 8, 2023

I agree to make sure every path to be unique, and #26318 is not ideal.

While I do not see why #26318 needs to be replaced by a single fatal message, I do not know how the "fatal" message would help end users.

Imagine a real case (this example is from #26264)

A user set [repository].ROOT=/data/git/repositories, [storage].PATH=/data/git, then every thing is mixed in the /data/git, then they sees this "fatal message", what can they do?

The user's /data/git might be like this:

repositories/owner/repo.git  # for repositories
ab/cd/...   # for packages, lfs ...
12345/ab/...  # for repo archiver
754d63406bd4e450d338562da7a50741f49d6b27b9ef7064f99791929c760d89 # for avatars ...
# and for actions, etc ....

Then DeleteRepositoryArchives calls storage.Clean(storage.RepoArchives), it deletes everything in /data/git

What's the problem?

IMO the only missing thing is to teach the poor users how to separate these files into different folders, if the files could be separated, then #26318 is good enough.

ps: Every path should be unique (and they shouldn't overlap in most cases). The DeleteRepositoryArchives function's dangerous behavior should also be fixed.

@jolheiser
Copy link
Member Author

The purpose of reverting the other PR is to avoid silent data loss.
This PR will cause a fatal, yes, which is noisy. Annoying, but better than silently losing data imo.

With the other PR in place, the derivation naturally fixes this noise, which is good in future (unreleased) versions, but less good in this (released) version because affected users may suddenly be missing data rather than a noisy message telling them something is wrong.

It's my opinion that a noisy failure to start is better than the potential for data to be lost in a patch upgrade. (Although neither is ideal)

@wxiaoguang
Copy link
Contributor

wxiaoguang commented Aug 8, 2023

Yup, see the real example #26380 (comment), if Gitea "fails" to start, then what should be done (how to "fix" the fatal)? That's the same question as #26318 . If the question can be answered, I think #26318 can be kept by a "detection"

@wxiaoguang
Copy link
Contributor

wxiaoguang commented Aug 8, 2023

ps: I approved #26318/#26271 because I guess it could be a rare case for users who set [storage].PATH manually. Either "breaking" or "fatal" wouldn't make too much difference (both need the unavoidable "fix" operations), so just let it go. While if there could be some more better solutions, eg: more detailed documents to help the users to fix the problem manually, more smart detection, more smart doctor fix helper, I would also approve.

@jolheiser
Copy link
Member Author

The difference is the breaking version is silent and this one is not.

I would maybe be convinced to change this to error in this version, but future versions should still likely retain fatal.

I agree that mixed data isn't going to be simple to extricate, but at least affected admins should be made aware of it, not just silently start experiencing bugs.

@wxiaoguang
Copy link
Contributor

wxiaoguang commented Aug 8, 2023

The difference is the breaking version is silent and this one is not.

IMO the difference is that:

  • Use the "breaking": the instance still starts, the repositories still exist, if any file is missing (avatar/lfs/packages), the user could re-upload them (or if there is a document, they could manually move them)
  • Use the "fatal": the end users couldn't continue if they see this fatal message, no guideline yet and it's nearly impossible for an end user to separate the files into correct directories, the end user will be blocked and give up?

If it needs to keep 1.20 using legacy behavior, there could be a more strict change:

  1. Check if the [storage].PATH is set, if no, no problem
  2. If [storage].PATH is set, tell user that the serious problem and:
    • Let them set another option like I_KNOW_STORAGE_PATH_MIXTURE_DANGEROUS=NoSupportIn1.21
    • Disable the DeleteRepositoryArchives (it is the only Clean Storage caller)
    • Then they can bypass the fatal to continue, until 1.21 comes. (TBH, I think they should create a new clear instance in this case, otherwise it's impossible for them to upgrade)

@lunny lunny dismissed their stale review August 8, 2023 04:19

Concern resolved

@GiteaBot GiteaBot added lgtm/done This PR has enough approvals to get merged. There are no important open reservations anymore. and removed lgtm/blocked A maintainer has reservations with the PR and thus it cannot be merged labels Aug 8, 2023
@lunny
Copy link
Member

lunny commented Aug 8, 2023

I think this PR could work well with #26318 but not without it. Maybe it should also be sent to main branch?

@wxiaoguang
Copy link
Contributor

wxiaoguang commented Aug 8, 2023

I think this PR could work well with #26318 but not without it. Maybe it should also be sent to main branch?

You misunderstood, this PR doesn't work with #26318 for existing users who set [storage].PATH manually.

@lunny
Copy link
Member

lunny commented Aug 8, 2023

[repository].ROOT

Yes. The path [repository].ROOT should also be considered and also all other PATH configurations.

@delvh
Copy link
Member

delvh commented Aug 13, 2023

So, what do we do with this PR?
Merge or no merge?

@lunny
Copy link
Member

lunny commented Aug 13, 2023

I would like to merge but not revert #26318 and if we can include other storages into the map(git storage path and etc.), that's better.

@delvh
Copy link
Member

delvh commented Aug 13, 2023

Yes, I've canceled the revert: #26381

@lunny lunny added the reviewed/wait-merge This pull request is part of the merge queue. It will be merged soon. label Aug 13, 2023
@GiteaBot
Copy link
Contributor

@jolheiser please fix the merge conflicts. 🍵

@GiteaBot GiteaBot removed the reviewed/wait-merge This pull request is part of the merge queue. It will be merged soon. label Aug 14, 2023
@techknowlogick
Copy link
Member

Should this be closed now that the other PR was merged?

@lunny
Copy link
Member

lunny commented Aug 14, 2023

Should this be closed now that the other PR was merged?

I think this could be replaced by #26484

@lunny lunny closed this Aug 16, 2023
@lunny lunny removed this from the 1.20.3 milestone Aug 16, 2023
@go-gitea go-gitea locked as resolved and limited conversation to collaborators Nov 14, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
frontport/main This PR is not targeting the main branch. Still, its changes should also be added there lgtm/done This PR has enough approvals to get merged. There are no important open reservations anymore. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. type/enhancement An improvement of existing functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants