-
-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix several issues with beatmap online ID management #30453
Conversation
Both online and offline using the cache. The rationale behind this change is that in the current state of affairs, `TestPartiallyMaliciousSet()` fails in a way that cannot be reconciled without this sort of change. The test exercises a scenario where the beatmap being imported has an online ID in the `.osu` file, but its hash does not match the online hash of the beatmap. This turns out to be a more frequent scenario than envisioned because of users doing stupid things with manual file editing rather than reporting issues properly. The scenario is realistic only because the behaviour of the endpoint responsible for looking up beatmaps is such that if multiple parameters are given (e.g. all three of beatmap MD5, online ID, and filename), it will try the three in succession: https://github.com/ppy/osu-web/blob/f6b341813be270de59ec8f9698428aa6422bc8ae/app/Http/Controllers/BeatmapsController.php#L260-L266 and the local metadata cache implementation reflected this implementation. Because online ID and filename are inherently unreliable in this scenario due to being directly manipulable by clueless or malicious users, neither should not be used as a fallback.
After dd06dd0e699311494412e36bc3f37bb055a01477 the behaviour set up on the mock in the test in question is no longer realistic. Online metadata lookups will no longer fall back to online ID or filename.
… importing This fell out while investigating why the issue with online IDs mismatching in the `.osu` could be worked around by importing the map three times in total when starting from it not being available locally. Here follows an explanation of why that "helped". Import 1: - The beatmap set is imported normally. - Online metadata population sees the online ID mismatch and resets it on the problematic beatmap. Import 2: - The existing beatmap set is found, but deemed not reusable because of the single beatmap having its ID reset to -1. - The existing beatmap set is marked deleted, and all the IDs of its beatmaps are reset to -1. - The beatmap set is reimported afresh. - Online metadata population still sees the online ID mismatch and resets it on the problematic beatmap. Note that at this point the first import *is still physically present in the database* but marked deleted. Import 3: - When trying to find the existing beatmap set to see if it can be reused, *the one pending deletion and with its IDs reset - - the remnant from import 1 - is returned*. - Because of this, `validateOnlineIds()` resets online IDs *on the model representing the current reimport*. - The beatmap set is reimported yet again. - With the online ID reset, the online metadata population check for online ID mismatch does not run because *the IDs were reset to -1* earlier. Preferring undeleted models when picking the model instance for reuse prevents this scenario.
// unfortunately it appears that historically stable mappers would apply crude hacks to fix unspecified "issues" with submission | ||
// which would amount to reusing online IDs of other beatmaps. | ||
// this means that the online ID in the `.osu` file is not reliable, and this cannot be fixed server-side | ||
// because updating the maps retroactively would break stable (by losing all of users' local scores). | ||
|
||
if (beatmapInfo.OnlineID == -1 && result.MD5Hash != beatmapInfo.MD5Hash) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On a quick read of the code, I'm not sure that we can ever hit this scenario anymore. The code run prior to this is a tryLookup
which is based on md5 matching alone, so at this point it should be guaranteed that the MD5 matches.
I may be wrong, but we can probably do away with this method altogether?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, and the assertion holds up as far as I can tell. Removed in 2b0fd35 (alongside a test that made no sense anymore due to the same exact thing - the mock behaved in a way incongruous with the actual reality).
The scenario that remaining guard was trying to protect against is obviated by and no longer possible after 776fabd.
Not 100% about the filename removal. Stable still uses this as a fallback in order to let users update to the latest version in-game. Thoughts? |
My thoughts are that online ID in the beatmap file and the beatmap filename are similarly reliable, i.e. not at all, and if we're bringing back one, we may as well bring both back, and just rely on md5 checks at score submission time to prevent funny business. Just depends on how many layers of safety you're comfortable with I guess. I prefer more, which is why I deleted anything that isn't md51. I am not sure how to conclusively determine whether excluding filename is going to be a problem or not. Footnotes
|
My thoughts are that it hasn't ever presented an issue to date in stable, so we could probably still keep it around. I think it's the only way we can recover the case where a user downloads a beatmap with
|
I don't have willpower to fight on that point, but you do realise I will have to revert 2b0fd35 etc and bring the hash equality checks back if I'm to allow filename again? Because otherwise I can take any set and rename any difficulty file to any old thing and it will show as ranked so long as it matches something else. Unless you're again saying that that's fine and that the onus of checking the hash should be on the score submission process? |
I think this is valid thinking, yeah. After all, a user could just edit things in realm post-import. I feel we were doing far too much on import to try and get things in a good state, and probably need to fall back to "what worked in stable" for now |
Alright, filename lookup restored with no guardrails, online ID still excluded. I have tested that the submission process correctly detects this and prevents score token issuance when attempting to play a doctored beatmap. |
RFC.
This is an all-in-one diff with assorted changes after today's daily challenge fiasco. Caution, wall of text incoming.
Taking it commit by commit:
Adjust test to match desired reality
Self-explanatory.
Remove problematic online ID check
Would "close" #27332 by essentially throwing up hands and deciding that it's probably fine not to check the garbage IDs in the
.osu
file.Only use MD5 when performing metadata lookups
Both online and offline using the cache.
The rationale behind this change is that after 1a2e323,
TestPartiallyMaliciousSet()
fails in a way that cannot be reconciled without this sort of change.The test exercises a scenario where the beatmap being imported has an online ID in the
.osu
file, but its hash does not match the online hash of the beatmap. This turns out to be a more frequent scenario than envisioned because of users doing stupid things with manual file editing rather than reporting issues properly.The scenario is realistic only because the behaviour of the endpoint responsible for looking up beatmaps is such that if multiple parameters are given (e.g. all three of beatmap MD5, online ID, and filename), it will try the three in succession, and the local metadata cache implementation reflected this implementation.
Because online ID and filename are inherently unreliable in this scenario due to being directly manipulable by clueless or malicious users, neither should not be used as a fallback.
Remove no longer applicable test
After 2c2f307 the behaviour set up on the mock in the test in question is no longer realistic. Online metadata lookups will no longer fall back to online ID or filename.
Prefer not deleted models when picking model instances for reuse when importing
This fell out while investigating why the issue with online IDs mismatching in the
.osu
could be worked around by importing the map three times in total when starting from it not being available locally.Here follows an explanation of why that "helped".
Import 1:
Import 2:
Note that at this point the first import is still physically present in the database but marked deleted.
Import 3:
validateOnlineIds()
resets online IDs on the model representing the current reimport.Preferring undeleted models when picking the model instance for reuse prevents this scenario.