Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add possible ReplayGain values for loudgain #19

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

Moonbase59
Copy link

As discussed in the forum, here’s the PR to add the ReplayGain tags for loudgain into MediaFile.

For REPLAYGAIN_REFERENCE_LOUDNESS, if we find an old-style positive dB value ("89 dB"), we convert to LUFS so beets can show a uniform reference loudness. Upon writing, the reference loudness will also be stored as LUFS ("-18.00 LUFS").

@sampsyo
Copy link
Member

sampsyo commented Sep 30, 2019

Again, thanks for getting this started!

Let's tread carefully with respect to the case sensitivity thing. I note that you've created a capital and lower-case StorageStyle for each field. But MP3DescStorageStyle is case insensitive already, so I don't think this adds anything to the logic that's already there, right? Could you explain a little more about why you're not just relying on the case-insensitivity in the matching process?

@Moonbase59
Copy link
Author

Ok, but how does it write the values back? It seems to me, it compares the keys, so the lowercase variant would be written back as lowercase, the uppercase variant written back as uppercase. Apart from the matching (from the field list) functioning, this might turn out crucial in order not to mess up the tags in the file. Needed, because, some programs still require upper- or lowercase tags, which the user might have generated correctly using loudgain, and we wouldn’t want to destroy what might be needed.

At least, my cursory testing showed that with the current (PR) setup, lowercase tags would be written back as lowercase, and uppercase as uppercase if both entries were there. If only one was there, it could be written back in the wrong case.

This is also the reason why I added some more upper-/lowercase variants, i.e. for MP4.

@sampsyo
Copy link
Member

sampsyo commented Sep 30, 2019

Yeah, that's the idea—the loop you see in MP3DescStorageStyle.store tries to find an existing frame to store the data rather than blowing away the old one and creating a new one. So with that strategy, the case gets preserved by default, which is probably what we want.

I guess I don't 100% understand the case where this goes wrong… can you clarify when this strategy would replace a "correct" tag with an "incorrect" one?

@Moonbase59
Copy link
Author

Moonbase59 commented Sep 30, 2019

Whenever you specify, say only a lowercase StorageStyle and have (and need) an uppercase tag (or vice versa). MediaFile would go ahead, check for a match (case-insensitively), and upon writing, detect that it has no uppercase StorageStyle, so remove the existing uppercase tag and replace it with the variant from the StorageStyle (lowercase). Boom!

Without looking at the code more closely, it appears to do the matching for read (well, and delete) from the list of frames/fields case-insensitively; but for writing matches up the tag and the StorageStyle entry "as-is" to find out whether an in-place update would be ok. (A good thing.)

@sampsyo
Copy link
Member

sampsyo commented Oct 1, 2019

and upon writing, detect that it has no uppercase StorageStyle, so remove the existing uppercase tag and replace it with the variant from the StorageStyle (lowercase).

Hmm; I don't think that's right. The algorithm in MP3DescStorageStyle.store looks for an existing ID3 frame case-insensitively and updates its contents. It wouldn't remove the existing one or replace it with a new one.

mediafile/mediafile.py

Lines 850 to 856 in b6b3520

# Try modifying in place.
found = False
for frame in frames:
if frame.desc.lower() == self.description.lower():
frame.text = value
frame.encoding = mutagen.id3.Encoding.UTF8
found = True

@Moonbase59
Copy link
Author

Moonbase59 commented Oct 1, 2019

Hmm. This would actually mean we could possibly de-clutter the code a bit since it would check case-insensitively anyway. A good thing. So it would use whatever case we put in the StorageStyle only when creating a fresh tag. Is that true for other tag types as well?

Leaves the question if we should introduce some global config items that state how to write the tags for different file formats. (Like with "id3v23", to state we want ID3v2.3 instead of ID3v2.4.)

Personally, I’d prefer keeping the code small, and maybe just do an .upper() or .lower() on it. Too bad not everyone reads this stuff case-insensitively.

Even if the "standard" proposes always uppercase, to reach most of the players out there, loudgain currently recommends lowercase for mp3, m4a, mp4, wma, wav, aiff and uppercase for flac, ogg, opus, WavPack and Monkey’s Audio. See https://github.com/Moonbase59/loudgain#loudgain-makes-it-easy-following-the-gold-standard. (All the ones using -L force lowercase tags, the others are uppercase.)

N.B.: Seeing above, do you rely on Mutagen handling the UTF8-to-UTF16 conversion if "id3v23" is enabled? (UTF8-encoded frames are not possible in ID3v2.3.)

@sampsyo
Copy link
Member

sampsyo commented Oct 1, 2019

So it would use whatever case we put in the StorageStyle only when creating a fresh tag. Is that true for other tag types as well?

In the spirit of getting familiar with the code, any chance you'd be willing to check the other storage formats yourself? 😃 It could be a good way to understand more deeply about exactly what's going on and what needs to be changed.

Leaves the question if we should introduce some global config items that state how to write the tags for different file formats.

I would love to avoid creating more global switches if possible.

Even if the "standard" proposes always uppercase, to reach most of the players out there, loudgain currently recommends lowercase for mp3, m4a, mp4, wma, wav, aiff and uppercase for flac, ogg, opus, WavPack and Monkey’s Audio.

Seems like a good idea to make that the default then.

do you rely on Mutagen handling the UTF8-to-UTF16 conversion if "id3v23" is enabled?

Again, diving into the code to see if you can answer this yourself might be a great way to get more familiar with all the machinery here. But the basic answer is yes; Mutagen manages the encoding most of the time.

@Moonbase59
Copy link
Author

Moonbase59 commented Oct 1, 2019

Ok ok … 😀 I shouldn’t have started all this in the first place 🤣

But I have to insist we find a means for the user to specify the casing of the RG tags. Unfortunately, with the player situation out there, this is currently needed, We could of course provide above as default. As an example, Picard even goes as far as remembering the original case (provided by loudgain or any other RG tool), so even things like "rEpLAyGain_Track_Gain" would be restored exactly as they were. Was hard enough putting all this crap into loudgain, just to do it right in every thinkable situation.

I don’t think we have to go that far (upper/lower is enough), and we probably wouldn’t care for the tag types that have to be read case-insensitively per their specs (thinking of Vorbis Comments and APE tags—even loudgain writes these uppercase). So a selection method must only be found for the MP3, ASF and MP4 storage styles (I think).

I’ll try to streamline my changes a little (avoiding unneeded dups) and you think about how we best set lower/upper?

@sampsyo
Copy link
Member

sampsyo commented Oct 1, 2019

Ok ok … 😀 I shouldn’t have started all this in the first place 🤣

No no, it's cool! I'm glad we're exploring this. 😃

Picard even goes as far as remembering the original case (provided by loudgain or any other RG tool), so even things like "rEpLAyGain_Track_Gain" would be restored exactly as they were.

Hmm, I think I remain a little confused… because this is exactly what MediaFile does too, at least for ID3. The case-insensitive matching I was describing before has precisely this effect, where the case stays the same if it was set in some special way.

So when you say this:

But I have to insist we find a means for the user to specify the casing of the RG tags.

Are you saying that we should not do the "preserve the current casing" thing and we should instead always override it? Or are you just making a point about new tags, not about existing ones?

@Moonbase59
Copy link
Author

Moonbase59 commented Oct 2, 2019

Interesting thought, good you brought that up. I think both have their merits. To be honest, I wouldn’t know which one to prefer here. Let’s think out loud. Leaving them unchanged follows the logic "don’t change more than neccessary" (which I kinda prefer). Then again, people might want to do their ReplayGaining using beets and expect a certain form of case, and their files uniformely tagged.

Using beets’ replaygain backend could then possibly create new tags that weren’t "in sync" with existing ones, case-wise.

I think this is a design decision, somehow, and also heavily depends on a user’s workflow. (Does he replaygain before or with beets?)

After much thinking, I decided for loudgain that it always removes all existing ReplayGain tags and replaces them with new ones, following the user-set options (lower/uppercase, write extended tags, etc.), in the interest of getting uniformely tagged files.

I’ve also seen cases where users stored both uppercase and lowercase tags into files to provide values for all player type requirements. Since it can absolutely not be expected by any software to keep such in sync, loudgain prohibits this and removes all such duplicate entries in favour of correct ones.

A user would be shocked to experience sudden loudness jumps in his favourite player even if he had replaygained all his files, and this only happening because on some of them the case was not changed. He often wouldn’t know where this "bug" comes from and what he did wrong.

Thinking about this, beets is most often used to repair and complete tags, so it might actually be best to rewrite the ReplayGain tags to whatever case the user specified in his configuration. Just because we otherwise might mix exisiting (wrong case) with new (correct case) tags.

Beets is clearly the best tool around for finding omissions and mass-tagging on large libraries, so people (including myself) might want to use it to switch from one RG case variant to the other (say because they stopped using VLC and started using KODI), or re-replaygain because they want to upgrade their "old" files tagged using the ReplayGain 1 algorithm with ReplayGain 2 values.

That said, I come to the conclusion that—at least for ReplayGain—we should have config options (at least for MP3, MP4, ASF) and always rewrite all ReplayGain tags, probably not only the changed ones. Since MediaFile is its own library (great decision btw), how would we make it respect config options from beets? (I’m thinking of passing a list as an additional parameter, in addittion to "id3v23".)

Thinking further, it might be best to let MediaFile handle one tag at a time (writing RG tags in a configured case), and leave the more complicated stuff (like removing duplicates and taking care of all RG tags being the same case) to the ReplayGain backend, what do you think?

(Memo to myself: Overhaul of RG backend probably needed.)

@sampsyo
Copy link
Member

sampsyo commented Oct 2, 2019

That said, I come to the conclusion that—at least for ReplayGain—we should have config options (at least for MP3, MP4, ASF) and always rewrite all ReplayGain tags, probably not only the changed ones.

For what it's worth, beets already does have a way to let people specify that they want their tags blown away and rewritten from scratch: the scrub plugin. That plugin deletes everything from files' tags and rewrites them using the default formats. I think the natural thing to do here would be to stick with preserving the current settings on the tags by default and letting scrub do the normalization, when users want it.

Since MediaFile is its own library (great decision btw), how would we make it respect config options from beets? (I’m thinking of passing a list as an additional parameter, in addittion to "id3v23".)

Therein lies the problem, I think—managing the id3v23 setting for MediaFile is already a huge pain. The beets config option needs to be threaded through to MediaFile at every point where we create it. That's the consequence of making MediaFile a discrete component with a simple, low-configuration interface: it becomes proportionately harder to add global configuration switches like this.

What do you think about proceeding for now with a sensible default setting (e.g., lower case for new tags) and coming back later to explore a configurable option—especially if there's enough demand to justify the added complexity?

@Moonbase59
Copy link
Author

I think the natural thing to do here would be to stick with preserving the current settings on the tags by default and letting scrub do the normalization, when users want it.

What do you think about proceeding for now with a sensible default setting (e.g., lower case for new tags) and coming back later to explore a configurable option—especially if there's enough demand to justify the added complexity?

Looks like this would be the only way to proceed for now, not changing existing tags, and "defining" a default for new ones (even it may be wrong). This should at least allow for a workflow where they set up the tags beforehand, using external tools like loudgain or whatever else.

Enough demand might be there already, just users not recognizing it … Actually took me a while to diagnose users’ problems with ReplayGain and testing lots of players on different platforms to see that it often just boils down to the RG tags’ case.

In general, I love beets’ extensibility using plugins, but I also feel the core application should provide a feasible standard functionality without needing plugins. I wouldn’t like the idea of a user having to scrub all tags just to get a few right. Unfortunately, with ReplayGain, it seems a tagging application will never totally solve the problem—this could only be solved if the "standard" defined something like "tags MUST be read case-insensitively in a player". Well, the world isn’t perfect …

So would you be happy not changing a tags’ casing and adopting loudgain’s recommendation for new ones, just so that we can proceed in this matter?

@sampsyo
Copy link
Member

sampsyo commented Oct 8, 2019

Great summary!

So would you be happy not changing a tags’ casing and adopting loudgain’s recommendation for new ones, just so that we can proceed in this matter?

Yeah, absolutely. Let's start there and see how it goes. 👍

@Moonbase59
Copy link
Author

Moonbase59 commented Oct 10, 2019

Working on the code and testing, I only now realize that MediaFile seems to follow a "read any, write all" strategy (i.e., read any variant of a tag known but write all of them if changed). For my own library, I’d be quite unhappy with that, but it will have its use cases.

Anyway, if this strategy was closely followed, storing both upper- and lowercase variants of a tag should be possible, which works on a few file types like MP4 but not with MP3 (due to the case-insensitive compare). It would also mean a lot of tag pollution but might make a file compatible with players that’d only read one case. Then again, one could never be certain if another (tagging) software would remove or change only one of the case variants.

I’d favour not storing both upper- and lowercase variants but decide on one (as we’ve been talking about), and hope players catch up on reading these case-insensitively. (This is how loudgain does it: It removes all case variants then stores the one the user decided upon.) Also, storing tags that only differ in case would be possible but unwanted in Vorbis Comments and totally illegal in APEv2.

On the other hand, we could store everything in duplicate (thus making a file compatible with every player out there) and remove the case-insensitive checking in MP3StorageStyle. (I’d not like following this path—it will introduce lots of unwanted side effects.)

Your decision (since it’s your software)?

@sampsyo
Copy link
Member

sampsyo commented Oct 12, 2019

if this strategy was closely followed, storing both upper- and lowercase variants of a tag should be possible

To be perfectly pedantic, a strict "write all" policy would use all possible capitalizations of a tag name, just to be sure. But that's obviously intractable. 😃

I think the "use lowercase by default" policy is probably the right one here. It's a trade-off between compatibility and annoyance, obviously, and it seems like those two forces imply that a middle road is probably best.

@Moonbase59
Copy link
Author

Yeah, my brain often kicks in, trying to think out each and every possibility and making me over-pedantic, hee hee. Bloody brain! Don’t they say "publish early, publish often"? So maybe time to start hacking, merging and checking what comes out of it. :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants