Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cases where -e makes compression worse; does -e really do everything it should? #728

Open
H2Swine opened this issue Jul 18, 2024 · 12 comments

Comments

@H2Swine
Copy link
Contributor

H2Swine commented Jul 18, 2024

Usual reservation: I do not know whether -e is even supposed to handle this, and even if it is, I am not saying it is worth the effort of "fixing" it. Rather, I am reporting it because it could be symptom of a bug with bigger consequences. (At "worst", waking -e up from the dead ...)

But the situation is, sometimes I encounter signals where -e even makes for bigger encode. I found a near-silent file where it happened, and got it down to the first 16384 samples, mono. That is most manageable, so I ran it through -l <0 to 32>, -b <1024, 2048, 4096, 8192, 16384> -r <n,n for n=0 to 9> and then with/without -p and -e; that is 33510*4 = 6600 files. Or 3300 pairs "-e or not".

I think 256 of them did worse with -e. That's more than 7.5 percent. (For a further 196 pairs, they were the same size. I didn't check whether those were bit-identical.)
And it seems that when -e does worse, the "no-e" uses prediction order 1 while -e uses more.

Surely there are a lot of duplicates among those 3300; fewer than you might think because I used -r n,n forcing partition order, rather than -r n. The filenames indicate setting used, and the duplicates after the equals sign. Examples:
16384samples_-b2048_-l02_-r¨09.flac=-l03.flac means it was encoded with -b2048 -l2 -r9,9 and is bit-identical to the file encoded with -b2048 -l02 -r9,9
16384samples_-b1024_-l04_-r¨00-e.flac=-l_up_to_32.flac means it was encoded with -b1024 -l4 -r0,0, and is bit-identical to all files with -l up to 32 and everything else equal.

where_-e_loses.zip

All done with a recent compile from git. Discovered it with 1.4.x.

@ktmf01
Copy link
Collaborator

ktmf01 commented Jul 19, 2024

Without having done any tests, I assume this has to do with the inexact residual bits calculation.

The code has two methods of calculating the size of the residual, which one is used is a compile time option, not a run time option. When running with exhaustive model search (-e), the program chooses which order (i.e. which model) to use based on how many bits it thinks the residual and the model together are going to take up. As this is inexact, it can choose a model that does not result in the smallest subframe.

Maybe these signals can help to find a better fast way to estimate the amount of bits needed.

@ktmf01
Copy link
Collaborator

ktmf01 commented Jul 19, 2024

Here are two Win64 compiles to experiment with, one standard, one with exact rice calculation turned on: flac-test-exact-rice.zip

@H2Swine
Copy link
Contributor Author

H2Swine commented Jul 19, 2024

Trying those compiles, I got system errors that libwinpthread-1.dll was not found and then that libogg-0.dll was not found.

@ktmf01
Copy link
Collaborator

ktmf01 commented Jul 19, 2024

Ah, yes, I didn't do the usual checks.

Attempt no. 2: flac-test-exact-rice.zip

@H2Swine
Copy link
Contributor Author

H2Swine commented Jul 19, 2024

Here are a couple of "really bad" examples in that -e increases size by fifty percent, both on the "exact" build and the "inexact" build:
b1024-r9,9.zip ran with --no-padding -b1024 --lax -r9,9 with and without -e.
-r9,9 means out of subset, but for -r8,8 the files were still like 15% bigger with -e than without. Of course you can argue that -r n,n is "merely a pathological setting that nobody uses"; -e improves size when -r8 is used.

For those parameters where -e lost in the previous rounds (highly biased selection!), ran with these two builds, the -e outcompresses the --no-exhaustive-model-search more often in the "inexact" build. But for the "exact" build, -e loses most settings that it lost in the previous round.

Edit: this sounds dramatic. Note again to other readers, this is only 16384 samples with values -1, 0, 1. Issue opened just in case it is a symptom of something.

@H2Swine
Copy link
Contributor Author

H2Swine commented Jul 21, 2024

The plot thickens, and I do suspect it is a bug.
It is very much related to (dual) mono, and it is apparently new with flac 1.4.

I discovered it on the "near-mono" tracks of Miles Davis "The Complete Birth of the Cool"; the 1998 edition, but taking only the first eleven tracks that appeared on the original Birth of the Cool LP (which apparently served as a master!).
For some strange reason the channels differ in the LSB only, looks like they were independently dithered. (Which was why I got curious enough to look into those; the live tracks are truly "mono as stereo", the side channel is zero.)

Here is what happens - the audio is available on request, but I cannot upload it here.

  • Compressing the CDDA with presets 4 to 8: -e does what it is supposed to
  • Compressing the CDDA with preset 3: -e does harm
  • So therefore: -4 --no-mid-side to -8 --no-mid-side. With and without -p and -e. Turns out, -e does harm with 1.4.3. It does not with 1.3.4. (Curiously, with 1.2.1 and 1.1.4, -e improves but -p is harmful. Edit: I managed to find a signal where -p hurts compression with 1.4.3 too. -5 vs -5p, but not much.)

Sizes with --no-mid-side, 1.4.3, smallest to largest:

197050659 Miles1to11-143.-8p--no-mid-side.flac
197063494 Miles1to11-143.-8ep--no-mid-side.flac
197083289 Miles1to11-143.-7p--no-mid-side.flac
197089542 Miles1to11-143.-7ep--no-mid-side.flac
197133685 Miles1to11-143.-8--no-mid-side.flac
197148500 Miles1to11-143.-8e--no-mid-side.flac
197163269 Miles1to11-143.-7--no-mid-side.flac
197174153 Miles1to11-143.-7e--no-mid-side.flac
197317528 Miles1to11-143.-6p--no-mid-side.flac
197331275 Miles1to11-143.-6ep--no-mid-side.flac
197346712 Miles1to11-143.-5p--no-mid-side.flac
197347389 Miles1to11-143.-4p--no-mid-side.flac
197357560 Miles1to11-143.-5ep--no-mid-side.flac
197358339 Miles1to11-143.-4ep--no-mid-side.flac
197383143 Miles1to11-143.-6--no-mid-side.flac
197397815 Miles1to11-143.-6e--no-mid-side.flac
197413525 Miles1to11-143.-5--no-mid-side.flac
197414102 Miles1to11-143.-4--no-mid-side.flac
197425671 Miles1to11-143.-5e--no-mid-side.flac
197426321 Miles1to11-143.-4e--no-mid-side.flac
197920145 Miles1to11-143.-3p--no-mid-side.flac
197925004 Miles1to11-143.-3ep--no-mid-side.flac
197971049 Miles1to11-143.-3--no-mid-side.flac
197977252 Miles1to11-143.-3e--no-mid-side.flac

It should be noted that each of these settings improve over 1.3.4:

197103898 Miles1to11-134.-8ep--no-mid-side.flac
197108320 Miles1to11-134.-8p--no-mid-side.flac
197138650 Miles1to11-134.-7ep--no-mid-side.flac
197155596 Miles1to11-134.-7p--no-mid-side.flac
197192610 Miles1to11-134.-8e--no-mid-side.flac
197201804 Miles1to11-134.-8--no-mid-side.flac
197226583 Miles1to11-134.-7e--no-mid-side.flac
197240522 Miles1to11-134.-7--no-mid-side.flac
197360756 Miles1to11-134.-6p--no-mid-side.flac
197364644 Miles1to11-134.-6ep--no-mid-side.flac
197417984 Miles1to11-134.-5ep--no-mid-side.flac
197418760 Miles1to11-134.-4ep--no-mid-side.flac
197432027 Miles1to11-134.-6--no-mid-side.flac
197435197 Miles1to11-134.-6e--no-mid-side.flac
197477031 Miles1to11-134.-5p--no-mid-side.flac
197477713 Miles1to11-134.-4p--no-mid-side.flac
197486494 Miles1to11-134.-5e--no-mid-side.flac
197487190 Miles1to11-134.-4e--no-mid-side.flac
197542614 Miles1to11-134.-5--no-mid-side.flac
197543246 Miles1to11-134.-4--no-mid-side.flac
197958871 Miles1to11-134.-3ep--no-mid-side.flac
197977721 Miles1to11-134.-3p--no-mid-side.flac
198010968 Miles1to11-134.-3e--no-mid-side.flac
198028602 Miles1to11-134.-3--no-mid-side.flac

Allowing for stereo decorrelation, then -e works as should:

120050974 Miles1to11-143.-8ep.flac
120055033 Miles1to11-143.-8p.flac
120063818 Miles1to11-143.-7ep.flac
120071018 Miles1to11-143.-7p.flac
120091527 Miles1to11-143.-8e.flac
120095166 Miles1to11-143.-8.flac
120104345 Miles1to11-143.-7e.flac
120109975 Miles1to11-143.-7.flac
120185424 Miles1to11-143.-6ep.flac
120189074 Miles1to11-143.-6p.flac
120197902 Miles1to11-143.-5ep.flac
120203056 Miles1to11-143.-5p.flac
120204126 Miles1to11-143.-4ep.flac
120209398 Miles1to11-143.-4p.flac
120216340 Miles1to11-143.-6e.flac
120220068 Miles1to11-143.-6.flac
120230515 Miles1to11-143.-5e.flac
120235421 Miles1to11-143.-5.flac
120235756 Miles1to11-143.-4e.flac
120240837 Miles1to11-143.-4.flac

Curiously, for 1.2.1 (and 1.1.4), -p did misbehave (with or without --no-mid-side). -4 to -6:

120255935 Miles1to11-121.-6e.flac
120255961 Miles1to11-121.-5e.flac
120267836 Miles1to11-121.-6.flac
120267848 Miles1to11-121.-5.flac
120274039 Miles1to11-121.-4e.flac
120307066 Miles1to11-121.-4.flac
120312827 Miles1to11-121.-6ep.flac
120312864 Miles1to11-121.-5ep.flac
120335473 Miles1to11-121.-4ep.flac
120385579 Miles1to11-121.-5p.flac
120385582 Miles1to11-121.-6p.flac
120454126 Miles1to11-121.-4p.flac

The uncompressed CDDA .wav is 345320684 bytes.

Extracting a channel to a mono file and compressing it, works like --no-mid-side. Indeed, that is how I discovered it.

@ktmf01
Copy link
Collaborator

ktmf01 commented Jul 22, 2024

Here are a couple of "really bad" examples in that -e increases size by fifty percent, both on the "exact" build and the "inexact" build: b1024-r9,9.zip ran with --no-padding -b1024 --lax -r9,9 with and without -e. -r9,9 means out of subset, but for -r8,8 the files were still like 15% bigger with -e than without. Of course you can argue that -r n,n is "merely a pathological setting that nobody uses"; -e improves size when -r8 is used.

Maybe I misunderstood your naming scheme, but the files with -e at the end are smallest. Are these the files with e or specifically without e?

@H2Swine
Copy link
Contributor Author

H2Swine commented Jul 22, 2024

Embarrassing mistake, and you are right.

Did it over again with that particular short signal.

  • The exact-rice build resolves it.
  • Max adverse impact (for the inexact-build) is much smaller.

Tab-separated file: exact and inexact builds -e and -p impact.txt
Includes lots of dumb option combinations too, and going -l that high was no use, but I left them in because it would be more work deleting. I did delete the -b8192 run that produced no "bad" situations.

@H2Swine
Copy link
Contributor Author

H2Swine commented Aug 8, 2024

A few more tests on the "Miles tracks 1 to 11" set confirms that the exact-rice "fixes it": With that build, -e always improves, and same for -p.

I got only one instance (or two) where -p worsens: -5r6 -l1 with/without -e.

Leaving -p aside and focusing on -e:
For the "problem" (to the extent that 0.01 percent is a "problem") to manifest, I need to use --no-mid-side. Recall that the signal is near-mono: the difference channel ranges -3 to 3 or something. So -e makes for enough impact on those small numbers to offset the bad impact on the large numbers.

I ran

  • with and without -p, with and without -e (all four combinations)
  • exact-rice vs inexact-rice
  • with -m vs --no-mid-side

and varying the following parameters (yeah there was a --lax always):

  • varying -l: -5r6 -l <0 to 15>
  • varying -r: -5l8 -r <0 to 9>
  • varying the number of apodization functions to try: -7 A "subdivide_tukey(<1 to 7>)"

With the inexact-rice build and --no-mid-side, -e makes worse for all runs except "-l0" and "-l1" in the "varying -l" section.

Attaching some results (the cases with "undesired impact" are farthest down) even if the cause is kinda found.
Miles1..11_varying_-l.txt
Miles1..11_varying_-r.txt
Miles1..11_varying_-7A_varying_subdivide_tukey.txt
The leftmost column explains settings; the "f" and even "ff" at the end is nothing deeper than for getting filenames collated consistently. I FOR-looped a variable ranging (ef,ff,pe,pf).

@gabriele2000
Copy link

Is this going to eventually be merged?
The rice-build changes, I mean...

@ktmf01
Copy link
Collaborator

ktmf01 commented Nov 3, 2024

There is nothing to be merged, this has been in the code for many years, it is just disabled by default.

@gabriele2000
Copy link

gabriele2000 commented Nov 3, 2024

There is nothing to be merged, this has been in the code for many years, it is just disabled by default.

How can I enable it?
I'm using autotools of what's its name to build flac.

Building it with cmake spams stuff and it's not good for my use.

EDIT: I edited stream_encoder.c, changing #undef to #define

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants