-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gain #91
Comments
Thanks for taking time to read an review the code.
Thus the formula to compute the gain when replacing the pattern with a token :
Testing for |
Peter piper picked a peck of pickled peppers. A peck of pickled peppers Peter Piper picked. If Peter Piper picked a peck of pickled peppers, Where's the peck of pickled peppers Peter Piper picked above text is RegPack'ed(using -1): 195B to 148B |
All right, you found a very interesting edge case. Let's dive into the clockwork to explain what is happening here.
var gain = count*(packerData.matchesLookup[j].len-tokenCost)-packerData.matchesLookup[j].len-2*tokenCost;
var score = options.crushGainFactor*gain+options.crushLengthFactor*packerData.matchesLookup[j].len+options.crushCopiesFactor*count;
if (gain>=0) { Notice the -2 in the expression (multiplied by The reason is that the computed gain itself is off by one in the case of the packer. Replacing -2 with -1 in the packer is not a perfect solution either, I had an example from the benchmarks (need to retrieve it, coming later) where the last match, added thanks to the -1, needs a token from a new range, effectively growing the regular expression by one character. Suggested solution : count the first three characters in a range as one byte each (thus -2 for them), then the subsequent ones in the same range are free (-1). What do you think of that ? |
regPack.js
line 210 & 249 & 266: Z=R*j-R-j-2
line 267: Z1=(R+1)*j-(R+1)-j-2
I think that -2 may be -1. because sometimes better result. It's should be changed by options.
The text was updated successfully, but these errors were encountered: