Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make sure large numbers compute and factor reasonably in Formula #475

Open
katestange opened this issue Oct 26, 2024 · 9 comments
Open

Make sure large numbers compute and factor reasonably in Formula #475

katestange opened this issue Oct 26, 2024 · 9 comments
Assignees
Labels
bug Something isn't working

Comments

@katestange
Copy link
Member

katestange commented Oct 26, 2024

This URL:
http://localhost:5173/?name=Formula&viz=FactorFence&highlight=17&seq=Formula&formula=2%5En-1&first=1
Produces this erroneous image (factorizations are wrong from the 54th term onward):
Screenshot from 2024-10-26 14-35-57
The problem appears to be that we can't factor beyond a certain size and instead of reporting this (factorfence is supposed to have a way to show this), we are drawing the false factorization. Mousing over the columns beyond 54 or so shows a factorization as a power of 2.

@katestange katestange added the bug Something isn't working label Oct 26, 2024
@katestange
Copy link
Member Author

Interestingly it is fine on n factorial? It greys out the bars in that case, as it should:
Screenshot from 2024-10-26 14-44-47

@katestange
Copy link
Member Author

katestange commented Oct 26, 2024

Also, this may or may not be related, but when you click "choose visualizer" from the n factorial view above, you see Turtle doing something bad:
Screenshot from 2024-10-26 15-01-05
However, if you click turtle it comes up okay.

@gwhitney
Copy link
Collaborator

Yes, definitely a bug. And a weird one, too, as the in-browser factorizer is supposed to know its own limitations and get this sorted out. Marking for alpha.

@katestange
Copy link
Member Author

katestange commented Oct 27, 2024

This may be related and might provide clues:

http://localhost:5173/?name=FactorFence&viz=FactorFence&seq=Formula&formula=gcd%285%5En-1%2C3%5En-1%29

This just doesn't look right. There seem to be three phases going on. On the left, it looks correct to me, varying height, varying factorizations (e.g. sage verifies a(22) = 12328 = 2^3 * 23 * 67). In the middle, we see a bunch of probably fake powers of two (e.g. a(44) should be 24656 = 2^4 * 23 * 67). At the right, we see reported failures to factor, but they seem strangely large (bars too high and uniform), seems wrong (e.g. we should get a(71) = 2 and a(72) = 149440928 = 2^5 * 7 * 13 * 19 * 37 * 73 ).

Sage can be used to verify values, e.g. code below put into https://sagecell.sagemath.org/

n = 32
g = gcd(5^n-1,3^n-1)
print(g, factor(g))

Screenshot from 2024-10-26 19-51-11

@gwhitney
Copy link
Collaborator

Well, on this one, it's clear why the GCDs are becoming incorrect: currently mathjs is computing with number type, which can only encode integers as distinct bit patterns up to about 2^53. Every "integer" number higher than that is divisible by two, for example; and the excess divisibilities just get worse as they get larger, and the computations just lose more and more accuracy. So there is basically no point in re-examining this gcd of gigantic numbers issue until after #343 is resolved to get a mathjs that handles bigints. Since that has a beta milestone, not alpha, this part will have to split off to another issue. My guess for the main issue here is that in alpha, we will need a check against MAX_SAFE_INTEGER in the internal factorizer, which could perhaps be removed after the mathjs update.

@gwhitney gwhitney self-assigned this Oct 31, 2024
@gwhitney
Copy link
Collaborator

First step will be to add some unit tests for simpleFactor. My guess is they will show all of these issues and getting the tests to pass will solve all of the things that can be solved before the switch to using bigint more. But there could be more going on.

@gwhitney
Copy link
Collaborator

This URL:
http://localhost:5173/?name=Formula&viz=FactorFence&highlight=17&seq=Formula&formula=2%5En-1&first=1
Produces this erroneous image (factorizations are wrong from the 54th term onward):

Yes, 2^53-1 is the largest "safe integer" that can be exactly represented in a 64-bit IEEE floating-point number. So 2^54-1 loses accuracy inside the Formula and then when the result is converted to BigInt, it appears to equal 2^54. Similarly with all greater powers of 2. I have put a check for the time being for simpleFactor to just refuse to factor any numbers larger than 2^53-1.

@gwhitney
Copy link
Collaborator

Interestingly it is fine on n factorial? It greys out the bars in that case, as it should:

This was luck, in that the incorrect factorials computed just happened to have some too-large prime factors for simpleFactor to deal with. Now it just refuses to factorize any factorial after 18! just because they look too big (even though there were a couple more that were absolutely correct, because they were not far beyond 2^53-1 and the fact that they did have a lot of factors of 2 made their IEEE float representations happen to be exact. But that pattern breaks down not much farther along, so I don't think it's a big loss to not have the factorizations of 19! through 22! for a while, until we upgrade to mathjs 13 and bigints in formulas.

@gwhitney gwhitney changed the title Handling of large formula numbers in factorfence Make sure large numbers compute and factor reasonably in Formula Nov 1, 2024
@gwhitney
Copy link
Collaborator

gwhitney commented Nov 1, 2024

With #485 merged, we should for the moment not be reporting erroneous factorizations. This issue needs to revisited once #343 is resolved, and the tests changed to reflect the new higher expectations for computing with bigints.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants