Skip to content

Conversation

kzrnm
Copy link
Contributor

@kzrnm kzrnm commented Feb 24, 2025

https://en.wikipedia.org/wiki/Toom%E2%80%93Cook_multiplication

I updated BigInteger multiplication to use the Toom-3 algorithm.

The current Karatsuba algorithm has a time complexity of $O(n^{\log_2{3}}) \simeq O(n^{1.58})$, which is expected to improve to $O(n^{\log_3{5}}) \simeq O(n^{1.46})$ resulting in better performance.

in other languages:

About the Implementation

  • Merged SquareThreshold and MultiplyKaratsubaThreshold.
    • Since both had the same value, this improves testability.
  • Added {[MethodImpl(MethodImplOptions.AggressiveInlining)] to avoid stack consumption when determining the algorithm.
  • In some cases, the Toom-2.5 algorithm is used.

Why MultiplyToom3Threshold is 256?

Based on the benchmark results, I decided to set MultiplyToom3Threshold to 256.

Benchmark

When the number of digits is small, the preprocessing for algorithm selection is relatively high, leading to a slight regression—for example, a computation that previously took 19 μs now takes 20 μs.

However, as the number of digits increases, performance improves; for instance, a multiplication that used to take 750 μs is now completed in 690 μs.

Code
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;
using System.Numerics;
using System.Runtime.InteropServices;

[MemoryDiagnoser(false)]
[HideColumns("Job", "Error", "StdDev", "Median", "RatioSD")]
public class MultiplySomeSizeTests
{
    public IEnumerable<object> GetMultiplyArgs()
    {
        var rnd = new Random(227);
        var bytes = new byte[1000000];
        var lengths = new int[] { 100, 500, 1000, 10000, 100000, 1000000 };
        for (int i = lengths.Length - 1; i >= 0; i--)
        {
            var largeLength = lengths[i];
            var large = Make(largeLength);
            foreach (var p in new double[] { 0.999999999999999, 0.75, 0.5, 0.25 })
            {
                var smallLength = (int)(p * lengths[i]);
                var small = Make(smallLength);
                yield return new Data($"{largeLength:D7}-{smallLength:D7}", large, small);
            }

            yield return new Data($"Square{largeLength:D7}", large, large);
        }
        BigInteger Make(int length)
        {
            var b = bytes.AsSpan().Slice(0, length);
            rnd.NextBytes(b);
            return new BigInteger(b, isUnsigned: true);
        }
    }

    public record Data(string Name, BigInteger Large, BigInteger Small)
    {
        public override string ToString() => Name;
    }

    [Benchmark]
    [ArgumentsSource(nameof(GetMultiplyArgs))]
    public BigInteger Multiply(Data data)
    {
        return data.Large * data.Small;
    }
}

BenchmarkDotNet v0.13.12, Windows 11 (10.0.26100.3194)
13th Gen Intel Core i5-13500, 1 CPU, 20 logical and 14 physical cores
.NET SDK 10.0.100-alpha.1.25077.2
  [Host]   : .NET 10.0.0 (10.0.25.7313), X64 RyuJIT AVX2
  ShortRun : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX2

Job=ShortRun  IterationCount=3  LaunchCount=1  
WarmupCount=3  

Method Toolchain data Mean Ratio Gen0 Gen1 Gen2 Allocated Alloc Ratio
Multiply \main\corerun.exe 0000100-0000025 132.2 ns 1.00 0.0119 - - 152 B 1.00
Multiply \pr0128\corerun.exe 0000100-0000025 132.6 ns 1.00 0.0119 - - 152 B 1.00
Multiply \pr0256\corerun.exe 0000100-0000025 134.3 ns 1.02 0.0119 - - 152 B 1.00
Multiply \pr0512\corerun.exe 0000100-0000025 133.3 ns 1.01 0.0119 - - 152 B 1.00
Multiply \pr1024\corerun.exe 0000100-0000025 133.2 ns 1.01 0.0119 - - 152 B 1.00
Multiply \main\corerun.exe 0000100-0000050 223.6 ns 1.00 0.0138 - - 176 B 1.00
Multiply \pr0128\corerun.exe 0000100-0000050 241.7 ns 1.08 0.0138 - - 176 B 1.00
Multiply \pr0256\corerun.exe 0000100-0000050 232.8 ns 1.04 0.0138 - - 176 B 1.00
Multiply \pr0512\corerun.exe 0000100-0000050 231.1 ns 1.03 0.0138 - - 176 B 1.00
Multiply \pr1024\corerun.exe 0000100-0000050 228.1 ns 1.02 0.0138 - - 176 B 1.00
Multiply \main\corerun.exe 0000100-0000075 322.2 ns 1.00 0.0157 - - 200 B 1.00
Multiply \pr0128\corerun.exe 0000100-0000075 322.2 ns 1.00 0.0157 - - 200 B 1.00
Multiply \pr0256\corerun.exe 0000100-0000075 336.5 ns 1.04 0.0157 - - 200 B 1.00
Multiply \pr0512\corerun.exe 0000100-0000075 323.3 ns 1.00 0.0157 - - 200 B 1.00
Multiply \pr1024\corerun.exe 0000100-0000075 324.6 ns 1.01 0.0157 - - 200 B 1.00
Multiply \main\corerun.exe 0000100-0000099 399.4 ns 1.00 0.0176 - - 224 B 1.00
Multiply \pr0128\corerun.exe 0000100-0000099 445.1 ns 1.11 0.0176 - - 224 B 1.00
Multiply \pr0256\corerun.exe 0000100-0000099 436.7 ns 1.09 0.0176 - - 224 B 1.00
Multiply \pr0512\corerun.exe 0000100-0000099 407.8 ns 1.02 0.0176 - - 224 B 1.00
Multiply \pr1024\corerun.exe 0000100-0000099 438.0 ns 1.10 0.0176 - - 224 B 1.00
Multiply \main\corerun.exe 0000500-0000125 2,540.3 ns 1.00 0.0496 - - 656 B 1.00
Multiply \pr0128\corerun.exe 0000500-0000125 2,650.3 ns 1.04 0.0496 - - 656 B 1.00
Multiply \pr0256\corerun.exe 0000500-0000125 2,621.6 ns 1.03 0.0496 - - 656 B 1.00
Multiply \pr0512\corerun.exe 0000500-0000125 2,648.0 ns 1.04 0.0496 - - 656 B 1.00
Multiply \pr1024\corerun.exe 0000500-0000125 2,644.0 ns 1.04 0.0496 - - 656 B 1.00
Multiply \main\corerun.exe 0000500-0000250 3,945.4 ns 1.00 0.0610 - - 776 B 1.00
Multiply \pr0128\corerun.exe 0000500-0000250 4,185.3 ns 1.06 0.0610 - - 776 B 1.00
Multiply \pr0256\corerun.exe 0000500-0000250 4,546.5 ns 1.15 0.0610 - - 776 B 1.00
Multiply \pr0512\corerun.exe 0000500-0000250 4,777.1 ns 1.21 0.0610 - - 776 B 1.00
Multiply \pr1024\corerun.exe 0000500-0000250 4,604.5 ns 1.17 0.0610 - - 776 B 1.00
Multiply \main\corerun.exe 0000500-0000375 5,344.4 ns 1.00 0.0687 - - 904 B 1.00
Multiply \pr0128\corerun.exe 0000500-0000375 6,243.9 ns 1.17 0.0687 - - 904 B 1.00
Multiply \pr0256\corerun.exe 0000500-0000375 6,226.8 ns 1.17 0.0687 - - 904 B 1.00
Multiply \pr0512\corerun.exe 0000500-0000375 6,180.4 ns 1.16 0.0687 - - 904 B 1.00
Multiply \pr1024\corerun.exe 0000500-0000375 5,654.4 ns 1.06 0.0687 - - 904 B 1.00
Multiply \main\corerun.exe 0000500-0000499 6,060.4 ns 1.00 0.0763 - - 1024 B 1.00
Multiply \pr0128\corerun.exe 0000500-0000499 7,032.1 ns 1.16 0.0763 - - 1024 B 1.00
Multiply \pr0256\corerun.exe 0000500-0000499 6,844.1 ns 1.13 0.0763 - - 1024 B 1.00
Multiply \pr0512\corerun.exe 0000500-0000499 6,821.3 ns 1.13 0.0763 - - 1024 B 1.00
Multiply \pr1024\corerun.exe 0000500-0000499 6,821.3 ns 1.13 0.0763 - - 1024 B 1.00
Multiply \main\corerun.exe 0001000-0000250 7,894.7 ns 1.00 0.0916 - - 1280 B 1.00
Multiply \pr0128\corerun.exe 0001000-0000250 9,196.4 ns 1.16 0.0916 - - 1280 B 1.00
Multiply \pr0256\corerun.exe 0001000-0000250 9,301.5 ns 1.18 0.0916 - - 1280 B 1.00
Multiply \pr0512\corerun.exe 0001000-0000250 9,215.9 ns 1.17 0.0916 - - 1280 B 1.00
Multiply \pr1024\corerun.exe 0001000-0000250 9,514.8 ns 1.21 0.0916 - - 1280 B 1.00
Multiply \main\corerun.exe 0001000-0000500 12,588.0 ns 1.00 0.1068 - - 1528 B 1.00
Multiply \pr0128\corerun.exe 0001000-0000500 13,960.0 ns 1.11 0.1068 - - 1528 B 1.00
Multiply \pr0256\corerun.exe 0001000-0000500 13,931.5 ns 1.11 0.1068 - - 1528 B 1.00
Multiply \pr0512\corerun.exe 0001000-0000500 14,277.4 ns 1.13 0.1068 - - 1528 B 1.00
Multiply \pr1024\corerun.exe 0001000-0000500 13,925.9 ns 1.11 0.1068 - - 1528 B 1.00
Multiply \main\corerun.exe 0001000-0000750 16,363.0 ns 1.00 0.1221 - - 1776 B 1.00
Multiply \pr0128\corerun.exe 0001000-0000750 17,010.7 ns 1.04 0.1221 - - 1776 B 1.00
Multiply \pr0256\corerun.exe 0001000-0000750 17,728.1 ns 1.08 0.1221 - - 1776 B 1.00
Multiply \pr0512\corerun.exe 0001000-0000750 17,987.1 ns 1.10 0.1221 - - 1776 B 1.00
Multiply \pr1024\corerun.exe 0001000-0000750 17,786.7 ns 1.09 0.1221 - - 1776 B 1.00
Multiply \main\corerun.exe 0001000-0000999 19,115.1 ns 1.00 0.1526 - - 2024 B 1.00
Multiply \pr0128\corerun.exe 0001000-0000999 19,084.0 ns 1.00 0.1526 - - 2024 B 1.00
Multiply \pr0256\corerun.exe 0001000-0000999 19,898.4 ns 1.04 0.1526 - - 2024 B 1.00
Multiply \pr0512\corerun.exe 0001000-0000999 20,120.3 ns 1.05 0.1526 - - 2024 B 1.00
Multiply \pr1024\corerun.exe 0001000-0000999 20,073.3 ns 1.05 0.1526 - - 2024 B 1.00
Multiply \main\corerun.exe 0010000-0002500 331,491.7 ns 1.00 0.9766 - - 12528 B 1.00
Multiply \pr0128\corerun.exe 0010000-0002500 345,360.3 ns 1.04 0.9766 - - 12528 B 1.00
Multiply \pr0256\corerun.exe 0010000-0002500 331,238.7 ns 1.00 0.9766 - - 12528 B 1.00
Multiply \pr0512\corerun.exe 0010000-0002500 328,670.5 ns 0.99 0.9766 - - 12528 B 1.00
Multiply \pr1024\corerun.exe 0010000-0002500 346,899.5 ns 1.05 0.9766 - - 12528 B 1.00
Multiply \main\corerun.exe 0010000-0005000 505,027.1 ns 1.00 0.9766 - - 15025 B 1.00
Multiply \pr0128\corerun.exe 0010000-0005000 476,146.5 ns 0.94 0.9766 - - 15024 B 1.00
Multiply \pr0256\corerun.exe 0010000-0005000 450,121.5 ns 0.89 0.9766 - - 15024 B 1.00
Multiply \pr0512\corerun.exe 0010000-0005000 486,532.8 ns 0.96 0.9766 - - 15024 B 1.00
Multiply \pr1024\corerun.exe 0010000-0005000 490,601.3 ns 0.97 0.9766 - - 15024 B 1.00
Multiply \main\corerun.exe 0010000-0007500 672,133.9 ns 1.00 0.9766 - - 17529 B 1.00
Multiply \pr0128\corerun.exe 0010000-0007500 639,168.1 ns 0.95 0.9766 - - 17529 B 1.00
Multiply \pr0256\corerun.exe 0010000-0007500 593,343.2 ns 0.88 0.9766 - - 17529 B 1.00
Multiply \pr0512\corerun.exe 0010000-0007500 588,610.1 ns 0.88 0.9766 - - 17528 B 1.00
Multiply \pr1024\corerun.exe 0010000-0007500 655,762.0 ns 0.98 0.9766 - - 17528 B 1.00
Multiply \main\corerun.exe 0010000-0009999 752,860.1 ns 1.00 0.9766 - - 20025 B 1.00
Multiply \pr0128\corerun.exe 0010000-0009999 655,242.7 ns 0.87 0.9766 - - 20025 B 1.00
Multiply \pr0256\corerun.exe 0010000-0009999 690,172.2 ns 0.92 0.9766 - - 20025 B 1.00
Multiply \pr0512\corerun.exe 0010000-0009999 662,638.8 ns 0.88 0.9766 - - 20024 B 1.00
Multiply \pr1024\corerun.exe 0010000-0009999 739,939.7 ns 0.98 0.9766 - - 20025 B 1.00
Multiply \main\corerun.exe 0100000-0025000 13,501,335.9 ns 1.00 31.2500 31.2500 31.2500 125052 B 1.00
Multiply \pr0128\corerun.exe 0100000-0025000 10,430,180.2 ns 0.77 31.2500 31.2500 31.2500 125056 B 1.00
Multiply \pr0256\corerun.exe 0100000-0025000 10,917,271.4 ns 0.81 31.2500 31.2500 31.2500 125052 B 1.00
Multiply \pr0512\corerun.exe 0100000-0025000 10,978,839.6 ns 0.81 31.2500 31.2500 31.2500 125052 B 1.00
Multiply \pr1024\corerun.exe 0100000-0025000 11,131,788.0 ns 0.82 31.2500 31.2500 31.2500 125052 B 1.00
Multiply \main\corerun.exe 0100000-0050000 19,529,617.7 ns 1.00 31.2500 31.2500 31.2500 150068 B 1.00
Multiply \pr0128\corerun.exe 0100000-0050000 15,097,175.5 ns 0.77 46.8750 46.8750 46.8750 150062 B 1.00
Multiply \pr0256\corerun.exe 0100000-0050000 14,365,115.1 ns 0.74 46.8750 46.8750 46.8750 150062 B 1.00
Multiply \pr0512\corerun.exe 0100000-0050000 15,798,947.9 ns 0.81 31.2500 31.2500 31.2500 150068 B 1.00
Multiply \pr1024\corerun.exe 0100000-0050000 15,769,881.2 ns 0.81 31.2500 31.2500 31.2500 150068 B 1.00
Multiply \main\corerun.exe 0100000-0075000 26,245,361.5 ns 1.00 31.2500 31.2500 31.2500 175068 B 1.00
Multiply \pr0128\corerun.exe 0100000-0075000 20,914,966.7 ns 0.80 31.2500 31.2500 31.2500 175067 B 1.00
Multiply \pr0256\corerun.exe 0100000-0075000 18,830,658.3 ns 0.72 31.2500 31.2500 31.2500 175067 B 1.00
Multiply \pr0512\corerun.exe 0100000-0075000 18,699,003.1 ns 0.71 31.2500 31.2500 31.2500 175058 B 1.00
Multiply \pr1024\corerun.exe 0100000-0075000 21,151,182.3 ns 0.81 31.2500 31.2500 31.2500 175058 B 1.00
Multiply \main\corerun.exe 0100000-0099999 29,509,821.9 ns 1.00 31.2500 31.2500 31.2500 200068 B 1.00
Multiply \pr0128\corerun.exe 0100000-0099999 20,654,077.1 ns 0.70 31.2500 31.2500 31.2500 200067 B 1.00
Multiply \pr0256\corerun.exe 0100000-0099999 21,727,724.0 ns 0.74 31.2500 31.2500 31.2500 200058 B 1.00
Multiply \pr0512\corerun.exe 0100000-0099999 20,679,755.2 ns 0.70 31.2500 31.2500 31.2500 200058 B 1.00
Multiply \pr1024\corerun.exe 0100000-0099999 23,118,738.5 ns 0.78 31.2500 31.2500 31.2500 200067 B 1.00
Multiply \main\corerun.exe 1000000-0250000 509,747,866.7 ns 1.00 - - - 1250760 B 1.00
Multiply \pr0128\corerun.exe 1000000-0250000 321,428,916.7 ns 0.63 - - - 1250392 B 1.00
Multiply \pr0256\corerun.exe 1000000-0250000 314,692,916.7 ns 0.62 - - - 1250392 B 1.00
Multiply \pr0512\corerun.exe 1000000-0250000 316,530,200.0 ns 0.62 - - - 1250472 B 1.00
Multiply \pr1024\corerun.exe 1000000-0250000 354,507,166.7 ns 0.70 - - - 1250472 B 1.00
Multiply \main\corerun.exe 1000000-0500000 778,175,866.7 ns 1.00 - - - 1500472 B 1.00
Multiply \pr0128\corerun.exe 1000000-0500000 453,160,500.0 ns 0.58 - - - 1500472 B 1.00
Multiply \pr0256\corerun.exe 1000000-0500000 433,646,666.7 ns 0.56 - - - 1500760 B 1.00
Multiply \pr0512\corerun.exe 1000000-0500000 438,128,166.7 ns 0.56 - - - 1500472 B 1.00
Multiply \pr1024\corerun.exe 1000000-0500000 467,433,166.7 ns 0.60 - - - 1500760 B 1.00
Multiply \main\corerun.exe 1000000-0750000 1,016,370,266.7 ns 1.00 - - - 1750760 B 1.00
Multiply \pr0128\corerun.exe 1000000-0750000 605,555,000.0 ns 0.60 - - - 1750472 B 1.00
Multiply \pr0256\corerun.exe 1000000-0750000 562,233,733.3 ns 0.55 - - - 1750472 B 1.00
Multiply \pr0512\corerun.exe 1000000-0750000 553,515,400.0 ns 0.54 - - - 1750472 B 1.00
Multiply \pr1024\corerun.exe 1000000-0750000 564,205,566.7 ns 0.56 - - - 1750472 B 1.00
Multiply \main\corerun.exe 1000000-0999999 1,194,190,333.3 ns 1.00 - - - 2000472 B 1.00
Multiply \pr0128\corerun.exe 1000000-0999999 621,246,233.3 ns 0.52 - - - 2000760 B 1.00
Multiply \pr0256\corerun.exe 1000000-0999999 617,586,733.3 ns 0.52 - - - 2000472 B 1.00
Multiply \pr0512\corerun.exe 1000000-0999999 613,818,600.0 ns 0.51 - - - 2000760 B 1.00
Multiply \pr1024\corerun.exe 1000000-0999999 613,492,666.7 ns 0.51 - - - 2000088 B 1.00
Multiply \main\corerun.exe Square0000100 275.5 ns 1.00 0.0176 - - 224 B 1.00
Multiply \pr0128\corerun.exe Square0000100 275.3 ns 1.00 0.0176 - - 224 B 1.00
Multiply \pr0256\corerun.exe Square0000100 280.0 ns 1.02 0.0176 - - 224 B 1.00
Multiply \pr0512\corerun.exe Square0000100 277.2 ns 1.01 0.0176 - - 224 B 1.00
Multiply \pr1024\corerun.exe Square0000100 276.6 ns 1.00 0.0176 - - 224 B 1.00
Multiply \main\corerun.exe Square0000500 4,054.4 ns 1.00 0.0763 - - 1024 B 1.00
Multiply \pr0128\corerun.exe Square0000500 4,121.7 ns 1.02 0.0763 - - 1024 B 1.00
Multiply \pr0256\corerun.exe Square0000500 4,123.9 ns 1.02 0.0763 - - 1024 B 1.00
Multiply \pr0512\corerun.exe Square0000500 4,142.7 ns 1.02 0.0763 - - 1024 B 1.00
Multiply \pr1024\corerun.exe Square0000500 4,175.9 ns 1.03 0.0763 - - 1024 B 1.00
Multiply \main\corerun.exe Square0001000 12,664.4 ns 1.00 0.1526 - - 2024 B 1.00
Multiply \pr0128\corerun.exe Square0001000 12,840.2 ns 1.01 0.1526 - - 2024 B 1.00
Multiply \pr0256\corerun.exe Square0001000 13,198.4 ns 1.04 0.1526 - - 2024 B 1.00
Multiply \pr0512\corerun.exe Square0001000 13,049.2 ns 1.03 0.1526 - - 2024 B 1.00
Multiply \pr1024\corerun.exe Square0001000 13,170.3 ns 1.04 0.1526 - - 2024 B 1.00
Multiply \main\corerun.exe Square0010000 507,761.8 ns 1.00 0.9766 - - 20025 B 1.00
Multiply \pr0128\corerun.exe Square0010000 449,839.6 ns 0.89 1.4648 - - 20024 B 1.00
Multiply \pr0256\corerun.exe Square0010000 446,288.1 ns 0.88 1.4648 - - 20024 B 1.00
Multiply \pr0512\corerun.exe Square0010000 433,092.4 ns 0.85 1.4648 - - 20024 B 1.00
Multiply \pr1024\corerun.exe Square0010000 491,795.1 ns 0.97 0.9766 - - 20024 B 1.00
Multiply \main\corerun.exe Square0100000 20,070,908.3 ns 1.00 31.2500 31.2500 31.2500 200059 B 1.00
Multiply \pr0128\corerun.exe Square0100000 14,622,159.4 ns 0.73 46.8750 46.8750 46.8750 200066 B 1.00
Multiply \pr0256\corerun.exe Square0100000 14,373,196.9 ns 0.72 46.8750 46.8750 46.8750 200062 B 1.00
Multiply \pr0512\corerun.exe Square0100000 14,284,324.5 ns 0.71 46.8750 46.8750 46.8750 200066 B 1.00
Multiply \pr1024\corerun.exe Square0100000 15,955,562.5 ns 0.79 31.2500 31.2500 31.2500 200068 B 1.00
Multiply \main\corerun.exe Square1000000 780,524,800.0 ns 1.00 - - - 2000472 B 1.00
Multiply \pr0128\corerun.exe Square1000000 445,885,933.3 ns 0.57 - - - 2000472 B 1.00
Multiply \pr0256\corerun.exe Square1000000 440,426,600.0 ns 0.56 - - - 2000760 B 1.00
Multiply \pr0512\corerun.exe Square1000000 428,211,366.7 ns 0.55 - - - 2000472 B 1.00
Multiply \pr1024\corerun.exe Square1000000 430,626,800.0 ns 0.55 - - - 2000760 B 1.00

@ghost ghost added the area-System.Numerics label Feb 24, 2025
@dotnet-policy-service dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Feb 24, 2025
Copy link
Contributor

Tagging subscribers to this area: @dotnet/area-system-numerics
See info in area-owners.md if you want to be subscribed.

@tecel007
Copy link

Hey @kzrnm , the current BigInteger is quite the mess. I developed a clean, 64bit, 2s compliment version which is 300% faster.

BigInt Fibonacci(1,073,741,824) => 650 seconds
BigInteger Fibonacci(1,073,741,824) => 1825 seconds

My version also uses ToomCook3 (as well as other multiplication algorithms)

https://github.com/tecel007/BigInt

I would like a Faster BigInteger, but don't have time to integrate it.

Seems like you are doing performance stuff... I really do think we need to start fresh, clean slate!

KISS

@kzrnm
Copy link
Contributor Author

kzrnm commented Apr 11, 2025

@tecel007 @IDisposable Please discuss topics unrelated to this Pull Request elsewhere (perhaps #114516).

@jeffhandley
Copy link
Member

@tannergooding Another BigInteger change for your consideration. I'm more hesitant on this one as it starts to veer more toward substantial changes that would be irrelevant if we pursue a different implementation such as a 64bit, 2s-complement implementation like @tecel007 has done. What do you think?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area-System.Numerics community-contribution Indicates that the PR has been added by a community member

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants