Buffered PRNG #45

obazoud · 2024-11-23T11:06:38Z

Hello,

I'm using ulidx in my project and was concerned about its performance based on benchmarks. While the built-in functionality seemed slow (compared to uuid v7 for example), I decided to investigate further.

Local Machine Benchmark

Here is the result of the benchmark on my local machine, a Mac M1 16 Go

Simple ulid: 26,597 ops/sec ±0.79% (95 runs)
ulid with timestamp: 26,823 ops/sec ±2.37% (74 runs)

Profiling and Bottleneck Identification:

I profiled ulidx (for Node.js only with --prof) to pinpoint the performance bottleneck. The results indicated that the encodeRandom method was the primary culprit. Upon closer inspection, I observed that this method calls crypto.getRandomValues 16 times per ulid generation, which seemed excessive.

Proposed Solution: Buffered PRNG

To address this, I implemented a custom PRNG that reduces the number of calls to crypto.getRandomValues while using a larger buffer size. Here's the code:

class BufferedPRNG {
  constructor(size) {
    this.buffer = new Uint8Array(size);
    this.cursor = 0xffff;
  }

  next = () => {
    if (this.cursor >= this.buffer.length) {
      crypto.getRandomValues(this.buffer);
      this.cursor = 0;
    }
    return this.buffer[this.cursor++] / 0xff;
  }
}

This custom PRNG generates random values from a pre-filled buffer, reducing the number of calls to crypto.getRandomValues.

Integration with Benchmark

I incorporated the BufferedPRNG class into the benchmark to compare performance:

const bufferedPRNG_2 = new BufferedPRNG(2);
const bufferedPRNG_4 = new BufferedPRNG(4);
const bufferedPRNG_16 = new BufferedPRNG(16);
const bufferedPRNG_32 = new BufferedPRNG(32);
const bufferedPRNG_64 = new BufferedPRNG(64);

suite.add("ulid with bufferedPRNG 2", function () {
  ulid(undefined, bufferedPRNG_2.next);
});

// ... similar tests for other buffer sizes

Benchmark Results:

Simple ulid: 27,561 ops/sec ±0.76% (90 runs)
ulid with timestamp: 26,556 ops/sec ±0.85% (93 runs)
ulid with bufferedPRNG 2: 74,244 ops/sec ±0.69% (93 runs)
ulid with bufferedPRNG 4: 137,098 ops/sec ±0.38% (97 runs)
ulid with bufferedPRNG 16: 381,865 ops/sec ±0.35% (97 runs)
ulid with bufferedPRNG 32: 531,374 ops/sec ±0.52% (96 runs)
ulid with bufferedPRNG 64: 668,991 ops/sec ±0.20% (98 runs)

The results demonstrate significant performance improvements. Using a buffer size of 16 (which is the minimum required for a single ulid) yields a roughly 14x speedup.

Request for Feedback

Please let me know if there are any errors in my implementation or approach. I'm open to any suggestions because I plan to use this optimization in my project.

The text was updated successfully, but these errors were encountered:

MarcelWaldvogel · 2024-12-29T11:58:02Z

I also looked into the PRNG behavior, as the necessity of an == comparison here looked like a bias in the resulting character frequencies

ulidx/source/ulid.ts

Lines 283 to 289 in bb685ba

    
           export function randomChar(prng: PRNG): string { 
        
               let rand = Math.floor(prng() * ENCODING_LEN); 
        
               if (rand === ENCODING_LEN) { 
        
                   rand = ENCODING_LEN - 1; 
        
               } 
        
               return ENCODING.charAt(rand); 
        
           }

… together with the use of 0xff instead of 0x100 here:

ulidx/source/ulid.ts

Line 89 in bb685ba

return () => crypto.randomBytes(1).readUInt8() / 0xff;

But it turned out just to be an unusual construct for someone used to either

doing integer arithmetic for this or
thinking of the standard Math.rand() behavior of returning a value strictly smaller than 1.

For anyone interested in the frequency analysis, here is a branch supporting it: https://github.com/MarcelWaldvogel/ulidx/tree/frequency-analysis

(I am not creating a PR for this, as it would increase the public API unnecessarily. Even though such a test might be a good one to have included.)

MarcelWaldvogel · 2024-12-29T13:53:29Z

This is the benchmark output on my development machine after applying #46 :

Simple ulid x 23,766 ops/sec ±1.57% (84 runs sampled)
ulid with timestamp x 20,757 ops/sec ±0.71% (75 runs sampled)

I compared a few pieces of code which would result in relatively small code changes, but big performance improvements. The code run by npm run bench is here; ulid() using "randomness wasteful bits" (using 8 bits of randomness for each char) with "string concatenation from back" (normal append)..

Simple ulid x 335,049 ops/sec ±0.46% (94 runs sampled)
ulid with timestamp x 318,268 ops/sec ±1.00% (91 runs sampled)
string concatenation from front x 7,057,222 ops/sec ±0.78% (94 runs sampled)
string concatenation from back x 7,484,660 ops/sec ±1.02% (87 runs sampled)
string concatenation from back with forEach x 3,676,662 ops/sec ±0.75% (95 runs sampled)
string concatenation by map/join x 998,182 ops/sec ±0.56% (93 runs sampled)
randomness 1by1 x 23,247 ops/sec ±4.41% (77 runs sampled)
randomness wasteful bits x 307,968 ops/sec ±0.85% (72 runs sampled)
randomness more computation x 380,385 ops/sec ±2.15% (90 runs sampled)

The result is about a 14-fold improvement with comparable code and complexity as before.

According to the benchmark, the "more computation" alternative would be some 20% faster. However, the difference was particularly high for this benchmark run; it is often only half the difference and sometimes even the other way round:

randomness wasteful bits x 352,064 ops/sec ±0.83% (83 runs sampled)
randomness more computation x 308,060 ops/sec ±2.42% (73 runs sampled)

Given that the extra complexity would require additional careful testing of the logic (that each random bit really only used once and that always all 5 bits are fresh from the random pool), I do suggest sticking to the simple code.

MarcelWaldvogel mentioned this issue Dec 29, 2024

The PRNG now returns an arraylike for performance reasons #47

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Buffered PRNG #45

Buffered PRNG #45

obazoud commented Nov 23, 2024 •

edited

Loading

MarcelWaldvogel commented Dec 29, 2024

MarcelWaldvogel commented Dec 29, 2024

Buffered PRNG #45

Buffered PRNG #45

Comments

obazoud commented Nov 23, 2024 • edited Loading

MarcelWaldvogel commented Dec 29, 2024

MarcelWaldvogel commented Dec 29, 2024

obazoud commented Nov 23, 2024 •

edited

Loading