Optimize base16 encoding by zachmargolis · Pull Request #7274 · 18F/identity-idp

zachmargolis · 2022-11-01T21:06:58Z

<< turns out to be faster than [].join

See "encode_3" for some improvements over existing. Normally I would not try to pre-optimize this, but we I know that we were planning to do this for multiple megabytes of data at a time

Here's what I did to benchmark this against #7204

require 'benchmark/ips'
require 'base16'
require 'securerandom'

class PrBase16
  def self.encode16(str)
    str.bytes.map { |char| char.to_s(16).upcase.rjust(2, "0") }.join
  end

  def self.decode16(str)
    output = ''
    str.chars.each_slice(2) do |chars|
      output << chars.join.to_i(16).chr
    end
    output
  end
end

class Base16V3
  def self.encode16(str)
    output = ''
    str.bytes.each { |char| output << char.to_s(16).upcase.rjust(2, "0") }
    output
  end

  def self.decode16(str)
    str.chars.each_slice(2).map do |pair|
      pair.join.to_i(16).chr
    end.join
  end
end

random_10k_bytes = SecureRandom.random_bytes(10_000)
random_10k_hex = SecureRandom.hex(10_000)

Benchmark.ips do|x|
  x.report('encode_gem') do
    Base16.encode16(random_10k_bytes)
  end

  x.report('encode_pr') do
    PrBase16.encode16(random_10k_bytes)
  end

  x.report('encode_3') do
    Base16V3.encode16(random_10k_bytes)
  end

  x.compare!
end

Benchmark.ips do|x|
  x.report('decode_gem') do
    Base16.decode16(random_10k_hex)
  end

  x.report('decode_pr') do
    PrBase16.decode16(random_10k_hex)
  end

  x.report('decode_3') do
    Base16V3.decode16(random_10k_hex)
  end

  x.compare!
end

- << turns out to be faster than [].join

zachmargolis · 2022-11-01T21:50:20Z

lib/base16.rb

-
-  # The IRS has requested data be encoded this way. Loosely emulate the Base64 class.
-
  def self.encode16(str)


I think it will matter more in context as well, but another option is for us to shell out to xxd. I think once we start experimenting with big encrypting, we might just write the raw data to a file and shell out to xxd | gzip or something and I think that will be faster than doing things in ruby

zachmargolis · 2022-11-02T01:32:39Z

Updated benchmark, 10k bytes vs 100k bytes, xxd starts to win. I think for prototyping, early phases we should continue in Ruby but once our background jobs and stuff start to solidify, I 100% think we should start shelling out

Warming up --------------------------------------
      encode_gem_10k     3.000  i/100ms
       encode_pr_10k    22.000  i/100ms
        encode_3_10k    25.000  i/100ms
        xxd_file_10k    18.000  i/100ms
       xxd_stdin_10k    20.000  i/100ms
Calculating -------------------------------------
      encode_gem_10k     39.006  (±23.1%) i/s -    183.000  in   5.031732s
       encode_pr_10k    170.964  (±25.7%) i/s -    814.000  in   5.176560s
        encode_3_10k    200.373  (±16.0%) i/s -    975.000  in   5.081257s
        xxd_file_10k    151.883  (± 4.6%) i/s -    756.000  in   4.989180s
       xxd_stdin_10k    175.054  (± 6.9%) i/s -    880.000  in   5.051354s

Comparison:
        encode_3_10k:      200.4 i/s
       xxd_stdin_10k:      175.1 i/s - same-ish: difference falls within error
       encode_pr_10k:      171.0 i/s - same-ish: difference falls within error
        xxd_file_10k:      151.9 i/s - 1.32x  (± 0.00) slower
      encode_gem_10k:       39.0 i/s - 5.14x  (± 0.00) slower

Warming up --------------------------------------
      xxd_stdin_100k     5.000  i/100ms
     encode_gem_100k     1.000  i/100ms
      encode_pr_100k     1.000  i/100ms
       encode_3_100k     2.000  i/100ms
       xxd_file_100k     5.000  i/100ms
      xxd_stdin_100k     6.000  i/100ms
Calculating -------------------------------------
      xxd_stdin_100k     58.977  (± 3.4%) i/s -    295.000  in   5.005833s
     encode_gem_100k      0.180  (± 0.0%) i/s -      1.000  in   5.561398s
      encode_pr_100k     14.688  (± 0.0%) i/s -     74.000  in   5.043406s
       encode_3_100k     25.216  (± 0.0%) i/s -    128.000  in   5.077671s
       xxd_file_100k     55.823  (± 1.8%) i/s -    280.000  in   5.016903s
      xxd_stdin_100k     59.306  (± 1.7%) i/s -    300.000  in   5.059981s

Comparison:
      xxd_stdin_100k:       59.3 i/s
       xxd_file_100k:       55.8 i/s - 1.06x  (± 0.00) slower
       encode_3_100k:       25.2 i/s - 2.35x  (± 0.00) slower
      encode_pr_100k:       14.7 i/s - 4.04x  (± 0.00) slower
     encode_gem_100k:        0.2 i/s - 329.82x  (± 0.00) slower

Warming up --------------------------------------
          decode_gem     5.000  i/100ms
           decode_pr    15.000  i/100ms
            decode_3    13.000  i/100ms
Calculating -------------------------------------
          decode_gem     56.166  (± 5.3%) i/s -    280.000  in   5.003975s
           decode_pr    151.081  (± 2.6%) i/s -    765.000  in   5.066810s
            decode_3    122.092  (±14.7%) i/s -    585.000  in   5.030036s

Comparison:
           decode_pr:      151.1 i/s
            decode_3:      122.1 i/s - 1.24x  (± 0.00) slower
          decode_gem:       56.2 i/s - 2.69x  (± 0.00) slower

require 'benchmark/ips'
require 'base16'
require 'securerandom'
require 'open3'
require 'tempfile'

class PrBase16
  def self.encode16(str)
    str.bytes.map { |char| char.to_s(16).upcase.rjust(2, "0") }.join
  end

  def self.decode16(str)
    output = ''
    str.chars.each_slice(2) do |chars|
      output << chars.join.to_i(16).chr
    end
    output
  end
end

class Base16V3
  def self.encode16(str)
    output = ''
    str.bytes.each { |char| output << char.to_s(16).upcase.rjust(2, "0") }
    output
  end

  def self.decode16(str)
    str.chars.each_slice(2).map do |pair|
      pair.join.to_i(16).chr
    end.join
  end
end

random_10k_bytes = SecureRandom.random_bytes(10_000)
random_10k_hex = SecureRandom.hex(10_000)
random_10k_bytes_file = Tempfile.new
File.open(random_10k_bytes_file.path, 'wb') { |f| f.write(random_10k_bytes) }
random_10k_hex_file = Tempfile.new
File.open(random_10k_hex_file.path, 'w') { |f| f.write(random_10k_hex) }

random_100k_bytes = SecureRandom.random_bytes(100_000)
random_100k_hex = SecureRandom.hex(100_000)
random_100k_bytes_file = Tempfile.new
File.open(random_100k_bytes_file.path, 'wb') { |f| f.write(random_100k_bytes) }
random_100k_hex_file = Tempfile.new
File.open(random_100k_hex_file.path, 'w') { |f| f.write(random_100k_hex) }

outfile = Tempfile.new

Benchmark.ips do|x|
  x.report('encode_gem_10k') do
    Base16.encode16(random_10k_bytes)
  end

  x.report('encode_pr_10k') do
    PrBase16.encode16(random_10k_bytes)
  end

  x.report('encode_3_10k') do
    Base16V3.encode16(random_10k_bytes)
  end

  x.report('xxd_file_10k') do
    system('xxd', '-u', '-plain', random_10k_bytes_file.path, outfile.path)
  end

  x.report('xxd_stdin_10k') do
    Open3.popen3('xxd', '-u', '-plain') do |stdin, stdout|
      stdin.write(random_10k_bytes)
      stdin.close
      stdout.read
    end
  end

  x.compare!
end

Benchmark.ips do |x|
  x.report('xxd_stdin_100k') do
    Open3.popen3('xxd', '-u', '-plain') do |stdin, stdout|
      stdin.write(random_100k_bytes)
      stdin.close
      stdout.read
    end
  end

  x.report('encode_gem_100k') do
    Base16.encode16(random_100k_bytes)
  end

  x.report('encode_pr_100k') do
    PrBase16.encode16(random_100k_bytes)
  end

  x.report('encode_3_100k') do
    Base16V3.encode16(random_100k_bytes)
  end

  x.report('xxd_file_100k') do
    system('xxd', '-u', '-plain', random_100k_bytes_file.path, outfile.path)
  end

  x.report('xxd_stdin_100k') do
    Open3.popen3('xxd', '-u', '-plain') do |stdin, stdout|
      stdin.write(random_100k_bytes)
      stdin.close
      stdout.read
    end
  end

  x.compare!
end

Benchmark.ips do|x|
  x.report('decode_gem') do
    Base16.decode16(random_10k_hex)
  end

  x.report('decode_pr') do
    PrBase16.decode16(random_10k_hex)
  end

  x.report('decode_3') do
    Base16V3.decode16(random_10k_hex)
  end

  x.compare!
end


random_10k_bytes_file.unlink
random_10k_hex_file.unlink

zachmargolis · 2022-11-02T16:24:58Z

💡 that pack and unpack literally do all this for us, and since they're implemented in C instead of ruby, they're leagues faster.

ruby base16.rb 
Warming up --------------------------------------
      encode_gem_10k     3.000  i/100ms
       encode_pr_10k    21.000  i/100ms
        encode_3_10k    23.000  i/100ms
     encode_pack_10k     5.975k i/100ms
        xxd_file_10k    19.000  i/100ms
       xxd_stdin_10k    21.000  i/100ms
Calculating -------------------------------------
      encode_gem_10k     50.141  (±25.9%) i/s -    237.000  in   5.047424s
       encode_pr_10k    235.691  (±12.7%) i/s -      1.155k in   5.079293s
        encode_3_10k    226.123  (±10.2%) i/s -      1.127k in   5.048677s
     encode_pack_10k     62.379k (±11.0%) i/s -    310.700k in   5.058291s
        xxd_file_10k    183.512  (± 4.4%) i/s -    931.000  in   5.084147s
       xxd_stdin_10k    209.037  (± 5.7%) i/s -      1.050k in   5.041609s

Comparison:
     encode_pack_10k:    62379.2 i/s
       encode_pr_10k:      235.7 i/s - 264.67x  (± 0.00) slower
        encode_3_10k:      226.1 i/s - 275.86x  (± 0.00) slower
       xxd_stdin_10k:      209.0 i/s - 298.41x  (± 0.00) slower
        xxd_file_10k:      183.5 i/s - 339.92x  (± 0.00) slower
      encode_gem_10k:       50.1 i/s - 1244.08x  (± 0.00) slower

Warming up --------------------------------------
          decode_gem     5.000  i/100ms
           decode_pr    15.000  i/100ms
            decode_3    12.000  i/100ms
         decode_pack   493.000  i/100ms
Calculating -------------------------------------
          decode_gem     55.712  (± 7.2%) i/s -    280.000  in   5.052216s
           decode_pr    160.648  (± 3.1%) i/s -    810.000  in   5.047056s
            decode_3    145.287  (±10.3%) i/s -    720.000  in   5.040730s
         decode_pack      4.576k (± 9.4%) i/s -     22.678k in   5.009005s

Comparison:
         decode_pack:     4575.8 i/s
           decode_pr:      160.6 i/s - 28.48x  (± 0.00) slower
            decode_3:      145.3 i/s - 31.49x  (± 0.00) slower
          decode_gem:       55.7 i/s - 82.13x  (± 0.00) slower

zachmargolis · 2022-11-02T17:30:43Z

ok one last time, updated because we needed an upcase in there, slightly slower but still faster than existing

Warming up --------------------------------------
      encode_gem_10k     1.000  i/100ms
       encode_pr_10k    10.000  i/100ms
        encode_3_10k    10.000  i/100ms
     encode_pack_10k   801.000  i/100ms
        xxd_file_10k    12.000  i/100ms
       xxd_stdin_10k    12.000  i/100ms
Calculating -------------------------------------
      encode_gem_10k     38.267  (±23.5%) i/s -    179.000  in   4.999714s
       encode_pr_10k    167.452  (±21.5%) i/s -    790.000  in   5.078897s
        encode_3_10k    177.296  (±24.3%) i/s -    810.000  in   5.040526s
     encode_pack_10k      8.288k (±17.6%) i/s -     40.050k in   5.039900s
        xxd_file_10k    135.442  (±14.8%) i/s -    660.000  in   5.023229s
       xxd_stdin_10k    126.034  (±27.0%) i/s -    552.000  in   5.057803s

Comparison:
     encode_pack_10k:     8288.4 i/s
        encode_3_10k:      177.3 i/s - 46.75x  (± 0.00) slower
       encode_pr_10k:      167.5 i/s - 49.50x  (± 0.00) slower
        xxd_file_10k:      135.4 i/s - 61.19x  (± 0.00) slower
       xxd_stdin_10k:      126.0 i/s - 65.76x  (± 0.00) slower
      encode_gem_10k:       38.3 i/s - 216.60x  (± 0.00) slower

Warming up --------------------------------------
          decode_gem     4.000  i/100ms
           decode_pr    12.000  i/100ms
            decode_3    11.000  i/100ms
         decode_pack   331.000  i/100ms
Calculating -------------------------------------
          decode_gem     46.118  (±17.3%) i/s -    216.000  in   5.034708s
           decode_pr    127.556  (± 9.4%) i/s -    636.000  in   5.040090s
            decode_3    101.360  (±11.8%) i/s -    506.000  in   5.086260s
         decode_pack      3.665k (±20.2%) i/s -     17.212k in   5.031577s

Comparison:
         decode_pack:     3664.6 i/s
           decode_pr:      127.6 i/s - 28.73x  (± 0.00) slower
            decode_3:      101.4 i/s - 36.15x  (± 0.00) slower
          decode_gem:       46.1 i/s - 79.46x  (± 0.00) slower

* Optimize base16 encoding

Optimize base16 encoding

bf765a6

- << turns out to be faster than [].join

zachmargolis commented Nov 1, 2022

View reviewed changes

add in-place requires

0aa807c

zachmargolis added 2 commits November 1, 2022 18:33

remove that last require

33a7a23

even faster with pack/unpack!!

ebaeede

zachmargolis requested a review from a team November 2, 2022 16:25

upcase

186d10b

zachmargolis mentioned this pull request Nov 2, 2022

LG-8010 | Start to just return JWEs #7204

Merged

3 tasks

zachmargolis added 2 commits November 2, 2022 14:32

Fix spec through encoding

b44f913

Update spec comment

bb179d3

ThatSpaceGuy approved these changes Nov 2, 2022

View reviewed changes

ThatSpaceGuy merged this pull request into mattw/just-jwes Nov 2, 2022

ThatSpaceGuy deleted the margolis-optimize-base16-encode branch November 2, 2022 21:42

ThatSpaceGuy pushed a commit that referenced this pull request Nov 8, 2022

Optimize base16 encoding (#7274)

309e486

* Optimize base16 encoding

zachmargolis mentioned this pull request Jan 6, 2023

LG-8060 | Timing WIP #7598

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize base16 encoding#7274

Optimize base16 encoding#7274
ThatSpaceGuy merged 7 commits intomattw/just-jwesfrom
margolis-optimize-base16-encode

zachmargolis commented Nov 1, 2022

Uh oh!

zachmargolis Nov 1, 2022

Uh oh!

zachmargolis commented Nov 2, 2022 •

edited

Loading

Uh oh!

zachmargolis commented Nov 2, 2022

Uh oh!

zachmargolis commented Nov 2, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants


		# The IRS has requested data be encoded this way. Loosely emulate the Base64 class.

		def self.encode16(str)

Conversation

zachmargolis commented Nov 1, 2022

Uh oh!

zachmargolis Nov 1, 2022

Choose a reason for hiding this comment

Uh oh!

zachmargolis commented Nov 2, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zachmargolis commented Nov 2, 2022

Uh oh!

zachmargolis commented Nov 2, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

zachmargolis commented Nov 2, 2022 •

edited

Loading