Skip to content
This repository has been archived by the owner on Jan 20, 2022. It is now read-only.

Bumps version of other projects. Changes to compile with latest scalding... #533

Merged
merged 1 commit into from
Jul 13, 2014

Conversation

ianoc
Copy link
Collaborator

@ianoc ianoc commented Jul 13, 2014

....

jcoveney added a commit that referenced this pull request Jul 13, 2014
Bumps version of other projects. Changes to compile with latest scalding...
@jcoveney jcoveney merged commit eff3460 into develop Jul 13, 2014
@jcoveney jcoveney deleted the bumpVersions branch July 13, 2014 02:13
snoble pushed a commit to snoble/summingbird that referenced this pull request Sep 8, 2017
* Update CMSBenchmark for performance testing.

This commit copies the previous benchmark to TopCMSBenchmark (since
it was actually using the TopCMS implementation) and then converts
CMSBenchmark to benchmark CMS itself. It also standardizes and
simplifies the code a little bit to make it easier to see what's
happening, and removes some unused benchmark parameters.

On the author's machine (MBP retina), here are the current benchmark
results:

[info] Benchmark                         (delta)  (eps)  (size)   Mode  Cnt    Score     Error  Units
[info] CMSBenchmark.sumLargeBigIntCms  0.0000001    0.1    1000  thrpt    5   49.939 ±   1.152  ops/s
[info] CMSBenchmark.sumLargeBigIntCms  0.0000001  0.005    1000  thrpt    5   48.309 ±   2.453  ops/s
[info] CMSBenchmark.sumLargeStringCms  0.0000001    0.1    1000  thrpt    5   50.682 ±  14.959  ops/s
[info] CMSBenchmark.sumLargeStringCms  0.0000001  0.005    1000  thrpt    5   51.256 ±   3.108  ops/s
[info] CMSBenchmark.sumSmallBigIntCms  0.0000001    0.1    1000  thrpt    5  556.247 ±  34.430  ops/s
[info] CMSBenchmark.sumSmallBigIntCms  0.0000001  0.005    1000  thrpt    5  377.775 ±  34.519  ops/s
[info] CMSBenchmark.sumSmallLongCms    0.0000001    0.1    1000  thrpt    5  594.712 ±  26.725  ops/s
[info] CMSBenchmark.sumSmallLongCms    0.0000001  0.005    1000  thrpt    5  449.726 ± 110.672  ops/s

* Optimize the CMS implementation and machinery.

This commit does a number of things:

 1. Split CMSHasher into its own file
 2. Optimize CMS monoid's .sum and .sumOption
 3. Reduce allocations for +, ++, frequency, etc.
 4. Fix CMS law-checking so it runs during testing.
 5. Faster CMS hashing functions.

Running on the author's machine (MBP retina), after this commit the
CMS benchmarks are:

[info] Benchmark                         (delta)  (eps)  (size)   Mode  Cnt     Score      Error  Units
[info] CMSBenchmark.sumLargeBigIntCms  0.0000001    0.1    1000  thrpt    5   626.961 ±   32.379  ops/s
[info] CMSBenchmark.sumLargeBigIntCms  0.0000001  0.005    1000  thrpt    5   573.082 ±  186.829  ops/s
[info] CMSBenchmark.sumLargeStringCms  0.0000001    0.1    1000  thrpt    5   173.149 ±   64.034  ops/s
[info] CMSBenchmark.sumLargeStringCms  0.0000001  0.005    1000  thrpt    5   146.868 ±  136.613  ops/s
[info] CMSBenchmark.sumSmallBigIntCms  0.0000001    0.1    1000  thrpt    5  1369.887 ±  188.736  ops/s
[info] CMSBenchmark.sumSmallBigIntCms  0.0000001  0.005    1000  thrpt    5  1144.827 ±  238.539  ops/s
[info] CMSBenchmark.sumSmallLongCms    0.0000001    0.1    1000  thrpt    5  7998.298 ± 6520.702  ops/s
[info] CMSBenchmark.sumSmallLongCms    0.0000001  0.005    1000  thrpt    5  4708.305 ± 1749.729  ops/s

* Clean up TopCMS benchmark.

This removes the (now unnecessary) import for CMSHasher instances.
It also cleans things up a little bit.

* improve TopCMS performance

* Respond to review comments.

Specifically, this removes some dead code (which we had added
to tests), fixes an out-of-date comment, and reduces the range
of integers we generate to ensure more collisions.

* Revert to the older CMSHasher[BigInt] implementation.

I still would like to improve this, but will do so in a follow-on PR.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants