Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Add a benchmark. #235

Open
martinfrances107 opened this issue Apr 18, 2021 · 5 comments
Open

Proposal: Add a benchmark. #235

martinfrances107 opened this issue Apr 18, 2021 · 5 comments

Comments

@martinfrances107
Copy link
Contributor

In my spare time I am porting this module over to rust ( rustlang ) so I can have a wasm binary that I run in the browser.

As I work I am judging my process by how many test I have ported over .. but now I need something else... I need to start profiling. ( it is still very much pre-alpha level code but it is showing signs of working.)

What I need is a benchmark .. that is

The simplest code that exercises the largest amount of code paths a produces a number so that I can measure improvement or degradations in performance.

Also what I need is a javascript equivalent for comparison.

So , I have converted this tutorial

https://www.d3-graph-gallery.com/graph/backgroundmap_canvas_basic.html

to run d3-geo with the orthographic projection.

I have a PR .. that I am about to post.

In terms of discussion ... perhaps future work will have the projection selected from a pull-down menu

I am open to suggestions .. If anyone has an idea for other benchmarks ... I am wiling to do the work.

Currently the RenderTime on my laptop for the benchmark is 64ms

If the rust code runs 10 times faster, as I hope, then we can have a detailed globe spinning well below the 16ms needed for jank free animation.

martinfrances107 added a commit to martinfrances107/d3-geo that referenced this issue Apr 18, 2021
martinfrances107 added a commit to martinfrances107/d3-geo that referenced this issue Apr 18, 2021
martinfrances107 added a commit to martinfrances107/d3-geo that referenced this issue Apr 18, 2021
@jrus
Copy link
Contributor

jrus commented Apr 19, 2021

The time spent in projecting a map is a combination of function call overhead, expensive arithmetic (mostly a lot of trigonometric function evaluations), and probably some branch mispredictions. Some parts of the code can be more careful about avoiding unnecessary branches, but I would expect the other two to be more significant.

If you want to speed everything up, you probably first want to re-architect the code to do tight loops of pure arithmetic over large homogeneous arrays of data, re-using already allocated blocks of memory whenever possible. Switching programming languages may make that more convenient, but it can also be done in javascript if you are careful and write in a low-level C style (the javascript JITs should usually do a pretty good job with such code). But it is a substantially different code style than the current "stream" concept.

Breaking data into chunks and processing them in parallel on different CPU cores would also speed everything up roughly linearly with the number of cores you have available. (Unfortunately SharedArrayBuffer has been disabled in some browsers for security reasons.) I’m not sure what the latest wasm multicore story is.

Second, once you have cut the overhead down, you can save most of the trigonometry (in some map projections a bit of trig is still unavoidable, but fast approximate trig functions could be used) by switching from using longitude/latitude to using stereographic coordinates as a canonical internal data representation (I have been working out the details in the past couple years at https://observablehq.com/@jrus/planisphere). You can save a large amount of computation on rotation, interpolation, clipping, etc.

Once wasm reliably supports SIMD, adopting SIMD could speed some parts of the pipeline by another 2x or more, depending on how wide the SIMD support is. But every other kind of optimization should theoretically be accessible from plain javascript.


If you just want a spinning globe in orthographic projection and don't need arbitrary map projections etc., you can make something more specialized than d3-geo which will be a whole lot faster. You can use lower precision arithmetic, switch everything to 3d cartesian coordinates, and do rotation by a matrix product. You could even use a GPU.

@martinfrances107
Copy link
Contributor Author

I think I should start with a alternate explanation of why I think the benchmark is valid.

I want to spend 90% of my time writing application level code...

d3 is a great eco-system for this .. from application to application I say can just bolt one of the many other packages as needed.

( I am simultaneously porting
https://github.com/martinfrances107/rust_d3_delaunay
https://github.com/martinfrances107/rust_d3_geo
https://github.com/martinfrances107/rust_d3_geo_voronoi
)

As you say I could refactor the loops, suffer the browser incompatibilities of shared buffer arrays. and write GPU centric code ... but I that goes against the spirit of a rapid development.

Let me express my expectations ...

if I port to rust then my expectation is to a raw x10 speedup...
but my experience is different the FFI between rust and javascript is a major bottleneck

if for example as I think will happen in this benchmark : -
Approximately 240kb will be passed down into the rust library via the shared memory buffer to rust and then 240kb of data will be passed up again via the shared memory buffer ..

Well If you don't take the time to mitigate the bottlenecks then speedup will drop to only x2.
( if the output is via a "2d" context rather than rendering to SVG the data can be drawn on the screen without the return data path)

From my blinkered rapid-development mindset .. if I can rewrite a small part of my call to call the equivalent FFI function ... and just get a doubling in speed. then I don't care about the waste. So even writing to an SVG and suffering the bottleneck is maybe valid.

I can only do that will confidence if I know there is : -

The API's look identical ..
The underlying implementations are s close a possible.
A one-to-one mapping between d3's javascript tests and the rust equivalent.

Anyway that is what I mean by port.

@jrus
Copy link
Contributor

jrus commented Apr 19, 2021

Sorry, I didn’t mean to suggest that making a benchmark is ‘invalid’. Indeed it sounds pretty useful: Go for it!

I would be pretty surprised if you see anything close to 10x improvement if you port the code 1:1 to Rust. You might even see a slowdown. But indeed, try and see!

If you want to significantly beat the current version, I think the place to start is putting all of the data in a single large homogeneous data array (think of a numpy or Matlab type array here), and then doing pure arithmetic in simple loops. You'll save overhead from chasing down pointers, allocating small arrays, nested function calls in the inner loops, and so on. I could imagine getting a several times speedup from that, though perhaps not 10x. (Again, someone would have to try to find out.) This could be done from Javascript if you write in a low-level C style; Rust is probably a more convenient language for writing at higher level and still emitting tight low-overhead assembly for numerical workloads. (It’s also a lot easier in a compiled language to avoid some of the traps that can accidentally make javascript jump out of the fully JITed version and start running an order of magnitude slower than necessary.)

After that the bottleneck becomes expensive arithmetic, mostly a bunch of trigonometry.

Fully-JITed Javascript can be pretty darn fast though. See e.g. this recent thing, https://surma.dev/things/js-to-asc/

@martinfrances107
Copy link
Contributor Author

The benchmark I want to add to this module ready...

#236

I really don't want to refactor my set of applications into performance tuned loops...

Can I speak up for the modular approach of the cached stream pipeline

  projection.stream = function(stream) {
    return cache && cacheStream === stream ? cache : cache = transformRadians(transformRotate(rotate)(preclip(projectResample(postclip(cacheStream = stream)))));
  };

src/projection/index.js

I think this is a well considered approach ( all credit to whoever wrote it ) and once you unravelled it a bit and can see the in-depth molecularity it becomes easy to maintain and debug ... it looks like enterprise grade library code.
I have learnt to appreciate easy to maintain and debug over performance ... it is easier to trust it.

I appreciate good code like this takes hundreds of hours to get right...
So if I port that code .. I am getting that goodness for free.

Regarding rust

If you read anything about rust and its zero cost abstraction ...which is new compared to C

If all type information is determined statically .. ie known at compile time .. then the compiler's optimiser can remove the branch checks on certain "if-then" statements.

I just want to say the modular approach fits in really nicely with the "zero-cost" language improvement.
So I am expecting good performance and good readability

@jrus
Copy link
Contributor

jrus commented Apr 20, 2021

easy to maintain and debug ... it looks like enterprise grade library code.
I have learnt to appreciate easy to maintain and debug over performance

Yeah, Mike Bostock makes excellent modular code, and d3-geo is certainly elegant, carefully thought-out, and pleasant to work with.

It’s just a different style than often seen in e.g. scientific computing, image processing, or video game code where people are aiming for peak numerical performance or have tight per-frame compute budgets.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants