Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Would a smallvec optimization be useful? #88

Open
adeschamps opened this issue Feb 7, 2018 · 4 comments
Open

Would a smallvec optimization be useful? #88

adeschamps opened this issue Feb 7, 2018 · 4 comments

Comments

@adeschamps
Copy link

Here are the results of a failed experiment.

Currently, the Position type is defined as type Position = Vec<f64>;. Since the majority of instances of Position are going to be 2D or 3D points, this seems like a good candidate for a SmallVec optimization.

The memory use (stack, heap, and total) for 2D and 3D points with various storage types are:

Vec SmallVec<[f64; 2]> SmallVec<[f64; 3]>
2D 24s + 16h = 40 32s + 0h = 32 40s + 0h = 40
3D 24s + 24h = 48 32s + 24h = 56 40s + 0h = 40

So, compared to Vec, a SmallVec with a backing store of [f64; 3] is the same size for 2D points, smaller for 3D points, and never uses the heap. If you go past 3 elements, then it heap allocates and there is a constant overhead of 16 bytes per point, but I think that's a rare use case.

This seemed like a no-brainer, so I did some benchmarks, and unfortunately it didn't make a noticeable difference. If you're interested, you can check out the smallvec branch of my fork. Note that the unit tests don't pass, since it's a breaking API change.

# Benchmark current implementation:
$ cargo bench --bench encode --bench decode
# Benchmark with type Position = SmallVec<[f64; 3]>:
$ cargo bench --bench encode --bench decode --features smallvec

In conclusion, it would be a breaking API change for no apparent gain, but since I measured it I figured it's worth sharing.

@adeschamps
Copy link
Author

For what it's worth, I'm curious why there was no apparent performance gain. My best guess is that when lots of small allocations are made in quick succession you still end up with good cache locality, so the benefits of smallvec are kind of null.

@KodrAus
Copy link
Contributor

KodrAus commented Feb 12, 2018

Thanks for sharing this @adeschamps! I guess we'll need to give it a prod with perf to see what proportion of the time is spent allocating Vecs for positions.

Your changes do highlight something important though: we probably shouldn't use a type alias for Position. We should be able to change the underlying storage without requiring a breaking change.

@frewsxcv
Copy link
Member

closing in favor of #130 and #131 which has some promising performance improvements

@michaelkirk
Copy link
Member

michaelkirk commented Aug 26, 2022

I saw some modest improvements (~5-10%) to serialization/deserialization with my tinyvec branch. And much bigger improvements (78%) with strictly geometry parsing, which I suppose makes sense, since this only affects geometry parsing.

bench: `type Postion=TinyVec<[f64; 2]>`
parse (countries.geojson)                                                                                                                                                    
                        time:   [1.6394 ms 1.6474 ms 1.6557 ms]                                                                                                              
                        change: [-14.271% -13.882% -13.521%] (p = 0.00 < 0.05)                                                                                               
                        Performance has improved.                                                                                                                            
                                                                                                                                                                             
FeatureReader::features (countries.geojson)                                                                                                                                  
                        time:   [5.6254 ms 5.6481 ms 5.6772 ms]                                                                                                              
                        change: [-3.8240% -3.4138% -2.9515%] (p = 0.00 < 0.05)                                                                                               
                        Performance has improved.                                                                                                                            
Found 11 outliers among 100 measurements (11.00%)                                                                                                                            
  5 (5.00%) high mild                                                                                                                                                        
  6 (6.00%) high severe                                                                                                                                                      
                                                                                                                                                                             
FeatureReader::deserialize (countries.geojson)                                                                                                                               
                        time:   [6.9135 ms 6.9234 ms 6.9338 ms]                                                                                                              
                        change: [-3.4371% -3.2617% -3.0886%] (p = 0.00 < 0.05)                                                                                               
                        Performance has improved.                                                                                                                            
Found 5 outliers among 100 measurements (5.00%)                                                                                                                              
  4 (4.00%) high mild                                                                                                                                                        
  1 (1.00%) high severe                                                                                                                                                      
                                                                                                                                                                             
FeatureReader::deserialize to geo-types (countries.geojson)                                                                                                                  
                        time:   [6.9075 ms 6.9218 ms 6.9378 ms]                                                                                                              
                        change: [-3.8702% -3.6317% -3.3951%] (p = 0.00 < 0.05)                                                                                               
                        Performance has improved.                                                                                                                            
Found 8 outliers among 100 measurements (8.00%)                                                                                                                              
  6 (6.00%) high mild                                                                                                                                                        
  2 (2.00%) high severe                                                                                                                                                      
                                                                                                                                                                             
parse (geometry_collection.geojson)                                                                                                                                          
                        time:   [15.260 us 15.277 us 15.296 us]                                                                                                              
                        change: [-7.0382% -6.8279% -6.6148%] (p = 0.00 < 0.05)                                                                                               
                        Performance has improved.                                                                                                                                           
                                                                                                                                                                             
     Running benches/serialize.rs (target/release/deps/serialize-f64dc76123159536)                                                                                           
Gnuplot not found, using plotters backend                                                                                                                                    
serialize geojson::FeatureCollection struct (countries.geojson)                                                                                                              
                        time:   [2.9583 ms 2.9651 ms 2.9724 ms]                                                                                                              
                        change: [-6.8222% -6.5692% -6.3168%] (p = 0.00 < 0.05)                                                                                               
                        Performance has improved.                                                                                                                            
Found 1 outliers among 100 measurements (1.00%)                                                                                                                              
  1 (1.00%) high mild                                                                                                                                                        
                                                                                      
serialize custom struct (countries.geojson)                                                                                                                                  
                        time:   [2.2263 ms 2.2307 ms 2.2356 ms]                       
                        change: [-8.0275% -7.8191% -7.5871%] (p = 0.00 < 0.05)        
                        Performance has improved.                                     
Found 11 outliers among 100 measurements (11.00%)                                                                                                                            
  8 (8.00%) high mild                                                                 
  3 (3.00%) high severe                                                               
                                                                                      
serialize custom struct to geo-types (countries.geojson)                                                                                                                     
                        time:   [2.2887 ms 2.2928 ms 2.2970 ms]                       
                        change: [-15.768% -15.578% -15.378%] (p = 0.00 < 0.05)        
                        Performance has improved.  

Geometry parsing

quick_collection        time:   [72.078 us 72.556 us 73.132 us]                                                                                                              
                        change: [-78.177% -78.042% -77.875%] (p = 0.00 < 0.05)        
                        Performance has improved.                           

Our benches aren't necessarily representative of real-world-use, but I think this is worth re-investigating.

One reason to consider it separately from #130, #131, is that tinyvec (and probably small_vec, I don't really know the difference) is pretty much a drop-in replacement. It would still be a breaking change for anyone relying on the type of Position to be a Vec, but in terms of functionality, people can still parse GeoJSON with 17-Dimensional points or whatever 🙃.

I think ultimately a more targeted approach where someone can say "I'm only ever dealing with 2-d geometry, so just use [f64; 2]" could be optimal for those cases. And indeed, I tested that too, and I saw a bit additional improvement beyond what was possible with tinyvec (another 1-3%) and, again, a larger improvement (another 30%) for strictly geometry parsing), but I feel like with tiny_vec we avoid most of the cost (code changes are straight forward, no loss in functionality) and we get most of the speedup.

bench: `type Postion = [f64; 2]` vs. tiny_vec
parse (countries.geojson)                                                                                                                                                    
                        time:   [1.6047 ms 1.6109 ms 1.6167 ms]                                                                                                              
                        change: [-2.5399% -2.1493% -1.7514%] (p = 0.00 < 0.05)                                                                                               
                        Performance has improved.  
Found 11 outliers among 100 measurements (11.00%)                                                                                                                            
  7 (7.00%) low mild                                                                                                                                                         
  2 (2.00%) high mild                                                                                                                                                        
  2 (2.00%) high severe                                                                                                                                                      
                                                                                                                                                                             
FeatureReader::features (countries.geojson)                                                                                                                                  
                        time:   [5.6560 ms 5.6640 ms 5.6725 ms]                                                                                                              
                        change: [-0.2560% +0.2813% +0.7181%] (p = 0.27 > 0.05)                                                                                               
                        No change in performance detected.                                                                                                                   
Found 10 outliers among 100 measurements (10.00%)                                                                                                                            
  10 (10.00%) high mild                                                                                                                                                      
                                                                                                                                                                             
FeatureReader::deserialize (countries.geojson)                                                                                                                               
                        time:   [6.8874 ms 6.9013 ms 6.9162 ms]                                                                                                              
                        change: [-0.5756% -0.3192% -0.0535%] (p = 0.01 < 0.05)                                                                                               
                        Change within noise threshold.                                                                                                                       
Found 8 outliers among 100 measurements (8.00%)                                                                                                                              
  7 (7.00%) high mild                                                                                                                                                        
  1 (1.00%) high severe                                                                                                                                                      
                                                                                                                                                                             
FeatureReader::deserialize to geo-types (countries.geojson)                                                                                                                  
                        time:   [6.8948 ms 6.9070 ms 6.9204 ms]                                                                                                              
                        change: [-0.4976% -0.2135% +0.0816%] (p = 0.15 > 0.05)                                                                                               
                        No change in performance detected.                                                                                                                   
Found 7 outliers among 100 measurements (7.00%)                                                                                                                              
  5 (5.00%) high mild                                                                                                                                                        
  2 (2.00%) high severe                                                                                                                                                      
                                                                                                                                                                             
parse (geometry_collection.geojson)                                                                                                                                          
                        time:   [15.168 us 15.186 us 15.206 us]                                                                                                              
                        change: [-1.0050% -0.7554% -0.5476%] (p = 0.00 < 0.05)                                                                                               
                        Change within noise threshold.                                                                                                                       
Found 13 outliers among 100 measurements (13.00%)                                                                                                                            
  5 (5.00%) low severe                                                                                                                                                       
  2 (2.00%) high mild                                                                                                                                                        
  6 (6.00%) high severe                                                                                                                                                      
                                                                                                                                                                             
     Running benches/serialize.rs (target/release/deps/serialize-f64dc76123159536)  
serialize geojson::FeatureCollection struct (countries.geojson)                                                                                                              
                        time:   [2.8957 ms 2.9020 ms 2.9086 ms]                                                                                                              
                        change: [-2.4377% -2.1261% -1.8135%] (p = 0.00 < 0.05)                                                                                               
                        Performance has improved.                                                                                                                            
Found 14 outliers among 100 measurements (14.00%)                                                                                                                            
  9 (9.00%) low mild                                                                                                                                                         
  3 (3.00%) high mild                                                                                                                                                        
  2 (2.00%) high severe                                                                                                                                                      
                                                                                                                                                                             
serialize custom struct (countries.geojson)                                                                                                                                  
                        time:   [2.2066 ms 2.2115 ms 2.2170 ms]                                                                                                              
                        change: [-1.1709% -0.8626% -0.5750%] (p = 0.00 < 0.05)                                                                                               
                        Change within noise threshold.                                                                                                                       
Found 7 outliers among 100 measurements (7.00%)                                                                                                                                3 (3.00%) high mild                                                                                                                                                          4 (4.00%) high severe                                                               
                                                                                      
serialize custom struct to geo-types (countries.geojson)                                                                                                                     
                        time:   [2.2413 ms 2.2436 ms 2.2459 ms]                                                                                                              
                        change: [-2.3487% -2.1464% -1.9416%] (p = 0.00 < 0.05)                                                                                               
                        Performance has improved.                                     
Found 6 outliers among 100 measurements (6.00%)                                                                                                                              
  3 (3.00%) low mild                                                                  
  3 (3.00%) high mild                                                                                                                                                                                                                                                                                                                                          Running benches/to_geo_types.rs (target/release/deps/to_geo_types-1acf3672c00f0e60)                                                                                     Gnuplot not found, using plotters backend                                                                                                                                    quick_collection        time:   [51.479 us 51.549 us 51.630 us]                                                                                                              
                        change: [-29.677% -29.207% -28.760%] (p = 0.00 < 0.05)                                                                                               
                        Performance has improved.                                                                                                                            
Found 5 outliers among 100 measurements (5.00%)                                                                                                                              
  3 (3.00%) high mild                                                                                                                                                        
  2 (2.00%) high severe                                                                                                                                                      
                             

@michaelkirk michaelkirk reopened this Aug 26, 2022
@michaelkirk michaelkirk mentioned this issue Sep 2, 2022
2 tasks
bors bot added a commit that referenced this issue Sep 6, 2022
206: stream reading features r=frewsxcv a=michaelkirk

- [x] I agree to follow the project's [code of conduct](https://github.com/georust/geo/blob/master/CODE_OF_CONDUCT.md).
- [x] I added an entry to `CHANGES.md` if knowledge of this change could be valuable to users.
---

This greatly reduces the peak memory usage needed to parse a GeoJSON FeatureCollection by utilizing the FeatureIterator's trick of seeking to the `"features": [` array without actually parsing the FeatureCollection. Note there are [some robustness issues](#207) with this approach, some of which can be improved, but overall I think it's worth the trade-off.

There is a small (0-1%) CPU regression in our deserialization bench due to this change, but the peak memory profile of the `stream_reader_writer` examples decreased 90% - from 2.22MiB to 237Kib.

Before:
<img width="1624" alt="Screen Shot 2022-09-02 at 2 42 04 PM" src="https://user-images.githubusercontent.com/217057/188239657-ade76677-f3e2-4750-b71d-bed0effbe215.png">

After:
<img width="1624" alt="Screen Shot 2022-09-02 at 2 40 29 PM" src="https://user-images.githubusercontent.com/217057/188239645-5274e0ff-71b2-4da6-8ac9-1f72d288c2bb.png">

Notice that overall memory allocations stayed the same (see #88 (comment)), but the peak usage is much lower since we never hold the entire data structure in memory at once.


Co-authored-by: Michael Kirk <[email protected]>
@michaelkirk michaelkirk changed the title Would a smallvec optimization be useful? (my benchmarks say it wouldn't) Would a smallvec optimization be useful? Feb 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants