-
-
Notifications
You must be signed in to change notification settings - Fork 473
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SIMD integration #217
Comments
@phaazon are you still interested in doing this? I think @sebcrozet has been busy lately, but I would not mind looking at your code and eventually merging it. I assume, this is not going to change any public API, but only change implementation details to make use of SIMD, right? It would be nice to get a bit more into detail what you have in mind. |
Hi. No, I’m not. I should have updated the issue. nalgebra has changed too much and I don’t like the new design (the infinite type aliasing based on |
I've actually been working a lot on performances of nalgebra lately (so much that I did not maintain very actively existing issue/PRs). My observations is that explicit SIMD integration is currently not worth the effort so I'd prefer to postpone this until SIMD becomes stable in rust. I'll communicate more about this next week but the next major version of nalgebra will be as fast as the SIMD version of cgmath (except that we don't need to use SIMD intrinsics nor the |
I’ll be waiting for the benchmarks ;) |
What is the un-optimized performance like though? The advantage of explicit simd might be that debug builds might be faster. |
@brendanzab You're right, the performance difference is significant for debug builds (SIMD cgmath is faster): test lowdim::inverse::mat2_inverse_cgmath ... bench: 618 ns/iter (+/- 24)
test lowdim::inverse::mat2_inverse_na ... bench: 2,293 ns/iter (+/- 1,027)
test lowdim::inverse::mat3_inverse_cgmath ... bench: 2,282 ns/iter (+/- 768)
test lowdim::inverse::mat3_inverse_na ... bench: 4,793 ns/iter (+/- 853)
test lowdim::inverse::mat4_inverse_cgmath ... bench: 13,300 ns/iter (+/- 404)
test lowdim::inverse::mat4_inverse_na ... bench: 17,899 ns/iter (+/- 2,758)
test lowdim::product::mat2_mul_mat2_cgmath ... bench: 848 ns/iter (+/- 156)
test lowdim::product::mat2_mul_mat2_na ... bench: 5,468 ns/iter (+/- 426)
test lowdim::product::mat2_mul_vec2_cgmath ... bench: 499 ns/iter (+/- 111)
test lowdim::product::mat2_mul_vec2_na ... bench: 2,924 ns/iter (+/- 318)
test lowdim::product::mat3_mul_mat3_cgmath ... bench: 1,857 ns/iter (+/- 27)
test lowdim::product::mat3_mul_mat3_na ... bench: 12,486 ns/iter (+/- 1,639)
test lowdim::product::mat3_mul_vec3_cgmath ... bench: 636 ns/iter (+/- 12)
test lowdim::product::mat3_mul_vec3_na ... bench: 4,378 ns/iter (+/- 165)
test lowdim::product::mat4_mul_mat4_cgmath ... bench: 2,126 ns/iter (+/- 639)
test lowdim::product::mat4_mul_mat4_na ... bench: 26,624 ns/iter (+/- 5,408)
test lowdim::product::mat4_mul_vec4_cgmath ... bench: 1,123 ns/iter (+/- 108)
test lowdim::product::mat4_mul_vec4_na ... bench: 7,282 ns/iter (+/- 1,195)
test lowdim::product::vec2_dot_vec2_cgmath ... bench: 155 ns/iter (+/- 27)
test lowdim::product::vec2_dot_vec2_na ... bench: 982 ns/iter (+/- 113)
test lowdim::product::vec3_dot_vec3_cgmath ... bench: 121 ns/iter (+/- 29)
test lowdim::product::vec3_dot_vec3_na ... bench: 1,439 ns/iter (+/- 133)
test lowdim::product::vec4_dot_vec4_cgmath ... bench: 119 ns/iter (+/- 17)
test lowdim::product::vec4_dot_vec4_na ... bench: 1,898 ns/iter (+/- 183) Here is the same benchmark with optimizations turned on:
I will post the source code of those benchmarks at the same time as my communication next week. |
Okay, given that optimizations take care of this any way, I would question, whether the additional code complexity is worth adding the explicit SIMD code. |
@aepsil0n I don't think the additional code complexity is worth it for now. I'm closing this for now. We might want to reexplore the question when SIMD becomes stable in Rust. Here is the communication I was mentioning earlier in this thread: #274 . |
Hey, not sure if it's alright to revive this issue or if I should be posting a new one, but I wanted to poke this topic after this much time has passed. I've recently run some benchmarks that have appeared earlier in this thread again, and the performance difference between this crate and cgmath is still staggering: Especially on opt level 0, the difference is pretty huge. I've been working with the amethyst engine a bit, and we've been getting pretty significant performance drops in debug mode, which is probably related to this. SIMD hasn't fully stabilized, but it's come a long way and I feel like it would be time to take another look at this. |
Thank you for sharing your concerns @happenslol! I have created a new issue to discuss this. |
Ohai!
Are you interested in having a SIMD support? Because I truly am, and I might work on it and push a PR if you feel it’s an interesting feature to add to nalgebra.
The text was updated successfully, but these errors were encountered: