-
-
Notifications
You must be signed in to change notification settings - Fork 35.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
exr piz wavelet decompression #13346
Conversation
This is awesome! I am concerned it is slow code though, this function, wdec14, is in an inner loop and does two new calls: https://github.com/mrdoob/three.js/pull/13346/files#diff-8c3168ce268d9c60ad2c1e022e579e3aR307 Remember that you can create views into the same data set. It may be possible to use a preallocated view and just access it repeatedly. you can also create multiple views of the same data set. I believe that such a strategy will speed this code up like 10x. Same with parseUint16 - use a preallocated view and then calculate offsets into it. It should be straight forward to implement this and a huge time saving. Also remember there is already a half conversion function here that was used for HDR half conversion: https://github.com/mrdoob/three.js/blob/master/examples/js/loaders/HDRCubeTextureLoader.js#L27 But I guess this is the wrong way. There was a half class a while back, but I guess it got refactored out along the way, which is too bad. |
examples/js/loaders/EXRLoader.js
Outdated
|
||
if ( EXRHeader.channels[ channelID ].pixelType == 1 ) { | ||
var val = parseFloat16( buffer, offset ); | ||
var cOff = channelOffsets[ EXRHeader.channels[ channelID ].name ]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move cOff outside of inner loop.
examples/js/loaders/EXRLoader.js
Outdated
// HALF | ||
for ( var x = 0; x < width; x ++ ) { | ||
|
||
var cOff = channelOffsets[ EXRHeader.channels[ channelID ].name ]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move cOff outisde of inner loop.
Hey @bhouston, thank you for your comments! I have taken your advice and refactored in a few places to either use a function for conversion, or preallocated a DataView in an effort to improve the performance. Using your suggestions, I have reduced the parsing time from ~2149ms to ~327ms. Even more can be done, but I think this puts us in acceptable performance bracket to start with. Regarding the shared half conversion, I definitely agree we don't want to repeat this type of code all over the place, and that we should put this into some kind of class or helper tool chain, but perhaps that can come in a follow up refactor, since it would require also touching the code in other examples? |
One more thought: as it stands, I am passing around ArrayBuffer/DataView along with a separate offset. I think this could be cleaned up even more by introducing a small object to carry the buffer and offset together, which hopefully I can do in a clean up refactor follow up. |
Nice performance improvement. I would go a little further and instead replace this pattern (Which applies to others such as parseFloat16, etc):
Replace the above with the following preallocated DataView strategy. It should remove nearly all allocations while being fairly simple code.
This will get rid of nearly all the rest of the allocations.... |
Sure, but I wouldn't worry about it. Creating objects is the slowest thing in JavaScript by far. In fact often destructuring temporary objects into just individual primitive types is an optimization in JavaScript. Thus I would not try to clean up the code by introducing temporary JavaScript objects, even if they are created like You basically never want to touch the GC when writing performance oriented JavaScript code -- which is what all my feedback is in regards to. |
Went back over the code again, and I've got the time down yet again to ~240ms, from ~327ms. I've got a single preallocated Note the profiler seems to show things a bit slower than the actual run time. I saw the 🔥hot path is actually around I also explored some variations where instead of doing |
I know I am being annoying but two more changes will basically fix the last remaining memory issues: Replace this pattern with this:
The above pattern of passing in a single allocated object to get the results of the call, instead of returning a new JAvaScript object on each innovation should be a big speed up. This pattern can be applied to wdec15, getChar and getCode -- all of which are in your inner loops. |
The last remaining issues for memory are these - these are just killer costly and unnecessary: three.js/examples/js/loaders/EXRLoader.js Line 308 in 090ce2f
three.js/examples/js/loaders/EXRLoader.js Line 278 in 090ce2f
three.js/examples/js/loaders/EXRLoader.js Line 145 in 090ce2f
three.js/examples/js/loaders/EXRLoader.js Line 335 in 090ce2f
|
I see 'em, will fix 👍 |
Updated! |
@bhouston looks good? |
Hoping to follow this one up using this to implement IBL with latlong/equiangular EXR HDR light probes! |
Will merge. If @bhouston finds something we can tweak afterwards. |
Thanks! |
Adds support for reading PIZ wavelet compressed EXR images (in addition to the previously supported uncompressed reader.) It's been quite the task, but I have worked back from the C/C++ reference OpenEXR and TinyEXR implementations, making the appropriate changes and translations into Javascript. (Ever converted pointer arithmetic to a language that only has floating point data type?! 😨) I'd like to keep following this up with better handling for various channel configs, 32bit floating point, etc. and more edge cases, but I think this is useful enough to check it in, and hopefully others can also help out, since the major hurdle is passed.
@mrdoob @WestLangley
related #10652