Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pr/batched gen ops #1413

Merged

Conversation

AlexMWells
Copy link
Contributor

Description

Implement batched llvm code generation for ops:

  • andor
  • bitwise_binary_op
  • clamp
  • get_simple_SG_field
  • isconstant
  • select
  • unary_op
  • mix

NOTE: clamp maybe unreachable as stdosl.h removed the builtin in 2010, but unless clamp is removed an operation through all of oso and scalar implementation, thought it should be here.

NOTE: andor maybe unreachable because "and" and "or" built-ins were removed in commit f7e8ed3, May 2011, release 0.5.4., but unless "and" and "or" is removed an operation through all of oso and scalar implementation, thought it should be here.

Fixed bug in BatchedAnalysis where the complement operator was being treated as always having a boolean result. In reality the result of complement or other bitwise operation would only be boolean if its input parameters were forced to be boolean (left TODO note for future improvement).

Fixed bug in printf where integer who is forced_llvm_bool() is formatted as a float was not converted to an integer first.

Tests

Enabled BATCHED for execution for tests:

  • and-or-not-synonyms
  • isconstant
  • logic
  • select

Expanded testsuite/shaderglobals to exercise/access all shader global data members.

Add regression tests:

  • andor-reg
  • bitwise-and-reg
  • bitwise-or-reg
  • bitwise-shl-reg
  • bitwise-shr-reg
  • bitwise-xor-reg
  • complement-reg
  • mix-reg
  • select-reg

Checklist:

  • I have read the contribution guidelines.
  • I have previously submitted a Contributor License Agreement.
  • I have updated the documentation, if applicable.
  • I have ensured that the change is tested somewhere in the testsuite (adding new test cases if necessary).
  • My code follows the prevailing code style of this project.

Added batched llvm code gen for: llvm_gen_blackbody, llvm_gen_luminance, llvm_gen_transformc (including code gen of loop to identify unique "from" and "to" space combinations to call library function with).

Made ColorSystem const correct.
Pulled much of ColorSystem's implementation into src/liboslexec/opcolor_impl.h so it could be inlined in both opcolor.cpp and wide_opcolor.cpp.
To enable SIMD fast path for ColorSystem::blackbody_rgb added new methods to expose underlying lookup table optimization.
    bool can_lookup_blackbody(float T /*Kelvin*/) const;

    Color3 lookup_blackbody_rgb (float T /*Kelvin*/) const;

    Color3 compute_blackbody_rgb (float T /*Kelvin*/) const;
When can_lookup_blackbody() returns true
then lookup_blackbody_rgb can be safely called,
otherwise the expensive compute_blackbody_rgb is required.

Updated implementation of ColorSystem to succesfully vectorize on multiple compilers, optimized lookup table data types and indexed dereferences to generate prefered 32bit indexed gathers vs 64bit gathers which require multiple instructions and registers.

To improve code generation and less chances of aliasing issues, changed nested M[i][j] operator access to Matrix in ImathMatrix.h to directly access underlying 2d array M.x[i][j] also changed Vector operator V[i] to directly access V.x, V.y, V.z in OSL/IMathx/IMath*.h

Moved helper testIfAnyLaneIsNonZero from wide_opmatrix.cpp to be testIfAnyLaneIsNonZero in OSL/wide.h

Added sfm::min_val to provide an implementation of min which returns values not references to the original through a ternary (which can cause issues with vectorization).  Added Clang specific sfm::min_val and sfm::max_val implementations to assist in vectorization.

Enabled BATCHED execution of existing testsuites: blackbody, color, transformc, wavelength_color
Added new testsuite BATCHED regression tests:  blackbody-reg, color-reg, luminance-reg, transformc-reg, wavelength_color-reg
Added sfm::min_val to provide an implementation of min which returns values not references to the original through a ternary (which can cause issues with vectorization).  Added Clang specific sfm::min_val and sfm::max_val implementations to assist in vectorization.

Signed-off-by: Alex M. Wells <[email protected]>
Use single precision for bb_spectrum object/operator and manually implement std::pow(wlm,-5.0f) to better enable vectorization.
Reduced masked code generation for rgb_to_hsl.
Removed duplicated code by implementing ColorSystem::blackbody_rgb(T) in terms of ColorSystem::can_lookup_blackbody(T), ColorSystem::lookup_blackbody_rgb(T), and ColorSystem::compute_blackbody_rgb(T).  Removed some casting to uint16_t that was unnecessary.

Removed some test shaders that were not intended to be promoted.

Signed-off-by: Alex M. Wells <[email protected]>
…_op, clamp, get_simple_SG_field, isconstant, select, unary_op, mix

Fixed bug in BatchedAnalysis where the complement operator was being treated as always having a boolean result.  In reality the result of complement or other bitwise operation would only be boolean if its input parameters were forced to be boolean (left TODO note for future improvement).

Fixed bug in printf where integer who is forced_llvm_bool() is formatted as a float was not converted to an integer first.

Enabled BATCHED for execution for tests:  and-or-not-synonyms, isconstant, logic, select.
Expanded testsuite/shaderglobals to exercise/access all shader global data members.

Add regression tests: andor-reg, bitwise-and-reg bitwise-or-reg bitwise-shl-reg
bitwise-shr-reg bitwise-xor-reg, complement-reg, mix-reg, select-reg

Signed-off-by: Alex M. Wells <[email protected]>
Copy link
Collaborator

@lgritz lgritz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lgritz lgritz merged commit 54727f9 into AcademySoftwareFoundation:master Oct 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants