Skip to content
gallickgunner edited this page Mar 29, 2020 · 2 revisions

Using Yune as a program is pretty easy. You only need to know a bit of OpenCL and be familiar with parallel programming/GPGPU Computing. Just grab the template kernel file, which itself has some instructions to get you started.

Custom Preprocessor Directives

Every kernel file you decide to load in Yune should have a .cl extension and have these lines at the beginning

  1. #yune-preproc compiler-opts -cl-std=CL2.0,cl-mad-enable,etc (Optional, comma separated list of compiler options)

  2. #yune-preproc kernel-name mykernel Mandatory. This is the name of the kernel function in your file.

OBJ and MTL Files

Regarding model files, you can load simple OBJ files. Groups of object should be defined with the o tag. Yune ignores g and any other irrelevant commands in the OBJ and just extracts relevant information. MTL commands like mtllib and usemtl are supported as they are used in normal obj files. usemtl should be present before starting face descriptions. If however, the OBJ file has no reference to any sort of MTL file, yune creates a default one automatically and assigns the default material to every thing in the OBJ. You, however, need to specify the MTL commands specifically if you wish to refer to different materials later on.

The MTL file itself has been greatly simplified and the format differs from that of a normal one. You can check the template of both OBJ and the custom MTL files in the repository.

Yune Architecture

Yune is based on a mega-kernel approach. This means a single kernel is responsible for doing everything. This has its own sets of cons and pros. While this makes setting up and sharing the kernel easier, it has issues regarding kernel occupancy, divergence and many other performance related problems. Since Yune was intended as an educational raytracer/pathtracer we chose the simpler mega-kernel approach to the multi-kernel wavefront approach.

However, Yune supports launching the same mega-kernel in multiple instances. By default, Yune divides the image into 4 blocks. The kernel file provided is then launched for those 4 blocks hence 4 kernel launches. This has its own overhead and almost always decreases FPS, however has one significant advantage. If one single kernel instance, takes more than 16.666 ms ( < 60FPS), normally the GUI will start going unresponsive and in extreme cases > 3-5 sec windows may terminate the program altogether (on single GPU systems) since the GPU is busy executing the kernel and has no time to update the display. In this case, increasing the block size, reduces the execution time of a single kernel instance to avoid the above scenario. However, this feature is only supported for the Rendering Kernel, not for Post-processing.

However both the rendering and post-processing kernels' workgroup sizes can be set through GUI. This can be used to come up with different ways to run the same kernel. The default kernels launch as many work items as there are pixels in the image or in a single block (if block size > 1). This means a local workgroup deals with an even smaller block of pixels (64 or 256, for e.g).

Note: Yune will probably not be able to exploit Multiple GPUs to avoid the above scenario. This is because the OpenCL context is shared with the OpenGL one and this happens to be the one which drives your display. So Yune will always select the GPU which is your primary and driving the display. This may or may not be true as I haven't checked it.

Misc Options

Yune also creates a SAH-based BVH whenever you load an OBJ file. You can change the number of bins used in the algorithm, or disable the creation of BVH altogether through the settings panel.

The Save Image at Samples option allows you to save images at specific samples per pixel. Currently Yune advances sample count by 1 after one whole frame is processed. This means if you use more than 1 sample per pixel the meter will show incorrect readings. (TODO).

Clone this wiki locally