Support graphics cards without `ARB_half_float_vertex` #1179

illwieckz · 2024-06-04T03:14:21Z

This is to address those two issues:

Some graphics devices don't supports ARB_half_float_vertex like the Intel GMA Gen3 architecture with the Mesa driver.
Some graphics devices don't supports ARB_half_float_vertex but drivers may provide an unplayable slow emulation, like the ATi r300 architecture with the Mesa driver.
- Huge performance drop on r300 with default Thunder and Vega scene #1172

Those changes purposes to provide an alternative code path when ARB_half_float_vertex is not available or when the r_arb_half_float_vertex cvar is disabled on purpose to avoid a slow emulation provided by the driver.

It is the next step over preparatory work that was already merged:

Some improvements of our type system, helping the compiler to report more type errors and then making the conversion easier with the less risk of introducing silent errors:
Implement f16_t, f16vec2_t, f16vec4_t half float types for safety #728
Some enablement of GL 2.1 on the said Intel graphics chip so we can know GL 2.1 is supported but catch the lack of the half float vertex extension, meaning that if we implement a workaround one day (that is the day today), we would already have everything else already in place for it:
renderer/glimp: new GL detection and selection code #478

This branch doesn't break compatibility with the game but I implemented it over the illwieckz/image-fistcreen/sync branch anyway:

renderer: introduce IF_FITSCREEN and RSF_2D #1145

I did it that way because this other branch also helps running the game on this kind of hardware (to not upload textures too big for such hardware) and we better test this hardware over for-0.55.0/sync because of all compatibility-breaking changes purposed for performance that were merged there. The illwieckz/image-fistcreen/sync branch itself is already implemented over for-0.55.0/sync because of compatibility breaking.

This branch also includes the GL_DEPTH_CLAMP disablement when GL_ARB_depth_clamp to avoid spamming the console with warnings when running the game on ATi r300 graphics card, to make debugging easier:

renderer: use GL_DEPTH_CLAMP only if GL_ARB_depth_clamp is available #1173

~~⚠️🚧️ This branch is in very early state! 🚧️⚠️~~
~~⚠️🚧️ It's even not ready to review! 🚧️⚠️~~
~~⚠️🚧️ This may be full of useless halffloat-to-float-to-halffloat conversions! 🚧️⚠️~~
~~⚠️🚧️ I even haven't tested if it doesn't break the usual code path! 🚧️⚠️~~

~~I push this branch now because it starts to work and this is so precious I don't want the team to lose that work is something happens to me!~~

It's now ready to be reviewed, it works for me with both code paths, and is rebased on master.

illwieckz · 2024-06-11T01:30:30Z

For unknown reasons, this is broken:

mouse pointer
minimap
console text

illwieckz · 2024-06-11T06:17:14Z

For unknown reasons, this is broken:

mouse pointer

minimap

console text

It was just some stupid copy-paste mistake.

Now it looks like everything is working.

Also fixes the version printing.

illwieckz · 2024-06-11T07:29:20Z

Now that I verified the branch works, the branch is now targeting master.

It is ready to review.

…tToHalf4 to avoid confusion

VReaperV

Same question w. r. t. vector2Copy() vs floatToHalf() usage where the original code only has the former and the branches in this pr seem switched up.

src/engine/client/hunk_allocator.cpp

src/engine/renderer/tr_backend.cpp

src/engine/renderer/tr_surface.cpp

src/engine/renderer/tr_vbo.cpp

src/engine/renderer/tr_local.h

illwieckz · 2024-06-11T17:58:22Z

src/engine/renderer/tr_shade_calc.cpp

+			}
+			else
+			{
+				factor = (int16_t) v1->texCoords[ 3 ];


@slipher may have a comment about those lines.

illwieckz · 2024-06-11T18:05:04Z

The changes are a bit verbose because I renamed all related half float variables by prefixing them with f16 (for example, half-float texCoords becomes f16TexCoords, while float texCoords is named texCoords). Not only it helps to avoid confusion when reading, but it makes far easier to me to implement this code, because I just had to rename texCoords to f16TexCoords to have my compiler reporting me all the lines requiring an alternate code.

slipher · 2024-06-11T23:30:14Z

As an experiment, I implemented a proof of concept of automatic vertex translation at VBO upload time in the branch slipher/vertex-translation. I'm not sure if that approach is better, but I was curious if it would work.

It doesn't translate every VBO because so far it only acts on "static" VBOs (ones fixed at load time). But I imagine it would be enough to get good performance on the cards with poor implementation quality of the half-float extension.

If we were to productionize this, the vertex translation process should probably be combined into R_CopyVertexData which already appears to do some similar work.

Usage of half floats in the VBOs can be toggle with r_avoidHalfFloatVertex.

illwieckz · 2024-06-12T00:32:21Z

The slipher/vertex-translation branch is enough to recover performance on ATI r300:

As a remind, I would like to not only accelerate r300 (that has a slow half-float vertex driver emulation) but also enable some Intel cards (that also support GL 2.1 but don't provide half-float vertex and provide no emulation).

A good way to emulate with Mesa a card only having GL 2.1 and not having half float vertexes is to set some environment variable this way:

MESA_GL_VERSION_OVERRIDE='2.1' MESA_EXTENSION_OVERRIDE='-GL_ARB_half_float_vertex' ./daemon +devmap vega

This should work with any Mesa driver, even llvmpipe on Windows:

https://github.com/pal1000/mesa-dist-win

More details about Mesa environment variables:

https://docs.mesa3d.org/envvars.html

illwieckz · 2024-06-12T00:37:55Z

I don't mind that much the implementation, as long as it works.

What I like in my branch is that I renamed the half float variables with an explicit prefix, and I find it better for the code to read and debug, this may be something I would redo even if we merge another implementation.

We may even decide to convert in the future some current half float code to float, to avoid useless round trips between float and half float. I especially think about the model code: some parts of the model CPU fallback code may process half floats, but I want this code to be as fast as possible since it's already a fallback for the GPU not being powerful enough…

illwieckz · 2024-06-12T01:07:23Z

What is currently broken in slipher/vertex-translation if I make the extension not required and it is missing, is some 2D things like text and images that are either garbage or distorted in wrong ways. I'm fine with the idea of translating the data when needed, especially if it prevents us to duplicate code everywhere, and if it even allows us to add new stuff using half float and relying on a generic converter when needed.

illwieckz · 2024-07-30T18:20:41Z

@slipher do you have some update on your slipher/vertex-translation branch? I like your approach, but it was not complete as far as I know.

I will have access to the machine with the Intel card without half-float vertex for two weeks starting with the next week.

If you're not available to work on this for now, I may merge my implementation and we can revert it in the future to merge your implementation instead. I would prefer to have yours, but we can take time to improve it.

slipher · 2024-07-31T18:51:32Z

@slipher do you have some update on your slipher/vertex-translation branch? I like your approach, but it was not complete as far as I know.

I made some progress in implementing a more complete version of the idea, which eliminates some of the unnecessary layers in preparing vertex data (it's often copied about 5 times before reaching the VBO). IIRC I finished 2 of the 3 model formats. Got sidetracked by all the bugs I was finding in the renderer. I'll try to dig that out that code in the next few days.

illwieckz · 2024-07-31T19:13:51Z

Got sidetracked by all the bugs I was finding in the renderer.

I know what it is. 🤣️

Thanks for the update! 🙂️

illwieckz · 2024-08-03T19:35:33Z

This branch makes the Intel GMA 3 working:

I needed some other minor patches to not compile some shaders for disabled features like bloom, but nothing more related to vertex float was needed.

The output of glxinfo -B:

name of display: :0
display: :0  screen: 0
direct rendering: Yes
Extended renderer info (GLX_MESA_query_renderer):
    Vendor: Mesa Project (0x8086)
    Device: i915 (chipset: G33) (0x29c2)
    Version: 24.0.9
    Accelerated: yes
    Video memory: 384MB
    Unified memory: yes
    Preferred profile: compat (0x2)
    Max core profile version: 0.0
    Max compat profile version: 2.1
    Max GLES1 profile version: 1.1
    Max GLES[23] profile version: 2.0
OpenGL vendor string: Mesa Project
OpenGL renderer string: i915 (chipset: G33)
OpenGL version string: 2.1 Mesa 24.0.9-0ubuntu0.1
OpenGL shading language version string: 1.20

OpenGL ES profile version string: OpenGL ES 2.0 Mesa 24.0.9-0ubuntu0.1
OpenGL ES profile shading language version string: OpenGL ES GLSL ES 1.0.16

Generate the layout for interleaved vertex attribute data at runtime. The motivation for this is to support OpenGL implementations that don't provide half float support (#1179). The vertex "struct" may contain a 16-bit or 32-bit float, depending on the graphics card. Now, instead of defining a struct for the data to be uploaded into a VBO, one must separately specify inputs for each attribute. The input is defined by a type, base address, stride, etc.; very similarly to the arguments of glVertexAttribPointer itself. The new version of R_CreateStaticVBO takes these inputs and writes them to an interleaved format, performing any neede type conversions along the way. In this commit just skeletal models (IQM and MD5) are migrated to the new method.

slipher · 2024-08-09T06:38:36Z

You can see my current progress on the branch slipher/runtime-vbo-layout. The dynamically generated vertex attribute layout, which may be configured to use half float or not, is used for IQM, MD3, and MD5 models. It's not ready to test as other VBOs are not ported yet. Remaining steps:

Investigate the vertex attribute-related warning spam on station15. IIRC this is caused by some code which creates some verts with the qtangent attribute, then later overwrites that data with the incompatible orientation attribute. I should be careful to avoid breaking this.
Use the new vertex attribute layout method for the other static VBOs.
Change the dynamic VBO's vertex struct to (always) use float instead of f16_t. This is probably better anyway. Data in the dynamic VBO is only used once and (I believe; will double check) is usually small. Trying to compress it seems pointless. Also tess.verts is sometimes used for data which is not actually sent to the GPU, so avoiding useless conversions should be a win there.

illwieckz · 2024-08-09T18:12:58Z

Great! I like the way it looks!

As a side note, I prefer the cvar being named r_arb_half_float_vertex (not r_ext_half_float_vertex) as the extension is ARB_half_float_vertex, and ordered alphabetically like the others.

slipher · 2024-08-10T05:32:36Z

I spelled out my thoughts on the extension cvar naming. To me it looks like a case of cargo culting gone out of control. Maybe some of the first extensions were EXT ones, which makes some sense as part of the cvar name. But ARB really doesn't.

renderer: add alternate code path without ARB_half_float_vertex

c5643df

illwieckz added A-Renderer T-Feature-Request Proposed new feature labels Jun 4, 2024

illwieckz marked this pull request as draft June 4, 2024 03:15

illwieckz changed the title ~~WIP: support graphis cards without ARB_half_float_vertex~~ WIP: support graphics cards without ARB_half_float_vertex Jun 4, 2024

illwieckz mentioned this pull request Jun 4, 2024

Huge performance drop on r300 with default Thunder and Vega scene #1172

Open

illwieckz force-pushed the illwieckz/no-half-float-vertex branch 2 times, most recently from 08bd535 to e6ec633 Compare June 9, 2024 02:18

tr_shade_calc: optimize half-float round trip

7273bf8

illwieckz force-pushed the illwieckz/no-half-float-vertex branch 2 times, most recently from 5cd8db6 to c1d888b Compare June 11, 2024 01:29

illwieckz force-pushed the illwieckz/no-half-float-vertex branch 3 times, most recently from 1898c46 to 38c78e7 Compare June 11, 2024 06:16

sdl_glimp: update the error message when OpenGL is too old

a46d2c6

Also fixes the version printing.

illwieckz force-pushed the illwieckz/no-half-float-vertex branch from b5c0125 to 1b58b39 Compare June 11, 2024 07:04

illwieckz changed the title ~~WIP: support graphics cards without ARB_half_float_vertex~~ Support graphics cards without ARB_half_float_vertex Jun 11, 2024

illwieckz force-pushed the illwieckz/no-half-float-vertex branch from 1b58b39 to 2f8e40f Compare June 11, 2024 07:27

illwieckz marked this pull request as ready for review June 11, 2024 07:27

illwieckz changed the base branch from for-0.55.0/sync to master June 11, 2024 07:28

illwieckz force-pushed the illwieckz/no-half-float-vertex branch from 2f8e40f to 4a01370 Compare June 11, 2024 07:28

renderer: rename halfToFloat and floatToHalf as halfToFloat4 and floa…

ce8361d

…tToHalf4 to avoid confusion

illwieckz force-pushed the illwieckz/no-half-float-vertex branch 4 times, most recently from 8d8d9f5 to 0d6b114 Compare June 11, 2024 08:05

VReaperV reviewed Jun 11, 2024

View reviewed changes

illwieckz mentioned this pull request Jun 11, 2024

hunk_allocator: break on hunk error #1187

Open

illwieckz force-pushed the illwieckz/no-half-float-vertex branch 2 times, most recently from 8b162ce to ada8fbd Compare June 11, 2024 17:42

illwieckz commented Jun 11, 2024

View reviewed changes

illwieckz force-pushed the illwieckz/no-half-float-vertex branch from ada8fbd to 7c8a8bd Compare June 11, 2024 18:34

tr_init: let gfxinfo tell about half-float usage

7f896cb

illwieckz force-pushed the illwieckz/no-half-float-vertex branch from f70345c to e86ea98 Compare August 3, 2024 19:30

illwieckz mentioned this pull request Aug 5, 2024

renderer: query tex indirections and ALU instructions when possible #1229

Merged

illwieckz force-pushed the illwieckz/no-half-float-vertex branch from e86ea98 to 7f896cb Compare August 7, 2024 14:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support graphics cards without `ARB_half_float_vertex` #1179

Support graphics cards without `ARB_half_float_vertex` #1179

illwieckz commented Jun 4, 2024 •

edited

Loading

illwieckz commented Jun 11, 2024

illwieckz commented Jun 11, 2024 •

edited

Loading

illwieckz commented Jun 11, 2024

VReaperV left a comment

illwieckz Jun 11, 2024 •

edited

Loading

illwieckz commented Jun 11, 2024

slipher commented Jun 11, 2024 •

edited

Loading

illwieckz commented Jun 12, 2024 •

edited

Loading

illwieckz commented Jun 12, 2024 •

edited

Loading

illwieckz commented Jun 12, 2024

illwieckz commented Jul 30, 2024

slipher commented Jul 31, 2024

illwieckz commented Jul 31, 2024

illwieckz commented Aug 3, 2024

slipher commented Aug 9, 2024

illwieckz commented Aug 9, 2024

slipher commented Aug 10, 2024

Support graphics cards without ARB_half_float_vertex #1179

Are you sure you want to change the base?

Support graphics cards without ARB_half_float_vertex #1179

Conversation

illwieckz commented Jun 4, 2024 • edited Loading

illwieckz commented Jun 11, 2024

illwieckz commented Jun 11, 2024 • edited Loading

illwieckz commented Jun 11, 2024

VReaperV left a comment

Choose a reason for hiding this comment

illwieckz Jun 11, 2024 • edited Loading

Choose a reason for hiding this comment

illwieckz commented Jun 11, 2024

slipher commented Jun 11, 2024 • edited Loading

illwieckz commented Jun 12, 2024 • edited Loading

illwieckz commented Jun 12, 2024 • edited Loading

illwieckz commented Jun 12, 2024

illwieckz commented Jul 30, 2024

slipher commented Jul 31, 2024

illwieckz commented Jul 31, 2024

illwieckz commented Aug 3, 2024

slipher commented Aug 9, 2024

illwieckz commented Aug 9, 2024

slipher commented Aug 10, 2024

Support graphics cards without `ARB_half_float_vertex` #1179

Support graphics cards without `ARB_half_float_vertex` #1179

illwieckz commented Jun 4, 2024 •

edited

Loading

illwieckz commented Jun 11, 2024 •

edited

Loading

illwieckz Jun 11, 2024 •

edited

Loading

slipher commented Jun 11, 2024 •

edited

Loading

illwieckz commented Jun 12, 2024 •

edited

Loading

illwieckz commented Jun 12, 2024 •

edited

Loading