Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support graphics cards without ARB_half_float_vertex #1179

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

illwieckz
Copy link
Member

@illwieckz illwieckz commented Jun 4, 2024

This is to address those two issues:

  1. Some graphics devices don't supports ARB_half_float_vertex like the Intel GMA Gen3 architecture with the Mesa driver.
  2. Some graphics devices don't supports ARB_half_float_vertex but drivers may provide an unplayable slow emulation, like the ATi r300 architecture with the Mesa driver.

Those changes purposes to provide an alternative code path when ARB_half_float_vertex is not available or when the r_arb_half_float_vertex cvar is disabled on purpose to avoid a slow emulation provided by the driver.

It is the next step over preparatory work that was already merged:

This branch doesn't break compatibility with the game but I implemented it over the illwieckz/image-fistcreen/sync branch anyway:

I did it that way because this other branch also helps running the game on this kind of hardware (to not upload textures too big for such hardware) and we better test this hardware over for-0.55.0/sync because of all compatibility-breaking changes purposed for performance that were merged there. The illwieckz/image-fistcreen/sync branch itself is already implemented over for-0.55.0/sync because of compatibility breaking.

This branch also includes the GL_DEPTH_CLAMP disablement when GL_ARB_depth_clamp to avoid spamming the console with warnings when running the game on ATi r300 graphics card, to make debugging easier:

⚠️🚧️ This branch is in very early state! 🚧️⚠️
⚠️🚧️ It's even not ready to review! 🚧️⚠️
⚠️🚧️ This may be full of useless halffloat-to-float-to-halffloat conversions! 🚧️⚠️
⚠️🚧️ I even haven't tested if it doesn't break the usual code path! 🚧️⚠️

I push this branch now because it starts to work and this is so precious I don't want the team to lose that work is something happens to me!

It's now ready to be reviewed, it works for me with both code paths, and is rebased on master.

@illwieckz illwieckz added A-Renderer T-Feature-Request Proposed new feature labels Jun 4, 2024
@illwieckz illwieckz marked this pull request as draft June 4, 2024 03:15
@illwieckz illwieckz changed the title WIP: support graphis cards without ARB_half_float_vertex WIP: support graphics cards without ARB_half_float_vertex Jun 4, 2024
@illwieckz illwieckz force-pushed the illwieckz/no-half-float-vertex branch 2 times, most recently from 08bd535 to e6ec633 Compare June 9, 2024 02:18
@illwieckz illwieckz force-pushed the illwieckz/no-half-float-vertex branch 2 times, most recently from 5cd8db6 to c1d888b Compare June 11, 2024 01:29
@illwieckz
Copy link
Member Author

For unknown reasons, this is broken:

  • mouse pointer
  • minimap
  • console text

@illwieckz illwieckz force-pushed the illwieckz/no-half-float-vertex branch 3 times, most recently from 1898c46 to 38c78e7 Compare June 11, 2024 06:16
@illwieckz
Copy link
Member Author

illwieckz commented Jun 11, 2024

For unknown reasons, this is broken:

  • mouse pointer
  • minimap
  • console text

It was just some stupid copy-paste mistake.

Now it looks like everything is working.

@illwieckz illwieckz force-pushed the illwieckz/no-half-float-vertex branch from b5c0125 to 1b58b39 Compare June 11, 2024 07:04
@illwieckz illwieckz changed the title WIP: support graphics cards without ARB_half_float_vertex Support graphics cards without ARB_half_float_vertex Jun 11, 2024
@illwieckz illwieckz force-pushed the illwieckz/no-half-float-vertex branch from 1b58b39 to 2f8e40f Compare June 11, 2024 07:27
@illwieckz illwieckz marked this pull request as ready for review June 11, 2024 07:27
@illwieckz illwieckz changed the base branch from for-0.55.0/sync to master June 11, 2024 07:28
@illwieckz illwieckz force-pushed the illwieckz/no-half-float-vertex branch from 2f8e40f to 4a01370 Compare June 11, 2024 07:28
@illwieckz
Copy link
Member Author

Now that I verified the branch works, the branch is now targeting master.

It is ready to review.

@illwieckz illwieckz force-pushed the illwieckz/no-half-float-vertex branch 4 times, most recently from 8d8d9f5 to 0d6b114 Compare June 11, 2024 08:05
Copy link
Contributor

@VReaperV VReaperV left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question w. r. t. vector2Copy() vs floatToHalf() usage where the original code only has the former and the branches in this pr seem switched up.

src/engine/client/hunk_allocator.cpp Outdated Show resolved Hide resolved
src/engine/renderer/tr_backend.cpp Outdated Show resolved Hide resolved
src/engine/renderer/tr_surface.cpp Show resolved Hide resolved
src/engine/renderer/tr_vbo.cpp Show resolved Hide resolved
src/engine/renderer/tr_local.h Outdated Show resolved Hide resolved
@illwieckz illwieckz force-pushed the illwieckz/no-half-float-vertex branch 2 times, most recently from 8b162ce to ada8fbd Compare June 11, 2024 17:42
}
else
{
factor = (int16_t) v1->texCoords[ 3 ];
Copy link
Member Author

@illwieckz illwieckz Jun 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@slipher may have a comment about those lines.

@illwieckz
Copy link
Member Author

The changes are a bit verbose because I renamed all related half float variables by prefixing them with f16 (for example, half-float texCoords becomes f16TexCoords, while float texCoords is named texCoords). Not only it helps to avoid confusion when reading, but it makes far easier to me to implement this code, because I just had to rename texCoords to f16TexCoords to have my compiler reporting me all the lines requiring an alternate code.

@illwieckz illwieckz force-pushed the illwieckz/no-half-float-vertex branch from ada8fbd to 7c8a8bd Compare June 11, 2024 18:34
@slipher
Copy link
Member

slipher commented Jun 11, 2024

As an experiment, I implemented a proof of concept of automatic vertex translation at VBO upload time in the branch slipher/vertex-translation. I'm not sure if that approach is better, but I was curious if it would work.

It doesn't translate every VBO because so far it only acts on "static" VBOs (ones fixed at load time). But I imagine it would be enough to get good performance on the cards with poor implementation quality of the half-float extension.

If we were to productionize this, the vertex translation process should probably be combined into R_CopyVertexData which already appears to do some similar work.

Usage of half floats in the VBOs can be toggle with r_avoidHalfFloatVertex.

@illwieckz
Copy link
Member Author

illwieckz commented Jun 12, 2024

The slipher/vertex-translation branch is enough to recover performance on ATI r300:

unvanquished_2024-06-12_022513_000

As a remind, I would like to not only accelerate r300 (that has a slow half-float vertex driver emulation) but also enable some Intel cards (that also support GL 2.1 but don't provide half-float vertex and provide no emulation).

A good way to emulate with Mesa a card only having GL 2.1 and not having half float vertexes is to set some environment variable this way:

MESA_GL_VERSION_OVERRIDE='2.1' MESA_EXTENSION_OVERRIDE='-GL_ARB_half_float_vertex' ./daemon +devmap vega

This should work with any Mesa driver, even llvmpipe on Windows:

More details about Mesa environment variables:

@illwieckz
Copy link
Member Author

illwieckz commented Jun 12, 2024

I don't mind that much the implementation, as long as it works.

What I like in my branch is that I renamed the half float variables with an explicit prefix, and I find it better for the code to read and debug, this may be something I would redo even if we merge another implementation.

We may even decide to convert in the future some current half float code to float, to avoid useless round trips between float and half float. I especially think about the model code: some parts of the model CPU fallback code may process half floats, but I want this code to be as fast as possible since it's already a fallback for the GPU not being powerful enough…

@illwieckz
Copy link
Member Author

What is currently broken in slipher/vertex-translation if I make the extension not required and it is missing, is some 2D things like text and images that are either garbage or distorted in wrong ways. I'm fine with the idea of translating the data when needed, especially if it prevents us to duplicate code everywhere, and if it even allows us to add new stuff using half float and relying on a generic converter when needed.

@illwieckz
Copy link
Member Author

@slipher do you have some update on your slipher/vertex-translation branch? I like your approach, but it was not complete as far as I know.

I will have access to the machine with the Intel card without half-float vertex for two weeks starting with the next week.

If you're not available to work on this for now, I may merge my implementation and we can revert it in the future to merge your implementation instead. I would prefer to have yours, but we can take time to improve it.

@slipher
Copy link
Member

slipher commented Jul 31, 2024

@slipher do you have some update on your slipher/vertex-translation branch? I like your approach, but it was not complete as far as I know.

I made some progress in implementing a more complete version of the idea, which eliminates some of the unnecessary layers in preparing vertex data (it's often copied about 5 times before reaching the VBO). IIRC I finished 2 of the 3 model formats. Got sidetracked by all the bugs I was finding in the renderer. I'll try to dig that out that code in the next few days.

@illwieckz
Copy link
Member Author

Got sidetracked by all the bugs I was finding in the renderer.

I know what it is. 🤣️

Thanks for the update! 🙂️

@illwieckz illwieckz force-pushed the illwieckz/no-half-float-vertex branch from f70345c to e86ea98 Compare August 3, 2024 19:30
@illwieckz
Copy link
Member Author

This branch makes the Intel GMA 3 working:

unvanquished_2024-08-03_195448_000

I needed some other minor patches to not compile some shaders for disabled features like bloom, but nothing more related to vertex float was needed.

The output of glxinfo -B:

name of display: :0
display: :0  screen: 0
direct rendering: Yes
Extended renderer info (GLX_MESA_query_renderer):
    Vendor: Mesa Project (0x8086)
    Device: i915 (chipset: G33) (0x29c2)
    Version: 24.0.9
    Accelerated: yes
    Video memory: 384MB
    Unified memory: yes
    Preferred profile: compat (0x2)
    Max core profile version: 0.0
    Max compat profile version: 2.1
    Max GLES1 profile version: 1.1
    Max GLES[23] profile version: 2.0
OpenGL vendor string: Mesa Project
OpenGL renderer string: i915 (chipset: G33)
OpenGL version string: 2.1 Mesa 24.0.9-0ubuntu0.1
OpenGL shading language version string: 1.20

OpenGL ES profile version string: OpenGL ES 2.0 Mesa 24.0.9-0ubuntu0.1
OpenGL ES profile shading language version string: OpenGL ES GLSL ES 1.0.16

@illwieckz illwieckz force-pushed the illwieckz/no-half-float-vertex branch from e86ea98 to 7f896cb Compare August 7, 2024 14:49
slipher added a commit that referenced this pull request Aug 9, 2024
Generate the layout for interleaved vertex attribute data at runtime. The
motivation for this is to support OpenGL implementations that don't
provide half float support
(#1179). The vertex "struct"
may contain a 16-bit or 32-bit float, depending on the graphics card.

Now, instead of defining a struct for the data to be uploaded into a
VBO, one must separately specify inputs for each attribute. The input is
defined by a type, base address, stride, etc.; very similarly to the
arguments of glVertexAttribPointer itself. The new version of
R_CreateStaticVBO takes these inputs and writes them to an interleaved
format, performing any neede type conversions along the way.

In this commit just skeletal models (IQM and MD5) are migrated to the
new method.
@slipher
Copy link
Member

slipher commented Aug 9, 2024

You can see my current progress on the branch slipher/runtime-vbo-layout. The dynamically generated vertex attribute layout, which may be configured to use half float or not, is used for IQM, MD3, and MD5 models. It's not ready to test as other VBOs are not ported yet. Remaining steps:

  • Investigate the vertex attribute-related warning spam on station15. IIRC this is caused by some code which creates some verts with the qtangent attribute, then later overwrites that data with the incompatible orientation attribute. I should be careful to avoid breaking this.
  • Use the new vertex attribute layout method for the other static VBOs.
  • Change the dynamic VBO's vertex struct to (always) use float instead of f16_t. This is probably better anyway. Data in the dynamic VBO is only used once and (I believe; will double check) is usually small. Trying to compress it seems pointless. Also tess.verts is sometimes used for data which is not actually sent to the GPU, so avoiding useless conversions should be a win there.

@illwieckz
Copy link
Member Author

Great! I like the way it looks!

As a side note, I prefer the cvar being named r_arb_half_float_vertex (not r_ext_half_float_vertex) as the extension is ARB_half_float_vertex, and ordered alphabetically like the others.

@slipher
Copy link
Member

slipher commented Aug 10, 2024

I spelled out my thoughts on the extension cvar naming. To me it looks like a case of cargo culting gone out of control. Maybe some of the first extensions were EXT ones, which makes some sense as part of the cvar name. But ARB really doesn't.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants