-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Depth raster TODO list #19757
Comments
Not sure if this comment belongs here, but. |
Yes, that will probably be fine, because both multiplicands are small. Certainly better than the horror of the workaround function :) Or maybe it's okay to do the triangle setup in float? Although, I'm sure Fabian has a good reason to stick to int.. Btw, in your https://rextester.com/GDHNO44482 , for the hiearchical traversal, I'm pretty sure that you don't have to do four tests like in your test_rect, it should be possible to bias the edge functions instead and do a single test even at the upper level. Though, have not tried that :) |
You can even do: // Returns (a-b)*(c-d)-(e-f)*(g-h) per int32 lane,
// assuming all (...)'s fit into int16.
static __m128i edge_function(__m128i a,__m128i b,__m128i c,__m128i d,__m128i e,__m128i f,__m128i g,__m128i h)
{
__m128i p=_mm_sub_epi32(a,b);
__m128i q=_mm_sub_epi32(c,d);
__m128i r=_mm_sub_epi32(e,f);
__m128i s=_mm_sub_epi32(h,g); // flipped order, since _mm_madd_epi16 is p*q+r*s, not p*q-r*s.
__m128i x=_mm_or_si128(_mm_and_si128(p,_mm_set1_epi32(0xFFFF)),_mm_slli_epi32(r,16));
__m128i y=_mm_or_si128(_mm_and_si128(q,_mm_set1_epi32(0xFFFF)),_mm_slli_epi32(s,16));
return _mm_madd_epi16(x,y);
} Tested it, seems to work fine. Triangle setup/rasterization are done in int pretty much for reasons of exactness: you want to make sure pixels on common edge of 2 triangles are rendered exactly once. You mean computing edge function at rect center, and comparing it to sum of absolute values of increments for rect half-sides? Oh, yeah, that should work, nice. |
Nice! That'll come in handy. I'll stick to integer... By the way, I was driving today and thinking of rasterization hehe. I had two thoughts: In your 8x8 raster, if there is a block to the left of the current one, you can reuse the top right and bottom right samples as the new top left and bottom left. But also, I don't understand how your 8x8 method with checking corners doesn't miss small triangles, like a tiny one entirely enclosed by a block... So it feels like rastering at 8x8 centres with bias, and then, in "inside" blocks, checking corners with the 1x1 biases to see if a block is full or partial would be the way to go? |
The block test looks at 12 bits of data: signs of 3 edge functions at block corners. So tiny triangle enclosed by a block poses no problem: the signs of indivadual edge functions at corners would be different. |
Ahhh ok, I understand now :) With reuse, checking the corners may practically be as fast as using a rect-center biased check I guess, since with that method we still need to check the corners to see if a block is fully in or partial.. I'm going to play around with this later. |
This is about #19748 , which solves a number of lens flare issues across various games, at the cost of running an extra Z-only software renderer.
Ideally other games should run and render good depth buffers too, so we get the bugs out of the system, even when they don't have any use for them.
Problematic things (done)
Features:
Planned optimizations:
raster time is so dominant that maybe it's ok to just send all draws to all threads.
The text was updated successfully, but these errors were encountered: