-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
(Vulkan) Lack of Vsync and severe stuttering during Texture Scaling #10105
Comments
Lack of vsync is backend-dependent. Which ones have problems? I certainly have vsync in Vulkan for example. Don't know what's up with the texture scaling in Vulkan, needs to be looked into. But ideally it should be moved to the GPU anyway. |
Pretty sure you can turn on or off vsync in your driver, and this will likely affect Vulkan. Texture scaling on Vulkan - currently - scales even empty or flat textures. They're still cached, but in GLES these are skipped. It'd require reordering some logic (because of how it allocates) to make it able skip scaling empty/flat textures, I bet we could make IsEmptyOrFlat faster, and potentially even use it to send a 1x1 texture, and call it earlier. That's the main differences I'm aware of. -[Unknown] |
I only get tearing with Vulkan, all the other backends are fine (OpenGL, D3D11, D3D9). My Vsync driver configuration is "application controlled". I know I can force "Vsync on" via driver, but it's much better to control it by the application. And I wouldn't be able to get past 60 fps that way, even with unthrottle. |
We can choose between multiple present modes in Vulkan. On Windows we currently use MAILBOX which lets us submit frames as fast as we can (to support non-frame-skipped unthrottle) but should vsync but maybe doesn't always. There's also FIFO which we use on Android, which probably guarantees better results but requires us to skip frames during unthrottle. We could use the vsync checkbox to select between those. |
Try this again now. |
Well, I think it has improved, but it's still quite behind D3D11 and OpenGL. As for Vsync, it hasn't changed, as expected. |
I don't see this at all, plenty fast for me. Strange. |
DonelBueno, what GPU do you have? Also, that little change could maybe help if it's not nvidia... |
GTX 770, so no, that change didn't help. It still stutters heavily in some cases, strange. |
What cases? |
I'm using Windows 7, GTX 970, i5, all PPSSPP settings default. I get constant stuttering in project Diva 2nd, Project Diva extend, Castlevania the Dracula X chronicles. Even with the resolution at native, it still stutters slightly. I should also mention the stuttering is present in all backends. I don't think it's a Vulkan only problem. |
Doesn't matter how much I crank up texture scaling, I don't get any terrible stutters in any game I try. GTX 770, i7-3770K 3.5GHz, Windows 10, Vulkan. |
Just tried the very latest build. The stuttering is greatly reduced, but still present. |
Hmm. None of the scaling methods will write memory sequentially, which is not great for discrete GPUs. It may be worth testing the performance on integrated memory GPUs, since I doubt it cares much about sequential access, but it may. Then we could add a flag (maybe assume mobile/desktop except on Vulkan where we can detect?) to force a temp buffer when scaling, rather than scaling directly? With the exception of swizzled textures (common) and DXT (uncommon), we decode a lot of textures sequentially. Might be worth benching to see if it's better to use a temp buffer in the non-sequential cases. But for texture scaling perf issues that happen on OpenGL (where we always use a temp buffer currently), this won't help. I think there the issue is simple: texture scaling takes time. Until someone invents magic, somebody else's problem field generators, or an spectacular texture scaling algorithm that looks amazing and is super fast... I guess that problem ain't going away. Best workaround in those cases is to try creating a HD texture pack, I suppose. -[Unknown] |
Write sequentially? That doesn't really matter too much, we write textures (at least in Vulkan) to a cached region of memory that's also mapped into the GPU's memory space (pushbuffer), then have the GPU copy it out of there into local vram, so performance of the upload itself should be reasonably good regardless. What we really should do is to copy the original size texture into this buffer instead, then have a compute shader perform the actual scaling. This will be way faster than running the scaling algorithm on the CPU, and it can write directly into the texture's storage. |
I could be wrong, but I thought coherent memory was uncached, and therefore suffered from random access. I thought #10108 improved Vulkan more than say OpenGL in part because of uncached memory. We write textures directly to mapped coherent push buffer memory, right? -[Unknown] |
Oh, right, confused myself a little. Example of available memory types: http://vulkan.gpuinfo.org/displayreport.php?id=2223#memory Yeah, we use coherent and not cached memory - we could also used cached but then we'd have to manually vkFlushCachedMemoryRanges, I think. Either way, a compute shader would spank whatever we are doing. We could also unswizzle and depalettize in a compute shader first, for even less memory copying from the CPU side. Of course could also be done in a pixel shader but compute shaders have much less launch overhead than renderpasses so more suitable for texture uploads, of which there might be many in a frame. |
Hmm, maybe it's worth preferring cached and coherent: I don't have a good device to test coherent vs non-coherent cached. But my desktop GPU does have coherent and cached coherent options. Interestingly, using cached memory generally slowed things slightly (even though it was still coherent): Maybe we actually want to avoid cached memory (since we currently go by order, we might accidentally select cached memory unintentionally - for example on the Adreno 530 linked above.) Agreed that it'd be better to perform upscaling on the GPU. -[Unknown] |
Digging up this old thread, regarding coherent vs cached, I've learned that for writes from the CPU it doesn't really matter much if it's cached or not since write combining is generally performed and in practice mostly whole cachelines are written out regardless when you write large contiguous chunks of data like images. For uploads to the GPU we should thus prefer COHERENT to CACHED indeed. Reads are a whole other matter though, for those CACHED can be very beneficial. |
Anyway regarding the actual issue here, we have two conflated ones, VSYNC and supposed extra stuttering when upscaling. Still no idea about the latter, does it still appear to be an issue? Regarding VSYNC we already use a mode that's supposed to be synced so I don't know what more we can do. |
Stutter still happens in 1.7.5.415. Monster Hunter Portable 3rd HD ver, texture scaling x5 xBRZ. |
Since the CPU cost of texture scaling should be the same, something else has to be different. Probably D3D11 has preallocated space while Vulkan runs out of buffers and has to go to the OS to get more with vkAllocateMemory. We might be able to fix this by increasing the size of our allocations when 5x texture scaling is used... but I'm putting that off for after the 1.8.0 release. |
About VSync, here are some clues. |
By the way, Cemu emulator had just fixed their broken VSync option (though they use OpenGL), Dolphin vulkan VSync also works properly on my NVIDIA. I think there must be something wrong. |
This is still actual (both problems). Now I'm using a GTX 1070 and drivers 430.53. |
Lack of vsync will be fixed in the next NV drivers if it isn't already. As for scaling stutter, yeah it's gonna be like that until we implement GPU texture upscaling. It will happen eventually... |
This allows the setting to be changed at runtime in Vulkan too. Should help hrydgard#10105.
Has this improved? -[Unknown] |
Confirmed, NVIDIA had fixed this |
Does that mean we can close this, or is texture scaling still performing poorly? -[Unknown] |
Another test: Reducing tex scale factor to 2x. |
For me, if I change: scaler.ScaleAlways((u32 *)writePtr, pixelData, fmt, w, h, scaleFactor); To: uint8_t *rearrange = (uint8_t *)AllocateAlignedMemory(w * scaleFactor * h * scaleFactor * 4, 16);
scaler.ScaleAlways((u32 *)rearrange, pixelData, fmt, w, h, scaleFactor);
memcpy(writePtr, rearrange, w * h * 4);
FreeAlignedMemory(rearrange); The speed is improved to near OpenGL speeds. To me, this indicates it is definitely something about the memory it's writing to. Adding For me, Direct3D 11 is comparable in speed to OpenGL already (even with the mapRowPitch part.) -[Unknown] |
These are the remaining generalized issues on Vulkan for me.
Lack of Vsync is self-explanatory, there is screen tearing even with vsync turned on.
The stuttering with texture scaling is MUCH worse in Vulkan than in any other backend, I don't know why. It happens even with Hybrid at 2x.
The text was updated successfully, but these errors were encountered: