Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(Vulkan) Lack of Vsync and severe stuttering during Texture Scaling #10105

Closed
DonelBueno opened this issue Nov 12, 2017 · 32 comments · Fixed by #12687
Closed

(Vulkan) Lack of Vsync and severe stuttering during Texture Scaling #10105

DonelBueno opened this issue Nov 12, 2017 · 32 comments · Fixed by #12687
Labels
Milestone

Comments

@DonelBueno
Copy link

DonelBueno commented Nov 12, 2017

These are the remaining generalized issues on Vulkan for me.

Lack of Vsync is self-explanatory, there is screen tearing even with vsync turned on.

The stuttering with texture scaling is MUCH worse in Vulkan than in any other backend, I don't know why. It happens even with Hybrid at 2x.

@hrydgard
Copy link
Owner

Lack of vsync is backend-dependent. Which ones have problems? I certainly have vsync in Vulkan for example.

Don't know what's up with the texture scaling in Vulkan, needs to be looked into. But ideally it should be moved to the GPU anyway.

@hrydgard hrydgard added this to the v1.6.0 milestone Nov 12, 2017
@unknownbrackets
Copy link
Collaborator

Pretty sure you can turn on or off vsync in your driver, and this will likely affect Vulkan.

Texture scaling on Vulkan - currently - scales even empty or flat textures. They're still cached, but in GLES these are skipped.

It'd require reordering some logic (because of how it allocates) to make it able skip scaling empty/flat textures, I bet we could make IsEmptyOrFlat faster, and potentially even use it to send a 1x1 texture, and call it earlier.

That's the main differences I'm aware of.

-[Unknown]

@DonelBueno
Copy link
Author

DonelBueno commented Nov 12, 2017

I only get tearing with Vulkan, all the other backends are fine (OpenGL, D3D11, D3D9).

My Vsync driver configuration is "application controlled". I know I can force "Vsync on" via driver, but it's much better to control it by the application. And I wouldn't be able to get past 60 fps that way, even with unthrottle.

@hrydgard
Copy link
Owner

We can choose between multiple present modes in Vulkan. On Windows we currently use MAILBOX which lets us submit frames as fast as we can (to support non-frame-skipped unthrottle) but should vsync but maybe doesn't always. There's also FIFO which we use on Android, which probably guarantees better results but requires us to skip frames during unthrottle. We could use the vsync checkbox to select between those.

@hrydgard
Copy link
Owner

Try this again now.

@DonelBueno
Copy link
Author

DonelBueno commented Nov 13, 2017

Well, I think it has improved, but it's still quite behind D3D11 and OpenGL.

As for Vsync, it hasn't changed, as expected.

@hrydgard
Copy link
Owner

I don't see this at all, plenty fast for me. Strange.

@hrydgard
Copy link
Owner

DonelBueno, what GPU do you have? Also, that little change could maybe help if it's not nvidia...

@DonelBueno
Copy link
Author

GTX 770, so no, that change didn't help.

It still stutters heavily in some cases, strange.

@hrydgard
Copy link
Owner

What cases?

@ghost
Copy link

ghost commented Nov 13, 2017

I'm using Windows 7, GTX 970, i5, all PPSSPP settings default. I get constant stuttering in project Diva 2nd, Project Diva extend, Castlevania the Dracula X chronicles. Even with the resolution at native, it still stutters slightly.

I should also mention the stuttering is present in all backends. I don't think it's a Vulkan only problem.

@hrydgard
Copy link
Owner

Doesn't matter how much I crank up texture scaling, I don't get any terrible stutters in any game I try.

GTX 770, i7-3770K 3.5GHz, Windows 10, Vulkan.

@ghost
Copy link

ghost commented Nov 13, 2017

Just tried the very latest build. The stuttering is greatly reduced, but still present.

@DonelBueno DonelBueno changed the title (Vulkan) Lack of Vsync and severe stuttering with Texture Scaling (Vulkan) Lack of Vsync and severe stuttering during Texture Scaling Nov 16, 2017
@unknownbrackets
Copy link
Collaborator

unknownbrackets commented Dec 11, 2017

Hmm. None of the scaling methods will write memory sequentially, which is not great for discrete GPUs.

It may be worth testing the performance on integrated memory GPUs, since I doubt it cares much about sequential access, but it may. Then we could add a flag (maybe assume mobile/desktop except on Vulkan where we can detect?) to force a temp buffer when scaling, rather than scaling directly?

With the exception of swizzled textures (common) and DXT (uncommon), we decode a lot of textures sequentially. Might be worth benching to see if it's better to use a temp buffer in the non-sequential cases.

But for texture scaling perf issues that happen on OpenGL (where we always use a temp buffer currently), this won't help. I think there the issue is simple: texture scaling takes time. Until someone invents magic, somebody else's problem field generators, or an spectacular texture scaling algorithm that looks amazing and is super fast... I guess that problem ain't going away.

Best workaround in those cases is to try creating a HD texture pack, I suppose.

-[Unknown]

@hrydgard
Copy link
Owner

hrydgard commented Dec 11, 2017

Write sequentially? That doesn't really matter too much, we write textures (at least in Vulkan) to a cached region of memory that's also mapped into the GPU's memory space (pushbuffer), then have the GPU copy it out of there into local vram, so performance of the upload itself should be reasonably good regardless.

What we really should do is to copy the original size texture into this buffer instead, then have a compute shader perform the actual scaling. This will be way faster than running the scaling algorithm on the CPU, and it can write directly into the texture's storage.

@unknownbrackets
Copy link
Collaborator

unknownbrackets commented Dec 11, 2017

I could be wrong, but I thought coherent memory was uncached, and therefore suffered from random access. I thought #10108 improved Vulkan more than say OpenGL in part because of uncached memory.

We write textures directly to mapped coherent push buffer memory, right?

-[Unknown]

@hrydgard
Copy link
Owner

hrydgard commented Dec 11, 2017

Oh, right, confused myself a little. Example of available memory types: http://vulkan.gpuinfo.org/displayreport.php?id=2223#memory

Yeah, we use coherent and not cached memory - we could also used cached but then we'd have to manually vkFlushCachedMemoryRanges, I think.

Either way, a compute shader would spank whatever we are doing. We could also unswizzle and depalettize in a compute shader first, for even less memory copying from the CPU side. Of course could also be done in a pixel shader but compute shaders have much less launch overhead than renderpasses so more suitable for texture uploads, of which there might be many in a frame.

@unknownbrackets
Copy link
Collaborator

Hmm, maybe it's worth preferring cached and coherent:
http://vulkan.gpuinfo.org/displayreport.php?id=2202#memory

I don't have a good device to test coherent vs non-coherent cached. But my desktop GPU does have coherent and cached coherent options. Interestingly, using cached memory generally slowed things slightly (even though it was still coherent):
https://github.com/hrydgard/ppsspp/compare/master...unknownbrackets:vulkan-mem?expand=1

Maybe we actually want to avoid cached memory (since we currently go by order, we might accidentally select cached memory unintentionally - for example on the Adreno 530 linked above.)

Agreed that it'd be better to perform upscaling on the GPU.

-[Unknown]

@unknownbrackets unknownbrackets modified the milestones: v1.6.0, v1.7.0 Mar 24, 2018
@hrydgard hrydgard modified the milestones: v1.7.0, v1.8.0 Oct 6, 2018
@hrydgard
Copy link
Owner

hrydgard commented Feb 6, 2019

Digging up this old thread, regarding coherent vs cached, I've learned that for writes from the CPU it doesn't really matter much if it's cached or not since write combining is generally performed and in practice mostly whole cachelines are written out regardless when you write large contiguous chunks of data like images. For uploads to the GPU we should thus prefer COHERENT to CACHED indeed.

Reads are a whole other matter though, for those CACHED can be very beneficial.

@hrydgard
Copy link
Owner

hrydgard commented Feb 6, 2019

Anyway regarding the actual issue here, we have two conflated ones, VSYNC and supposed extra stuttering when upscaling. Still no idea about the latter, does it still appear to be an issue?

Regarding VSYNC we already use a mode that's supposed to be synced so I don't know what more we can do.

@Narugakuruga
Copy link
Contributor

Stutter still happens in 1.7.5.415.
i5-8400 GTX 1050 Ti Windows 10

Monster Hunter Portable 3rd HD ver, texture scaling x5 xBRZ.
Vulkan : When goes in the village and move the character immediately, a stutter occurs. I'm very sure it's a stutter because both audio and graphics are paused for a short time.
D3D11 : When goes in the village and move the character immediately, no stutter at all.
OpenGL : When goes in the village and move the character immediately, there may be stutter, but absolutely not that obvious as Vulkan because the audio is smooth.

@hrydgard
Copy link
Owner

hrydgard commented Feb 6, 2019

Since the CPU cost of texture scaling should be the same, something else has to be different. Probably D3D11 has preallocated space while Vulkan runs out of buffers and has to go to the OS to get more with vkAllocateMemory. We might be able to fix this by increasing the size of our allocations when 5x texture scaling is used... but I'm putting that off for after the 1.8.0 release.

@hrydgard hrydgard modified the milestones: v1.8.0, v1.9.0 Feb 6, 2019
@Narugakuruga
Copy link
Contributor

About VSync, here are some clues.
There are 4 kinds of VSync in NVIDIA : "On", "Adaptive", "Adaptive (Half refresh rate)", "Fast".
Test on 1.7.5.415 with Cube test program.
On : Nothing special, works properly on PPSSPP Vulkan.
Adaptive : At first works. But after I went to graphics setting and turned back(not on purpose), the screen tears.
Adaptive (Half refresh rate) : Nothing special, works properly on PPSSPP Vulkan.
Fast : Completely doesn't work on PPSSPP Vulkan.

@Narugakuruga
Copy link
Contributor

Narugakuruga commented Feb 6, 2019

By the way, Cemu emulator had just fixed their broken VSync option (though they use OpenGL), Dolphin vulkan VSync also works properly on my NVIDIA. I think there must be something wrong.
Hope you can fix this small issue so I can continue to feel good about buying Gold.

@DonelBueno
Copy link
Author

This is still actual (both problems). Now I'm using a GTX 1070 and drivers 430.53.

@hrydgard
Copy link
Owner

hrydgard commented Aug 8, 2019

Lack of vsync will be fixed in the next NV drivers if it isn't already. As for scaling stutter, yeah it's gonna be like that until we implement GPU texture upscaling. It will happen eventually...

@hrydgard hrydgard modified the milestones: v1.9.0, v1.10.0 Aug 8, 2019
unknownbrackets added a commit to unknownbrackets/ppsspp that referenced this issue Mar 1, 2020
This allows the setting to be changed at runtime in Vulkan too.

Should help hrydgard#10105.
@unknownbrackets
Copy link
Collaborator

Has this improved?

-[Unknown]

@Narugakuruga
Copy link
Contributor

Confirmed, NVIDIA had fixed this

@unknownbrackets
Copy link
Collaborator

Does that mean we can close this, or is texture scaling still performing poorly?

-[Unknown]

@Narugakuruga
Copy link
Contributor

I believe not.
I run some simple test using the frametime graph. The result shows OpenGL has the best performance when handling texture scaling, while Vulkan and D3D11 have lower performance (especially Vulkan).
I use 5x tex scale, xBRZ, to test Monster Hunter Portable 3rd HD, where I enter the same area with large amount of textures. I recorded the frametime graph (the recording itself had no influence on frametime)

批注 2020-03-07 113241
(OpenGL) The frametime almost instantly stabilizes after loading into the area.

批注 2020-03-07 113530
(Vulkan) The frametime is in an unstable pattern after loading into the area.

批注 2020-03-07 113602
But stabilizes after a few seconds.

批注 2020-03-07 113403
(D3D11) The frametime is unstable too but some better than Vulkan.

@Narugakuruga
Copy link
Contributor

Another test: Reducing tex scale factor to 2x.
This help lower the fluctuations of frametime in all backends, but Vulkan and D3D11 still have heavy stuttering.

@unknownbrackets
Copy link
Collaborator

For me, if I change:

			scaler.ScaleAlways((u32 *)writePtr, pixelData, fmt, w, h, scaleFactor);

To:

			uint8_t *rearrange = (uint8_t *)AllocateAlignedMemory(w * scaleFactor * h * scaleFactor * 4, 16);
			scaler.ScaleAlways((u32 *)rearrange, pixelData, fmt, w, h, scaleFactor);
			memcpy(writePtr, rearrange, w * h * 4);
			FreeAlignedMemory(rearrange);

The speed is improved to near OpenGL speeds. To me, this indicates it is definitely something about the memory it's writing to.

Adding VK_MEMORY_PROPERTY_HOST_CACHED_BIT to the texture push buffer only also significantly improved speed and was comparable to OpenGL speeds. The scaling code actually reads from the output buffer during its scaling, which is why.

For me, Direct3D 11 is comparable in speed to OpenGL already (even with the mapRowPitch part.)

-[Unknown]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants