Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vulkan Swapchain and Submit Synchronization Incorrect #5559

Closed
cwfitzgerald opened this issue Apr 18, 2024 · 4 comments · Fixed by #5681
Closed

Vulkan Swapchain and Submit Synchronization Incorrect #5559

cwfitzgerald opened this issue Apr 18, 2024 · 4 comments · Fixed by #5681
Labels
api: vulkan Issues with Vulkan area: correctness We're behaving incorrectly area: wsi Issues with swapchain management or windowing help required We need community help to make this happen. type: bug Something isn't working

Comments

@cwfitzgerald
Copy link
Member

cwfitzgerald commented Apr 18, 2024

We are currently using a single ping-pong semaphore (I'm calling this set of two semaphores a "relay semaphore") for all of our synchronization between acquire/present and between submissions. This is wrong for multiple reasons.

First using the same relay semaphore for both the submit -> present barrier and the submit -> submit barrier means that a single semaphore is being waited on twice. The act of waiting on a semaphore also resets the semaphore, so waiting on a single semaphore twice results in undefined behavior.

Second we are using the same relay semaphore for all acquire -> submit -> present cycles. This would be fine if we could guarantee that the previously used submit had finished its wait and the semaphore was reset by the time we scheduled the following acquire, but we can't. We need to have N relay semaphores, one per acquire cycle, until we can prove that the least-recently-used semaphore is finished being used.

I have provided the diagram of how submission synchronization needs to work with lines. Blue lines are swapchain relay semaphores (one set per frame). Orange lines are submit -> submit relay semaphore (a single one on the device.

flowchart LR
    A1[Acquire 1] --> S1[Submit 1]
    S1 --> P1[Present 1]
    S1 --> S21

    A2[Acquire 2] --> S21[Submit 2]
    S21 --> S22[Submit 3]
    S22 --> P2[Present 2]
    S22 --> S3[Submit 4]
    S3 --> S4
    
    A3[Acquire 3] --> S4[Submit 5]
    S4 --> P3[Present 3]

    subgraph Frame 3
        A3
        S4
        P3
    end

    subgraph Free Submission
        S3
    end

    subgraph Frame 2
        A2
        S21
        S22
        P2
    end

    subgraph Frame 1
        A1
        S1
        P1
    end

    linkStyle 2,4,6,7 stroke:#ffa600,whoops;
    linkStyle 0,1,3,5,8,9 stroke:#019297,whoops;
Loading

Possibly related:

@cwfitzgerald cwfitzgerald added type: bug Something isn't working help required We need community help to make this happen. area: correctness We're behaving incorrectly api: vulkan Issues with Vulkan area: wsi Issues with swapchain management or windowing labels Apr 18, 2024
@jimblandy
Copy link
Member

jimblandy commented Apr 27, 2024

The use of relaxed atomics in there seems very strange to me. I can't imagine how one could get away with not caring about which values concurrent threads would see for relay_index.

We need to have N relay semaphores, one per acquire cycle, until we can prove that the least-recently-used semaphore is finished being used.

This sounds like something akin to wgpu_hal::vulkan::Fence::FencePool. In fact, perhaps we could literally just replace Queue::relay_semaphores and relax_index with a vulkan::Fence?? Since vkQueueSubmit2 can specify the semaphore value to wait for, a single timeline semaphore should be sufficient for all submit-submit synchronization.

@cwfitzgerald
Copy link
Member Author

So I thought about that, and you can wait and signal the same timeline semaphore, but we can't assume timeline semaphores exist, so we'd need a binary semaphore fallback anyway.

@jimblandy
Copy link
Member

jimblandy commented May 6, 2024

So I thought about that, and you can wait and signal the same timeline semaphore, but we can't assume timeline semaphores exist, so we'd need a binary semaphore fallback anyway.

Right - my point being, vulkan::Fence is exactly a timeline semaphore with a vkFence collection as a fallback. I added some doc comments to vulkan:: Fence about this just recently.

@cwfitzgerald
Copy link
Member Author

We need a timeline semaphore with a vk::Semaphore fallback, so I'm not sure we can share the infra.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: vulkan Issues with Vulkan area: correctness We're behaving incorrectly area: wsi Issues with swapchain management or windowing help required We need community help to make this happen. type: bug Something isn't working
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

2 participants