Skip to content

Conversation

@bob80905
Copy link
Contributor

@bob80905 bob80905 commented Oct 17, 2025

This PR is the first step in bringing support to testing with partially mapped and unmapped resources.
It lays out the infrastructure necessary to initialize resources with only a certain number of tiles mapped.
The Load overload that takes the 2nd status argument is implemented in DXC, so we take advantage of that and test that when accessing data in an unmapped resource, we get 0'd data and the status integer is also 0.

The test is currently unsupported in Clang and Vulkan, but will be supported in the future.
Fixes #182

Copy link
Collaborator

@bogner bogner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great. We may want to revisit and see if we can do some code deduplication in a nice way at some point, but I think this makes sense as is for now.

A few of minor comments.

std::optional<VulkanBinding> VKBinding;
Buffer *BufferPtr = nullptr;
bool HasCounter;
int TilesMapped = -1;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::optional<int> would be clearer than using -1 as a sigil here, and isn't really that much more expensive.

Comment on lines 12 to 13


Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some extra whitespace here

Comment on lines 16 to 17
// 32 S structs inside X or Y occupy 64KB of data. (32 * 512 ints * 4 bytes per int)
// So, any index into the buffer >= [32] will access a new "tile"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this might be over 80-column. It's a bit annoying to clang-format within these .test files (easiest is to paste in another file and clang-format that...), but please try to follow LLVM style regardless.

Comment on lines 756 to 1137
memset(ExtraData, 0, CBVSize - R.size() - 1);
memset(ExtraData, 0, CBVSize - R.size());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this an unrelated bug fix? I'm nervous about a change like this without accompanying tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is, but I won't be in the business of writing a test for this change here in this PR.
I'll undo this for now.
#477

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you still intending to undo this change, or did you find it is actually required?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am intending to undo this change, it should already be undone right now.

Format: Int32
Stride: 2048 # S is 512 ints, 512*4 = 2048.
FillSize: 131072
FillValue: 1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's easier to see what parts of the data are read if we use something more noticeable here (like 42, or 9001 or whatever)

Comment on lines 127 to 128
# XFAIL: Clang
# XFAIL: Vulkan
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These need comments like Unimplemented <link to issue> or Bug <link to issue> so that we can track why they're XFAIL'd and loop back appropriately.

@bob80905 bob80905 force-pushed the add_unmapped_resources_tests branch from 8d9cbbb to d3cbf91 Compare October 22, 2025 20:12
Comment on lines 735 to 740
// Tile mapping setup (optional if NumTiles > 0)
const UINT NumTiles = getNumTiles(R.TilesMapped, ResDesc.Width);

ComPtr<ID3D12Heap> Heap; // optional, only created if NumTiles > 0

if (NumTiles > 0) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This whole block (up to the updateTileMappings call) looks identical to the SRV version. Can we pull out a helper function to call in both cases?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turns out I'll have to leave unmapped UAVs for another time, so I'll pull out a helper function in a future PR.

Comment on lines 580 to 581
// Tile mapping setup (optional if NumTiles > 0)
const UINT NumTiles = getNumTiles(R.TilesMapped, ResDesc.Width);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment feels a bit confusing - NumTiles being zero is the edge case given that it can only happen if we explicitly set zero tiles. "optional if NumTiles > 0" sounds to me like NumTiles will usually be zero.

@bob80905 bob80905 force-pushed the add_unmapped_resources_tests branch from afcbf1e to 3e78f39 Compare October 24, 2025 17:46
addUploadEndBarrier(IS, Destination, R.isReadWrite());
}

UINT getNumTiles(std::optional<int> NumTiles, UINT64 Width) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this just contain a uint instead of an int? I assume the number of tiles can never be negative

else
// Map the entire buffer by computing how many 64KB tiles cover it
Ret = static_cast<UINT>(
(Width + D3D12_TILED_RESOURCE_TILE_SIZE_IN_BYTES - 1) /
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could this operation overflow? Why is width a uint64?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently the arg that feeds into the width parameter is an int datatype. So it doesn't need to be a uint64, at least not yet. I'll change it.

Copy link
Collaborator

@spall spall Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could width or d3d12_tiled_resource_tile_size_in_bytes be uint32 max or half that amount; could this operation overflow after you change width to be a uint32?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe D3D12_TILED_RESOURCE_TILE_SIZE_IN_BYTES can be any larger than 65536. However, width could be any arbitrary size, depending on how the developer specified it. So yes, I think this operation could overflow, though in exceedingly specific and pathological circumstances.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW: in D3D12, Width is a uint64 because it needs to be able to support describing buffers that are larger than 4gb.

Copy link
Contributor Author

@bob80905 bob80905 Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, though for this API:
UpdateTileMappings
The pRangeTileCounts param is a UINT. Its value in my code comes directly from counting the number of tiles. So I don't think this should ever be > uint32_max.

if (R.size() < CBVSize) {
void *ExtraData = static_cast<char *>(ResDataPtr) + R.size();
memset(ExtraData, 0, CBVSize - R.size() - 1);
memset(ExtraData, 0, CBVSize - R.size());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this line be unmodified? I see a long unresolvd comment thread about it and the conclusion is the change should be undone, but I still see a modification?

ComPtr<ID3D12Resource> Readback)
: Upload(Upload), Buffer(Buffer), Readback(Readback) {}
ComPtr<ID3D12Resource> Readback,
ComPtr<ID3D12Heap> Heap = nullptr)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this defaulted to null but the others are not?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both CBV and UAV don't need to pass the Heap to the Resource bundle (yet) because only SRVs are using reserved resources by default, and that's the only case where the lifetime of the Heap need to be extended. The other cases will need non-null values, especially for functions like Bind* or CreateComputeCommands.

@damyanp
Copy link
Collaborator

damyanp commented Nov 10, 2025

Code LGTM, on assumption that all these tests failing aren't related to this change.

@bob80905
Copy link
Contributor Author

Code LGTM, on assumption that all these tests failing aren't related to this change.

llvm/llvm-project#166449
Actually, the above PR is a prerequisite for this PR, otherwise the new file will fail.

@bogner
Copy link
Collaborator

bogner commented Nov 11, 2025

Code LGTM, on assumption that all these tests failing aren't related to this change.

llvm/llvm-project#166449 Actually, the above PR is a prerequisite for this PR, otherwise the new file will fail.

The test will only fail with clang without that change, it should pass with dxc - we generally mark such tests as XFAIL: Clang waiting on the unimplemented feature.

Copy link
Collaborator

@bogner bogner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

API generally looks good to me. A few comments on improving / simplifying the test.

if (CAFMResult){
CAFM[idx] = CAFMResult;
Out[idx] = Out0.x;
Out[idx + 4] = status;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In practice using status directly is the same as using the result of CheckAccessFullyMapped, but at least according to the docs it's illegal to access status directly. We should probably use CAFMResult here if it's useful (or just omit this - the CAFM buffer records the information already).


uint status;
int4 Out0 = X.Load(0, status);
int CAFMResult = CheckAccessFullyMapped(status) ? 1 : 0;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not bool CAFMResult = CheckAccessFullyMapped(status);? Also why not make CAFM an RWStructuredBuffer<bool> instead of casting this to int at all?

int4 Out0 = X.Load(0, status);
int CAFMResult = CheckAccessFullyMapped(status) ? 1 : 0;
if (CAFMResult){
CAFM[idx] = CAFMResult;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be clearer to assign to CAFM[idx] unconditionally - that way we're not relying on the initialization of the CAFM buffer at all. It might also make sense to explicitly assign a known constant (like 0x5a5a5a5a) to Out[idx] when CAFMResult is false

if (CAFMResult)
Out[idx] = Result.x;
else {
if (all(Result == int4(0,0,0,0)))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is Result even defined if CheckAccessFullyMapped returns false? We've read from an unmapped tile in that case...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think reading from unmapped depends on the tiled resource tier, and is undefined in some cases: https://learn.microsoft.com/en-us/windows/win32/api/d3d12/ne-d3d12-d3d12_tiled_resources_tier

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add support for reserved resources to the offload tester

4 participants