Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change mip level padding requirements to support mmapping #116

Closed
MarkCallow opened this issue Jan 15, 2020 · 15 comments · Fixed by #117
Closed

Change mip level padding requirements to support mmapping #116

MarkCallow opened this issue Jan 15, 2020 · 15 comments · Fixed by #117

Comments

@MarkCallow
Copy link
Contributor

There is a desire to make sure mip levels are aligned to meet the needs of mmapping. See KhronosGroup/KTX-Software#161 for initial discussion.

The proposal is to require the first mip level in the file to be aligned on a texel block-sized boundary. For block-compressed textures subsequent mip levels will naturally fall on block-sized boundaries so mipPadding will be always be 0 bytes.

For uncompressed textures, it is necessary to be cognizant of the Vulkan requirement for bufferOffset in a VkBufferImageCopy to be both a multiple of 4 and of the texel block size. This means that 1, 2 and 3 component uncompressed textures may still require mipPadding.

If this is spec. change is acceptable, the trick now is to find a way to express all this simply in the specification. One complexity to be expressed is that padding following key/value data has to be align(8) if followed by supercompressionGlobalData, otherwise it has to be align(texel block size).

@alecazam
Copy link

Thinking about this some more. Just having any mips that follow header + key-value data always be aligned to 16 would be enough for current formats. rgba32f is 16 bytes, rgba16f is 8 bytes, and BC/ASTC formats are all 8 or 16 bytes in size. Any mips that follow would be aligned to block size.

It's great that the image length in the mip data was left out from the KTX spec, since that was a lot of what threw off alignment. Although many KTX (v1) textures are written out with KTXorientation fields that also likely throw off the first mip alignment.

If not aligned, then mip data must be copied, and can't be mmap-ed directly from the file.

On interesting tidbit is that Microsoft states that top mips with block compressed formats then DX12 may need the top mip (though not lower mips) to be a multiple of the block dimensions. With 4x4 to 12x12 astc, that's a complicating factor to generating valid mip textures.

@MarkCallow
Copy link
Contributor Author

Thinking about this some more. Just having any mips that follow header + key-value data always be aligned to 16

Unfortunately certain uncompressed formats have block sizes for which 16 is not a multiple and some have block sizes greater than 16 so we can't simplify like this. In the case of libktx's it is not hard to use the actual block size as this is given by the DFD which is initialized very early in the process of creating a .ktx2 file. In other words the hard part of the work has already been done.

MarkCallow added a commit to MarkCallow/KTX-Specification that referenced this issue Jan 31, 2020
@MarkCallow
Copy link
Contributor Author

Please review PR #117.

@alecazam
Copy link

Thanks for addressing this. I think using the max(blockSize, 4) is a reasonable amount of pre-padding like what you specify. Callers can always copy the block if it's not aligned. Mostly 8 and 16-byte block compressed textures should upload nicely directly by mmap of the ktx file, or via copying into a staging buffer for blitting to a discrete gpu texture or heap or sparse texture for twiddling. The 4 byte mip post-padding as long as it doesn't kick in should also not throw off the block formats.

@zeux
Copy link

zeux commented Jan 31, 2020

To clarify, does this change the existing 8 byte alignment to 4 byte alignment for Basis compressed textures?

@MarkCallow
Copy link
Contributor Author

Yes the alignment will change. I am now thinking to remove mipPadding for supercompressed textures as it is only necessary for mmapping into a staging buffer which will never be done for supercompressed textures.

Separately I've realized that align(max(blockSize, 4)) and align(4) are wrong. The padding needs to be to the least common multiple of 4 and the texel block size (at least for Vulkan staging buffers). I'll post a note here when I've updated the PR.

@MarkCallow
Copy link
Contributor Author

@alecazam is initialMipPadding really necessary for mmapping? Can you not mmap the offset of the start of the mip levels into the start of, e.g, a Vulkan staging buffer? The mmap man page seems to say that with MAP_FIXED the specified file offset will be mapped to the provided address though there are the weasel words "try to" in there.

@alecazam
Copy link

alecazam commented Feb 1, 2020

In Metal, the pixels for the mips must start on a multiple of the block size. So if you just mmap the entire file from the start, then that initial mip padding is needed. I don't thing that starting address helps since the start of the file is mapped to that, or you can specify a page-multiple offset to mmap.

If you copy the mips into a staging buffer, then you can obviously do anything, but you incur a copy then. mmap starting address I've never used, but the offset must be a multiple of page size, so that limits where the data is placed.

My recommendation was for 16byte (instead of 8 alignment of the starting mip). I think the block formats all upload with 8 or 16 bytes, and the explicit formats likely need a multiple of 4 bytes in general. RGBA32F is 16 bytes, RGBA16f is 8 bytes. I'm not familiar with bigger formats, but maybe there are double-based textures now.

And yes, for supercompressed textures, then none of this applies. I anticipate given the size of compressed formats, that most users will opt fo supercompressed textures.

@MarkCallow
Copy link
Contributor Author

In Vulkan all single plane, non-depth/stencil formats must start on a multiple of the "texel block size", which is the block size for block formats and numComponents * componentBytes for other formats. There are 6, 12, 24 and 32 byte formats for which 16 would not work.

mmap on macOS does not have the requirement for offset to be a multiple of page size. The man page says

If offset or len is not a multiple of the pagesize, the mapped region may extend past the specified range. Any extension beyond the end of the mapped object will be zero-filled

So I thought you could just mmap the mips part of the file to the start of the texture buffer.

Does Metal not need staging or other GPU allocated buffers mapped to host memory in order to upload textures? Does it sample them directly from host memory? How do you get the data from an mmapped whole KTX2 file into a Metal texture?

My thought was to use mmapping to avoid the step of copying the data from the file to the staging buffer or the actual texture buffer for linear-tiled textures. However I'm not even sure it is possible to have GPU memory mapped to the host to which you then mmap a file. Anyone know?

@MarkCallow
Copy link
Contributor Author

Pinging @alecazam again. Please explain how you would use mmapping to load a texture in Metal and why some value of initialMipPadding helps.

@alecazam
Copy link

alecazam commented Feb 4, 2020

Yes, offsetting the mmap to the start of the mips would work. But if you wanted to concatenate several KTX or KTX2 files together then you'd need more than one mmap, or to copy the mip data out to another file and then mmap that once. So the thought was just that instead of 8-byte initial mip padding, the use of 16 bytes instead would suffice for a wide variety of texture formats. It just seems strange to pick 8, when all of the ASTC blocks are 16 bytes, and most of the newer BC and ETC format. Block formats and 2/4 element fp16/fp32 formats and most pow2 4 byte formats would all naturally be block aligned.

Metal can pull from a staging buffer, and that staging buffer can be a no-copy mmap, but the mips must be block aligned and a multiple of the page size. GPU memory won't be mapped to that, it will point to a private copy that is formatted appropriately for the gpu. But if you want to purge or restore the content, then the mmap is a "free" backing store. iOS only treats read-only mmap pages as not counting towards overall memory pressure.

@MarkCallow
Copy link
Contributor Author

Metal can pull from a staging buffer, and that staging buffer can be a no-copy mmap, but the mips must be block aligned and a multiple of the page size.

What mmap flags does "no-copy mmap" correspond to? I.e, what do you mean by it.

At the moment I am trying to understand how, in Vulkan, you ensure proper alignment of the physical address of the start of a staging buffer or linear-tiled texture. The spec only talks about alignment of the offsets within the staging buffer. 0 is clearly a valid offset regardless of block size.

VkMemoryRequirements returns the required alignment, but only if you are allocating memory for an image will the image format have been passed in when querying the requirements. Regardless, VkMemoryAllocateInfo, which is what is passed to the allocator, has no field for alignment. I need to study further. Do you know? It is because of the apparent lack of anyway to ensure the alignment that I asked if it was necessary for the very first mip level to be aligned on a block-size boundary which, on reflection, seems to be a stupid question.

I think I'll require an alignment of lcm(texel block size, 4) for the start of all mip levels. I.e least common multiple.

@MarkCallow
Copy link
Contributor Author

@alecazam, in Metal is the texel block size alignment only required for block compressed textures or is it for any texture? Is there any requirement also for 4 byte alignment?

@alecazam
Copy link

alecazam commented Feb 4, 2020

I don't think this is well documented, and it's only something that fires as a Metal validation. The validator may be different for macOS and iOS as well. But the validation message is typically that the base address needs to be a multiple of the block size, in order to blit from staging to private. I'm sure formats like RGB8 also need 4 byte alignment, but I typically only use compressed textures.

Also the mmap used as a backing store can use MTLBuffer no copy version. But the base address and mmap must be a multiple of page size. This means fudging the mmap area of smaller files, but if you could just map them all in a single file, then all that is required is that all mip content be aligned.

https://developer.apple.com/documentation/metal/mtldevice/1433382-makebuffer

There are various other approaches to achieve this. The mip data could be split off from the header on KTX/KTX2/DDS files. Then supercompressed textures would unpack to the mip data, and be rewritten to disk that way. KTX (v1) is thrown off since the data has the 4-byte size field on each mip level, so there you need to strip those, or copy individual mips to another location without that offset. KTX2 will be easier to manipulate since it doesn't include that.

@MarkCallow
Copy link
Contributor Author

Please review PR #117 again.

mipPadding is now only required for non-supercompressed files. There is no longer separate initialMipPadding and mipPadding. The function to calculate required padding has been changed from max to lcm fixing a bug in the previous version of the PR.

I think this is now as simple as it can be.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants