Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimised COG layout for masks #4931

Closed
tbonfort opened this issue Dec 3, 2021 · 2 comments
Closed

Optimised COG layout for masks #4931

tbonfort opened this issue Dec 3, 2021 · 2 comments

Comments

@tbonfort
Copy link
Member

tbonfort commented Dec 3, 2021

Not sure if this should be a (documentation) issue or a mailing list message, but here goes...

From the COG ghost area spec https://gdal.org/drivers/raster/cog.html#header-ghost-area it is unclear to me how masks are/should be handled in these 2 cases:

  • when using band interleaved data (planarconfiguration=2): after which band should the mask be interleaved with ?
  • when there is more than one mask (in which case it seems that at least masks[n].TileByteCounts still need to be read)
@rouault
Copy link
Member

rouault commented Dec 3, 2021

  • when using band interleaved data (planarconfiguration=2)

For now the COG definition excludes band interleaved data (see https://gdal.org/drivers/raster/cog.html#high-level). This is something I forgot, and recently I was wondering how to create a 5-band JPEG compressed COG, which would have required band interleaved data, and I remember this limitation in the current definition. One justification might be that Baseline TIFF is only pixel interleaved ( https://www.awaresystems.be/imaging/tiff/tifftags/planarconfiguration.html ) but tiled TIFF is not in baseline.... I've raised it to opengeospatial/CloudOptimizedGeoTIFF#3 . Nothing would prevent to have a band interleaved data. But even without speaking about masks, I can imagine scenarios where the layout might be Tile1_Band1, Tile2_Band1, ... TileN_Band1, Tile1_Band2, Tile2_Band2, .... TileN_Band2 or Tile1_Band1, Tile1_Band2, ... Tile1_BandB, Tile2_Band1, Tile2_Band2, .... Tile2_BandB

when there is more than one mask (in which case it seems that at least masks[n].TileByteCounts still need to be read)

Are you speaking about several IFDs of the same dimension tagged with NewSubfileType = FILETYPE_MASK ? GDAL typically doesn't handle this. My understanding of the code is that it must only take into account the first one and ignore the following ones. The GDAL API is limited to one mask per band/dataset

@tbonfort
Copy link
Member Author

tbonfort commented Dec 3, 2021

For now the COG definition excludes band interleaved data (snip) I've raised it to [opengeospatial/CloudOptimizedGeoTIFF#3]

OK I wasn't aware of that, and it is indeed an oversight

But even without speaking about masks, I can imagine scenarios where the layout might be Tile1_Band1, Tile2_Band1, ... TileN_Band1, Tile1_Band2, Tile2_Band2, .... TileN_Band2 or Tile1_Band1, Tile1_Band2, ... Tile1_BandB, Tile2_Band1, Tile2_Band2, .... Tile2_BandB

Yes, and that decision is left to the user creating the file, to optimize for the most common data access pattern. We can also create some even more esoteric layouts for e.g. 4 band data where we know the 4th band will be rarely accessed:
T1B1,T1B2,T1B3,T2B1,T2B2,T2B3, ... TNB1,TNB2,TNB3,T1B4,T2B4,...TNB4

Are you speaking about several IFDs of the same dimension tagged with NewSubfileType = FILETYPE_MASK

Ok thanks, then the gdal specific ghost area is irrelevant in this case

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants