Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Empty section handling #50

Open
DanW97 opened this issue Aug 30, 2024 · 2 comments
Open

Empty section handling #50

DanW97 opened this issue Aug 30, 2024 · 2 comments

Comments

@DanW97
Copy link

DanW97 commented Aug 30, 2024

I've noticed that PyVista includes all possible sections for PolyData files, and it isn't readily apparent how this can be prevented in a sane manner.

As an example, this is a dataset with the following attributes: NumberOfPoints="6" NumberOfVerts="6" NumberOfLines="0" NumberOfStrips="0" NumberOfPolys="0"

In binary:

      <Verts>
        <DataArray type="Int64" Name="connectivity" format="binary" RangeMin="0" RangeMax="5">
          AQAAAACAAAAwAAAAGAAAAA==eJxjYIAARijNBKWZoTQLlGaF0gABSAAQ
        </DataArray>
        <DataArray type="Int64" Name="offsets" format="binary" RangeMin="1" RangeMax="6">
          AQAAAACAAAAwAAAAGAAAAA==eJxjZIAAJijNDKVZoDQrlGaD0gAB8AAW
        </DataArray>
      </Verts>
      <Lines>
        <DataArray type="Int64" Name="connectivity" format="binary" RangeMin="1e+299" RangeMax="-1e+299">
          AAAAAACAAAAAAAAA
        </DataArray>
        <DataArray type="Int64" Name="offsets" format="binary" RangeMin="1e+299" RangeMax="-1e+299">
          AAAAAACAAAAAAAAA
        </DataArray>
      </Lines>
      <Strips>
        <DataArray type="Int64" Name="connectivity" format="binary" RangeMin="1e+299" RangeMax="-1e+299">
          AAAAAACAAAAAAAAA
        </DataArray>
        <DataArray type="Int64" Name="offsets" format="binary" RangeMin="1e+299" RangeMax="-1e+299">
          AAAAAACAAAAAAAAA
        </DataArray>
      </Strips>
      <Polys>
        <DataArray type="Int64" Name="connectivity" format="binary" RangeMin="1e+299" RangeMax="-1e+299">
          AAAAAACAAAAAAAAA
        </DataArray>
        <DataArray type="Int64" Name="offsets" format="binary" RangeMin="1e+299" RangeMax="-1e+299">
          AAAAAACAAAAAAAAA
        </DataArray>
      </Polys>

And in ASCII (as you can see - empty, I'm not fully certain on what the AAAAAACAAAAAAAAA entries for the binary file mean):

      <Verts>
        <DataArray type="Int64" Name="connectivity" format="ascii" RangeMin="0" RangeMax="5">
          0 1 2 3 4 5
        </DataArray>
        <DataArray type="Int64" Name="offsets" format="ascii" RangeMin="1" RangeMax="6">
          1 2 3 4 5 6
        </DataArray>
      </Verts>
      <Lines>
        <DataArray type="Int64" Name="connectivity" format="ascii" RangeMin="1e+299"
          RangeMax="-1e+299">
        </DataArray>
        <DataArray type="Int64" Name="offsets" format="ascii" RangeMin="1e+299" RangeMax="-1e+299">
        </DataArray>
      </Lines>
      <Strips>
        <DataArray type="Int64" Name="connectivity" format="ascii" RangeMin="1e+299"
          RangeMax="-1e+299">
        </DataArray>
        <DataArray type="Int64" Name="offsets" format="ascii" RangeMin="1e+299" RangeMax="-1e+299">
        </DataArray>
      </Strips>
      <Polys>
        <DataArray type="Int64" Name="connectivity" format="ascii" RangeMin="1e+299"
          RangeMax="-1e+299">
        </DataArray>
        <DataArray type="Int64" Name="offsets" format="ascii" RangeMin="1e+299" RangeMax="-1e+299">
        </DataArray>
      </Polys>

Running in debug mode, I see that xml.rs, line 1900 tries to execute an overflowing subtraction. In release mode, the silent overflow, to me, doesn't seem like a good thing. So I am wondering what the best approach to handling this case is? I've added a couple of lines to xml.rs that basically makes decompress() return early with a vector containing a single u8 set to 0. From what I can tell, I don't think any other code needs modification. From some quick testing of the example files that have the snippets I've shared above, it appears to work as intended - output is polys: Some(XML { connectivity: [], offsets: [] }) for instance.

My reasoning is that if a read is requested, we should try and read everything that we can correctly and return the exact contents in the file (except if invalid data is present). Full disclosure - I'm in 2 minds whether this should be an error or left to the user to handle the fact that they have empty sections - I'm leaning more towards the latter because an empty section, to me, doesn't seem like invalid data necessarily.

I'm curious about your thoughts on this type of case.

@elrnv
Copy link
Owner

elrnv commented Sep 4, 2024

Thank you for bringing this up, @DanW97! I want to clarify: is the issue in reading base64 encoded VTP files produced by PyVista using vtkio? If so, could you paste the output when running vtkio with logging set to "trace"?

@DanW97
Copy link
Author

DanW97 commented Sep 5, 2024

Yes, that's correct.

If so, could you paste the output when running vtkio with logging set to "trace"?

I couldn't find the original file I had when creating this issue, so I've created another that reproduces the issue, this time with <Piece NumberOfPoints="5" NumberOfVerts="0" NumberOfLines="0" NumberOfStrips="0" NumberOfPolys="3">

Debug:

[2024-09-05T13:42:43Z TRACE vtkio::xml] [decompress]: Num blocks: 0
[2024-09-05T13:42:43Z TRACE vtkio::xml] [decompress]: header bytes: 4
[2024-09-05T13:42:43Z TRACE vtkio::xml] [decompress]: full header bytes: 12
[2024-09-05T13:42:43Z TRACE vtkio::xml] [decompress]: full header bytes in base 64: 16
[2024-09-05T13:42:43Z TRACE vtkio::xml] [decompress]: Decoded header length: 12
[2024-09-05T13:42:43Z TRACE vtkio::xml] [decompress]: Block size: 32768
[2024-09-05T13:42:43Z TRACE vtkio::xml] [decompress]: Last block size: 0
[2024-09-05T13:42:43Z TRACE vtkio::xml] [decompress]: Compressed offsets: [0]
[2024-09-05T13:42:43Z TRACE vtkio::xml] [decompress]: Total number of bytes: 0
thread 'main' panicked at /home/dan/.cargo/git/checkouts/vtkio-b6ae8ec40896763b/c4feb50/src/xml.rs:1900:34:
attempt to subtract with overflow
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Release

[2024-09-05T13:44:25Z TRACE vtkio::xml] [decompress]: Num blocks: 0
[2024-09-05T13:44:25Z TRACE vtkio::xml] [decompress]: header bytes: 4
[2024-09-05T13:44:25Z TRACE vtkio::xml] [decompress]: full header bytes: 12
[2024-09-05T13:44:25Z TRACE vtkio::xml] [decompress]: full header bytes in base 64: 16
[2024-09-05T13:44:25Z TRACE vtkio::xml] [decompress]: Decoded header length: 12
[2024-09-05T13:44:25Z TRACE vtkio::xml] [decompress]: Block size: 32768
[2024-09-05T13:44:25Z TRACE vtkio::xml] [decompress]: Last block size: 0
[2024-09-05T13:44:25Z TRACE vtkio::xml] [decompress]: Compressed offsets: [0]
[2024-09-05T13:44:25Z TRACE vtkio::xml] [decompress]: Total number of bytes: 0
thread 'main' panicked at library/alloc/src/raw_vec.rs:25:5:
capacity overflow
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

In xml.rs:1900 you have vec![0u8; nu * (nb - 1) + np] where nb is inferred to be usize. From the trace, it looks like an empty section has zero block size, so naturally subtracting 1usize from 0usize is not going to end well.

With a quick

for piece in pieces {
    println!("{:?}". piece);
}

it shows that the ASCII file (exact same as the binary file but setting binary=False in PyVista) sections are correctly interpreted as empty when they are empty:

Inline(PolyDataPiece { points: F64([0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.5, 0.5, -1.0]), verts: Some(XML { connectivity: [], offsets: [] }), lines: Some(XML { connectivity: [], offsets: [] }), polys: Some(XML { connectivity: [0, 1, 2, 3, 0, 1, 4, 1, 2, 4], offsets: [4, 7, 10] }), strips: Some(XML { connectivity: [], offsets: [] }), data: Attributes { point: [DataArray(DataArrayBase { name: "scalars", elem: Scalars { num_comp: 1, lookup_table: None }, data: F64([0.11370923041857384, 0.45998515979005794, 0.15322062479736465, 0.19034519863052368, 0.5868799896571429]) })], cell: [] } })

In case you want to try and reproduce locally, I've attached the script and dependencies I was using to generate the files.
Archive.zip

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants