Fix bug in size estimation of array buffers #2991

emilk · 2023-08-15T16:43:32Z

An off-by-one bug in estimated_bytes_size caused the size estimations of Tensors to completely ignore the actual payload, causing the memory consumption of images to be vastly underestimated.

I decided to rewrite the code to be more clear for future readers.

This discovered a difference in how validity bitmaps are written by Rust and Python+C++, fixed in a828441.

To help find this problem a lot of work was also put into improving the output of scripts/ci/run_e2e_roundtrip_tests.py.

Discovered while working on Show per-entity data rate and volume #2435

Checklist

I have read and agree to Contributor Guide and the Code of Conduct
I've included a screenshot or gif (if applicable)
I have tested demo.rerun.io (if applicable)

An off-by-one bug in `estimated_bytes_size` caused the size estimations of Tensors to completely ignore the actual payload, causing the memory consumption of images to be vastly underestimated. I decided to rewrite the code to be more clear for future readers.

Wumpf

👍

jleibs

Nice catch

emilk · 2023-08-15T17:54:55Z

This seems to be catching a bug in the e2e tests now! That's good, I guess :) But needs debugging…

I should probably a dd a unit-test also to make sure the new code is correct, and won't regress.

emilk · 2023-08-15T18:59:11Z

It is translation_and_mat3x3/identity that fails:

and with a slightly different diff each time:

crates/re_log_types/src/data_table.rs

Co-authored-by: Jeremy Leibs <[email protected]>

emilk · 2023-08-15T20:16:17Z

So I've found the difference: the python version writes a validity bitmaps where the rust version doesn't. Maybe the rust arrow library omits them when they are all set or something. I will dig deeper. Still, this is a real difference in the data that rerun compare didn't catch until now. Perhaps though it is a difference that should be ignored.

emilk · 2023-08-15T20:31:44Z

So on the Rust size, TranslationAndMat3x3 has this datatype (crates/re_types/src/datatypes/translation_and_mat3x3.rs):

DataType::Struct(vec![
    Field {
        name: "translation".to_owned(),
        data_type: <crate::datatypes::Vec3D>::to_arrow_datatype(),
        is_nullable: true,
        metadata: [].into(),
    },
    Field {
        name: "matrix".to_owned(),
        data_type: <crate::datatypes::Mat3x3>::to_arrow_datatype(),
        is_nullable: true,
        metadata: [].into(),
    },
    Field {
        name: "from_parent".to_owned(),
        data_type: DataType::Boolean,
        is_nullable: false,
        metadata: [].into(),
    },
])

while Python has this (rerun_py/rerun_sdk/rerun/_rerun2/datatypes/transform3d.py):

pa.struct(
    [
        pa.field(
            "translation",
            pa.list_(pa.field("item", pa.float32(), nullable=False, metadata={}), 3),
            nullable=True,
            metadata={},
        ),
        pa.field(
            "matrix",
            pa.list_(pa.field("item", pa.float32(), nullable=False, metadata={}), 9),
            nullable=True,
            metadata={},
        ),
        pa.field("from_parent", pa.bool_(), nullable=False, metadata={}),
    ]
),

EDIT: @jleibs pointed out this is not wrong

jleibs · 2023-08-15T20:40:22Z

Notice how Rust sets the first fields as nullable, but Python doesn't!

I think you're misreading the python. The field that's set non-nullible there is the inner nested type, not the outer type.

pa.field("matrix", pa.list_(pa.field("item", pa.float32(), False, {}), 9), True, {})

For clarity it would be nice if this were re-implemented recursively as:

pa.field("matrix", Vec3D.data_type(), True, {})

But unless the inner types don't match those look like they are all doing the same thing.

emilk · 2023-08-15T20:42:27Z

Oh, you're right… I got confused by how Rust and C++ calls out to other functions for the inner datatype, while Python repeats/inlines it. Nevermind then.

emilk · 2023-08-15T20:46:18Z

The Python encoder is outputting validities with all zeroes, but the Rust and C++ arrow encoder omits the validity.

Maybe it is an optimization in Rust and C++ to only output the validity if it is non-zero? That works if the encoder interprets a missing validity as "all nulls".

From https://arrow.apache.org/docs/format/Columnar.html#sparse-union:

Arrays having a 0 null count may choose to not allocate the validity bitmap; how this is represented depends on the implementation (for example, a C++ implementation may represent such an “absent” validity bitmap using a NULL pointer). Implementations may choose to always allocate a validity bitmap anyway as a matter of convenience. Consumers of Arrow arrays should be ready to handle those two possibilities.

This sounds to me like an omitted validity map means "all set = no nulls"… I'm giving up for today.

jleibs · 2023-08-15T20:53:03Z

The Python encoder is outputting validities with all zeroes, but the Rust and C++ arrow encoder omits the validity.

This doesn't match what I'm seeing. For me, Python & C++ both output the same encoding and Rust is the one that's different.

# Conflicts: # crates/re_types/source_hash.txt # crates/re_types_builder/src/codegen/rust/serializer.rs

emilk added 🪳 bug Something isn't working 🏹 arrow concerning arrow labels Aug 15, 2023

Wumpf self-requested a review August 15, 2023 16:52

emilk force-pushed the emilk/fix-size-estimation-bug branch from b0f91ba to cae4e4e Compare August 15, 2023 16:53

Wumpf approved these changes Aug 15, 2023

View reviewed changes

Add sanity check

c2e39bd

jleibs approved these changes Aug 15, 2023

View reviewed changes

Some more cleanup

0f07b60

emilk added 4 commits August 15, 2023 20:32

Add test of dense union size estimation

f4bde22

More readable output from run_e2e_rountrip_tests.py

fc4d168

Even nicer output of e2e roundtrip tests

a4a9f09

Better diffs on size mismatches

059de50

jleibs reviewed Aug 15, 2023

View reviewed changes

crates/re_log_types/src/data_table.rs Outdated Show resolved Hide resolved

fix bug in diffing

8a9c6c4

Co-authored-by: Jeremy Leibs <[email protected]>

Better diff output

510df47

python codegen: use named arguments to improve readability

9aa6aa2

jleibs added 2 commits August 16, 2023 01:41

Add validity to the inner elements of fixed-sized-lists

a828441

Codegen

aa557f8

jleibs mentioned this pull request Aug 15, 2023

Review and document edge cases related to inner-nullabillity #2993

Open

2 tasks

jleibs and others added 2 commits August 16, 2023 01:57

Link to issue for tracking

f0b477a

Suggest to the user to check validity if size differs

72f0b8e

Merge branch 'main' into emilk/fix-size-estimation-bug

f592c36

# Conflicts: # crates/re_types/source_hash.txt # crates/re_types_builder/src/codegen/rust/serializer.rs

emilk merged commit 04d5491 into main Aug 16, 2023

emilk deleted the emilk/fix-size-estimation-bug branch August 16, 2023 07:18

Wumpf changed the title ~~Fix big bug in size estimation of array buffers~~ Fix bug in size estimation of array buffers Oct 5, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix bug in size estimation of array buffers #2991

Fix bug in size estimation of array buffers #2991

emilk commented Aug 15, 2023 •

edited

Loading

Wumpf left a comment

jleibs left a comment

emilk commented Aug 15, 2023

emilk commented Aug 15, 2023

emilk commented Aug 15, 2023 •

edited

Loading

emilk commented Aug 15, 2023 •

edited

Loading

jleibs commented Aug 15, 2023

emilk commented Aug 15, 2023

emilk commented Aug 15, 2023 •

edited

Loading

jleibs commented Aug 15, 2023

Fix bug in size estimation of array buffers #2991

Fix bug in size estimation of array buffers #2991

Conversation

emilk commented Aug 15, 2023 • edited Loading

Checklist

Wumpf left a comment

Choose a reason for hiding this comment

jleibs left a comment

Choose a reason for hiding this comment

emilk commented Aug 15, 2023

emilk commented Aug 15, 2023

emilk commented Aug 15, 2023 • edited Loading

emilk commented Aug 15, 2023 • edited Loading

jleibs commented Aug 15, 2023

emilk commented Aug 15, 2023

emilk commented Aug 15, 2023 • edited Loading

jleibs commented Aug 15, 2023

emilk commented Aug 15, 2023 •

edited

Loading

emilk commented Aug 15, 2023 •

edited

Loading

emilk commented Aug 15, 2023 •

edited

Loading

emilk commented Aug 15, 2023 •

edited

Loading