Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't leak arrow headers into Rerun C++ API headers #2873

Closed
Wumpf opened this issue Jul 31, 2023 · 0 comments · Fixed by #3041
Closed

Don't leak arrow headers into Rerun C++ API headers #2873

Wumpf opened this issue Jul 31, 2023 · 0 comments · Fixed by #3041
Labels
😤 annoying Something in the UI / SDK is annoying to use 🌊 C++ API C/C++ API specific codegen/idl

Comments

@Wumpf
Copy link
Member

Wumpf commented Jul 31, 2023

Currently the codegen includes some arrow headers into our own headers. This is unfortunate since it drags in arrow into the user's global namespace when we could avoid it fairly easily. The main issues are:

  • utility functions on generated headers that are needed by aren't useful to the user (ideally we only expose to_data_cell which returns our own type)
  • arrow::Result/arrow::Status - we should have a different error value

We should strive to more clearly separate concerns

Note that we can't avoid dragging in arrow itself for building the SDK (and binary distribution of the C++ SDK remains very challenging as we depend on C++ standard library types which are notoriously poisonous to binary interfaces)

@Wumpf Wumpf added 😤 annoying Something in the UI / SDK is annoying to use codegen/idl 🌊 C++ API C/C++ API specific labels Jul 31, 2023
Wumpf added a commit that referenced this issue Jul 31, 2023
* Part of #2647
* Next step after #2820

### What

Introduces APIs for component and archetype logging as well as the
necessary methods in codegen to do so. A very basic example is included
in `main.cpp`.
Largely unrelated to the rest:
* Makes C++ tagged union types copy-able
* `to_arrow_datatype` constructs/allocates the datatype lazily exactly
once (using C++'s crazy local `static` variable guarantees)

Commit by commit review possible.


![image](https://github.com/rerun-io/rerun/assets/1220815/8e7fdf3a-4fe7-45b1-a835-463b13fc34d8)

Next steps:
* Add roundtrip tests
* Add C++ to ci (roundtrip, linting, etc.)
* Add codegen custom code injection
* Improve API usability in varous places, including #2873 
* add other tests
* serialize unions
* serialize datatypes nested in unions, structs and lists
* more testing & roundtripping

### Checklist
* [x] I have read and agree to [Contributor
Guide](https://github.com/rerun-io/rerun/blob/main/CONTRIBUTING.md) and
the [Code of
Conduct](https://github.com/rerun-io/rerun/blob/main/CODE_OF_CONDUCT.md)
* [x] I've included a screenshot or gif (if applicable)
* [x] I have tested [demo.rerun.io](https://demo.rerun.io/pr/2874) (if
applicable)

- [PR Build Summary](https://build.rerun.io/pr/2874)
- [Docs
preview](https://rerun.io/preview/pr%3Aandreas%2Fcpp-codegen%2Farchetype-builder-and-logging/docs)
- [Examples
preview](https://rerun.io/preview/pr%3Aandreas%2Fcpp-codegen%2Farchetype-builder-and-logging/examples)
Wumpf added a commit that referenced this issue Aug 2, 2023
…t APIs & introduce C++ SDK tests (#2890)

* Part of #2516
(SDK! Not codegen! :))
* Next in the cpp series after #2874

### What

Adds a test dependency to the
[Catch2](https://github.com/catchorg/Catch2/) testing framework in order
to start testing all the new RecordingStream features added here.

C++ tests can be conveniently run via
`./rerun_cpp/build_and_run_tests.sh` now!

For quick api overview start with the `rerun.h` and `recording_stream.h`
headers.

Fixes a range of compiler warnings as a consequence of improving some of
the CMake setup, more to do there!
Adds lots more documentation to RecordingStream as well.


Next steps:
* Add C++ to ci (linting, running this test suite)
* Add roundtrip tests
*  Add codegen custom code injection
    * Improve API usability in varous places, including 
* #2873
* add other tests
* serialize unions
* serialize datatypes nested in unions, structs and lists
* more testing & roundtripping

### Checklist
* [x] I have read and agree to [Contributor
Guide](https://github.com/rerun-io/rerun/blob/main/CONTRIBUTING.md) and
the [Code of
Conduct](https://github.com/rerun-io/rerun/blob/main/CODE_OF_CONDUCT.md)
* [x] I've included a screenshot or gif (if applicable)
* [x] I have tested [demo.rerun.io](https://demo.rerun.io/pr/2890) (if
applicable)

- [PR Build Summary](https://build.rerun.io/pr/2890)
- [Docs
preview](https://rerun.io/preview/pr%3Aandreas%2Fcpp-api%2Fbetter-recording-stream/docs)
- [Examples
preview](https://rerun.io/preview/pr%3Aandreas%2Fcpp-api%2Fbetter-recording-stream/examples)
Wumpf added a commit that referenced this issue Aug 11, 2023
### What

As kindly supplied by a user on Discord. Didn't get to test them myself
yet but makes lot of sense to me regardless.

* Part of #2919
* requires additionally C++20 fixes from #2957
* currently every dependend binary has to know about arrow, should be
fixed as part of #2873

### Checklist
* [x] I have read and agree to [Contributor
Guide](https://github.com/rerun-io/rerun/blob/main/CONTRIBUTING.md) and
the [Code of
Conduct](https://github.com/rerun-io/rerun/blob/main/CODE_OF_CONDUCT.md)
* [x] I've included a screenshot or gif (if applicable)
* [x] I have tested [demo.rerun.io](https://demo.rerun.io/pr/2959) (if
applicable)

- [PR Build Summary](https://build.rerun.io/pr/2959)
- [Docs
preview](https://rerun.io/preview/pr%3Aandreas%2Fcpp%2Fwindows-fixes/docs)
- [Examples
preview](https://rerun.io/preview/pr%3Aandreas%2Fcpp%2Fwindows-fixes/examples)
Wumpf added a commit that referenced this issue Aug 17, 2023
### What

Introduces error handling to C/C++.
* C status object
* utilities for dealing safely with pointers in rerun_c and exposing any
errors on the way
* C++ wrapper with configurable logging
* bunch of tests for various error cases on recording stream (not 100%
coverage but deemed "good enough")
* depend on loguru in tests but not in the sdk itself

Missing (and already in the works!) is a custom result type that will
allow us to remove dependency on arrow status/result which right now
doesn't translate to `rerun::Status`

* Part of #2919
* Almost solves #2917
* Prerequisite for #2873

### Checklist
* [x] I have read and agree to [Contributor
Guide](https://github.com/rerun-io/rerun/blob/main/CONTRIBUTING.md) and
the [Code of
Conduct](https://github.com/rerun-io/rerun/blob/main/CODE_OF_CONDUCT.md)
* [x] I've included a screenshot or gif (if applicable)
* [x] I have tested [demo.rerun.io](https://demo.rerun.io/pr/3001) (if
applicable)

- [PR Build Summary](https://build.rerun.io/pr/3001)
- [Docs
preview](https://rerun.io/preview/pr%3Aandreas%2Fcpp%2Ferrorhandling/docs)
- [Examples
preview](https://rerun.io/preview/pr%3Aandreas%2Fcpp%2Ferrorhandling/examples)
Wumpf added a commit that referenced this issue Aug 17, 2023
* Depends on #3001
* Part of #2919
* Fixes #2917
* Solves a big chunk of #2873

### What

Introduces a very simple `rerun::Result`. I decided to keep it **a lot**
more simple than the arrow Result type and refrained from the typical
"return or assign" etc. macro since I noticed that they get quite
complicated quickly.

The idea is that the rerun result type won't be needed by a lot of
manual code, so erring on the too lightweight side should be in our
favor.

We use this now accross the entire public serialization path - meaning
that once we move out array->arrow serialization helper into separate
headers we should be pretty much done with not exposing arrow headers!


### Checklist
* [x] I have read and agree to [Contributor
Guide](https://github.com/rerun-io/rerun/blob/main/CONTRIBUTING.md) and
the [Code of
Conduct](https://github.com/rerun-io/rerun/blob/main/CODE_OF_CONDUCT.md)
* [x] I've included a screenshot or gif (if applicable)
* [x] I have tested [demo.rerun.io](https://demo.rerun.io/pr/3005) (if
applicable)

- [PR Build Summary](https://build.rerun.io/pr/3005)
- [Docs
preview](https://rerun.io/preview/pr%3Aandreas%2Fcpp%2Fcustom-result-type/docs)
- [Examples
preview](https://rerun.io/preview/pr%3Aandreas%2Fcpp%2Fcustom-result-type/examples)
Wumpf added a commit that referenced this issue Aug 19, 2023
…l set (#3041)

* Part of #2919
* Fixes #2873

### What

We now use our own status type everywhere, allowing us to predeclare
everything arrow.
On top of that I made sure that we include a much more minimal set of
arrow headers for the serialization code itself.

There's a static assertion in the unit test that ensures that we don't
start leaking arrow headers into the future

Haven't measured, but it feels like this also improved compile times of
the sdk which were already pretty good.

-----------

The one thing that's noted in #2873 but is not happening here yet (?) is
splitting out the utility methods. The idea would be this (actual
generated code I already have on a test branch!) :
```cpp
        struct LineStrip2D {
// .........
            /// Creates a Rerun DataCell from an array of LineStrip2D components.
            static Result<rerun::DataCell> to_data_cell(
                const LineStrip2D* instances, size_t num_instances
            );
        };

        /// Serialization utilities for arrays of LineStrip2D
        namespace LineStrip2DSerializationUtils {
            /// Returns the arrow data type this type corresponds to.
            static const std::shared_ptr<arrow::DataType>& to_arrow_datatype();

            /// Creates a new array builder with an array of this type.
            static Result<std::shared_ptr<arrow::ListBuilder>> new_arrow_array_builder(
                arrow::MemoryPool* memory_pool
            );

            /// Fills an arrow array builder with an array of this type.
            static Error fill_arrow_array_builder(
                arrow::ListBuilder* builder, const LineStrip2D* elements, size_t num_elements
            );
        } // namespace LineStrip2DSerializationUtils
```

The problem is that this makes calling these helpers unnecessary
complicated sicne I need to append `SerializationUtils` to the typename
of some previously static method calls which is suprisingly cumbersome 🤔
EDIT: Briefly chatted with Emil about it. Let's do it, it's actually not
that hard

### Checklist
* [x] I have read and agree to [Contributor
Guide](https://github.com/rerun-io/rerun/blob/main/CONTRIBUTING.md) and
the [Code of
Conduct](https://github.com/rerun-io/rerun/blob/main/CODE_OF_CONDUCT.md)
* [x] I've included a screenshot or gif (if applicable)
* [x] I have tested [demo.rerun.io](https://demo.rerun.io/pr/3041) (if
applicable)

- [PR Build Summary](https://build.rerun.io/pr/3041)
- [Docs
preview](https://rerun.io/preview/pr%3Aandreas%2Fcpp%2Fdont-leak-arrow-headers/docs)
- [Examples
preview](https://rerun.io/preview/pr%3Aandreas%2Fcpp%2Fdont-leak-arrow-headers/examples)
- [Recent benchmark results](https://ref.rerun.io/dev/bench/)
- [Wasm size tracking](https://ref.rerun.io/dev/sizes/)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
😤 annoying Something in the UI / SDK is annoying to use 🌊 C++ API C/C++ API specific codegen/idl
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant