Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Code-gen for C++ #2647

Closed
Tracked by #2516
emilk opened this issue Jul 10, 2023 · 3 comments
Closed
Tracked by #2516

Code-gen for C++ #2647

emilk opened this issue Jul 10, 2023 · 3 comments
Assignees
Labels
🌊 C++ API C/C++ API specific codegen/idl

Comments

@emilk
Copy link
Member

emilk commented Jul 10, 2023

To generate the C++ object hierarchy there are a few things to figure out.

One of them is .hpp vs .cpp files. This is probably quite easy.

Sum types

How do we express sum types in C++?

Consider a simple example (in Rust):

pub enum Rotation3D {
    Quaternion(Quaternion),
    AxisAngle(RotationAxisAngle),
}

Inheritance

Using inheritance for sum-types adds a lot of allocation and indirections, which will be quite costly.

std::variant

std::variant was introduced in C++17, and tries to be like a Rust union. Some downsides:

  • Minimum supported version of C++17
  • It uses the the type-id of the thing it holds, i.e. cannot distinguish between float degrees and float radians unless degrees and radians are their own new-types
    *std::variant is also overall very unergonomic, so we would still want to generate wrappers around it
  • It reportedly compiles quite slowly (so much template magic)

Tagged unions

For Plain Old Data (POD) a tagged union is very efficient:

enum Rotation3DKind {
    Quaternion,
    AxisAngle,
};

union Rotation3DData {
    Quaternion quaternion;
    AxisAngle axis_angle;
};

struct Rotation3D {
    kind: Rotation3DKind,
    data: Rotation3DData,

    // implicit conversions:
    Rotation3D(Quaternion);
    Rotation3D(AxisAngle);
};

However, this becomes quite error prone if the contained data is not plain-old-data, but has allocations.

It is still doable if we make the union and all the fields private. We would need to take care to overload (or make private):

  • The copy constructor
  • The move constructor
  • Copy assignment
  • Move assignment
  • Destructor

See https://en.cppreference.com/w/cpp/language/rule_of_three.

In other words, it would have to be something like:

namespace {  // private
  enum Rotation3DKind {
      Quaternion,
      AxisAngle,
  };
  
  union Rotation3DData {
      Quaternion quaternion;
      AxisAngle axis_angle;
  };
} // namespace

namespace rr {
    class Rotation3D {
      private:
        _kind: Rotation3DKind,
        _data: Rotation3DData,
  
      public:
        // Implicit constructors (maybe should be explicit?):
        Rotation3D(Quaternion&& quaternion) {
            _kind = Rotation3DKind::Quaterion;
            _data.quaternion = quaternion;
        }

        Rotation3D(AxisAngle&& axis_angle);

        Rotation3D(const Rotation3D& axis_angle); // TODO: copy-constructor
        Rotation3D(Rotation3D&& axis_angle) noexecpt; // TODO: move-constructor
        
        ~Rotation3D {
             switch self.kind {
                 Rotation3DKind::Quaterion:
                     self.data.quaterion.~Quaterion();
                     break;

                 Rotation3DKind::AxisAngle:
                     self.data.axis_angle.~AxisAngle();
                     break;
             }
        }


        Rotation3D& operator=(const Rotation3D& axis_angle); // TODO: copy-assignment
        Rotation3D& operator=(Rotation3D&& axis_angle) noexecpt; // TODO: move-assigment
    }
}
@emilk emilk added codegen/idl 🌊 C++ API C/C++ API specific labels Jul 10, 2023
@emilk emilk mentioned this issue Jul 10, 2023
11 tasks
@abey79
Copy link
Member

abey79 commented Jul 10, 2023

FWIW, some notes from the (Python) front:

  • For Python, the core tool I'm using is typing.Union[arm_type1, arm_type2, ...], which obviously is of no use for C++.
  • Using inheritance would require (near) systematic wrapping of the arm's type in some custom type, since you can't have a Vec3D or a float directly inherit from Scale3D. For Python, this was a big no go in terms of the impact on the API.
  • Angle { Degrees(float), Radians(float) } was... interesting. I basically introduce a union tag field kind any time 2 or more arms share the same type, as (dynamic) type info is no longer sufficient to discriminate. Having the possibility to hand-code the constructor is key to still provide a decent API in such cases.

Question: would it be feasible to back-port (and improve for our purpose) std::variant or does it depend on core language features?

@emilk
Copy link
Member Author

emilk commented Jul 10, 2023

I don't think we want to back-port std::variant even if we could. Some more downsides:

  • It uses the the type-id of the thing it holds, i.e. cannot distinguish between float degrees and radians unless degrees and radians are their own new-types.
    *std::variant is also overall very unergonomic, so we would still want to generate wrappers around it.
  • It reportedly compiles quite slowly (so much template magic)

(I've updated the PR description with this)

I think tagged unions is the way to go. It basically mimics how Rust implements its enums. There will be a bunch of code generated, but the generated code should be quite ergonomic to use and compile quickly, and work on very old C++ versions if needed.

@emilk emilk self-assigned this Jul 12, 2023
@emilk emilk mentioned this issue Jul 12, 2023
3 tasks
Wumpf added a commit that referenced this issue Jul 13, 2023
Part of:
* #2647 

### What
Adds minimal code-gen for C++. It generates a `.hpp/.cpp` pair for each
type, plus a module file each for datatypes, components, and archetypes.

### Checklist
* [x] I have read and agree to [Contributor
Guide](https://github.com/rerun-io/rerun/blob/main/CONTRIBUTING.md) and
the [Code of
Conduct](https://github.com/rerun-io/rerun/blob/main/CODE_OF_CONDUCT.md)
* [x] I've included a screenshot or gif (if applicable)
* [x] I have tested [demo.rerun.io](https://demo.rerun.io/pr/2678) (if
applicable)

- [PR Build Summary](https://build.rerun.io/pr/2678)
- [Docs preview](https://rerun.io/preview/pr%3Aemilk%2Fcpp-codegen/docs)
- [Examples
preview](https://rerun.io/preview/pr%3Aemilk%2Fcpp-codegen/examples)

---------

Co-authored-by: Andreas Reich <[email protected]>
@Wumpf Wumpf self-assigned this Jul 18, 2023
Wumpf pushed a commit that referenced this issue Jul 19, 2023
* Part of #2647

### What
This declares all C++ types and their fields. A matching `#include` list
is also included. The resulting C++ builds.

Test it yourself with `cargo codegen &&
./examples/cpp/minimal/build_and_run.sh`

Sum-types are implemented via "tagged unions" approach, basically an
emulation of how Rust does it under the hood. see #2647 for details.

### Done:
* struct declarations
* enum declarations
* enum destructors
* enum move semantics
* enum static constructors
* Implicit constructors for enums
* Implicit constructor for single-fields structs (e.g. `Vec3D`)

### Still left to do:
* Arrow-serialization
* Helper functions for building archetypes (needs design)

### Things I think we should punt on:
* Copy constructors
* Copy assignment operator

### Checklist
* [x] I have read and agree to [Contributor
Guide](https://github.com/rerun-io/rerun/blob/main/CONTRIBUTING.md) and
the [Code of
Conduct](https://github.com/rerun-io/rerun/blob/main/CODE_OF_CONDUCT.md)
* [x] I've included a screenshot or gif (if applicable)
* [x] I have tested [demo.rerun.io](https://demo.rerun.io/pr/2707) (if
applicable)

- [PR Build Summary](https://build.rerun.io/pr/2707)
- [Docs
preview](https://rerun.io/preview/pr%3Aemilk%2Fcodegen-cpp-struct/docs)
- [Examples
preview](https://rerun.io/preview/pr%3Aemilk%2Fcodegen-cpp-struct/examples)
Wumpf added a commit that referenced this issue Jul 20, 2023
* Part of #2647
* Next step after #2707

### What

Implements `to_arrow_data_type` (same exists in Rust) for all C++
components and datatypes.

Started splitting up `cpp.rs` a tiny little bit, not gonna go the single
file way `rust.rs` went there, too hard to handle now that there's more
than one author on it.

Test it yourself with `cargo codegen &&
./examples/cpp/minimal/build_and_run.sh`

Rough next steps:
* implement same for unions
* implement extension type decl (missing in this PR!), needed for
instance to declare component types
* serialize out data
* testing (nothing in this PR is tested other than "it compiles!")
* utilities for better usability

### Checklist
* [x] I have read and agree to [Contributor
Guide](https://github.com/rerun-io/rerun/blob/main/CONTRIBUTING.md) and
the [Code of
Conduct](https://github.com/rerun-io/rerun/blob/main/CODE_OF_CONDUCT.md)
* [x] I've included a screenshot or gif (if applicable)
* [x] I have tested [demo.rerun.io](https://demo.rerun.io/pr/2756) (if
applicable)

- [PR Build Summary](https://build.rerun.io/pr/2756)
- [Docs
preview](https://rerun.io/preview/pr%3Aandreas%2Fcpp-arrow-datatype-definition-for-structs/docs)
- [Examples
preview](https://rerun.io/preview/pr%3Aandreas%2Fcpp-arrow-datatype-definition-for-structs/examples)
Wumpf added a commit that referenced this issue Jul 21, 2023
* Part of #2647
* Next step after #2756

### What

* Other half of #2759
* Equivalent to #2760

Disable whitespaces for reviewing!

It compiles (`cargo codegen && ./examples/cpp/minimal/build_and_run.sh`)

### Checklist
* [x] I have read and agree to [Contributor
Guide](https://github.com/rerun-io/rerun/blob/main/CONTRIBUTING.md) and
the [Code of
Conduct](https://github.com/rerun-io/rerun/blob/main/CODE_OF_CONDUCT.md)
* [x] I've included a screenshot or gif (if applicable)
* [x] I have tested [demo.rerun.io](https://demo.rerun.io/pr/2765) (if
applicable)

- [PR Build Summary](https://build.rerun.io/pr/2765)
- [Docs
preview](https://rerun.io/preview/pr%3Aandreas%2Fdatatype-no-rec-cpp/docs)
- [Examples
preview](https://rerun.io/preview/pr%3Aandreas%2Fdatatype-no-rec-cpp/examples)
Wumpf added a commit that referenced this issue Jul 21, 2023
* Part of #2647 

### What

For unknown reasons cpp formatter invoked via just file versus via
codegen yielded slightly different results. Tried various things but to
no avail. However, specifying some of the options in the clang format
file more explicitely solves the issue.

Plus, changed how header sorting is handled.

### Checklist
* [x] I have read and agree to [Contributor
Guide](https://github.com/rerun-io/rerun/blob/main/CONTRIBUTING.md) and
the [Code of
Conduct](https://github.com/rerun-io/rerun/blob/main/CODE_OF_CONDUCT.md)
* [x] I've included a screenshot or gif (if applicable)
* [x] I have tested [demo.rerun.io](https://demo.rerun.io/pr/2773) (if
applicable)

- [PR Build Summary](https://build.rerun.io/pr/2773)
- [Docs
preview](https://rerun.io/preview/pr%3Aandreas%2Fcpp-codegen%2Ffix-formatter-differences/docs)
- [Examples
preview](https://rerun.io/preview/pr%3Aandreas%2Fcpp-codegen%2Ffix-formatter-differences/examples)
Wumpf added a commit that referenced this issue Jul 21, 2023
* Part of #2647
* Next step after #2765

### What

implements 

Tested with `cargo codegen && ./examples/cpp/minimal/build_and_run.sh`

### Checklist
* [x] I have read and agree to [Contributor
Guide](https://github.com/rerun-io/rerun/blob/main/CONTRIBUTING.md) and
the [Code of
Conduct](https://github.com/rerun-io/rerun/blob/main/CODE_OF_CONDUCT.md)
* [x] I've included a screenshot or gif (if applicable)
* [x] I have tested [demo.rerun.io](https://demo.rerun.io/pr/2765) (if
applicable)

- [PR Build Summary](https://build.rerun.io/pr/2765)
- [Docs
preview](https://rerun.io/preview/pr%3Aandreas%2Fdatatype-no-rec-cpp/docs)
- [Examples
preview](https://rerun.io/preview/pr%3Aandreas%2Fdatatype-no-rec-cpp/examples)
Wumpf added a commit that referenced this issue Jul 26, 2023
…sted rerun types (#2820)

* Part of #2647
* Next step after: #2766

### What

Generates serialization for all non-union components and datatypes
except for datatypes nested in structs & unions.
Uses new serialization methods for 2D & 3D point logging. Ergonomics are
pretty poor at this point, handling this will be a follow-up.

Commit history got a bit messy at some point - **not** recommending
commit by commit reviewing ;)

Next Steps:
* work towards Point2D roundtrip
   * builder methods on archetype
   * make logging on Recording stream nice'ish
   * basic testing
* serialize unions
* serialize datatypes nested in unions, structs and lists
* more testing & roundtripping

### Checklist
* [x] I have read and agree to [Contributor
Guide](https://github.com/rerun-io/rerun/blob/main/CONTRIBUTING.md) and
the [Code of
Conduct](https://github.com/rerun-io/rerun/blob/main/CODE_OF_CONDUCT.md)
* [x] I've included a screenshot or gif (if applicable)
* [x] I have tested [demo.rerun.io](https://demo.rerun.io/pr/2820) (if
applicable)

- [PR Build Summary](https://build.rerun.io/pr/2820)
- [Docs
preview](https://rerun.io/preview/pr%3Aandreas%2Fcpp-codegen%2Farrow-serialize-structs/docs)
- [Examples
preview](https://rerun.io/preview/pr%3Aandreas%2Fcpp-codegen%2Farrow-serialize-structs/examples)
Wumpf added a commit that referenced this issue Jul 31, 2023
* Part of #2647
* Next step after #2820

### What

Introduces APIs for component and archetype logging as well as the
necessary methods in codegen to do so. A very basic example is included
in `main.cpp`.
Largely unrelated to the rest:
* Makes C++ tagged union types copy-able
* `to_arrow_datatype` constructs/allocates the datatype lazily exactly
once (using C++'s crazy local `static` variable guarantees)

Commit by commit review possible.


![image](https://github.com/rerun-io/rerun/assets/1220815/8e7fdf3a-4fe7-45b1-a835-463b13fc34d8)

Next steps:
* Add roundtrip tests
* Add C++ to ci (roundtrip, linting, etc.)
* Add codegen custom code injection
* Improve API usability in varous places, including #2873 
* add other tests
* serialize unions
* serialize datatypes nested in unions, structs and lists
* more testing & roundtripping

### Checklist
* [x] I have read and agree to [Contributor
Guide](https://github.com/rerun-io/rerun/blob/main/CONTRIBUTING.md) and
the [Code of
Conduct](https://github.com/rerun-io/rerun/blob/main/CODE_OF_CONDUCT.md)
* [x] I've included a screenshot or gif (if applicable)
* [x] I have tested [demo.rerun.io](https://demo.rerun.io/pr/2874) (if
applicable)

- [PR Build Summary](https://build.rerun.io/pr/2874)
- [Docs
preview](https://rerun.io/preview/pr%3Aandreas%2Fcpp-codegen%2Farchetype-builder-and-logging/docs)
- [Examples
preview](https://rerun.io/preview/pr%3Aandreas%2Fcpp-codegen%2Farchetype-builder-and-logging/examples)
@Wumpf
Copy link
Member

Wumpf commented Aug 7, 2023

With #2916 all the basics are in place, we "just" need to extend to more complex types.
Further tracking happening in

@Wumpf Wumpf closed this as completed Aug 7, 2023
Wumpf added a commit that referenced this issue Aug 7, 2023
### What

Introduces a simple extension system for C++ codegen: Add an extra cpp
file that will be compiled as part of the SDK. A section between two
markes is copied into the generated hpp as part of generation (typically
this section is removed from compilation via #ifdef)

Adds extensions to:
* color
* vec2/vec3/vec4
* quaternion
* origin3d
* point2d
* point3d
* arrow3d

... and uses them to simplify examples and test code!

All fully supported archetypes now have a simple test that checks that
the base interface works and that we can serialize out to arrow without
issues.

Additionally, there's tests for vecN/quaternion/color to check that
their various constructors work as expected (not being a C++ expert it's
fairly hard to predict)

Extends codegen with single-array-field-constructors.


---------

* part of #2647
* Fixes #2798
* Fixes #2785
* Missing only 2D example: #2789
* Adds to #2791 


### Checklist
* [x] I have read and agree to [Contributor
Guide](https://github.com/rerun-io/rerun/blob/main/CONTRIBUTING.md) and
the [Code of
Conduct](https://github.com/rerun-io/rerun/blob/main/CODE_OF_CONDUCT.md)
* [x] I've included a screenshot or gif (if applicable)
* [x] I have tested [demo.rerun.io](https://demo.rerun.io/pr/2916) (if
applicable)

- [PR Build Summary](https://build.rerun.io/pr/2916)
- [Docs
preview](https://rerun.io/preview/pr%3Aandreas%2Fcpp%2Fcustom-extensions/docs)
- [Examples
preview](https://rerun.io/preview/pr%3Aandreas%2Fcpp%2Fcustom-extensions/examples)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🌊 C++ API C/C++ API specific codegen/idl
Projects
None yet
Development

No branches or pull requests

3 participants