Skip to content

Commit

Permalink
Python codegen: big cleaning and paving the way towards transforms (#…
Browse files Browse the repository at this point in the history
…2603)

### What

**Note**: as usual, review commit-by-commit (the bulk of the changes are
in the first commit). This time, reviewers might want to spend some time
on the auto-generated code, though, as this is a major focus of this PR.

This PR has the following aims:
- Enforce and exploit the decision that a component must be a struct of
one, and exactly one, field, which may be a datatype (in which case the
component is known as "delegating", as its implementation delegates to
the datatype) or a native type (e.g. float, int, etc.).
- Significantly polish the Python SDK, with the goal of a cleaner, fully
typed user-facing API, and a streamlined implementation. Part of this is
achieved using `attrs` (see below).
- Improve the `from_similar()` hook and introduce several others. The
proposed mechanisms consist of storing the so-called (hand-coded)
overrides in the `_overrides` sub-packages in each of the `archetypes`,
`components`, and `datatypes` packages. These overrides are discovered
by the code generator by "parsing" (well, "brute-force extracting
symbols" from) `_overrides/__init__.py` [1]. The following hooks are
currently implemented:
- `xxx_native_to_pa_array()` (where `xxx` is the lower-cased name of the
corresponding object): this is the primary way to flexibly convert
native input into the Arrow array form of the component.
- `xxx_yyy_converter` (`yyy` is a field name): optional converter
function a given field of a native objects (e.g. `Color`). This enables
the object to flexibly accept native input for the given field.
- `xxx_as_array`: optional `__array__()` magic function for native
objects (e.g. `Point2D`). This function enable the object to be
interpreted as array data by Numpy, which can considerably simplify the
implementation of, e.g., `xxx_native_to_pa_array` (`Point2D` is a good
example).

This PR also introduces the following changes:
- So far, the archetypes and `log_any()` were pulled into the `rr`
namespace. This PR now pulls all component- and datatype-related symbols
in `rr.cmp` and `rr.dt`, respectively. This is intended as a temporary
solution, to avoid name clashes with the legacy API [2].
- The [`attrs`](https://www.attrs.org/en/stable/) dependency is
introduced as it allows much more flexibility than the standard library
`dataclass`, in particular with the `converter` feature which I use
extensively. Introducing new deps is always a tough call. In the case of
`attrs`, it's most probably safe as this is one of the most common and
well maintained package of the ecosystem (110+M dl/month).
- Native objects automatically get `__str__()`, `__int__()`, or
`__float__()` functions when they contain a single field of the related
type.
- The `Point2D` component now delegate to a new `Point2D` datatype,
defined as a struct of two floats. This is done such as to avoid
breaking compatibility with the current viewer. The long-term goal is to
use the `Vec2D` datatype instead.
- The CI now uses Python 3.11 for linting (and linting only, i.e. not
testing, etc.)

Addressing union types is a non-goal of this PR. This will be part of
the next PR (transforms). As such `QuotedObject::from_union()` is most
likely broken.

TODO/TBD:
- PyCharm's typechecker isn't happy with the Archetype, with is
irritating. This could apparently be addressed by using `xxxArrayLike`
instead of `xxxArray` annotations for archetype's fields. I'm curious
how VSCode fares, though.
- Maybe passing `None` as args to the archetype should set `None`
instead of empty arrays (they aren't logged either way).
- Maybe the notion of delegating vs. non-delegating component should be
promoted to the langauge-agnostic code?


[1] I actually prototyped and tested an approach based on PyO3. It sorta
worked, until I tried to import stuff (Numpy, ...) for the overrides.
The codegen would somehow be given access to a venv with the required
deps, etc. Hell no.
[2] Name clashes also exist internally to the new API, e.g.
`components.Point2D` vs. `datatypes.Point2D`. This will have to be
addressed before we can pull "everything" into the `rr` namespace.

### Checklist
* [x] I have read and agree to [Contributor
Guide](https://github.com/rerun-io/rerun/blob/main/CONTRIBUTING.md) and
the [Code of
Conduct](https://github.com/rerun-io/rerun/blob/main/CODE_OF_CONDUCT.md)
* [x] I've included a screenshot or gif (if applicable)
* [x] I have tested [demo.rerun.io](https://demo.rerun.io/pr/2603) (if
applicable)

- [PR Build Summary](https://build.rerun.io/pr/2603)
- [Docs
preview](https://rerun.io/preview/pr%3Aantoine%2Fhope-python-cleanup-new/docs)
- [Examples
preview](https://rerun.io/preview/pr%3Aantoine%2Fhope-python-cleanup-new/examples)

---------

Co-authored-by: Clement Rey <[email protected]>
  • Loading branch information
abey79 and teh-cmc authored Jul 5, 2023
1 parent c92a554 commit 59b20e9
Show file tree
Hide file tree
Showing 85 changed files with 4,948 additions and 2,925 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/reusable_build_and_test_wheels.yml
Original file line number Diff line number Diff line change
Expand Up @@ -243,7 +243,7 @@ jobs:
# TODO(jleibs): understand why deps can't be installed in the same step as the wheel
shell: bash
run: |
pip install deprecated numpy>=1.23 pillow pyarrow==10.0.1 pytest==7.1.2
pip install attrs>=23.1.0 deprecated numpy>=1.23 pillow pyarrow==10.0.1 pytest==7.1.2
- name: Install built wheel
if: needs.set-config.outputs.RUN_TESTS == 'true'
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/reusable_checks.yml
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@ jobs:
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: ${{ env.PYTHON_VERSION }}
python-version: "3.11"
cache: "pip"
cache-dependency-path: "rerun_py/requirements-lint.txt"

Expand Down
24 changes: 24 additions & 0 deletions crates/re_types/build.rs
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,11 @@ const DOC_EXAMPLES_DIR_PATH: &str = "../../docs/code-examples";
const RUST_OUTPUT_DIR_PATH: &str = ".";
const PYTHON_OUTPUT_DIR_PATH: &str = "../../rerun_py/rerun_sdk/rerun/_rerun2";

// located in PYTHON_OUTPUT_DIR_PATH
const ARCHETYPE_OVERRIDES_SUB_DIR_PATH: &str = "archetypes/_overrides";
const COMPONENT_OVERRIDES_SUB_DIR_PATH: &str = "components/_overrides";
const DATATYPE_OVERRIDES_SUB_DIR_PATH: &str = "datatypes/_overrides";

fn main() {
if cfg!(target_os = "windows") {
// TODO(#2591): Codegen is temporarily disabled on Windows due to hashing issues.
Expand Down Expand Up @@ -47,16 +52,35 @@ fn main() {
let re_types_builder_hash = compute_crate_hash("re_types_builder");
let definitions_hash = compute_dir_hash(DEFINITIONS_DIR_PATH, Some(&["fbs"]));
let doc_examples_hash = compute_dir_hash(DOC_EXAMPLES_DIR_PATH, Some(&["rs", "py"]));
let archetype_overrides_hash = compute_dir_hash(
PathBuf::from(PYTHON_OUTPUT_DIR_PATH).join(ARCHETYPE_OVERRIDES_SUB_DIR_PATH),
Some(&["py"]),
);
let component_overrides_hash = compute_dir_hash(
PathBuf::from(PYTHON_OUTPUT_DIR_PATH).join(COMPONENT_OVERRIDES_SUB_DIR_PATH),
Some(&["py"]),
);
let datatype_overrides_hash = compute_dir_hash(
PathBuf::from(PYTHON_OUTPUT_DIR_PATH).join(DATATYPE_OVERRIDES_SUB_DIR_PATH),
Some(&["py"]),
);

let new_hash = compute_strings_hash(&[
&re_types_builder_hash,
&definitions_hash,
&doc_examples_hash,
&archetype_overrides_hash,
&component_overrides_hash,
&datatype_overrides_hash,
]);

// Leave these be please, very useful when debugging.
eprintln!("re_types_builder_hash: {re_types_builder_hash:?}");
eprintln!("definitions_hash: {definitions_hash:?}");
eprintln!("doc_examples_hash: {doc_examples_hash:?}");
eprintln!("archetype_overrides_hash: {archetype_overrides_hash:?}");
eprintln!("component_overrides_hash: {component_overrides_hash:?}");
eprintln!("datatype_overrides_hash: {datatype_overrides_hash:?}");
eprintln!("new_hash: {new_hash:?}");
eprintln!("cur_hash: {cur_hash:?}");

Expand Down
5 changes: 0 additions & 5 deletions crates/re_types/definitions/python/attributes.fbs
Original file line number Diff line number Diff line change
@@ -1,10 +1,5 @@
namespace python.attributes;

/// Marks a field as transparent, meaning its type will be replaced by the underlying type.
///
/// Only applies to fields whose type is an object with a single-field.
attribute "attr.python.transparent";

/// Defines the type aliases for a component, e.g. the types that make up `ComponentLike`.
///
/// Only applies to structs/unions that are components.
Expand Down
2 changes: 1 addition & 1 deletion crates/re_types/definitions/rerun/components/class_id.fbs
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ namespace rerun.components;
struct ClassId (
"attr.arrow.transparent",
"attr.python.aliases": "int",
"attr.python.array_aliases": "npt.NDArray[np.uint8], npt.NDArray[np.uint16], npt.NDArray[np.uint32], npt.NDArray[np.uint64]",
"attr.python.array_aliases": "int, npt.NDArray[np.uint8], npt.NDArray[np.uint16], npt.NDArray[np.uint32], npt.NDArray[np.uint64]",
"attr.rerun.legacy_fqname": "rerun.class_id",
"attr.rust.derive": "Copy, Clone, Debug, Default, PartialEq, Eq, PartialOrd, Ord, Hash",
"attr.rust.tuple_struct",
Expand Down
4 changes: 2 additions & 2 deletions crates/re_types/definitions/rerun/components/color.fbs
Original file line number Diff line number Diff line change
Expand Up @@ -16,8 +16,8 @@ namespace rerun.components;
/// \py If there is an alpha, we assume it is in linear space, and separate (NOT pre-multiplied).
struct Color (
"attr.arrow.transparent",
"attr.python.aliases": "int, npt.NDArray[np.uint8], npt.NDArray[np.uint32], npt.NDArray[np.float32], npt.NDArray[np.float64]",
"attr.python.array_aliases": "Sequence[int], npt.NDArray[np.uint8], npt.NDArray[np.uint32], npt.NDArray[np.float32], npt.NDArray[np.float64]",
"attr.python.aliases": "int, Sequence[int], npt.NDArray[Union[np.uint8, np.float32, np.float64]]",
"attr.python.array_aliases": "Sequence[Sequence[int]], npt.NDArray[Union[np.uint8, np.uint32, np.float32, np.float64]]",
"attr.rerun.legacy_fqname": "rerun.colorrgba",
"attr.rust.derive": "Debug, Default, Clone, Copy, PartialEq, Eq, PartialOrd, Ord, bytemuck::Pod, bytemuck::Zeroable",
"attr.rust.repr": "transparent",
Expand Down
1 change: 1 addition & 0 deletions crates/re_types/definitions/rerun/components/label.fbs
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ namespace rerun.components;
table Label (
"attr.arrow.transparent",
"attr.python.aliases": "str",
"attr.python.array_aliases": "Sequence[str]",
"attr.rerun.legacy_fqname": "rerun.label",
"attr.rust.derive": "Debug, Clone, PartialEq, Eq, PartialOrd, Ord",
"attr.rust.repr": "transparent",
Expand Down
24 changes: 3 additions & 21 deletions crates/re_types/definitions/rerun/components/point2d.fbs
Original file line number Diff line number Diff line change
Expand Up @@ -10,32 +10,14 @@ namespace rerun.components;
// ---

/// A point in 2D space.
// TODO(cmc): bring back attr.rust.tuple_struct
struct Point2D (
"attr.arrow.transparent",
"attr.python.aliases": "npt.NDArray[np.float32], Sequence[float], Tuple[float, float]",
"attr.python.array_aliases": "npt.NDArray[np.float32], Sequence[float]",
"attr.rerun.legacy_fqname": "rerun.point2d",
"attr.rust.derive": "Debug, Default, Clone, Copy, PartialEq, PartialOrd",
order: 100
) {
x: float (order: 100);
y: float (order: 200);
xy: rerun.datatypes.Point2D (order: 100);
}

// ---

// TODO(cmc): use the following definition instead once we've finalized the switch to HOPE.

// /// A point in 2D space.
// struct Point2D (
// "attr.arrow.transparent",
// "attr.python.aliases": "npt.NDArray[np.float32], Sequence[float], Tuple[float, float]",
// "attr.python.array_aliases": "npt.NDArray[np.float32], Sequence[float]",
// "attr.rust.tuple_struct",
// "attr.rust.derive": "Debug, Default, Clone, Copy, PartialEq, PartialOrd",
// order: 100
// ) {
// position: rerun.datatypes.Vec2D (
// "attr.python.transparent",
// order: 100
// );
// }
1 change: 1 addition & 0 deletions crates/re_types/definitions/rerun/datatypes.fbs
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
include "./datatypes/point2d.fbs";
include "./datatypes/vec2d.fbs";

namespace rerun.datatypes;
19 changes: 19 additions & 0 deletions crates/re_types/definitions/rerun/datatypes/point2d.fbs
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
include "arrow/attributes.fbs";
include "python/attributes.fbs";
include "fbs/attributes.fbs";
include "rust/attributes.fbs";

namespace rerun.datatypes;

// ---

/// A point in 2D space.
struct Point2D (
"attr.python.aliases": "Sequence[float]",
"attr.python.array_aliases": "npt.NDArray[np.float32], Sequence[npt.NDArray[np.float32]], Sequence[Tuple[float, float]], Sequence[float]",
"attr.rust.derive": "Debug, Default, Clone, Copy, PartialEq, PartialOrd",
order: 100
) {
x: float (order: 100);
y: float (order: 200);
}
3 changes: 3 additions & 0 deletions crates/re_types/definitions/rerun/datatypes/vec2d.fbs
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
include "arrow/attributes.fbs";
include "python/attributes.fbs";
include "fbs/attributes.fbs";
include "rust/attributes.fbs";

Expand All @@ -9,6 +10,8 @@ namespace rerun.datatypes;
/// A vector in 2D space.
struct Vec2D (
"attr.arrow.transparent",
"attr.python.aliases": "Tuple[float, float]",
"attr.python.array_aliases": "npt.NDArray[np.float32], Sequence[Tuple[float, float]], Sequence[float]",
"attr.rust.derive": "Debug, Default, Clone, Copy, PartialEq, PartialOrd",
"attr.rust.tuple_struct",
order: 100
Expand Down
24 changes: 24 additions & 0 deletions crates/re_types/definitions/rerun/testing/archetypes/fuzzy.fbs
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,12 @@ table AffixFuzzer1 (
fuzz1005: rerun.testing.components.AffixFuzzer5 ("attr.rerun.component_required", required, order: 1005);
fuzz1006: rerun.testing.components.AffixFuzzer6 ("attr.rerun.component_required", required, order: 1006);
fuzz1007: rerun.testing.components.AffixFuzzer7 ("attr.rerun.component_required", required, order: 1007);
fuzz1008: rerun.testing.components.AffixFuzzer8 ("attr.rerun.component_required", required, order: 1008);
fuzz1009: rerun.testing.components.AffixFuzzer9 ("attr.rerun.component_required", required, order: 1009);
fuzz1010: rerun.testing.components.AffixFuzzer10 ("attr.rerun.component_required", required, order: 1010);
fuzz1011: rerun.testing.components.AffixFuzzer11 ("attr.rerun.component_required", required, order: 1011);
fuzz1012: rerun.testing.components.AffixFuzzer12 ("attr.rerun.component_required", required, order: 1012);
fuzz1013: rerun.testing.components.AffixFuzzer13 ("attr.rerun.component_required", required, order: 1013);

fuzz1101: [rerun.testing.components.AffixFuzzer1] ("attr.rerun.component_required", required, order: 1101);
fuzz1102: [rerun.testing.components.AffixFuzzer2] ("attr.rerun.component_required", required, order: 1102);
Expand All @@ -30,6 +36,12 @@ table AffixFuzzer1 (
fuzz1105: [rerun.testing.components.AffixFuzzer5] ("attr.rerun.component_required", required, order: 1105);
fuzz1106: [rerun.testing.components.AffixFuzzer6] ("attr.rerun.component_required", required, order: 1106);
fuzz1107: [rerun.testing.components.AffixFuzzer7] ("attr.rerun.component_required", required, order: 1107);
fuzz1108: [rerun.testing.components.AffixFuzzer8] ("attr.rerun.component_required", required, order: 1108);
fuzz1109: [rerun.testing.components.AffixFuzzer9] ("attr.rerun.component_required", required, order: 1109);
fuzz1110: [rerun.testing.components.AffixFuzzer10] ("attr.rerun.component_required", required, order: 1110);
fuzz1111: [rerun.testing.components.AffixFuzzer11] ("attr.rerun.component_required", required, order: 1111);
fuzz1112: [rerun.testing.components.AffixFuzzer12] ("attr.rerun.component_required", required, order: 1112);
fuzz1113: [rerun.testing.components.AffixFuzzer13] ("attr.rerun.component_required", required, order: 1113);

fuzz2001: rerun.testing.components.AffixFuzzer1 ("attr.rerun.component_optional", order: 2001);
fuzz2002: rerun.testing.components.AffixFuzzer2 ("attr.rerun.component_optional", order: 2002);
Expand All @@ -38,6 +50,12 @@ table AffixFuzzer1 (
fuzz2005: rerun.testing.components.AffixFuzzer5 ("attr.rerun.component_optional", order: 2005);
fuzz2006: rerun.testing.components.AffixFuzzer6 ("attr.rerun.component_optional", order: 2006);
fuzz2007: rerun.testing.components.AffixFuzzer7 ("attr.rerun.component_optional", order: 2007);
fuzz2008: rerun.testing.components.AffixFuzzer8 ("attr.rerun.component_optional", order: 2008);
fuzz2009: rerun.testing.components.AffixFuzzer9 ("attr.rerun.component_optional", order: 2009);
fuzz2010: rerun.testing.components.AffixFuzzer10 ("attr.rerun.component_optional", order: 2010);
fuzz2011: rerun.testing.components.AffixFuzzer11 ("attr.rerun.component_optional", order: 2011);
fuzz2012: rerun.testing.components.AffixFuzzer12 ("attr.rerun.component_optional", order: 2012);
fuzz2013: rerun.testing.components.AffixFuzzer13 ("attr.rerun.component_optional", order: 2013);

fuzz2101: [rerun.testing.components.AffixFuzzer1] ("attr.rerun.component_optional", order: 2101);
fuzz2102: [rerun.testing.components.AffixFuzzer2] ("attr.rerun.component_optional", order: 2102);
Expand All @@ -46,5 +64,11 @@ table AffixFuzzer1 (
fuzz2105: [rerun.testing.components.AffixFuzzer5] ("attr.rerun.component_optional", order: 2105);
fuzz2106: [rerun.testing.components.AffixFuzzer6] ("attr.rerun.component_optional", order: 2106);
fuzz2107: [rerun.testing.components.AffixFuzzer7] ("attr.rerun.component_optional", order: 2107);
fuzz2108: [rerun.testing.components.AffixFuzzer8] ("attr.rerun.component_optional", order: 2108);
fuzz2109: [rerun.testing.components.AffixFuzzer9] ("attr.rerun.component_optional", order: 2109);
fuzz2110: [rerun.testing.components.AffixFuzzer10] ("attr.rerun.component_optional", order: 2110);
fuzz2111: [rerun.testing.components.AffixFuzzer11] ("attr.rerun.component_optional", order: 2111);
fuzz2112: [rerun.testing.components.AffixFuzzer12] ("attr.rerun.component_optional", order: 2112);
fuzz2113: [rerun.testing.components.AffixFuzzer13] ("attr.rerun.component_optional", order: 2113);
}

47 changes: 45 additions & 2 deletions crates/re_types/definitions/rerun/testing/components/fuzzy.fbs
Original file line number Diff line number Diff line change
Expand Up @@ -61,12 +61,55 @@ table AffixFuzzer7 (
order: 700
) {
many_optional: [rerun.testing.datatypes.AffixFuzzer1] (order: 100);
}

table AffixFuzzer8 (
"attr.rust.derive": "Debug, Clone, PartialEq",
order: 800
) {
single_float_optional: float (order: 101);
}

table AffixFuzzer9 (
"attr.rust.derive": "Debug, Clone, PartialEq, Eq",
order: 900
) {
single_string_required: string (order: 102, required);
}

table AffixFuzzer10 (
"attr.rust.derive": "Debug, Clone, PartialEq, Eq",
order: 1000
) {
single_string_optional: string (order: 103);
}

table AffixFuzzer11 (
"attr.rust.derive": "Debug, Clone, PartialEq",
order: 1100
) {
many_floats_optional: [float] (order: 104);
}

table AffixFuzzer12 (
"attr.rust.derive": "Debug, Clone, PartialEq, Eq",
order: 1200
) {
many_strings_required: [string] (order: 105, required);
}

table AffixFuzzer13 (
"attr.rust.derive": "Debug, Clone, PartialEq, Eq",
order: 1300
) {
many_strings_optional: [string] (order: 106);
// TODO(cmc): the ugly bug we need to take care of at some point
// many_transparent_optionals: rerun.testing.datatypes.AffixFuzzer2 (order: 107);
}

// TODO(cmc): the ugly bug we need to take care of at some point
// table AffixFuzzer14 (
// "attr.rust.derive": "Debug, Clone, PartialEq",
// order: 1400
// ) {
//
// many_transparent_optionals: rerun.testing.datatypes.AffixFuzzer2 (order: 107);
// }
2 changes: 1 addition & 1 deletion crates/re_types/source_hash.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# This is a sha256 hash for all direct and indirect dependencies of this crate's build script.
# It can be safely removed at anytime to force the build script to run again.
# Check out build.rs to see how it's computed.
d828fdb05c35a01b92dd30b7ce40987b87c5dfe8d1bb8f729004ae88b62c830c
dec6b10b9e90c9c95b3e1436c1b53987573319efd121cc5e715c31466163b0bc
Loading

1 comment on commit 59b20e9

@github-actions
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'Rust Benchmark'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.25.

Benchmark suite Current: 59b20e9 Previous: c92a554 Ratio
mono_points_arrow_batched/decode_message_bundles 11454228 ns/iter (± 5099078) 7419857 ns/iter (± 6645) 1.54

This comment was automatically generated by workflow using github-action-benchmark.

Please sign in to comment.