Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SDK batching/revamp 2.2: homegrown arrow size estimation routines #2002

Merged
merged 41 commits into from
May 4, 2023

Conversation

teh-cmc
Copy link
Member

@teh-cmc teh-cmc commented Apr 28, 2023

This takes care of all size measurement issues (incl. batch support) and makes sure they won't bother us again, at least until the migration to arrow1.

Don't be fooled by the line-changed numbers, it's all either tests or shuffling code around. All the relevant stuff is confined to the new estimated_bytes_size function in size_bytes.rs.

On top of #1983

Checks:

  • api_demo
    • single
    • batched
  • dna
    • single
    • batched
  • minimal
    • single
    • batched
  • objectron
    • single
    • batched
  • raw_mesh
    • single
    • batched

@teh-cmc teh-cmc changed the base branch from main to cmc/sdk_revamp/2_rust_revamp April 28, 2023 15:54
@teh-cmc teh-cmc force-pushed the cmc/sdk_revamp/22_batching_estimated_size_hell branch from b41f4ea to 33a60ac Compare April 28, 2023 15:58
@teh-cmc teh-cmc added 🪳 bug Something isn't working 🏹 arrow concerning arrow labels Apr 28, 2023
@teh-cmc teh-cmc added the do-not-merge Do not merge this PR label Apr 28, 2023
@teh-cmc teh-cmc force-pushed the cmc/sdk_revamp/22_batching_estimated_size_hell branch 2 times, most recently from aece2df to 25ca19c Compare May 2, 2023 07:40
@teh-cmc teh-cmc force-pushed the cmc/sdk_revamp/22_batching_estimated_size_hell branch from c5522fe to f8e74e9 Compare May 3, 2023 09:11
@teh-cmc teh-cmc force-pushed the cmc/sdk_revamp/22_batching_estimated_size_hell branch from d0cd877 to da160c5 Compare May 3, 2023 09:46
@teh-cmc teh-cmc changed the title SDK batching/revamp 2.2: estimated_bytes_size + Union + batching = disaster SDK batching/revamp 2.2: homegrown arrow size estimation routines May 3, 2023
@teh-cmc teh-cmc marked this pull request as ready for review May 3, 2023 17:27
Base automatically changed from cmc/sdk_revamp/2_rust_revamp to main May 4, 2023 09:16
@teh-cmc teh-cmc removed the do-not-merge Do not merge this PR label May 4, 2023
crates/re_arrow_store/benches/data_store.rs Outdated Show resolved Hide resolved
crates/re_log_types/src/data_cell.rs Show resolved Hide resolved
crates/re_log_types/src/data_cell.rs Show resolved Hide resolved
crates/re_log_types/src/data_table_batcher.rs Show resolved Hide resolved
crates/re_query/Cargo.toml Outdated Show resolved Hide resolved
crates/re_query/tests/query_tests.rs Outdated Show resolved Hide resolved
Copy link
Member

@jleibs jleibs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As always, Dense Unions are a major pain.

crates/re_log_types/src/size_bytes.rs Show resolved Hide resolved
crates/re_log_types/src/size_bytes.rs Show resolved Hide resolved
crates/re_log_types/src/size_bytes.rs Outdated Show resolved Hide resolved
@teh-cmc teh-cmc requested a review from jleibs May 4, 2023 14:17
Comment on lines +366 to +369
let mut idx_end = idx_start;
for idx in indices {
idx_end = idx;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let mut idx_end = idx_start;
for idx in indices {
idx_end = idx;
}
let idx_end = indices.last().unwrap_or_default();

Copy link
Member Author

@teh-cmc teh-cmc May 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah... I didn't go there because it very much looks like this is some magic 0(1) function, which feels like I'm betraying the reader...

Copy link
Member

@jleibs jleibs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@teh-cmc teh-cmc merged commit 5a32f6e into main May 4, 2023
@teh-cmc teh-cmc deleted the cmc/sdk_revamp/22_batching_estimated_size_hell branch May 4, 2023 15:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏹 arrow concerning arrow 🪳 bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants