Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add comments and tests for gc_string_view_batch #1

Merged

Conversation

alamb
Copy link

@alamb alamb commented Jul 24, 2024

Note this targets apache#11587

Rationale

While reviewing apache#11587 I felt something wasn't quite right with the heuristic, but I couldn't articulate it directly.

So I wrote some tests that show the problem

let gc_array = do_gc(array.clone());
compare_string_array_values(&array, &gc_array);
assert_eq!(array.data_buffers().len(), 5);
// TODO this is failing now (it always compacts)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is failing now because the StringView is compacted even though all data in the buffers is pointed to

What I think this is showing is that string views with many long strings but no actual garbage are always compacted, which doesn't seem like a good heuristic

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree the heuristic is a bit arbitrary and not really backed by real experiments... I'll think a bit more about this and back to it later today

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW I am going to prototype what the combined filter/coalsece thing we were talking about on apache#11628 might look like

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prototype is here: apache#11647 (not yet ready for review)

@XiangpengHao XiangpengHao merged commit 2c882d6 into XiangpengHao:string-view-gc Jul 25, 2024
1 check passed
@alamb alamb deleted the alamb/string-view-gc-tests branch July 26, 2024 14:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants