Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Typed Buffers in Arrays (#1811) (#1176) #3743

Merged
merged 3 commits into from
Feb 23, 2023

Conversation

tustvold
Copy link
Contributor

Which issue does this PR close?

Part of #1176
Closes #1811

Rationale for this change

The first part of moving towards a fully typed ArrayData as part of #1176

As an added bonus the change to precompute the offset for ByteArray improves the performance of the string comparison kernels by up to 25%. The arithmetic kernels show a very slight performance regression of ~1-2%, I'm inclined to think this does not matter, we are talking microseconds here.

What changes are included in this PR?

Are there any user-facing changes?

@github-actions github-actions bot added the arrow Changes to the arrow crate label Feb 21, 2023
@@ -162,12 +183,15 @@ impl Buffer {
/// Panics iff `(offset + length)` is larger than the existing length.
pub fn slice_with_length(&self, offset: usize, length: usize) -> Self {
assert!(
offset + length <= self.len(),
offset.saturating_add(length) <= self.length,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -603,7 +603,7 @@ mod tests {
let record_batch =
RecordBatch::try_new(Arc::new(schema), vec![Arc::new(a), Arc::new(b)])
.unwrap();
assert_eq!(record_batch.get_array_memory_size(), 592);
assert_eq!(record_batch.get_array_memory_size(), 640);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a slight regression in the size of the arrays themselves, typically they are passed around as references so I am inclined to think this doesn't really matter, it is also likely temporary

unsafe {
std::slice::from_raw_parts(
self.buffer.as_ptr() as *const T,
self.buffer.len() / std::mem::size_of::<T>(),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found caching this value doesn't impact performance, this is likely because the "hot" codepaths are either iterating the slice amortising this cost over multiple values, or using unchecked addressing where this length isn't ever consulted

@@ -637,7 +637,7 @@ pub fn new_null_array(data_type: &DataType, length: usize) -> ArrayRef {
}

// Helper function for printing potentially long arrays.
pub(crate) fn print_long_array<A, F>(
fn print_long_array<A, F>(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just a drive-by cleanup

///
/// We store a pointer instead of an offset to avoid pointer arithmetic
/// which causes LLVM to fail to vectorise code correctly
ptr: *const u8,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seemingly inconsequential change makes a substantial difference to the benchmarks, LLVM seems to fail to realise that the offset doesn't change in the body of the loop and so fails to lift it out, which in turn causes the loop itself to fail to vectorise correctly

@tustvold tustvold requested a review from viirya February 21, 2023 16:53
Copy link
Member

@viirya viirya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @tustvold for moving this forward! I plan to review this in next few days.

Comment on lines -121 to -129
// Soundness
// pointer alignment & location is ensured by RawPtrBox
// buffer bounds/offset is ensured by the ArrayData instance.
unsafe {
std::slice::from_raw_parts(
self.value_offsets.as_ptr().add(self.data.offset()),
self.len() + 1,
)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh that's good. 👍

Comment on lines 227 to 238
let value_offsets = match data.is_empty() && data.buffers()[0].is_empty() {
true => OffsetBuffer::new_empty(),
false => {
let buffer = ScalarBuffer::new(
data.buffers()[0].clone(),
data.offset(),
data.len() + 1,
);
// Safety:
// ArrayData is valid
unsafe { OffsetBuffer::new_unchecked(buffer) }
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems occurring more than once. Maybe we can have a function for it?


/// A non-empty buffer of monotonically increasing, positive integers
#[derive(Debug, Clone)]
pub struct OffsetBuffer<O: ArrowNativeType>(ScalarBuffer<O>);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
pub struct OffsetBuffer<O: ArrowNativeType>(ScalarBuffer<O>);
pub struct OffsetBuffer<O: OffsetSizeTrait>(ScalarBuffer<O>);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OffsetSizeTrait is defined in arrow-array and so can't be used here

Self { buffer, ptr, len }
let byte_offset = offset.checked_mul(size).expect("offset overflow");
let byte_len = len.checked_mul(size).expect("length overflow");
buffer.slice_with_length(byte_offset, byte_len).into()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to check alignment anymore?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The alignment gets checked in the From conversion

/// # Safety
///
/// - ArrayData must contain a valid [`OffsetBuffer`] as its first buffer
unsafe fn get_offsets<O: ArrowNativeType>(data: &ArrayData) -> OffsetBuffer<O> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will eventually get removed as we make ArrayData an enumeration of typed buffers

@tustvold tustvold merged commit 47e4b61 into apache:master Feb 23, 2023
@ursabot
Copy link

ursabot commented Feb 23, 2023

Benchmark runs are scheduled for baseline = ebe6f53 and contender = 47e4b61. 47e4b61 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on ec2-t3-xlarge-us-east-2] ec2-t3-xlarge-us-east-2
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on test-mac-arm] test-mac-arm
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on ursa-i9-9960x] ursa-i9-9960x
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on ursa-thinkcentre-m75q] ursa-thinkcentre-m75q
Buildkite builds:
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Replace RawPtrBox with Safe Abstraction
3 participants