Skip to content

Conversation

@rluvaton
Copy link
Member

Which issue does this PR close?

N/A

Rationale for this change

I noticed that converting around 50 columns the conversion become very slow, so adding a benchmark as I'm optimizing those parts

What changes are included in this PR?

added new benchmark for row_format that convert 50 columns arrays

Are these changes tested?

N/A

Are there any user-facing changes?

Nope

@github-actions github-actions bot added the arrow Changes to the arrow crate label Dec 31, 2025
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @rluvaton

));
}

for _ in 0..3 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about also adding a StringView column too (in addition to string)

cols.push(Arc::new(create_f64_array_with_seed(batch_size, nulls, seed)) as ArrayRef);
}

do_bench(c, format!("{batch_size} lot of columns").as_str(), cols);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please make the description more precise (specificially, "how many" columns) -- maybe with an assert too to make sure the benchmark stays in sync

assert_eq!(cols.len(), 50)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since I am trying to clear the backlog, I took the liberty of making this change and pushing to this PR 3c5bf63

@alamb alamb changed the title bench: added to row_format benchmark conversion of 50 non-nested columns bench: added to row_format benchmark conversion of 53 non-nested columns Jan 9, 2026
@alamb alamb merged commit 5a1e482 into apache:main Jan 10, 2026
24 checks passed
@rluvaton rluvaton deleted the add-large-number-of-columns-bench branch January 10, 2026 15:40
Dandandan pushed a commit to Dandandan/arrow-rs that referenced this pull request Jan 15, 2026
…mns (apache#9081)

# Which issue does this PR close?

N/A

# Rationale for this change

I noticed that converting around 50 columns the conversion become very
slow, so adding a benchmark as I'm optimizing those parts

# What changes are included in this PR?

added new benchmark for `row_format` that convert 50 columns arrays

# Are these changes tested?

N/A

# Are there any user-facing changes?

Nope

---------

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

arrow Changes to the arrow crate performance

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants