Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove duplicate psp_okey column from arrow updates #1044

Merged
merged 3 commits into from
May 13, 2020
Merged

Conversation

sc1f
Copy link
Contributor

@sc1f sc1f commented May 12, 2020

This PR reproduces and fixes a segmentation fault with arrow updates and string reallocation - previously, the psp_okey column was being duplicated incorrectly on an Arrow update, and this turned out to be the cause of a few different segfaults that became trivially reproducible (and has been reproduced and tested in the Python tests).

Removing the incorrect duplication fixes the tests and seems to (for now) prevented any further segfaults related to this issue.

Copy link
Member

@texodus texodus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!

The size of this commit belies the monumental effort required to isolate and debug this behavior. Very useful set of tests also, considering how long this bug eluded capture!

@@ -1064,7 +1064,8 @@ namespace binding {

// Always use the `Table` column names and data types on up
if (table_initialized && is_update) {
auto schema = gnode->get_output_schema();
auto gnode_output_schema = gnode->get_output_schema();
auto schema = gnode_output_schema.drop({"psp_okey"});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is amazing to me that this coerces to std::set<std::string> ..

Since column_names is simply iterated on 10loc below this, it may be cheaper to simple continue on "psp_okey", since drop() iterates through the set and reconstructs both name and type vectors from scratch.

@texodus texodus merged commit cafaec6 into master May 13, 2020
@texodus texodus deleted the py-segfaults branch May 13, 2020 01:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants