Skip to content

Conversation

@natashasehgal
Copy link
Contributor

@natashasehgal natashasehgal commented Oct 30, 2025

Summary:
Fix stale pointer issue in MapWriter causing segfault in FB-internal UDF.

When a UDF calls add_item() repeatedly on a Map result writer, the underlying vector may reallocate to a larger memory location when full. However, MapWriter retains pointers to the old memory location, causing subsequent writes to access freed memory and segfault.
Fix involves refreshing
valuesVector_ = new vector location
data_ = new memory pointer
So next add_item() adds to correct location.

      for (const auto& [timestamp, value] : timeValues) {
        int64_t timestampMs = timestamp * 1000; // Convert to milliseconds
        auto [keyWriter, valueWriter] = result.add_item();
        keyWriter = Timestamp::fromMillis(timestampMs);
        valueWriter = value; <<<< HERE
      }

Differential Revision: D85692836

@netlify
Copy link

netlify bot commented Oct 30, 2025

Deploy Preview for meta-velox canceled.

Name Link
🔨 Latest commit f4a1c87
🔍 Latest deploy log https://app.netlify.com/projects/meta-velox/deploys/690a8d37e29a7b0008c6f456

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 30, 2025
@meta-codesync
Copy link

meta-codesync bot commented Oct 30, 2025

@natashasehgal has exported this pull request. If you are a Meta employee, you can view the originating Diff in D85692836.

@natashasehgal natashasehgal changed the title fix: Prevent complex type result vector reuse fix: Refresh Map Vector pointers after reallocation Oct 30, 2025
…r#15338)

Summary:

Fix stale pointer issue in MapWriter causing segfault in FB_DECOMPRESS_BERINGEI_TIME_BLOCK UDF.

When a UDF calls `add_item()` repeatedly on a Map result writer, the underlying vector may reallocate to a larger memory location when full. However, MapWriter retains pointers to the old memory location, causing subsequent writes to access freed memory and segfault.

Fix involves refreshing 
- valuesVector_ = new vector location 
- data_ = new memory pointer

So next add_item() adds to correct location.

-- 
Seg Fault in FB_DECOMPRESS_BERINGEI_TIME_BLOCK example - 

     for (const auto& [timestamp, value] : timeValues) {
        int64_t timestampMs = timestamp * 1000; // Convert to milliseconds
        auto [keyWriter, valueWriter] = result.add_item();
        keyWriter = Timestamp::fromMillis(timestampMs);
        valueWriter = value; <<<< HERE
      }

emplace() won't help in the UDF, as it internally calls add_item(). Both face this same issue.

Differential Revision: D85692836
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported meta-exported

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant