Skip to content

misc: Make WriterOptions serializable#14868

Closed
yingsu00 wants to merge 1 commit intofacebookincubator:mainfrom
yingsu00:WriterOptionsSerDe
Closed

misc: Make WriterOptions serializable#14868
yingsu00 wants to merge 1 commit intofacebookincubator:mainfrom
yingsu00:WriterOptionsSerDe

Conversation

@yingsu00
Copy link
Copy Markdown
Contributor

No description provided.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 16, 2025
@netlify
Copy link
Copy Markdown

netlify bot commented Sep 16, 2025

Deploy Preview for meta-velox canceled.

Name Link
🔨 Latest commit ae636e2
🔍 Latest deploy log https://app.netlify.com/projects/meta-velox/deploys/68c8ef1f3125340008dec24d

Copy link
Copy Markdown
Contributor

@mbasmanova mbasmanova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yingsu00 Do you want to upgrade this PR from 'draft'?

  • CI is red. Please, take a look.
  • Use "feat: " prefix for PR title.
  • Please, add PR description to clarify the purpose of this change (a stepping stone towards making HiveInsertTableHandle serializable)

folly::dynamic obj = folly::dynamic::object;
obj["name"] = "WriterOptions";

// 1) Schema
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ordinal numbers like this are fragile as they do not survice refactoring. Please, drop.

Also, these comments seem redundant as they just repeat the code. Consider dropping altogether.

// 5) adjustTimestampToTimezone
obj["adjustTimestampToTimezone"] = adjustTimestampToTimezone;

// We do *not* serialize pool, spillConfig, nonReclaimableSection,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you move this comment to the header file? I assume the caller of 'create' would need to be aware of these.

}

if (auto p = obj.get_ptr("serdeParameters")) {
opts->serdeParameters.clear();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is redundant.


if (auto p = obj.get_ptr("serdeParameters")) {
opts->serdeParameters.clear();
for (auto& kv : p->items()) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for (const auto& [key, value] : p->items())

opts->adjustTimestampToTimezone = p->asBool();
}

// TODO: Finish spillConfig. We currently do not serialize it.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: Finish spillConfig.

What does that mean? Would you elaborate? Can you address this TODO now?

*/
#include <gtest/gtest.h>

// #include "velox/common/compression/Compression.h"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, remove.


auto opts = std::make_shared<WriterOptions>();

// Schema: row<a:bigint>
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drop redundant comments.

// Basic shape checks on serialized output
ASSERT_TRUE(serialized.isObject());
// Always present:
ASSERT_TRUE(serialized.count("adjustTimestampToTimezone") == 1);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ASSERT_EQ

However, let's not hard-code serde format in the test. Rather, let's create round-trip test.

EXPECT_EQ(serialized.count("flushPolicyFactory"), 0);

// Deserialize
auto roundTripped = ISerializable::deserialize<WriterOptions>(serialized);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

roundTripped -> copy

};

struct WriterOptions {
struct WriterOptions : public ISerializable {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that WriterOptions has a virtual method (processConfigs), which means that it has derived classes, which may have additional state. Serde needs to take this into account.

@yingsu00 yingsu00 marked this pull request as ready for review September 17, 2025 07:42
@yingsu00
Copy link
Copy Markdown
Contributor Author

@mbasmanova Thank you very much for reviewing this PR. I'll address your comments tomorrow.

Copy link
Copy Markdown
Collaborator

@PingLiuPing PingLiuPing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we need to serialize the sub class too? Such as velox::parquet::WriterOptions and velox::dwrf::WriterOptions.

@stale
Copy link
Copy Markdown

stale bot commented Jan 1, 2026

This pull request has been automatically marked as stale because it has not had recent activity. If you'd still like this PR merged, please comment on the PR, make sure you've addressed reviewer comments, and rebase on the latest main. Thank you for your contributions!

@stale stale bot added the stale label Jan 1, 2026
@stale stale bot closed this Jan 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. stale

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants