feat: Support iceberg partition transform by PingLiuPing · Pull Request #15509 · facebookincubator/velox

PingLiuPing · 2025-11-14T21:18:49Z

Integrate Iceberg partition transform functionality with IcebergDataSink.
Complete support for writing partitioned Iceberg table. Include identity,
bucket, truncate, and temporal transforms.

Add end-to-end tests covering various partition transform scenarios.

Differences between Hive and Iceberg when writing parquet data files:

Timestamp Timezone:
Hive: Uses session timezone from connectorQueryCtx_->sessionTimezone() and respects adjustTimestampToTimezone setting
Iceberg: Ignore session timezone, timezone-agnostic.
Timestamp Precision:
Hive: Uses configurable timestamp precision.
Iceberg: Use microseconds for Iceberg spec compliance.
Parquet data file name:
Hive: Support bucketing, use special file naming conventions. For non bucketing case, Hive track which task/driver wrote which file,

fmt::format(
        "{}_{}_{}_{}",
        connectorQueryCtx.taskId(),
        connectorQueryCtx.driverId(),
        connectorQueryCtx.planNodeId(),
        makeUuid()

Iceberg: Simple UUID-based names.

netlify · 2025-11-14T21:18:56Z

✅ Deploy Preview for meta-velox canceled.

Name	Link
🔨 Latest commit	`cc35e13`
🔍 Latest deploy log	https://app.netlify.com/projects/meta-velox/deploys/693144843ea1170008610061

PingLiuPing · 2025-11-14T21:28:36Z

@mbasmanova As all the preliminary work have been merged. This PR integrate them together and with this PR we support writing partitioned iceberg table.

mbasmanova · 2025-11-14T21:29:34Z

velox/connectors/hive/iceberg/IcebergDataSink.cpp

    auto commitDataJson = folly::toJson(commitData);
    commitTasks.push_back(commitDataJson);
  }
  return commitTasks;
 }

+void IcebergDataSink::appendData(RowVectorPtr input) {


What's the difference between this method's implementation in Iceberg and Hive?

Thanks for the comment.
There are two differences:

Hive a minor optimization during writing partitioned table - when the input data has only one partition it directly call write and return. But for iceberg it needs to save the partition values in splitInputRowsAndEnsureWriters for commit message (pass back to upstream engine).

Iceberg does not need to check isBucketed

mbasmanova · 2025-11-14T21:30:17Z

velox/connectors/hive/iceberg/IcebergDataSink.cpp

+  }
+}
+
+HiveWriterId IcebergDataSink::getIcebergWriterId(size_t row) const {


What is the difference between Iceberg writer ID and Hive writer ID?

Thanks.
There is no difference between Iceberg writer ID and Hive writer ID, for iceberg we can skip checking isBucketed.

for iceberg we can skip checking isBucketed.

Is this because Iceberg doesn't support bucketing? Do we need a separate getIcebergWriterId method or can this logic be unified?

Yes, iceberg does not support bucketing, the bucket transform is a different concept.

Do we need a separate getIcebergWriterId method or can this logic be unified?

Yes, just need to make some change to the HiveDataSink to move the getWriterId implementation from .cpp to .h. Otherwise there is linking error since getWriterId is declared as inline.

mbasmanova · 2025-11-15T12:04:14Z

velox/connectors/hive/HiveDataSink.h

@@ -576,6 +576,7 @@ class HiveDataSink : public DataSink {
      uint32_t bucketCount,
      std::unique_ptr<core::PartitionFunction> bucketFunction,
      const std::vector<column_index_t>& partitionChannels,
+      const std::vector<column_index_t>& dataChannels,


Would you add a @param comment for this new argument?

mbasmanova · 2025-11-15T12:04:48Z

velox/connectors/hive/HiveDataSink.cpp

-  for (column_index_t i = 0; i < childrenSize; i++) {
-    if (std::find(partitionChannels.cbegin(), partitionChannels.cend(), i) ==
-        partitionChannels.cend()) {
+  for (auto i = 0; i < insertTableHandle->inputColumns().size(); i++) {


nit: add a temp variable for insertTableHandle->inputColumns() for readability

mbasmanova · 2025-11-15T12:05:44Z

velox/connectors/hive/HiveDataSink.cpp

@@ -385,7 +382,15 @@ HiveDataSink::HiveDataSink(
                    inputType)
              : nullptr,
          getPartitionChannels(insertTableHandle),
-          nullptr) {}
+          getNonPartitionChannels(insertTableHandle),
+          !getPartitionChannels(insertTableHandle).empty()


It seems sub-optimal to call getPartitionChannels(insertTableHandle) twice and create a vector just to check if it is empty

Thanks, I refactored the code and move getPartitionChannels to HiveInsertTableHandle

mbasmanova · 2025-11-15T12:06:02Z

velox/connectors/hive/HiveDataSink.cpp

+          !getPartitionChannels(insertTableHandle).empty()
+              ? std::make_unique<PartitionIdGenerator>(
+                    inputType,
+                    getPartitionChannels(insertTableHandle),


this is a 3-rd call to getPartitionChannels

mbasmanova · 2025-11-15T12:08:40Z

velox/connectors/hive/iceberg/IcebergDataSink.h

+  // message sent to Presto. Indexed by writer index. Each entry contains the
+  // partition values (as folly::dynamic) for that writer's partition, which
+  // are serialized to JSON as "partitionDataJson" in the commit protocol.
+  // These values are distinct from the transformed partition values in


These values are distinct

Are you saying that the data is the same, but it is represented as JSON here and as Velox Vectors in partitionIdGenerator_? Perhaps, clarify.

Yes, the data is same but in different format, columnar vs row and stored in folly::dynamic for easy to serialize to json. I will refine the comment.

mbasmanova · 2025-11-15T12:11:28Z

velox/connectors/hive/iceberg/IcebergDataSink.cpp

+  return options;
+}
+
+std::vector<folly::dynamic> IcebergDataSink::makeCommitPartitionValue(


Can we return folly::array instead of std::vector?

Yes, we can just return folly::dynamic.

mbasmanova · 2025-11-15T12:11:39Z

velox/connectors/hive/iceberg/IcebergDataSink.cpp

+  std::vector<folly::dynamic> partitionValues(partitionChannels_.size());
+  const auto& transformedValues = partitionIdGenerator_->partitionValues();
+  for (auto i = 0; i < partitionChannels_.size(); ++i) {
+    const auto& block = transformedValues->childAt(i);


block is a Presto term; do not use in Velox

mbasmanova · 2025-11-15T12:12:45Z

velox/connectors/hive/iceberg/IcebergDataSink.cpp

+    if (block->isNullAt(writerIndex)) {
+      partitionValues[i] = nullptr;
+    } else {
+      DecodedVector decoded(*block);


No need to decode the whole vector to extract a single value. Just use SimpleVector<T>::valueAt

mbasmanova · 2025-11-15T12:14:03Z

velox/connectors/hive/iceberg/IcebergDataSink.cpp

+#ifdef VELOX_ENABLE_PARQUET
+  auto parquetOptions =
+      std::dynamic_pointer_cast<parquet::WriterOptions>(options);
+  VELOX_CHECK_NOT_NULL(parquetOptions);
+  parquetOptions->parquetWriteTimestampTimeZone = std::nullopt;
+  parquetOptions->parquetWriteTimestampUnit = TimestampPrecision::kMicroseconds;
+#endif


Why is this difference? Is this intentional?

Would it be possible to update PR description to list all differences between Hive and Iceberg?

Yes this is intentional.
See https://iceberg.apache.org/spec/#parquet, there is requirement when writing parquet files. For timestamp, the precision is microseconds (velox default is nanoseconds) and should not convert to utc timezone.

Thank you for clarifying. Just to make sure, this is Iceberg-specific requirement that applies to Parquet only. Is this so? Would be nice to add the link above to a comment in the code.

Thanks. Yes, this is iceberg-specific requirement.
Sure, will add a comment here.

mbasmanova · 2025-11-15T12:18:07Z

velox/connectors/hive/iceberg/IcebergDataSink.cpp

+
+    updatePartitionRows(index, numRows, row);
+
+    if (commitPartitionValue_[index].empty()) {


Would it make sense to move this logic into ensureWriter? We can then have shared splitInputRowsAndEnsureWriters logic and custom versions of ensureWriter.

Thanks.
Yes, it makes sense. This way we an remove splitInputRowsAndEnsureWriters and appendData from IcebergDataSink.

mbasmanova · 2025-11-17T13:42:35Z

velox/connectors/hive/HiveDataSink.h

@@ -343,13 +355,35 @@ class HiveInsertTableHandle : public ConnectorInsertTableHandle {
  const std::shared_ptr<const LocationHandle> locationHandle_;

 private:
+  std::vector<column_index_t> computePartitionChannels() const {


Let's move ctor implementation to .cpp and replace computeXxx methods with free functions.

I also move the constructor from .h to .cpp.

mbasmanova · 2025-11-17T13:43:45Z

velox/connectors/hive/HiveDataSink.h

@@ -671,7 +708,19 @@ class HiveDataSink : public DataSink {

  // Get the HiveWriter corresponding to the row
  // from partitionIds and bucketIds.
-  FOLLY_ALWAYS_INLINE HiveWriterId getWriterId(size_t row) const;
+  FOLLY_ALWAYS_INLINE HiveWriterId getWriterId(size_t row) const {


Let's remove FOLLY_ALWAYS_INLINE and move the implementation to .cpp file.

mbasmanova · 2025-11-17T13:47:42Z

velox/connectors/hive/iceberg/IcebergDataSink.cpp

+  }
+  auto fileFormat = dwio::common::toString(insertTableHandle->storageFormat());
+  return {
+      fmt::format("{}.{}", targetFileName, fileFormat),


nit: can be avoid formatting same string twice?

mbasmanova · 2025-11-17T13:47:58Z

velox/connectors/hive/iceberg/IcebergDataSink.cpp

+}
+
+folly::dynamic IcebergFileNameGenerator::serialize() const {
+  VELOX_UNREACHABLE();


Don't we need to implement this?

I found that the serialization and de-serialization logic are only used during test. See void HiveConnector::registerSerDe().

Hmm... if so, why do we need it at all? Would you look into who introducing this API / logic and let's see if we can just delete it.

@mbasmanova Thanks.

Found that SerDe is necessary as part of the test infrastructure. But still I cannot find it been used in production code.

I searched registerSerDe in Prestissimo code and found it also only used in test code.

In velox, the SerDe will be called from top down.

In Velox, I found SerDe is used for few purpose:

In fuzzer, remote query execution

velox/velox/exec/fuzzer/VeloxQueryRunner.cpp

Lines 202 to 214 in 7d2979d

VeloxQueryRunner::execute(const core::PlanNodePtr& plan) {

auto serializedPlan = serializePlan(plan);

auto queryId = fmt::format("velox_local_query_runner_{}", rand());

auto client =

createThriftClient(thriftHost_, thriftPort_, timeout_, &eventBase_);

// Create the request

ExecutePlanRequest request;

request.serializedPlan() = serializedPlan;

request.queryId() = queryId;

request.numWorkers() = 4; // Default value

request.numDrivers() = 2; // Default value

Query tracing and replay.
See tracing:

velox/velox/exec/TaskTraceWriter.cpp

Lines 36 to 65 in 7d2979d

void TaskTraceMetadataWriter::write(

const std::shared_ptr<core::QueryCtx>& queryCtx,

const core::PlanNodePtr& planNode) {

VELOX_CHECK(!finished_, "Query metadata can only be written once");

finished_ = true;

folly::dynamic queryConfigObj = folly::dynamic::object;

const auto configValues = queryCtx->queryConfig().rawConfigsCopy();

for (const auto& [key, value] : configValues) {

queryConfigObj[key] = value;

}

folly::dynamic connectorPropertiesObj = folly::dynamic::object;

for (const auto& [connectorId, configs] :

queryCtx->connectorSessionProperties()) {

folly::dynamic obj = folly::dynamic::object;

for (const auto& [key, value] : configs->rawConfigsCopy()) {

obj[key] = value;

}

connectorPropertiesObj[connectorId] = obj;

}

folly::dynamic metaObj = folly::dynamic::object;

metaObj[TraceTraits::kQueryConfigKey] = queryConfigObj;

metaObj[TraceTraits::kConnectorPropertiesKey] = connectorPropertiesObj;

metaObj[TraceTraits::kPlanNodeKey] = planNode->serialize();

const auto metaStr = folly::toJson(metaObj);

const auto file = fs_->openFileForWrite(traceFilePath_);

file->append(metaStr);

file->close();

Replaying:

velox/velox/exec/TaskTraceReader.cpp

Lines 26 to 37 in 7d2979d

TaskTraceMetadataReader::TaskTraceMetadataReader(

std::string traceDir,

memory::MemoryPool* pool)

: traceDir_(std::move(traceDir)),

fs_(filesystems::getFileSystem(traceDir_, nullptr)),

traceFilePath_(getTaskTraceMetaFilePath(traceDir_)),

pool_(pool),

metadataObj_(getTaskMetadata(traceFilePath_, fs_)),

tracePlanNode_(

ISerializable::deserialize<core::PlanNode>(

metadataObj_[TraceTraits::kPlanNodeKey],

pool_)) {}

And TraceReplayRunner:

velox/velox/tool/trace/TraceReplayRunner.cpp

Lines 255 to 312 in 7d2979d

void TraceReplayRunner::init() {

VELOX_USER_CHECK(!FLAGS_root_dir.empty(), "--root_dir must be provided");

VELOX_USER_CHECK(!FLAGS_node_id.empty(), "--node_id must be provided");

if (!memory::MemoryManager::testInstance()) {

velox::memory::SharedArbitrator::registerFactory();

memory::MemoryManager::Options options;

options.arbitratorKind = FLAGS_memory_arbitrator_type;

memory::initializeMemoryManager(options);

}

filesystems::registerLocalFileSystem();

filesystems::registerS3FileSystem();

filesystems::registerHdfsFileSystem();

filesystems::registerGcsFileSystem();

filesystems::registerAbfsFileSystem();

dwio::common::registerFileSinks();

dwrf::registerDwrfReaderFactory();

dwrf::registerDwrfWriterFactory();

#ifdef VELOX_ENABLE_PARQUET

parquet::registerParquetReaderFactory();

parquet::registerParquetWriterFactory();

#endif

core::PlanNode::registerSerDe();

velox::exec::trace::registerDummySourceSerDe();

core::ITypedExpr::registerSerDe();

common::Filter::registerSerDe();

Type::registerSerDe();

exec::registerPartitionFunctionSerDe();

if (!isRegisteredVectorSerde()) {

serializer::presto::PrestoVectorSerde::registerVectorSerde();

}

if (!isRegisteredNamedVectorSerde(VectorSerde::Kind::kPresto)) {

serializer::presto::PrestoVectorSerde::registerNamedVectorSerde();

}

if (!isRegisteredNamedVectorSerde(VectorSerde::Kind::kCompactRow)) {

serializer::CompactRowVectorSerde::registerNamedVectorSerde();

}

if (!isRegisteredNamedVectorSerde(VectorSerde::Kind::kUnsafeRow)) {

serializer::spark::UnsafeRowVectorSerde::registerNamedVectorSerde();

}

connector::hive::HiveConnector::registerSerDe();

functions::prestosql::registerAllScalarFunctions(FLAGS_function_prefix);

aggregate::prestosql::registerAllAggregateFunctions(FLAGS_function_prefix);

parse::registerTypeResolver();

fs_ = filesystems::getFileSystem(FLAGS_root_dir, nullptr);

const auto taskTraceDir = exec::trace::getTaskTraceDirectory(

FLAGS_root_dir, FLAGS_query_id, FLAGS_task_id);

taskTraceMetadataReader_ =

std::make_unique<exec::trace::TaskTraceMetadataReader>(

taskTraceDir, memory::MemoryManager::getInstance()->tracePool());

}

I think we need to add SerDe implementation.
Would it make sense to add SerDe in a separate PR? This should be added to few classes.

Would it make sense to add SerDe in a separate PR?

Sounds good. Let's open an issue.

@mbasmanova Thanks. See @#15563

mbasmanova · 2025-11-17T13:48:39Z

velox/connectors/hive/iceberg/IcebergDataSink.cpp

+  VELOX_USER_CHECK_EQ(
+      tableStorageFormat,
+      dwio::common::FileFormat::PARQUET,
+      "Only Parquet file format is supported when writing Iceberg tables. Format: {}",


No need to include format in the message. It is included automatically. Please, trigger this error and check the error message.

Thanks, the format in the message is redundant.

C++ exception with description "Exception: VeloxUserError Error Source: USER Error Code: INVALID_ARGUMENT Reason: (dwrf vs. parquet) Only Parquet file format is supported when writing Iceberg tables. Format: dwrf

mbasmanova · 2025-11-17T13:50:01Z

velox/connectors/hive/iceberg/IcebergDataSink.cpp

@@ -239,6 +344,9 @@ std::vector<std::string> IcebergDataSink::commitMessage() const {
      ("fileFormat", "PARQUET")
      ("content", "DATA");
    // clang-format on
+    if (!(commitPartitionValue_.empty() || commitPartitionValue_[i].isNull())) {
+      commitData["partitionDataJson"] = toJsonString(commitPartitionValue_[i]);


toJsonString function is used only once and is very short; consider removing it and writing out logic here; this would make it easier to understand the overall format of commit data

mbasmanova · 2025-11-17T13:53:13Z

velox/connectors/hive/iceberg/tests/IcebergInsertTest.cpp

+
+    for (const auto& file : files) {
+      std::vector<std::string> pathComponents;
+      folly::split("/", file, pathComponents);


This logic is repeated. Would it be possible to extract a helper function that takes a path and returns a map of partition keys? Then reuse it in multiple places?

mbasmanova · 2025-11-17T13:53:56Z

velox/connectors/hive/iceberg/tests/IcebergInsertTest.cpp

+          std::vector<std::string> parts;
+          folly::split('=', component, parts);
+          ASSERT_EQ(parts.size(), 2);
+          ASSERT_EQ(parts[0], rowType->nameOf(colIndex));


The verification logic might be simpler if we just hard-code expected path for each test case.

mbasmanova · 2025-11-17T13:54:56Z

velox/connectors/hive/iceberg/tests/IcebergInsertTest.cpp

+  auto rowType = ROW(
+      {"c1", "c2", "c3", "c4", "c5", "c6"},
+      {BIGINT(), INTEGER(), SMALLINT(), DECIMAL(18, 5), BOOLEAN(), VARCHAR()});
+  for (auto colIndex = 0; colIndex < rowType->size(); colIndex++) {


It looks like this test has 6 test cases. Each test cases tests a single column. If so, it would be clearer to write it this way by creating a testCase struct that contains a column name, type, expected path, etc. Then loop over these.

mbasmanova · 2025-11-17T13:56:16Z

velox/connectors/hive/iceberg/tests/IcebergInsertTest.cpp

+    const auto commitTasks = dataSink->close();
+    auto splits = createSplitsForDirectory(outputDirectory->getPath());
+
+    ASSERT_GT(commitTasks.size(), 0);


Can we assert specific number of tasks? Is this non-deterministic? Why?

Thanks. Yes it is non-deterministic since the number of tasks depends on the number of writers which depends on the number of partitions which in turn depends on the data, the data is random data generated by fuzzer.

Let's use a seed to make fuzzer generated data deterministic.

mbasmanova · 2025-11-17T13:57:41Z

velox/connectors/hive/iceberg/tests/TransformE2ETest.cpp

+      RowTypePtr rowType) {
+    std::vector<RowVectorPtr> batches;
+    static const std::vector<Timestamp> timestamps = {
+        Timestamp(0, 0), // 1970-01-01 00:00:00


Would you add a comment to explain how you chose these values?

…15509)" This reverts commit d791d84. Alchemy-item: (ID = 933) Iceberg staging hub commit 1/17 - 2651c50373339b5758f8e10500294c7069f6dd84

…15509)" This reverts commit d791d84. Alchemy-item: (ID = 947) Iceberg staging hub commit 1/17 - 4d4108049a7073e66047e6c8888c5b0f5de009c3

PingLiuPing requested a review from mbasmanova November 14, 2025 21:18

PingLiuPing requested a review from majetideepak as a code owner November 14, 2025 21:18

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 14, 2025

mbasmanova reviewed Nov 14, 2025

View reviewed changes

mbasmanova reviewed Nov 15, 2025

View reviewed changes

PingLiuPing force-pushed the lp_iceberg_support_transform branch 2 times, most recently from b099733 to 84d45ef Compare November 17, 2025 13:39

mbasmanova reviewed Nov 17, 2025

View reviewed changes

PingLiuPing mentioned this pull request Jan 27, 2026

feat: Collect Iceberg stats #16062

Closed


		updatePartitionRows(index, numRows, row);

		if (commitPartitionValue_[index].empty()) {

	VeloxQueryRunner::execute(const core::PlanNodePtr& plan) {
	auto serializedPlan = serializePlan(plan);
	auto queryId = fmt::format("velox_local_query_runner_{}", rand());

	auto client =
	createThriftClient(thriftHost_, thriftPort_, timeout_, &eventBase_);

	// Create the request
	ExecutePlanRequest request;
	request.serializedPlan() = serializedPlan;
	request.queryId() = queryId;
	request.numWorkers() = 4; // Default value
	request.numDrivers() = 2; // Default value

	void TaskTraceMetadataWriter::write(
	const std::shared_ptr<core::QueryCtx>& queryCtx,
	const core::PlanNodePtr& planNode) {
	VELOX_CHECK(!finished_, "Query metadata can only be written once");
	finished_ = true;
	folly::dynamic queryConfigObj = folly::dynamic::object;
	const auto configValues = queryCtx->queryConfig().rawConfigsCopy();
	for (const auto& [key, value] : configValues) {
	queryConfigObj[key] = value;
	}

	folly::dynamic connectorPropertiesObj = folly::dynamic::object;
	for (const auto& [connectorId, configs] :
	queryCtx->connectorSessionProperties()) {
	folly::dynamic obj = folly::dynamic::object;
	for (const auto& [key, value] : configs->rawConfigsCopy()) {
	obj[key] = value;
	}
	connectorPropertiesObj[connectorId] = obj;
	}

	folly::dynamic metaObj = folly::dynamic::object;
	metaObj[TraceTraits::kQueryConfigKey] = queryConfigObj;
	metaObj[TraceTraits::kConnectorPropertiesKey] = connectorPropertiesObj;
	metaObj[TraceTraits::kPlanNodeKey] = planNode->serialize();

	const auto metaStr = folly::toJson(metaObj);
	const auto file = fs_->openFileForWrite(traceFilePath_);
	file->append(metaStr);
	file->close();

	TaskTraceMetadataReader::TaskTraceMetadataReader(
	std::string traceDir,
	memory::MemoryPool* pool)
	: traceDir_(std::move(traceDir)),
	fs_(filesystems::getFileSystem(traceDir_, nullptr)),
	traceFilePath_(getTaskTraceMetaFilePath(traceDir_)),
	pool_(pool),
	metadataObj_(getTaskMetadata(traceFilePath_, fs_)),
	tracePlanNode_(
	ISerializable::deserialize<core::PlanNode>(
	metadataObj_[TraceTraits::kPlanNodeKey],
	pool_)) {}

	void TraceReplayRunner::init() {
	VELOX_USER_CHECK(!FLAGS_root_dir.empty(), "--root_dir must be provided");
	VELOX_USER_CHECK(!FLAGS_node_id.empty(), "--node_id must be provided");

	if (!memory::MemoryManager::testInstance()) {
	velox::memory::SharedArbitrator::registerFactory();

	memory::MemoryManager::Options options;
	options.arbitratorKind = FLAGS_memory_arbitrator_type;
	memory::initializeMemoryManager(options);
	}
	filesystems::registerLocalFileSystem();
	filesystems::registerS3FileSystem();
	filesystems::registerHdfsFileSystem();
	filesystems::registerGcsFileSystem();
	filesystems::registerAbfsFileSystem();

	dwio::common::registerFileSinks();
	dwrf::registerDwrfReaderFactory();
	dwrf::registerDwrfWriterFactory();

	#ifdef VELOX_ENABLE_PARQUET
	parquet::registerParquetReaderFactory();
	parquet::registerParquetWriterFactory();
	#endif

	core::PlanNode::registerSerDe();
	velox::exec::trace::registerDummySourceSerDe();
	core::ITypedExpr::registerSerDe();
	common::Filter::registerSerDe();
	Type::registerSerDe();
	exec::registerPartitionFunctionSerDe();
	if (!isRegisteredVectorSerde()) {
	serializer::presto::PrestoVectorSerde::registerVectorSerde();
	}
	if (!isRegisteredNamedVectorSerde(VectorSerde::Kind::kPresto)) {
	serializer::presto::PrestoVectorSerde::registerNamedVectorSerde();
	}
	if (!isRegisteredNamedVectorSerde(VectorSerde::Kind::kCompactRow)) {
	serializer::CompactRowVectorSerde::registerNamedVectorSerde();
	}
	if (!isRegisteredNamedVectorSerde(VectorSerde::Kind::kUnsafeRow)) {
	serializer::spark::UnsafeRowVectorSerde::registerNamedVectorSerde();
	}

	connector::hive::HiveConnector::registerSerDe();

	functions::prestosql::registerAllScalarFunctions(FLAGS_function_prefix);
	aggregate::prestosql::registerAllAggregateFunctions(FLAGS_function_prefix);
	parse::registerTypeResolver();

	fs_ = filesystems::getFileSystem(FLAGS_root_dir, nullptr);
	const auto taskTraceDir = exec::trace::getTaskTraceDirectory(
	FLAGS_root_dir, FLAGS_query_id, FLAGS_task_id);
	taskTraceMetadataReader_ =
	std::make_unique<exec::trace::TaskTraceMetadataReader>(
	taskTraceDir, memory::MemoryManager::getInstance()->tracePool());
	}

Conversation

PingLiuPing commented Nov 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

netlify bot commented Nov 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for meta-velox canceled.

Uh oh!

PingLiuPing commented Nov 14, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

PingLiuPing Nov 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

PingLiuPing commented Nov 14, 2025 •

edited

Loading

netlify bot commented Nov 14, 2025 •

edited

Loading

PingLiuPing Nov 15, 2025 •

edited

Loading