Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
20 commits
Select commit Hold shift + click to select a range
b108580
Update clp dependency to pull in support for new Timestamp column.
gibber9809 Feb 18, 2026
1671230
Add utility for converting nanosecond epoch timestamps to velox times…
gibber9809 Jan 29, 2026
4473c26
Remove usage of deprecated TimestampPattern::init() method
gibber9809 Jan 29, 2026
46dedae
Handle timestamp literal precision special cases for IR and archive s…
gibber9809 Jan 29, 2026
31a5253
Handle rename of DateString -> DeprecatedDateString.
gibber9809 Jan 29, 2026
d81779b
fix: bug fixes for e2e kql query with milisecond epoch against timest…
20001020ycx Feb 2, 2026
45d166a
Update existing tests to use timestamp literal in pushdown.
gibber9809 Feb 18, 2026
57ee741
Update out of date query.
gibber9809 Feb 18, 2026
fd51806
Add timestamp filtering and marshalling test for v0.5.0 archives.
gibber9809 Feb 18, 2026
aaf5d96
Lint fix
gibber9809 Feb 18, 2026
460a88f
Address comment about test name.
gibber9809 Feb 18, 2026
c78f992
Update log-surgeon dependency to match version currently used in clp
gibber9809 Feb 18, 2026
91ecd2f
Fix incorrect commit hash
gibber9809 Feb 18, 2026
19c7b17
Attempt to use fmt 11.2.0 to satisfy log-surgeon dependency.
gibber9809 Feb 18, 2026
aa083a4
Update incorrect checksum.
gibber9809 Feb 18, 2026
59af129
Attempt to get github to pick up previous commit.
gibber9809 Feb 18, 2026
845a75e
Attempt to force velox to build the correct version of fmt.
gibber9809 Feb 18, 2026
3b7153f
Resolve ystdlib before log surgeon.
gibber9809 Feb 18, 2026
1244c47
Resolve ystdlib before log surgeon.
gibber9809 Feb 18, 2026
646d7b3
Set spdlog version to 1.15.3
gibber9809 Feb 18, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion CMake/resolve_dependency_modules/clp.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ include_guard(GLOBAL)
FetchContent_Declare(
clp
GIT_REPOSITORY https://github.com/y-scope/clp.git
GIT_TAG v0.8.0)
GIT_TAG f82e6114160a6addd4727259906bcf621ac9912c)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

🧩 Analysis chain

🏁 Script executed:

# Verify the commit hash exists in the clp repository
git ls-remote https://github.com/y-scope/clp.git | grep f82e6114160a6addd4727259906bcf621ac9912c

Repository: y-scope/velox

Length of output: 159


🏁 Script executed:

# Read the clp.cmake file to see full context
cat -n CMake/resolve_dependency_modules/clp.cmake

Repository: y-scope/velox

Length of output: 1729


🏁 Script executed:

# Search for the APIs mentioned in the review comment in the codebase
rg "DeprecatedDateString|SetTimestampLiteralPrecision|TimestampLiteral|SingleFileArchiveDefs|cNewTimestampFormatVersion" --type cpp --type h -l

Repository: y-scope/velox

Length of output: 235


🏁 Script executed:

# Search for the specific APIs mentioned in the review comment
rg "DeprecatedDateString|SetTimestampLiteralPrecision|TimestampLiteral|SingleFileArchiveDefs|cNewTimestampFormatVersion" -A 2 -B 2

Repository: y-scope/velox

Length of output: 5337


🏁 Script executed:

# Also check if there's any documentation or comments about why this specific commit is needed
rg -i "commit|pin|tag|clp" CMake/resolve_dependency_modules/ -A 2 -B 2

Repository: y-scope/velox

Length of output: 17902


Add a comment documenting when this CLP commit pin can be replaced with a tagged release.

The CLP dependency is correctly pinned to commit f82e6114160a6addd4727259906bcf621ac9912c because the required APIs (SetTimestampLiteralPrecision, TimestampLiteral, SingleFileArchiveDefs, cNewTimestampFormatVersion, DeprecatedDateString) are used throughout the codebase but not yet available in a tagged release. To improve maintainability and ensure this pin isn't forgotten, add a comment in the CMake file noting the target CLP release version for this upgrade.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@CMake/resolve_dependency_modules/clp.cmake` at line 19, Add a comment next to
the GIT_TAG pin for commit f82e6114160a6addd4727259906bcf621ac9912c explaining
this temporary pin and when it can be replaced: note that the codebase relies on
CLP APIs SetTimestampLiteralPrecision, TimestampLiteral, SingleFileArchiveDefs,
cNewTimestampFormatVersion, and DeprecatedDateString which are not yet in a
tagged release, and indicate the target CLP release version (or ">= vX.Y.Z"
placeholder) to switch to once those APIs are available; place the comment
immediately above or beside the GIT_TAG line so future maintainers can easily
find and update it.


set(CLP_BUILD_CLP_REGEX_UTILS
OFF
Expand Down
4 changes: 2 additions & 2 deletions CMake/resolve_dependency_modules/fmt.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -13,9 +13,9 @@
# limitations under the License.
include_guard(GLOBAL)

set(VELOX_FMT_VERSION 10.1.1)
set(VELOX_FMT_VERSION 11.2.0)
set(VELOX_FMT_BUILD_SHA256_CHECKSUM
78b8c0a72b1c35e4443a7e308df52498252d1cefc2b08c9a97bc9ee6cfe61f8b)
bc23066d87ab3168f27cef3e97d545fa63314f5c79df5ea444d41d56f962c6af)
set(VELOX_FMT_SOURCE_URL
"https://github.com/fmtlib/fmt/archive/${VELOX_FMT_VERSION}.tar.gz")

Expand Down
2 changes: 1 addition & 1 deletion CMake/resolve_dependency_modules/log_surgeon.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ include_guard(GLOBAL)
FetchContent_Declare(
log_surgeon
GIT_REPOSITORY https://github.com/y-scope/log-surgeon.git
GIT_TAG 85d4f2c09c0e55f1fb87cdc8b0f4d13fb1a733e1
GIT_TAG 193e1f91eb137bb935a7f44b13cc8dd945a8d742
OVERRIDE_FIND_PACKAGE)

set(log_surgeon_BUILD_TESTING OFF)
Expand Down
4 changes: 2 additions & 2 deletions CMake/resolve_dependency_modules/spdlog.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -13,9 +13,9 @@
# limitations under the License.
include_guard(GLOBAL)

set(VELOX_SPDLOG_BUILD_VERSION 1.12.0)
set(VELOX_SPDLOG_BUILD_VERSION 1.15.3)
set(VELOX_SPDLOG_BUILD_SHA256_CHECKSUM
4dccf2d10f410c1e2feaff89966bfc49a1abb29ef6f08246335b110e001e09a9)
15a04e69c222eb6c01094b5c7ff8a249b36bb22788d72519646fb85feb267e67)
set(VELOX_SPDLOG_SOURCE_URL
"https://github.com/gabime/spdlog/archive/refs/tags/v${VELOX_SPDLOG_BUILD_VERSION}.tar.gz"
)
Expand Down
10 changes: 5 additions & 5 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -494,8 +494,8 @@ else()
endif()
velox_resolve_dependency(glog)

velox_set_source(fmt)
velox_resolve_dependency(fmt 9.0.0)
set(fmt_SOURCE BUNDLED)
velox_resolve_dependency(fmt 11.2.0)
Comment on lines +497 to +498
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check which fmt APIs are used across the codebase to assess breakage risk from 9→11
echo "=== fmt includes ==="
rg -n '#include.*fmt' --type=cpp --type=h -g '!third_party' -g '!_build' | head -60

echo ""
echo "=== Usage of deprecated fmt APIs (ostream.h, printf.h, format_int, BasicFormatter, BasicStringRef) ==="
rg -n 'fmt/ostream\.h|fmt/printf\.h|fmt::format_int|fmt::BasicFormatter|fmt::BasicStringRef' --type=cpp --type=h -g '!third_party' -g '!_build'

echo ""
echo "=== Check spdlog dependency handling in CMakeLists.txt ==="
rg -B3 -A3 'spdlog' CMakeLists.txt | head -40

Repository: y-scope/velox

Length of output: 4587


🏁 Script executed:

#!/bin/bash
# Check the context around lines 497-498 in CMakeLists.txt
echo "=== Context around fmt dependency (lines 490-510) ==="
sed -n '490,510p' CMakeLists.txt

echo ""
echo "=== Check if fmt_SOURCE is set conditionally elsewhere ==="
rg -n 'fmt_SOURCE' CMakeLists.txt

echo ""
echo "=== Check CLP connector block boundaries ==="
rg -B5 -A5 'VELOX_ENABLE_CLP_CONNECTOR' CMakeLists.txt | head -80

Repository: y-scope/velox

Length of output: 1985


🌐 Web query:

fmt library 11.2.0 breaking changes compatibility fmt 9.0.0

💡 Result:

ABI / binary compatibility (shared library)

  • Not ABI-compatible: anything linked against libfmt.so.9 (fmt 9.x) will not run with fmt 11.2.0 (it ships libfmt.so.11), so you must rebuild dependents or install a distro “compat” package. (packages.fedoraproject.org)

Source compatibility: notable breaking / behavior changes from 9.0.0 → 11.2.0

fmt 9.0.0 (big one if you relied on ostream fallback)

  • When fmt/ostream.h is included, automatic operator<< discovery is disabled (to avoid ODR violations). Old behavior required FMT_DEPRECATED_OSTREAM (deprecated) and you should migrate to fmt::streamed or implement formatter<T> : fmt::ostream_formatter. (github.com)

fmt 10.0.0

fmt 11.0.0

  • Introduced fmt/base.h and changed some internals/perf characteristics; most user code just keeps working, but it’s a new header split (useful if you want minimal includes). (raw.githubusercontent.com)
  • fmt::format_to when writing into a fixed array no longer overflows; output can be truncated (behavior change). (raw.githubusercontent.com)

fmt 11.2.0

Citations:


Critical: fmt version bump to 11.2.0 will break code using fmt/ostream.h — must update migration pattern before merging.

The codebase includes fmt/ostream.h in velox/common/base/Exceptions.h and velox/common/base/Status.h, which are core files. The upgrade from fmt 9.0.0 to 11.2.0 introduces breaking changes to how ostream formatting works:

  • fmt 9.0.0 disabled automatic operator<< discovery when fmt/ostream.h is included
  • fmt 10.0.0 removed the FMT_DEPRECATED_OSTREAM macro entirely
  • Code relying on the old ostream fallback pattern will not compile or function correctly with fmt 11.2.0

The change is unconditional (not restricted to the CLP connector block), so it affects all build configurations. Before merging, either:

  1. Migrate the ostream-dependent code in Exceptions.h and Status.h to use fmt::streamed() or implement formatter<T> : fmt::ostream_formatter, or
  2. Confirm this version bump is intended and all necessary code migration has been completed.

Additionally, fmt 11.2.0 requires binary rebuild of all dependents (incompatible with libfmt.so.9).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@CMakeLists.txt` around lines 497 - 498, You bumped fmt to 11.2.0
unconditionally which breaks code that includes fmt/ostream.h (used in
velox/common/base/Exceptions.h and velox/common/base/Status.h); either revert or
gate the version bump and migrate the ostream-dependent formatting: in
Exceptions.h and Status.h replace uses of ostream-based fallbacks with
fmt::streamed(...) where streaming semantics are needed or provide explicit
fmt::formatter<T> specializations (or implement formatter<T> :
fmt::ostream_formatter for types relying on operator<<), and ensure the CMake
change (set(fmt_SOURCE BUNDLED) / velox_resolve_dependency(fmt ...)) is only
applied after all consumers are rebuilt/binary-compatible or the migration is
complete.


if(VELOX_ENABLE_COMPRESSION_LZ4)
find_package(lz4 REQUIRED)
Expand Down Expand Up @@ -541,12 +541,12 @@ if(${VELOX_ENABLE_CLP_CONNECTOR})
velox_resolve_dependency(spdlog)

# Dependencies that depend on other dependencies
velox_set_source(log_surgeon)
velox_resolve_dependency(log_surgeon)

velox_set_source(ystdlib)
velox_resolve_dependency(ystdlib)

velox_set_source(log_surgeon)
velox_resolve_dependency(log_surgeon)

set(clp_SOURCE BUNDLED)
velox_resolve_dependency(clp)
endif()
Expand Down
10 changes: 2 additions & 8 deletions velox/connectors/clp/ClpConnector.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -14,8 +14,6 @@
* limitations under the License.
*/

#include "clp_s/TimestampPattern.hpp"

#include "velox/connectors/clp/ClpConnector.h"
#include "velox/connectors/clp/ClpDataSource.h"

Expand Down Expand Up @@ -50,13 +48,9 @@ std::unique_ptr<DataSink> ClpConnector::createDataSink(
}

ClpConnectorFactory::ClpConnectorFactory()
: ConnectorFactory(kClpConnectorName) {
clp_s::TimestampPattern::init();
}
: ConnectorFactory(kClpConnectorName) {}

ClpConnectorFactory::ClpConnectorFactory(const char* connectorName)
: ConnectorFactory(connectorName) {
clp_s::TimestampPattern::init();
}
: ConnectorFactory(connectorName) {}

} // namespace facebook::velox::connector::clp
16 changes: 16 additions & 0 deletions velox/connectors/clp/search_lib/ClpTimestampsUtils.h
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@

#pragma once

#include "clp_s/Defs.hpp"
#include "velox/type/Timestamp.h"

namespace facebook::velox::connector::clp::search_lib {
Expand Down Expand Up @@ -121,4 +122,19 @@ inline auto convertToVeloxTimestamp(int64_t timestamp) -> Timestamp {
return Timestamp(seconds, static_cast<uint64_t>(nanoseconds));
}

/// Converts a nanosecond precision epochtime_t into a Velox timestamp.
///
/// @param timestamp the input timestamp as an integer
/// @return the corresponding Velox timestamp
inline auto convertNanosecondEpochToVeloxTimestamp(clp_s::epochtime_t timestamp)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this called anywhere? This looks like the handler I am looking for, but probably need to call it in the VectorLoader if I understand correctly.

-> Timestamp {
int64_t seconds{timestamp / Timestamp::kNanosInSecond};
int64_t nanoseconds{timestamp % Timestamp::kNanosInSecond};
if (nanoseconds < 0) {
seconds -= 1;
nanoseconds += Timestamp::kNanosInSecond;
}
return Timestamp(seconds, static_cast<uint64_t>(nanoseconds));
}

} // namespace facebook::velox::connector::clp::search_lib
14 changes: 13 additions & 1 deletion velox/connectors/clp/search_lib/archive/ClpArchiveCursor.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -17,9 +17,12 @@
#include <glog/logging.h>

#include "clp_s/ArchiveReader.hpp"
#include "clp_s/SingleFileArchiveDefs.hpp"
#include "clp_s/search/EvaluateTimestampIndex.hpp"
#include "clp_s/search/ast/EmptyExpr.hpp"
#include "clp_s/search/ast/SearchUtils.hpp"
#include "clp_s/search/ast/SetTimestampLiteralPrecision.hpp"
#include "clp_s/search/ast/TimestampLiteral.hpp"
#include "velox/connectors/clp/ClpColumnHandle.h"
#include "velox/connectors/clp/search_lib/archive/ClpArchiveCursor.h"
#include "velox/connectors/clp/search_lib/archive/ClpArchiveJsonStringVectorLoader.h"
Expand Down Expand Up @@ -128,14 +131,23 @@ ErrorCode ClpArchiveCursor::loadSplit() {
archiveReader_->open(
get_path_object_for_raw_path(splitPath_), networkAuthOption);
} catch (std::exception& e) {
VLOG(2) << "Failed to open archive file: " << e.what();
VLOG(2) << "Failed to open archive file: " << splitPath_ << ": "
<< e.what();
return ErrorCode::InternalError;
}

auto timestampDict = archiveReader_->get_timestamp_dictionary();
auto schemaTree = archiveReader_->get_schema_tree();
auto schemaMap = archiveReader_->get_schema_map();

auto const defaultTimestampPrecision{
archiveReader_->get_header().version < clp_s::cNewTimestampFormatVersion

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
archiveReader_->get_header().version < clp_s::cNewTimestampFormatVersion
archiveReader_->get_header().version < clp_s::cDeprecatedDateStringFormatVersionMarker

? TimestampLiteral::Precision::Milliseconds
: TimestampLiteral::Precision::Nanoseconds};
SetTimestampLiteralPrecision timestampPrecisionPass{
defaultTimestampPrecision};
expr_ = timestampPrecisionPass.run(expr_);

EvaluateTimestampIndex timestampIndex(timestampDict);
if (clp_s::EvaluatedValue::False == timestampIndex.run(expr_)) {
VLOG(2) << "No matching timestamp ranges for query '" << query_ << "'";
Expand Down
22 changes: 16 additions & 6 deletions velox/connectors/clp/search_lib/archive/ClpArchiveVectorLoader.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,7 @@ void ClpArchiveVectorLoader::populateTimestampData(
case clp_s::NodeType::FormattedFloat:
case clp_s::NodeType::DictionaryFloat:
case clp_s::NodeType::Integer:
case clp_s::NodeType::DateString:
case clp_s::NodeType::DeprecatedDateString:
supportedNodeType = true;
break;
default:
Expand Down Expand Up @@ -115,7 +115,8 @@ void ClpArchiveVectorLoader::populateTimestampData(
convertToVeloxTimestamp(
std::get<int64_t>(reader->extract_value(messageIndex))));
} else {
auto reader = static_cast<clp_s::DateStringColumnReader*>(columnReader_);
auto reader =
static_cast<clp_s::DeprecatedDateStringColumnReader*>(columnReader_);
vector->set(
vectorIndex,
convertToVeloxTimestamp(reader->get_encoded_time(messageIndex)));
Expand Down Expand Up @@ -214,8 +215,9 @@ void ClpArchiveVectorLoader::loadInternal(
populateTimestampData<clp_s::NodeType::Integer>(rows, timestampVector);
} else if (
nullptr !=
dynamic_cast<clp_s::DateStringColumnReader*>(columnReader_)) {
populateTimestampData<clp_s::NodeType::DateString>(
dynamic_cast<clp_s::DeprecatedDateStringColumnReader*>(
columnReader_)) {
populateTimestampData<clp_s::NodeType::DeprecatedDateString>(
rows, timestampVector);
} else if (
nullptr != dynamic_cast<clp_s::FloatColumnReader*>(columnReader_)) {
Expand Down Expand Up @@ -261,13 +263,21 @@ template void
ClpArchiveVectorLoader::populateTimestampData<clp_s::NodeType::Integer>(
RowSet rows,
FlatVector<facebook::velox::Timestamp>* vector);
template void
ClpArchiveVectorLoader::populateTimestampData<clp_s::NodeType::DateString>(
template void ClpArchiveVectorLoader::populateTimestampData<
clp_s::NodeType::DeprecatedDateString>(
RowSet rows,
FlatVector<facebook::velox::Timestamp>* vector);
template void
ClpArchiveVectorLoader::populateTimestampData<clp_s::NodeType::Unknown>(
RowSet rows,
FlatVector<facebook::velox::Timestamp>* vector);
template void

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the changes make sense for DeprecatedDateString. But what about the deserialization of Timestamp type, I didn't see this being added in the vector loader file.

I imagine there must be some handler for Timestamp right?

ClpArchiveVectorLoader::populateTimestampData<clp_s::NodeType::FormattedFloat>(
RowSet rows,
FlatVector<facebook::velox::Timestamp>* vector);
template void
ClpArchiveVectorLoader::populateTimestampData<clp_s::NodeType::DictionaryFloat>(
RowSet rows,
FlatVector<facebook::velox::Timestamp>* vector);

} // namespace facebook::velox::connector::clp::search_lib
6 changes: 6 additions & 0 deletions velox/connectors/clp/search_lib/ir/ClpIrCursor.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,8 @@

#include "clp_s/ColumnReader.hpp"
#include "clp_s/InputConfig.hpp"
#include "clp_s/search/ast/SetTimestampLiteralPrecision.hpp"
#include "clp_s/search/ast/TimestampLiteral.hpp"

#include "ffi/ir_stream/search/QueryHandler.hpp"
#include "velox/connectors/clp/ClpColumnHandle.h"
Expand Down Expand Up @@ -75,6 +77,10 @@ ErrorCode ClpIrCursor::loadSplit() {
? NetworkAuthOption{.method = AuthMethod::None}
: NetworkAuthOption{.method = AuthMethod::S3PresignedUrlV4};

search::ast::SetTimestampLiteralPrecision timestampPrecisionPass{
search::ast::TimestampLiteral::Precision::Milliseconds};
expr_ = timestampPrecisionPass.run(expr_);

auto projections = splitFieldsToNamesAndTypes();
auto queryHandlerResult{QueryHandlerType::create(
projectionResolutionCallback_,
Expand Down
70 changes: 61 additions & 9 deletions velox/connectors/clp/tests/ClpConnectorTest.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -576,8 +576,8 @@ TEST_F(ClpConnectorTest, test4IrTimestampNoPushdown) {
TEST_F(ClpConnectorTest, test4IrTimestampPushdown) {
// Only the second event meet the condition, the first event is a date string
// which is not supported yet so the value will be NULL.
const std::shared_ptr<std::string> kqlQuery =
std::make_shared<std::string>("(timestamp < 1756003005000000)");
const std::shared_ptr<std::string> kqlQuery = std::make_shared<std::string>(
R"(timestamp < timestamp("1756003005000000", "\L"))");
Comment on lines +579 to +580
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

CLP KQL timestamp format specifier "\L" milliseconds precision

💡 Result:

In CrowdStrike LogScale / Humio “KQL” time formatting, the milliseconds component is the L time modifier:

  • L = milliseconds within the second, always zero‑padded to 3 digits (000999)—i.e., millisecond precision only. [1]
  • If you need more digits, the specifier is N = nanoseconds, zero‑padded to 9 digits (000000000999999999). [1]

Example (milliseconds shown as .SSS):

| eval ts_str = formatTime("%H:%M:%S.%L", field=@timestamp)

Note: In LogScale, the time modifiers like L are used inside formatTime() patterns as %...%L (or via format() using %tL), not as a standalone \L. [1]

Sources
[1] LogScale docs: format() / date-time modifiers (L milliseconds, N nanoseconds) (library.humio.com)

Citations:


🏁 Script executed:

# First, let's locate and examine the test file mentioned
fd ClpConnectorTest.cpp

Repository: y-scope/velox

Length of output: 104


🏁 Script executed:

# Check the exact content at lines 579-580 and 659-660
if fd ClpConnectorTest.cpp; then
  echo "=== Lines 579-580 ===" && \
  sed -n '575,585p' $(fd ClpConnectorTest.cpp)
  echo -e "\n=== Lines 659-660 ===" && \
  sed -n '655,665p' $(fd ClpConnectorTest.cpp)
fi

Repository: y-scope/velox

Length of output: 1262


🌐 Web query:

CLP KQL timestamp() function format specifiers documentation

💡 Result:

In YScope/CLP, the “timestamp formatting” you’re probably looking for is the YScope Log Viewer format-string timestamp formatter, which takes a Day.js format string as its “options”:

  • Syntax: {field:timestamp[:options]} where options is a Day.js format string. Default is YYYY-MM-DDTHH:mm:ssZ. [1]
  • Example: {ts:timestamp:YYYY-MM-DD}2024-11-27 [1] and a longer example appears in the format-string syntax docs. [2]
  • Day.js token reference (format specifiers): Day.js format() documentation lists tokens like YYYY, MM, DD, HH, mm, ss, SSS, Z, etc. [3]

If instead you meant CLP’s KQL-like search (e.g., clp-s), the docs/examples show time range filtering via numeric timestamps and a date("...") helper, not a timestamp() formatting function. [4]

Sources: [1] [2] [3] [4]


🌐 Web query:

"Compressed Log Processing" CLP KQL time format

💡 Result:

In CLP’s KQL (used by clp-s), time filtering is typically done on the ts field in either of these formats:

  • Unix epoch seconds (integer):
    • ts >= 1649923037 AND ts <= 1649923038 [1]
  • date("...") with an ISO-8601-like datetime string:
    • ts >= date("2022-04-14T07:57:17") AND ts <= date("2022-04-14T07:57:18") [1]

Note: CLP docs also state time zone information is not preserved, so avoid relying on timezone offsets in stored/queried timestamps. [1]

Source: CLP clp-s user guide (search examples + limitations). [1]


🏁 Script executed:

# Search for CLP-related documentation or comments in the codebase
rg -i "kql.*timestamp|format.*specifier|\\\L" velox/connectors/clp/ -A 3 -B 3 | head -100

Repository: y-scope/velox

Length of output: 2215


🏁 Script executed:

# Look for any CLP documentation or README
fd -e md -e txt -e rst velox/connectors/clp/ | xargs cat 2>/dev/null | head -200

Repository: y-scope/velox

Length of output: 446


🏁 Script executed:

# Search for references to "\L" format or timestamp parsing logic in CLP connector
rg "\\\\L|format.*timestamp|timestamp.*precision" velox/connectors/clp/ -i | head -50

Repository: y-scope/velox

Length of output: 3517


🏁 Script executed:

# Search for SetTimestampLiteralPrecision and how it handles format specifiers
rg "SetTimestampLiteralPrecision|TimestampLiteral.*Precision" velox/connectors/clp/ -B 5 -A 10

Repository: y-scope/velox

Length of output: 6629


🏁 Script executed:

# Look for handling of "\L" format specifier or how timestamp format is parsed
rg -i "format.*\\\\L|\\\\L.*format|timestamp.*format" velox/connectors/clp/ -B 3 -A 3

Repository: y-scope/velox

Length of output: 2943


🏁 Script executed:

# Search for estimatePrecision usage and implementation
rg "estimatePrecision" velox/connectors/clp/ -B 5 -A 10

Repository: y-scope/velox

Length of output: 4583


🏁 Script executed:

# Check if there's any logic that treats "\L" as a special case
rg "\\\\L|format.*specifier" velox/connectors/clp/search_lib/ -B 2 -A 2

Repository: y-scope/velox

Length of output: 39


🏁 Script executed:

# Look at the test file to see if there are any assertions or comments about expected behaviour
sed -n '575,610p' $(fd ClpConnectorTest.cpp)

Repository: y-scope/velox

Length of output: 1522


🏁 Script executed:

# Check the full test5 content to see the complete query and assertion
sed -n '655,690p' $(fd ClpConnectorTest.cpp)

Repository: y-scope/velox

Length of output: 1707


🏁 Script executed:

# Search for any comments or documentation about format specifiers in test file
rg -i "format|specifier|precision|\\\\L" $(fd ClpConnectorTest.cpp) -B 2 -A 2 | head -80

Repository: y-scope/velox

Length of output: 2202


🏁 Script executed:

# Check ClpTimestampsUtils.h in full to understand the thresholds better
cat velox/connectors/clp/search_lib/ClpTimestampsUtils.h | head -100

Repository: y-scope/velox

Length of output: 3740


🏁 Script executed:

# Search for how SetTimestampLiteralPrecision works - does it force or suggest precision
rg -i "SetTimestampLiteralPrecision" --type cpp -A 5 -B 2 | head -60

Repository: y-scope/velox

Length of output: 3387


🏁 Script executed:

# Check if there's a comments in code about why test4 might work despite the large value
sed -n '568,620p' $(fd ClpConnectorTest.cpp)

Repository: y-scope/velox

Length of output: 2154


🏁 Script executed:

# Look for what kTestTimestampSeconds is set to
rg "kTestTimestampSeconds|kTestTimestampNanoseconds" $(fd ClpConnectorTest.cpp) -B 2 -A 1

Repository: y-scope/velox

Length of output: 1726


Fix test4's timestamp precision to match test5's format and CLP IR millisecond specification.

The test4 timestamp query uses "1756003005000000" (16 digits) while test5 uses "1746003070000" (13 digits). With identical "\L" format specifiers, this inconsistency indicates a semantic error.

Given that CLP IR explicitly treats timestamp literals as millisecond precision (via SetTimestampLiteralPrecision::Milliseconds in ClpIrCursor.cpp), the correct millisecond representation for ~1756003005 seconds is 1756003005000 (13 digits, matching test5's format), not 1756003005000000 (16 digits).

test4 currently passes only because the archive contains sparse data with a single event well below any reasonable threshold. The 16-digit value either gets misinterpreted as nanoseconds by auto-detection (exceeding the 31.5 trillion nanosecond threshold) or happens to work due to the lax filter boundary. This becomes a latent bug in production when timestamp comparisons matter with edge-case or boundary values.

🔍 Suggested fix
-      R"(timestamp < timestamp("1756003005000000", "\L"))");
+      R"(timestamp < timestamp("1756003005000", "\L"))");

Also apply to line 659 if using similar inconsistent 16-digit values (verify test5 uses only 13-digit format).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@velox/connectors/clp/tests/ClpConnectorTest.cpp` around lines 579 - 580, The
timestamp literal in test4's kqlQuery is using a 16-digit value
("1756003005000000") which contradicts CLP IR's millisecond precision
(SetTimestampLiteralPrecision::Milliseconds in ClpIrCursor.cpp); update the
kqlQuery string in ClpConnectorTest.cpp (symbol: kqlQuery used in test4) to the
13-digit millisecond value "1756003005000" so it matches test5's format, and
scan the file for any other 16-digit timestamp literals (e.g., the similar value
around line 659) and convert them to millisecond precision as well.

auto plan = PlanBuilder(pool_.get())
.startTableScan()
.outputType(ROW({"timestamp"}, {TIMESTAMP()}))
Expand Down Expand Up @@ -655,11 +655,9 @@ TEST_F(ClpConnectorTest, test5FloatTimestampNoPushdown) {

TEST_F(ClpConnectorTest, test5FloatTimestampPushdown) {
// Test filtering rows with a timestamp parsed from a date string and floats
// in various formats. Because KQL doesn’t automatically interpret the unit of
// the timestamp, the returned result differs slightly from the one without
// pushdown.
// in various formats.
const std::shared_ptr<std::string> kqlQuery = std::make_shared<std::string>(
"(timestamp < 1746003005.127 and timestamp >= 1746003005.124)");
R"(timestamp < timestamp("1746003070000", "\L") and timestamp >= timestamp("1746003005124", "\L")");
auto plan =
PlanBuilder(pool_.get())
.startTableScan()
Expand Down Expand Up @@ -688,12 +686,66 @@ TEST_F(ClpConnectorTest, test5FloatTimestampPushdown) {
{Timestamp(1746003005, 124000000),
Timestamp(1746003005, 124100000),
Timestamp(1746003005, 125000000),
Timestamp(1746003005, 126000000)}),
Timestamp(1746003005, 126000000),
Timestamp(1746003005, 127000000),
Timestamp(1746003060, 0),
Timestamp(1746003065, 0)}),
makeFlatVector<double>(
{1.234567891234500E9,
{1.2345678912345E9,
1E16,
1.234567891234567E9,
1.234567891234567E9})});
1.234567891234567E9,
-1.234567891234567E-9,
1234567891.234567,
-1234567891.234567})});
test::assertEqualVectors(expected, output);
}

TEST_F(ClpConnectorTest, test5v050FloatTimestampPushdown) {
// Test filtering rows with a timestamp parsed from a date string and floats
// in various formats.
const std::shared_ptr<std::string> kqlQuery = std::make_shared<std::string>(
R"(timestamp < timestamp("1746003070000", "\L") and timestamp >= timestamp("1746003005124", "\L")");
auto plan =
PlanBuilder(pool_.get())
.startTableScan()
.outputType(ROW({"timestamp", "floatValue"}, {TIMESTAMP(), DOUBLE()}))
.tableHandle(
std::make_shared<ClpTableHandle>(kClpConnectorId, "test_5"))
.assignments(
{{"timestamp",
std::make_shared<ClpColumnHandle>(
"timestamp", "timestamp", TIMESTAMP())},
{"floatValue",
std::make_shared<ClpColumnHandle>(
"floatValue", "floatValue", DOUBLE())}})
.endTableScan()
.orderBy({"\"timestamp\" ASC"}, false)
.planNode();

auto output = getResults(
plan,
{makeClpSplit(
getExampleFilePath("test_5.v0.5.0.clps"),
ClpConnectorSplit::SplitType::kArchive,
kqlQuery)});
auto expected = makeRowVector({// timestamp
makeFlatVector<Timestamp>(
{Timestamp(1746003005, 124000000),
Timestamp(1746003005, 124100000),
Timestamp(1746003005, 125000000),
Timestamp(1746003005, 126000000),
Timestamp(1746003005, 127000000),
Timestamp(1746003060, 0),
Timestamp(1746003065, 0)}),
makeFlatVector<double>(
{1.2345678912345E9,
1E16,
1.234567891234567E9,
1.234567891234567E9,
-1.234567891234567E-9,
1234567891.234567,
-1234567891.234567})});
test::assertEqualVectors(expected, output);
}

Expand Down
Binary file not shown.
Loading