Skip to content

feat(clp-s): Add Timestamp schema tree node type to introduce the timestamp column; Add timestamp column reader and writer.#1873

Merged
gibber9809 merged 13 commits intoy-scope:mainfrom
gibber9809:clp-s-add-timestamp-column-reader-writer
Jan 19, 2026
Merged

feat(clp-s): Add Timestamp schema tree node type to introduce the timestamp column; Add timestamp column reader and writer.#1873
gibber9809 merged 13 commits intoy-scope:mainfrom
gibber9809:clp-s-add-timestamp-column-reader-writer

Conversation

@gibber9809
Copy link
Contributor

@gibber9809 gibber9809 commented Jan 15, 2026

Description

This PR introduces the TimestampColumnReader and TimestampColumnWriter class for the new NodeType::Timestamp column without using them. This change shouldn't impact functionality, and is aimed at reducing the review burden for #1788.

Besides adding these two classes, the changes also include:

  • Adding a new options for clp_s::ParsedMessage::variable_t std::pair<epochtime_t, uint64_t> which better reflects actual storage order (the std::pair<uint64_t, epochtime_t> option will be removed in the follow-up PR)
  • Adding the Timestamp to the NodeType enum
  • Adding a stub for TimestampDictionaryReader::append_timestamp_to_buffer -- a stub is needed since this method is used by TimestampColumnReader, but the implementation is omitted since it is related to an effective rewrite of TimestampDictionaryReader and TimestampDictionaryWriter coming in the follow-up PR

Checklist

  • The PR satisfies the contribution guidelines.
  • This is a breaking change and that has been indicated in the PR title, OR this isn't a
    breaking change.
  • Necessary docs have been updated, OR no docs need to be updated.

Validation performed

Summary by CodeRabbit

  • New Features

    • Full timestamp support: read, write, encode and store timestamps end-to-end.
    • Human‑readable timestamp formatting for display and exports.
    • Timestamp dictionary integration for efficient formatting and lookup across the system.
    • Extended message format and schema to surface timestamp values as first‑class data.
  • Other

    • Minor public API surface extended to enable the above features.

✏️ Tip: You can customize this high-level summary in your review settings.

@gibber9809 gibber9809 requested a review from a team as a code owner January 15, 2026 16:34
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 15, 2026

Walkthrough

Adds timestamp column support: new TimestampColumnReader/Writer, timestamp dictionary formatting API, delta-encoded int reader/writer adjustments, extends ParsedMessage variant, and adds NodeType::Timestamp.

Changes

Cohort / File(s) Summary
Timestamp reader
components/core/src/clp_s/ColumnReader.hpp, components/core/src/clp_s/ColumnReader.cpp
Introduces TimestampColumnReader with load(BufferViewReader&, uint64_t), extract_value(uint64_t), extract_string_value_into_buffer(uint64_t, std::string&), and get_encoded_time(uint64_t). Updates DeltaEncodedInt64ColumnReader::get_value_at_idx(size_t) signature to [[nodiscard]] auto ... -> int64_t.
Timestamp writer
components/core/src/clp_s/ColumnWriter.hpp, components/core/src/clp_s/ColumnWriter.cpp
Adds TimestampColumnWriter with add_value(ParsedMessage::variable_t&) and store(ZstdCompressor&). Adds DeltaEncodedInt64ColumnWriter::add_value(int64_t) -> size_t overload and delegates variant overload to it.
Variant/type and schema updates
components/core/src/clp_s/ParsedMessage.hpp, components/core/src/clp_s/SchemaTree.hpp
Extends ParsedMessage::variable_t to include std::pair<epochtime_t, uint64_t>. Adds Timestamp enum value to NodeType.
Timestamp dictionary API
components/core/src/clp_s/TimestampDictionaryReader.hpp
Adds append_timestamp_to_buffer(epochtime_t timestamp, uint64_t format_id, std::string& buffer) const to marshal/format timestamps via dictionary patterns (inline placeholder implementation).

Sequence Diagram(s)

mermaid
sequenceDiagram
participant Writer as TimestampColumnWriter
participant DeltaW as DeltaEncodedInt64ColumnWriter
participant EncBuf as m_timestamp_encodings (vector)
participant Compressor as ZstdCompressor
Writer->>DeltaW: add_value(timestamp)
DeltaW-->>Writer: encoded_delta_size
Writer->>EncBuf: push_back(format_id)
Writer->>Compressor: store(m_timestamps), write(encodings)

mermaid
sequenceDiagram
participant Reader as TimestampColumnReader
participant DeltaR as DeltaEncodedInt64ColumnReader
participant Dict as TimestampDictionaryReader
participant Req as Caller
Req->>Reader: load(reader, num_messages)
Reader->>DeltaR: load timestamps
Req->>Reader: extract_value(cur_message)
Reader->>DeltaR: get_value_at_idx(idx)
DeltaR-->>Reader: epochtime
Reader->>Dict: append_timestamp_to_buffer(epochtime, format_id, buffer)
Reader-->>Req: std::variant or encoded string

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 28.57% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main changes: introducing a Timestamp node type and adding both reader and writer classes for timestamp columns.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@components/core/src/clp_s/TimestampDictionaryReader.hpp`:
- Around line 39-52: The doc comment for append_timestamp_to_buffer contains a
typo: change "@throws OperationFailed is the format indicated by `format_id` can
not be interpreted as a `clp_s::timestamp_parser::TimestampPattern`." to
"@throws OperationFailed if the format indicated by `format_id` can not be
interpreted as a `clp_s::timestamp_parser::TimestampPattern`." — update the
comment above the append_timestamp_to_buffer method to correct "is" to "if".
📜 Review details

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 491e321 and 8810fd6.

📒 Files selected for processing (10)
  • components/core/cmake/Options/options.cmake
  • components/core/src/clp_s/CMakeLists.txt
  • components/core/src/clp_s/ColumnReader.cpp
  • components/core/src/clp_s/ColumnReader.hpp
  • components/core/src/clp_s/ColumnWriter.cpp
  • components/core/src/clp_s/ColumnWriter.hpp
  • components/core/src/clp_s/ParsedMessage.hpp
  • components/core/src/clp_s/SchemaTree.hpp
  • components/core/src/clp_s/TimestampDictionaryReader.hpp
  • components/core/src/clp_s/indexer/CMakeLists.txt
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}

⚙️ CodeRabbit configuration file

  • Prefer false == <expression> rather than !<expression>.

Files:

  • components/core/src/clp_s/ParsedMessage.hpp
  • components/core/src/clp_s/TimestampDictionaryReader.hpp
  • components/core/src/clp_s/ColumnReader.cpp
  • components/core/src/clp_s/ColumnWriter.hpp
  • components/core/src/clp_s/SchemaTree.hpp
  • components/core/src/clp_s/ColumnReader.hpp
  • components/core/src/clp_s/ColumnWriter.cpp
🧠 Learnings (19)
📓 Common learnings
Learnt from: Bill-hbrhbr
Repo: y-scope/clp PR: 1122
File: components/core/src/clp/clp/CMakeLists.txt:175-195
Timestamp: 2025-07-23T09:54:45.185Z
Learning: In the CLP project, when reviewing CMakeLists.txt changes that introduce new compression library dependencies (BZip2, LibLZMA, LZ4, ZLIB), the team prefers to address conditional linking improvements in separate PRs rather than expanding the scope of focused migration PRs like the LibArchive task-based installation migration.
📚 Learning: 2025-07-23T09:54:45.185Z
Learnt from: Bill-hbrhbr
Repo: y-scope/clp PR: 1122
File: components/core/src/clp/clp/CMakeLists.txt:175-195
Timestamp: 2025-07-23T09:54:45.185Z
Learning: In the CLP project, when reviewing CMakeLists.txt changes that introduce new compression library dependencies (BZip2, LibLZMA, LZ4, ZLIB), the team prefers to address conditional linking improvements in separate PRs rather than expanding the scope of focused migration PRs like the LibArchive task-based installation migration.

Applied to files:

  • components/core/src/clp_s/indexer/CMakeLists.txt
  • components/core/cmake/Options/options.cmake
  • components/core/src/clp_s/CMakeLists.txt
📚 Learning: 2025-08-09T04:07:27.083Z
Learnt from: Bill-hbrhbr
Repo: y-scope/clp PR: 1156
File: components/core/CMakeLists.txt:772-772
Timestamp: 2025-08-09T04:07:27.083Z
Learning: In the CLP project's CMakeLists.txt, when reviewing changes related to the ${zstd_TARGET} variable usage in test linking, the team is planning a refactoring PR to improve this mechanism. Guards for undefined target variables should be deferred to that separate PR rather than being added in focused dependency migration PRs.

Applied to files:

  • components/core/src/clp_s/indexer/CMakeLists.txt
  • components/core/cmake/Options/options.cmake
  • components/core/src/clp_s/CMakeLists.txt
📚 Learning: 2025-06-02T18:22:24.060Z
Learnt from: gibber9809
Repo: y-scope/clp PR: 955
File: components/core/src/clp_s/search/sql/CMakeLists.txt:8-26
Timestamp: 2025-06-02T18:22:24.060Z
Learning: In the clp project, ANTLR code generation at build time is being removed by another PR. When reviewing CMake files, be aware that some temporary suboptimal configurations may exist to reduce merge conflicts between concurrent PRs, especially around ANTLR_TARGET calls.

Applied to files:

  • components/core/src/clp_s/indexer/CMakeLists.txt
  • components/core/cmake/Options/options.cmake
  • components/core/src/clp_s/CMakeLists.txt
📚 Learning: 2025-06-27T01:49:15.724Z
Learnt from: sitaowang1998
Repo: y-scope/clp PR: 1044
File: .github/workflows/clp-core-build-macos.yaml:0-0
Timestamp: 2025-06-27T01:49:15.724Z
Learning: In the clp project, manually setting LDFLAGS for library paths can cause runtime errors because it interferes with CMake's built-in library discovery and linking mechanisms. CMake can identify and correctly link to libraries on its own without requiring manual LDFLAGS configuration.

Applied to files:

  • components/core/src/clp_s/indexer/CMakeLists.txt
📚 Learning: 2024-11-26T19:12:43.982Z
Learnt from: LinZhihao-723
Repo: y-scope/clp PR: 486
File: components/core/tests/test-error_handling.cpp:37-38
Timestamp: 2024-11-26T19:12:43.982Z
Learning: In the CLP project, C++20 is used, allowing the utilization of C++20 features such as class template argument deduction (CTAD) with `std::array`.

Applied to files:

  • components/core/src/clp_s/indexer/CMakeLists.txt
  • components/core/src/clp_s/CMakeLists.txt
📚 Learning: 2025-09-11T13:15:39.930Z
Learnt from: anlowee
Repo: y-scope/clp PR: 1176
File: components/core/src/clp_s/ColumnWriter.cpp:67-72
Timestamp: 2025-09-11T13:15:39.930Z
Learning: In components/core/src/clp_s/ColumnWriter.cpp, when using sizeof with types like variable_dictionary_id_t, always use the fully qualified namespace (clp::variable_dictionary_id_t) to match the variable declaration and maintain consistency with other column writers in the codebase.

Applied to files:

  • components/core/src/clp_s/ParsedMessage.hpp
  • components/core/src/clp_s/ColumnReader.cpp
  • components/core/src/clp_s/ColumnWriter.hpp
  • components/core/src/clp_s/ColumnReader.hpp
  • components/core/src/clp_s/ColumnWriter.cpp
📚 Learning: 2025-08-15T15:21:51.607Z
Learnt from: haiqi96
Repo: y-scope/clp PR: 1186
File: components/clp-package-utils/clp_package_utils/scripts/archive_manager.py:40-40
Timestamp: 2025-08-15T15:21:51.607Z
Learning: In components/clp-package-utils/clp_package_utils/scripts/archive_manager.py, the _validate_timestamps function correctly uses Optional[int] for end_ts parameter without a default value because it forces callers to explicitly pass the parsed argument value (which can be None), making the API contract clear and avoiding ambiguity about whether None came from defaults or actual parsed values.

Applied to files:

  • components/core/cmake/Options/options.cmake
📚 Learning: 2025-08-15T15:19:14.562Z
Learnt from: haiqi96
Repo: y-scope/clp PR: 1186
File: components/clp-package-utils/clp_package_utils/scripts/archive_manager.py:40-40
Timestamp: 2025-08-15T15:19:14.562Z
Learning: In components/clp-package-utils/clp_package_utils/scripts/archive_manager.py, the _validate_timestamps function correctly uses Optional[int] for end_ts parameter without a default value because it receives parsed argument values that can be None when --end-ts is not provided for the find command.

Applied to files:

  • components/core/cmake/Options/options.cmake
📚 Learning: 2025-08-03T18:56:07.366Z
Learnt from: Bill-hbrhbr
Repo: y-scope/clp PR: 1124
File: components/core/cmake/Modules/FindBZip2.cmake:49-65
Timestamp: 2025-08-03T18:56:07.366Z
Learning: In CLP's custom CMake Find modules, `FindStaticLibraryDependencies` populates the `*_DYNAMIC_LIBS` variable even when building with static libraries. When the main library is static, all dependencies must also use static libraries - there's no fallback to shared libraries for dependencies.

Applied to files:

  • components/core/cmake/Options/options.cmake
📚 Learning: 2025-10-22T21:02:31.113Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 0
File: :0-0
Timestamp: 2025-10-22T21:02:31.113Z
Learning: Repository y-scope/clp: Maintain deterministic CI/builds for Rust; add a check to verify Cargo.lock is in sync with Cargo.toml without updating dependencies (non-mutating verification in clp-rust-checks workflow).

Applied to files:

  • components/core/cmake/Options/options.cmake
📚 Learning: 2024-10-24T14:46:00.664Z
Learnt from: haiqi96
Repo: y-scope/clp PR: 523
File: components/core/src/clp/BufferedFileReader.hpp:158-158
Timestamp: 2024-10-24T14:46:00.664Z
Learning: In `components/core/src/clp/BufferedFileReader.hpp`, the `BufferedFileReader::try_read` function explicitly calls `BufferReader::try_read`, so documentation should reference `BufferReader::try_read`.

Applied to files:

  • components/core/src/clp_s/TimestampDictionaryReader.hpp
📚 Learning: 2025-08-06T23:57:49.762Z
Learnt from: gibber9809
Repo: y-scope/clp PR: 1163
File: components/core/src/clp_s/ColumnWriter.hpp:120-122
Timestamp: 2025-08-06T23:57:49.762Z
Learning: In the CLP codebase, the `encoded_log_dict_id_t` type alias in `components/core/src/clp_s/ColumnWriter.hpp` is appropriately placed as a class-scoped alias because the ColumnWriter.hpp header is already included in all places where this type is used, eliminating the need to move it to a common header.

Applied to files:

  • components/core/src/clp_s/ColumnReader.cpp
  • components/core/src/clp_s/ColumnWriter.hpp
  • components/core/src/clp_s/ColumnReader.hpp
  • components/core/src/clp_s/ColumnWriter.cpp
📚 Learning: 2024-10-24T14:45:26.265Z
Learnt from: haiqi96
Repo: y-scope/clp PR: 523
File: components/core/src/clp/BufferedFileReader.cpp:96-106
Timestamp: 2024-10-24T14:45:26.265Z
Learning: In `components/core/src/clp/BufferedFileReader.cpp`, refactoring the nested error handling conditions may not apply due to the specific logic in the original code.

Applied to files:

  • components/core/src/clp_s/ColumnReader.cpp
📚 Learning: 2025-12-09T01:57:04.839Z
Learnt from: SharafMohamed
Repo: y-scope/clp PR: 1313
File: components/core/tests/test-SchemaSearcher.cpp:136-137
Timestamp: 2025-12-09T01:57:04.839Z
Learning: In C++ files under the components directory, keep forward declarations even when a definition follows in the same scope (such as in anonymous namespaces) to maintain consistency across multiple methods. This style reduces churn and aligns with the existing convention in this repository for both source and tests.

Applied to files:

  • components/core/src/clp_s/ColumnReader.cpp
  • components/core/src/clp_s/ColumnWriter.cpp
📚 Learning: 2025-08-13T15:25:40.805Z
Learnt from: anlowee
Repo: y-scope/clp PR: 1176
File: components/core/src/clp_s/ColumnWriter.cpp:49-127
Timestamp: 2025-08-13T15:25:40.805Z
Learning: In components/core/src/clp_s/ColumnWriter.cpp, the FormattedFloatColumnWriter::add_value method intentionally allows std::stod to throw exceptions (std::invalid_argument or std::out_of_range) without catching them, as this is designed to terminate decompression when encountering invalid float strings for data integrity purposes.

Applied to files:

  • components/core/src/clp_s/ColumnWriter.hpp
  • components/core/src/clp_s/ColumnWriter.cpp
📚 Learning: 2024-10-13T09:29:56.553Z
Learnt from: LinZhihao-723
Repo: y-scope/clp PR: 557
File: components/core/tests/test-ir_encoding_methods.cpp:1283-1286
Timestamp: 2024-10-13T09:29:56.553Z
Learning: `clp::ffi::SchemaTree::Node::id_t` is of type `uint32_t`, so adding `1` to `INT32_MAX` is safe and does not cause integer overflow.

Applied to files:

  • components/core/src/clp_s/SchemaTree.hpp
📚 Learning: 2025-08-25T07:27:19.449Z
Learnt from: Bill-hbrhbr
Repo: y-scope/clp PR: 1262
File: docs/src/dev-guide/components-core/index.md:50-50
Timestamp: 2025-08-25T07:27:19.449Z
Learning: In the CLP project, nlohmann_json version 3.11.3 is installed through their dependency management process, but they don't enforce version constraints in CMake find_package calls - they control versions through installation rather than CMake version pinning.

Applied to files:

  • components/core/src/clp_s/CMakeLists.txt
📚 Learning: 2025-07-01T14:49:07.333Z
Learnt from: jackluo923
Repo: y-scope/clp PR: 1054
File: components/core/src/glt/SQLitePreparedStatement.hpp:4-4
Timestamp: 2025-07-01T14:49:07.333Z
Learning: In the CLP codebase, standard headers like `<cstdint>` and `<string>` in alphabetical order (as seen in components/core/src/glt/SQLitePreparedStatement.hpp) are correctly ordered and do not need adjustment.

Applied to files:

  • components/core/src/clp_s/CMakeLists.txt
🧬 Code graph analysis (4)
components/core/src/clp_s/TimestampDictionaryReader.hpp (2)
components/core/src/clp_s/TimestampEntry.hpp (2)
  • timestamp (58-58)
  • timestamp (59-59)
components/core/src/clp_s/TimestampPattern.hpp (1)
  • timestamp (143-143)
components/core/src/clp_s/ColumnReader.cpp (1)
components/core/src/clp_s/ColumnReader.hpp (1)
  • idx (122-122)
components/core/src/clp_s/SchemaTree.hpp (1)
components/core/src/clp_s/ColumnReader.hpp (1)
  • Timestamp (378-378)
components/core/src/clp_s/ColumnWriter.cpp (2)
components/core/src/clp_s/ParsedMessage.hpp (6)
  • add_value (45-47)
  • add_value (45-45)
  • value (84-86)
  • value (84-84)
  • value (94-96)
  • value (94-94)
components/core/src/clp_s/ColumnWriter.hpp (23)
  • value (29-29)
  • value (59-59)
  • value (76-76)
  • value (79-79)
  • value (97-97)
  • value (114-114)
  • value (134-134)
  • value (152-152)
  • value (179-179)
  • value (236-236)
  • value (254-254)
  • value (272-272)
  • compressor (35-35)
  • compressor (61-61)
  • compressor (81-81)
  • compressor (99-99)
  • compressor (116-116)
  • compressor (136-136)
  • compressor (154-154)
  • compressor (181-181)
  • compressor (238-238)
  • compressor (256-256)
  • compressor (274-274)
🪛 Cppcheck (2.19.0)
components/core/src/clp_s/ColumnReader.cpp

[style] 33-33: The function 'get_utc_offset' is never used.

(unusedFunction)

components/core/src/clp_s/ColumnWriter.cpp

[style] 37-37: The function 'add_to_logtype_query' is never used.

(unusedFunction)


[style] 43-43: The function 'get_placeholder' is never used.

(unusedFunction)


[style] 35-35: The function 'get_begin_pos' is never used.

(unusedFunction)


[style] 37-37: The function 'get_end_pos' is never used.

(unusedFunction)


[style] 31-31: The function 'get_timestamp' is never used.

(unusedFunction)


[style] 33-33: The function 'get_utc_offset' is never used.

(unusedFunction)


[style] 35-35: The function 'get_message' is never used.

(unusedFunction)


[style] 45-45: The function 'could_be_multi_digit_hex_value' is never used.

(unusedFunction)


[style] 35-35: The function 'get_error_code' is never used.

(unusedFunction)


[style] 37-37: The function 'get_filename' is never used.

(unusedFunction)


[style] 39-39: The function 'get_line_number' is never used.

(unusedFunction)


[style] 43-43: The function 'get_num_operands' is never used.

(unusedFunction)


[style] 41-41: The function 'get_print_stream' is never used.

(unusedFunction)


[style] 190-190: The function 'timestamp_is_in_search_time_range' is never used.

(unusedFunction)


[style] 194-194: The function 'get_ignore_case' is never used.

(unusedFunction)


[style] 196-196: The function 'get_search_string' is never used.

(unusedFunction)


[style] 189-189: The function 'get_encoded_log_dict_id' is never used.

(unusedFunction)


[style] 197-197: The function 'get_encoded_offset' is never used.

(unusedFunction)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: ubuntu-jammy-lint
  • GitHub Check: ubuntu-jammy-dynamic-linked-bins
  • GitHub Check: ubuntu-jammy-static-linked-bins
  • GitHub Check: lint-check (ubuntu-24.04)
  • GitHub Check: check-generated
🔇 Additional comments (16)
components/core/src/clp_s/SchemaTree.hpp (1)

47-47: LGTM!

The Timestamp enum value is correctly placed before Unknown, following the documented convention that new node types must be added at the end but before Unknown to preserve encoding stability.

components/core/src/clp_s/indexer/CMakeLists.txt (1)

135-135: LGTM!

The addition of clp_s::timestamp_parser as a private dependency is appropriate for enabling timestamp parsing functionality in the indexer, consistent with the broader timestamp support being introduced across clp_s components.

components/core/src/clp_s/ParsedMessage.hpp (1)

26-28: Temporary duplication noted for follow-up cleanup.

The new std::pair<epochtime_t, uint64_t> variant (line 27) coexists with the existing std::pair<uint64_t, epochtime_t> (line 26). Since these are distinct types (different element ordering), the variant can correctly distinguish between them at compile time. Per the PR objectives, the older pair will be removed in the follow-up PR (#1788).

components/core/cmake/Options/options.cmake (3)

135-136: LGTM!

The addition of CLP_BUILD_CLP_S_TIMESTAMP_PARSER to the binary dependencies validation is correct and consistent with the linkage changes in the corresponding CMakeLists.txt files.


211-212: LGTM!

Correctly adds CLP_BUILD_CLP_S_TIMESTAMP_PARSER as a required dependency for the archive reader, aligning with the timestamp parsing functionality being integrated.


233-234: LGTM!

Correctly adds CLP_BUILD_CLP_S_TIMESTAMP_PARSER as a required dependency for the archive writer, ensuring consistent build configuration with the other timestamp-related changes.

components/core/src/clp_s/CMakeLists.txt (1)

282-283: LGTM!

The addition of clp_s::timestamp_parser and clp_s::timestamp_pattern as PUBLIC dependencies to both clp_s_archive_writer and clp_s_archive_reader is consistent and appropriate for supporting the new TimestampColumnReader and TimestampColumnWriter classes.

Also applies to: 347-348

components/core/src/clp_s/ColumnWriter.hpp (2)

75-77: LGTM!

The new add_value(int64_t) overload with [[nodiscard]] enables direct value addition without variant extraction, which is appropriately used by the composed TimestampColumnWriter.


263-279: LGTM!

The TimestampColumnWriter class follows the established pattern of other column writers. The composition of DeltaEncodedInt64ColumnWriter for delta-encoded timestamp values with a separate vector for encodings is a clean design that promotes reuse.

components/core/src/clp_s/ColumnReader.cpp (2)

33-47: LGTM!

The signature change to trailing-return-type form matches the updated header declaration. The implementation logic remains unchanged and correct.


257-285: LGTM!

The TimestampColumnReader implementations are well-structured:

  • load() correctly delegates timestamp loading to the composed reader and reads encodings separately
  • extract_value() and extract_string_value_into_buffer() properly delegate to append_timestamp_to_buffer
  • get_encoded_time() correctly returns the decoded timestamp value

The pattern is consistent with the existing DateStringColumnReader implementation.

components/core/src/clp_s/ColumnWriter.cpp (2)

31-44: LGTM!

The refactored DeltaEncodedInt64ColumnWriter::add_value:

  • The new int64_t overload correctly implements delta encoding (first value stored directly, subsequent values as deltas)
  • The existing ParsedMessage::variable_t& overload properly delegates to the new overload
  • The condition 0 == m_values.size() follows the project's coding style

188-199: LGTM!

The TimestampColumnWriter implementation:

  • add_value() correctly extracts std::pair<epochtime_t, uint64_t> (the new pair order per PR objectives) and delegates timestamp storage to the composed writer
  • store() properly serializes by delegating to m_timestamps.store() first, then writing the encodings
components/core/src/clp_s/ColumnReader.hpp (3)

122-124: LGTM!

The addition of [[nodiscard]] and trailing-return-type syntax for get_value_at_idx is appropriate. The [[nodiscard]] attribute ensures callers don't accidentally ignore the returned value.


386-390: Verify the docstring claim about epoch nanoseconds.

The docstring states the return value is "epoch nanoseconds", while DateStringColumnReader::get_encoded_time (line 354) documents its return as "epoch time". Both return epochtime_t. Please verify that this documentation accurately reflects the expected time unit for the new Timestamp column type.


364-397: LGTM!

The TimestampColumnReader class is well-designed:

  • Follows the established pattern of other column readers
  • Uses composition with DeltaEncodedInt64ColumnReader for delta-encoded timestamp storage (improving compression)
  • Returns NodeType::Timestamp from get_type() aligning with the new enum value in SchemaTree.hpp
  • The constructor properly initializes all members including the composed reader

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Copy link
Member

@LinZhihao-723 LinZhihao-723 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall lgtm. Some style catch.

Comment on lines 32 to 38
if (0 == m_values.size()) {
m_cur = std::get<int64_t>(value);
m_values.push_back(m_cur);
m_cur = value;
m_values.emplace_back(m_cur);
} else {
auto next = std::get<int64_t>(value);
m_values.push_back(next - m_cur);
m_cur = next;
m_values.emplace_back(value - m_cur);
m_cur = value;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like we could just use:

    m_values.emplace_back(value - m_cur);
    m_cur = value;

because m_cur is initialized to 0.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!

void TimestampColumnWriter::store(ZstdCompressor& compressor) {
m_timestamps.store(compressor);
size_t const encodings_size{m_timestamp_encodings.size() * sizeof(uint64_t)};
compressor.write(reinterpret_cast<char const*>(m_timestamp_encodings.data()), encodings_size);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lol, I hope at some day we can drop all reinterpret_cast.
No action needed for this PR since it looks like just repeating existing implementations.

clp::ffi::FourByteEncodedTextAst,
bool,
std::pair<uint64_t, epochtime_t>,
std::pair<epochtime_t, uint64_t>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@junhaoliao junhaoliao added this to the Backlog milestone Jan 19, 2026
gibber9809 and others added 3 commits January 19, 2026 12:04
Co-authored-by: Lin Zhihao <59785146+LinZhihao-723@users.noreply.github.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@components/core/src/clp_s/TimestampDictionaryReader.hpp`:
- Around line 39-52: The current no-op implementation of
append_timestamp_to_buffer silently returns empty data; replace the stub in
TimestampDictionaryReader::append_timestamp_to_buffer with a fail-fast behavior
that throws an OperationFailed exception (including the format_id and a clear
message that timestamp formatting is unimplemented or the format cannot be
interpreted as clp_s::timestamp_parser::TimestampPattern) so callers do not
receive empty timestamps; ensure the thrown OperationFailed uses the project's
standard constructor/format for exceptions and preserves any available context
(format_id and timestamp) in the error message.

Copy link
Member

@LinZhihao-723 LinZhihao-723 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the PR title, how about:

feat(clp-s): Add `Timestamp` schema tree node type to introduce the timestamp column; Add timestamp column reader and writer.

@gibber9809 gibber9809 changed the title feat(clp-s): Add TimestampColumnReader and TimestampColumnWriter for new NodeType::Timestamp column. feat(clp-s): Add Timestamp schema tree node type to introduce the timestamp column; Add timestamp column reader and writer. Jan 19, 2026
@gibber9809 gibber9809 merged commit ff6f76a into y-scope:main Jan 19, 2026
28 checks passed
@junhaoliao junhaoliao removed this from the Backlog milestone Jan 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants