ffi: Move `SchemaTreeNode` into `SchemaTree`; Use an `optional` for the parent ID in the schema tree node to track nodes which have no parents (the root). #554

LinZhihao-723 · 2024-10-10T05:03:27Z

Description

In the current implementation, each ffi schema tree node stores an ID to keep track of the parent node ID. However, this scheme doesn't make sense for the root node since the root, by definition, should have no parent. In the current implementation, we manually set the parent ID of the root to itself (which is 0).
This PR changes the underlying parent ID from id_t to std::optional<id_t> so the parent ID can be null. This scheme allows us to implement the root properly by setting its parent ID to null. After this change, there are two standard approaches to identifying whether a node is a root:

If the node has no parent ID (null), it is identified as the root.
If the node's ID matches the root ID defined by SchemaTree, it is identified as the root.

We then made the following changes according to these two root identifications:

If the code refers to a schema tree node instance, i.e., when traversing the schema tree, we use the first approach.
If the code refers to the node by a numerical ID,i.e., constructing a node locator, we use the second approach.

Notice that either of these approaches is valid if and only if the root is unique in a schema tree. It is the schema tree's responsibility to ensure the root's validity and uniqueness. We secured this with the following design:

SchemaTreeNode is instantiated using factory functions. There are two factory functions: one for the root node and one for non-root nodes. The constructed node should be either a root node or a non-root node.
SchemaTree calls the root node factory functions to create the root as its first node (ID 0) on initialization. The root factory function must not be called elsewhere.
SchemaTree inserts new nodes to the tree identified by a valid node locator. Since the node locator specifies a parent ID, conceptually, the node created by such a locator must be a non-root node.

To achieve this design, we move SchemaTreeNode into the schema tree class as SchemaTree::Node, so that it can access SchemaTree::cRootId and SchemaTree::NodeLocator (to implement bullet point 2 and 3 above) without circular dependency. The factory functions and constructors of the node class are made private. SchemaTree is marked as friend to ensure it is the only class that can instantiate tree nodes.

We update the schema tree unit test to ensure that each build meets the assumptions above in case future development on the schema tree breaks them.

Validation performed

Ensure workflows passed
Ensure existing unit tests aren't broken
Ensure newly added unit tests passed

Summary by CodeRabbit

New Features
- Enhanced encapsulation and clarity in schema tree node management.
- Introduced a new Node class to improve the structure of schema tree nodes.
- Added profiling options and configuration settings for improved build management.
- Default build type set to "Release" for better user experience.
Bug Fixes
- Improved error handling for unsupported compressor values and missing libraries.
Tests
- Updated tests to reflect changes in node identifiers and root node management.

coderabbitai · 2024-10-10T05:06:04Z

Walkthrough

The pull request updates the CMake configuration and refines the handling of schema tree nodes within the codebase. Key changes include setting a minimum CMake version, defining a project name, and establishing a default build type. Additionally, the transition from SchemaTreeNode to SchemaTree::Node enhances encapsulation and clarity across various components, including adjustments to method signatures and serialization logic. Tests have also been updated to align with these changes, ensuring consistency in the handling of node identifiers and validation processes.

Changes

File Path	Change Summary
`components/core/CMakeLists.txt`	Updated CMake configuration: minimum version, project name, default build type, and library checks. Added `GENERAL_COMPRESSOR` variable and `unitTest` executable target.
`components/core/src/clp/ffi/KeyValuePairLogEvent.cpp`	Replaced `SchemaTreeNode` with `SchemaTree::Node`; updated traversal, serialization, and method signatures.
`components/core/src/clp/ffi/KeyValuePairLogEvent.hpp`	Changed type alias from `SchemaTreeNode::id_t` to `SchemaTree::Node::id_t`.
`components/core/src/clp/ffi/SchemaTree.cpp`	Updated node identifier types and method signatures to reflect `Node::id_t`. Enhanced safety checks.
`components/core/src/clp/ffi/SchemaTree.hpp`	Introduced new `Node` class; modified constructor and removed `reset()` method.
`components/core/src/clp/ffi/SchemaTreeNode.hpp`	Deleted class `SchemaTreeNode` as part of the transition to `SchemaTree::Node`.
`components/core/src/clp/ffi/ir_stream/Serializer.cpp`	Updated references from `SchemaTreeNode` to `SchemaTree::Node`; adjusted method signatures.
`components/core/src/clp/ffi/ir_stream/Serializer.hpp`	Updated method signatures to replace `SchemaTreeNode` with `SchemaTree::Node`.
`components/core/src/clp/ffi/ir_stream/ir_unit_deserialization_methods.cpp`	Renamed types and updated function signatures to use `SchemaTree::Node`.
`components/core/tests/test-ffi_IrUnitHandlerInterface.cpp`	Updated test to use `SchemaTree::get_root_node_id()` instead of `SchemaTree::cRootId`.
`components/core/tests/test-ffi_KeyValuePairLogEvent.cpp`	Updated references to `SchemaTreeNode::id_t` with `SchemaTree::Node::id_t`.
`components/core/tests/test-ffi_SchemaTree.cpp`	Updated `check_node` and `insert_node` methods to reflect new node ID types.

Possibly related PRs

ffi: Extract IR unit deserialization methods from Deserializer into standalone functions to prepare for a templatized Deserializer. #544: This PR modifies the CMakeLists.txt to include new source files related to IR unit deserialization methods, which aligns with the changes in the main PR that also updates the CMake configuration for similar functionalities.
ffi: Redesign Deserializer API to deserialize key-value pair IR streams one IR unit at a time (fixes #539). #549: This PR introduces significant changes to the Deserializer API, including the addition of new methods for deserializing IR units, which directly relates to the changes made in the main PR regarding the handling of schema tree nodes and key-value pair log events.

Suggested reviewers

davidlion

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 9

🧹 Outside diff range and nitpick comments (5)

components/core/src/clp/ffi/SchemaTree.hpp (3)
128-128: Consider using get_root_node_id() in the constructor.

The constructor now uses a factory method to create the root node, which is a good practice. However, for consistency and to leverage the newly added get_root_node_id() method, consider updating the constructor as follows:
- SchemaTree() { m_tree_nodes.emplace_back(SchemaTreeNode::create_root(cRootId)); }
+ SchemaTree() { m_tree_nodes.emplace_back(SchemaTreeNode::create_root(get_root_node_id())); }
This change would ensure that the root node ID is always accessed through the centralized method.

144-144: Approve the addition of get_root() method with a minor suggestion.

The get_root() method is a good addition, providing direct access to the root node. However, for consistency, consider using the get_root_node_id() method instead of cRootId:
- [[nodiscard]] auto get_root() const -> SchemaTreeNode const& { return m_tree_nodes[cRootId]; }
+ [[nodiscard]] auto get_root() const -> SchemaTreeNode const& { return m_tree_nodes[get_root_node_id()]; }
This change would ensure consistent access to the root node ID throughout the class.

192-194: Approve moving cRootId to private section with a minor suggestion.

Moving cRootId to the private section improves encapsulation, which is a good practice. To further improve the code, consider updating the constant name to follow the common C++ naming convention for private constants:
- static constexpr SchemaTreeNode::id_t cRootId{0};
+ static constexpr SchemaTreeNode::id_t kRootId{0};
This change would make the code more consistent with typical C++ style guidelines.
components/core/src/clp/ffi/SchemaTree.cpp (1)
Line range hint 17-19: Incorrect return value in try_get_node_id method

The method try_get_node_id returns std::optional<SchemaTreeNode::id_t>, but when the parent ID is out of bounds, it returns false, which is not compatible with the return type.

Modify the return statement to return std::nullopt instead of false.

Apply this diff to correct the issue:
 if (m_tree_nodes.size() <= parent_id) {
-    return false;
+    return std::nullopt;
 }
components/core/tests/test-ffi_SchemaTree.cpp (1)

60-61: Clarify comment regarding node initialisation.

The comment "// The node after initialization must not be the root." could be rephrased for clarity. Consider changing it to "// The node being checked should not be the root node."

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 1c67908 and 8a70a07.

📒 Files selected for processing (8)

components/core/src/clp/ffi/KeyValuePairLogEvent.cpp (5 hunks)
components/core/src/clp/ffi/SchemaTree.cpp (2 hunks)
components/core/src/clp/ffi/SchemaTree.hpp (3 hunks)
components/core/src/clp/ffi/SchemaTreeNode.hpp (4 hunks)
components/core/src/clp/ffi/ir_stream/Serializer.cpp (1 hunks)
components/core/tests/test-ffi_IrUnitHandlerInterface.cpp (1 hunks)
components/core/tests/test-ffi_KeyValuePairLogEvent.cpp (1 hunks)
components/core/tests/test-ffi_SchemaTree.cpp (3 hunks)

🧰 Additional context used

📓 Learnings (1)

components/core/tests/test-ffi_IrUnitHandlerInterface.cpp (2)

Learnt from: LinZhihao-723
PR: y-scope/clp#549
File: components/core/tests/test-ir_encoding_methods.cpp:110-110
Timestamp: 2024-10-01T08:00:48.347Z
Learning: The function `handle_schema_tree_node_insertion` in `IrUnitHandler` must accept `clp::ffi::SchemaTree::NodeLocator` by value to match the `IrUnitHandlerInterface`.

Learnt from: LinZhihao-723
PR: y-scope/clp#549
File: components/core/tests/test-ir_encoding_methods.cpp:110-110
Timestamp: 2024-10-08T15:52:50.753Z
Learning: The function `handle_schema_tree_node_insertion` in `IrUnitHandler` must accept `clp::ffi::SchemaTree::NodeLocator` by value to match the `IrUnitHandlerInterface`.

🔇 Additional comments (11)

components/core/tests/test-ffi_IrUnitHandlerInterface.cpp (1)

110-112: Excellent update to use SchemaTree::get_root_node_id()

This change aligns well with the PR objectives. Using SchemaTree::get_root_node_id() instead of directly accessing SchemaTree::cRootId is a more flexible approach. It encapsulates the logic of retrieving the root node ID, which is particularly beneficial given the new implementation where the root node's parent ID can be null.

This modification enhances the test's robustness by ensuring it uses the correct method to obtain the root node ID, regardless of how it's internally represented in the SchemaTree class.

components/core/src/clp/ffi/SchemaTree.hpp (2)

122-125: Good addition of a static method for root node ID access.

The get_root_node_id() method provides a centralized and encapsulated way to access the root node ID. Using a static method instead of a public constant improves encapsulation, and the constexpr specifier ensures compile-time evaluation when possible, which is good for performance.

Line range hint 1-195: Overall, the changes improve the SchemaTree class implementation.

The modifications to the SchemaTree class align well with the PR objectives, improving encapsulation and providing better access to the root node. The addition of get_root_node_id() and get_root() methods, along with the updated constructor and moving cRootId to the private section, contribute to a more robust implementation.

To ensure consistency across the entire file, it's recommended to:

Use get_root_node_id() instead of direct cRootId access throughout the class.

Update the naming convention for the private constant cRootId to kRootId.

To verify these changes and ensure consistency, you can run the following script:

This script will help ensure that the suggested changes have been applied consistently throughout the file.

components/core/tests/test-ffi_KeyValuePairLogEvent.cpp (1)

269-270: Improved root node ID handling and code readability

These changes enhance the code in two ways:

Replacing SchemaTree::cRootId with SchemaTree::get_root_node_id() aligns with the new schema tree node implementation using an optional ID for the parent node. This improves clarity and maintainability.

Splitting the single line into two lines enhances readability and adheres to common coding standards for line length.

components/core/src/clp/ffi/SchemaTree.cpp (1)

71-74: Proper handling of optional parent IDs

Good use of optional_parent_id.has_value() before accessing optional_parent_id.value(). This ensures that the code safely handles nodes that may not have a parent, preventing potential runtime errors.

components/core/tests/test-ffi_SchemaTree.cpp (2)

54-74: The implementation of check_node function is correct.

The logic within the check_node function appropriately validates the node's existence, ID, parent ID, type, and key name against the provided locator. The checks ensure the node is not the root and all attributes match as expected.

Line range hint 78-100: Test case ffi_schema_tree is well-structured.

The test accurately initialises the schema tree and verifies the properties of the root node, ensuring it aligns with the intended behaviour. The use of REQUIRE statements effectively validates the conditions.

components/core/src/clp/ffi/KeyValuePairLogEvent.cpp (4)

Line range hint 79-84: Improved root node identification using is_root() method

The use of m_schema_tree_node->is_root() enhances clarity and encapsulation compared to direct comparison with constants, making the code more maintainable.

221-223: Proper validation to prevent root node inclusion

Good implementation of node.is_root() to ensure that the root node is not included in node_id_value_pairs, maintaining the integrity of the schema tree.

245-247: Clear explanation of safety assumptions

The added comments effectively clarify that it's safe to call get_parent_id_unsafe() because the node is confirmed not to be the root, improving code readability.

431-431: Consistent use of get_root() method

Retrieving the root schema tree node using m_schema_tree->get_root() improves encapsulation and aligns with the updated root node identification throughout the code.

components/core/src/clp/ffi/ir_stream/Serializer.cpp

components/core/src/clp/ffi/SchemaTree.cpp

components/core/src/clp/ffi/SchemaTreeNode.hpp

components/core/tests/test-ffi_SchemaTree.cpp

components/core/src/clp/ffi/KeyValuePairLogEvent.cpp

coderabbitai

Actionable comments posted: 8

🧹 Outside diff range and nitpick comments (8)

components/core/src/clp/ffi/ir_stream/Serializer.hpp (2)
118-118: LGTM: Consistent update to SchemaTree::Node::Type.

The change from SchemaTreeNode::Type to SchemaTree::Node::Type is consistent with the PR objectives and the previous change. This further reinforces the transition to SchemaTree::Node.

Consider moving the line break to before the serialize_val function name for better readability:
-    [[nodiscard]] auto
-    serialize_val(msgpack::object const& val, SchemaTree::Node::Type schema_tree_node_type) -> bool;
+    [[nodiscard]] auto serialize_val(
+        msgpack::object const& val,
+        SchemaTree::Node::Type schema_tree_node_type
+    ) -> bool;
Incomplete Refactoring: Residual SchemaTreeNode Usages Detected

The verification script identified several instances of SchemaTreeNode still present in the codebase. This indicates that the transition to SchemaTree::Node is not fully complete. Please address the following files to ensure consistency and complete refactoring:

components/core/tests/test-ffi_IrUnitHandlerInterface.cpp

components/core/src/clp/ffi/SchemaTree.hpp

components/core/src/clp/ffi/KeyValuePairLogEvent.hpp

components/core/src/clp/ffi/ir_stream/IrUnitType.hpp

components/core/src/clp/ffi/ir_stream/protocol_constants.hpp

components/core/src/clp/ffi/ir_stream/Serializer.cpp

components/core/src/clp/ffi/ir_stream/Deserializer.hpp

components/core/src/clp/ffi/ir_stream/ir_unit_deserialization_methods.cpp

components/core/src/clp/ffi/ir_stream/Serializer.hpp

components/core/src/clp/ffi/ir_stream/Serializer.cpp

components/core/src/clp/ffi/ir_stream/ir_unit_deserialization_methods.cpp

components/core/tests/test-ffi_SchemaTree.cpp

Addressing these files will help ensure that the refactoring is consistent across the entire codebase.

🔗 Analysis chain

Line range hint 109-118: Summary: Consistent transition to SchemaTree::Node

The changes in this file successfully implement the transition from SchemaTreeNode to SchemaTree::Node as outlined in the PR objectives. This improves encapsulation and clarity in the Serializer class.

To ensure consistency across the entire codebase, let's verify the global transition:

This will help identify any areas that might have been missed during the refactoring process.
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify the transition from SchemaTreeNode to SchemaTree::Node across the codebase

echo "Checking for any remaining instances of SchemaTreeNode:"
rg "SchemaTreeNode" --type cpp

echo "Verifying usage of SchemaTree::Node:"
rg "SchemaTree::Node" --type cpp
Length of output: 20514
components/core/tests/test-ffi_SchemaTree.cpp (1)

52-72: Excellent enhancement to node validation process.

The updated check_node function now performs a more comprehensive validation of the node, including checks for existence, ID, root status, parent ID, type, and key name. These changes significantly improve the robustness of the node validation process and align well with the PR objective of better handling the parent-child relationship.

However, there's a minor grammatical error in the function documentation:

On line 29, "matches the ones specified" should be "match the ones specified" to maintain subject-verb agreement.
components/core/src/clp/ffi/SchemaTree.hpp (2)
94-210: LGTM: Addition of Node class with suggestion

The new Node class is a well-structured addition that improves encapsulation and provides a clear interface for node operations. The use of std::optional<id_t> for the parent ID is a good design choice, allowing for a semantically correct representation of the root node.

The class correctly implements the rule of five, and the factory methods ensure proper initialization of root and non-root nodes.

Suggestion for improvement:
Consider adding a static_assert in the create_root() method to ensure that cRootId is always 0. This would help prevent potential future issues if the value of cRootId is ever changed.
[[nodiscard]] static auto create_root() -> Node {
    static_assert(cRootId == 0, "Root ID must be 0");
    return {cRootId, std::nullopt, "", Type::Obj};
}
256-257: LGTM: Addition of get_root() method with suggestion

The new get_root() method is a valuable addition, providing direct access to the root node of the schema tree. The use of [[nodiscard]] and returning a const reference are good practices.

Suggestion for improvement:
Consider adding a debug assertion to ensure that the root node always exists and has the expected properties. This can help catch potential issues during development:
[[nodiscard]] auto get_root() const -> Node const& {
    assert(!m_tree_nodes.empty() && "Tree nodes vector should never be empty");
    assert(m_tree_nodes[cRootId].is_root() && "First node should always be the root");
    return m_tree_nodes[cRootId];
}
components/core/src/clp/ffi/ir_stream/ir_unit_deserialization_methods.cpp (3)
Line range hint 54-76: Modify 'deserialize_schema_tree_node_parent_id' to handle optional parent IDs

The function 'deserialize_schema_tree_node_parent_id' currently accepts 'SchemaTree::Node::id_t& parent_id' as a parameter. To accommodate the possibility of a null parent ID (for the root node), the parameter should be changed to 'std::optionalSchemaTree::Node::id_t& parent_id'.

Apply this diff to update the function signature and handle optional parent IDs:
- auto deserialize_schema_tree_node_parent_id(
-         ReaderInterface& reader,
-         SchemaTree::Node::id_t& parent_id
+) -> IRErrorCode {
+         ReaderInterface& reader,
+         std::optional<SchemaTree::Node::id_t>& parent_id
+) -> IRErrorCode {

  encoded_tag_t tag{};
  if (auto const err{deserialize_tag(reader, tag)}; IRErrorCode::IRErrorCode_Success != err) {
      return err;
  }
  if (cProtocol::Payload::SchemaTreeNodeParentIdNull == tag) {
+     parent_id = std::nullopt;
  } else if (cProtocol::Payload::SchemaTreeNodeParentIdUByte == tag) {
      uint8_t deserialized_id{};
      if (false == deserialize_int(reader, deserialized_id)) {
          return IRErrorCode::IRErrorCode_Incomplete_IR;
      }
      parent_id = static_cast<SchemaTree::Node::id_t>(deserialized_id);
  } else if (cProtocol::Payload::SchemaTreeNodeParentIdUShort == tag) {
      uint16_t deserialized_id{};
      if (false == deserialize_int(reader, deserialized_id)) {
          return IRErrorCode::IRErrorCode_Incomplete_IR;
      }
      parent_id = static_cast<SchemaTree::Node::id_t>(deserialized_id);
  } else {
      return IRErrorCode::IRErrorCode_Corrupted_IR;
  }
  return IRErrorCode_Success;
}
This modification ensures that the root node's parent ID can be represented as 'std::nullopt'.

Line range hint 198-224: Change 'parent_id' to optional in function implementations

Throughout the function 'deserialize_schema_tree_node_parent_id', 'parent_id' is treated as a non-optional type. Given the new implementation where 'parent_id' can be null, the function should accommodate this by using 'std::optionalSchemaTree::Node::id_t'.

Ensure that 'parent_id' is consistently treated as an optional type and handle the null case appropriately.

Line range hint 511-523: Adjust 'deserialize_ir_unit_schema_tree_node_insertion' to handle optional 'parent_id'

In the function 'deserialize_ir_unit_schema_tree_node_insertion', 'parent_id' is declared as 'SchemaTree::Node::id_t'. To support a null parent ID for the root node, it should be an optional type.

Apply this diff to update the variable declaration and handle the null parent ID:
-     SchemaTree::Node::id_t parent_id{};
+     std::optional<SchemaTree::Node::id_t> parent_id{};

      if (auto const err{deserialize_schema_tree_node_parent_id(reader, parent_id)};
          IRErrorCode_Success != err)
      {
          return ir_error_code_to_errc(err);
      }
Additionally, adjust the construction of 'SchemaTree::NodeLocator' to accept an optional 'parent_id':
      return SchemaTree::NodeLocator{parent_id, key_name, type.value()};

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 8a70a07 and eee2b76.

📒 Files selected for processing (12)

components/core/CMakeLists.txt (0 hunks)
components/core/src/clp/ffi/KeyValuePairLogEvent.cpp (13 hunks)
components/core/src/clp/ffi/KeyValuePairLogEvent.hpp (1 hunks)
components/core/src/clp/ffi/SchemaTree.cpp (4 hunks)
components/core/src/clp/ffi/SchemaTree.hpp (6 hunks)
components/core/src/clp/ffi/SchemaTreeNode.hpp (0 hunks)
components/core/src/clp/ffi/ir_stream/Serializer.cpp (12 hunks)
components/core/src/clp/ffi/ir_stream/Serializer.hpp (2 hunks)
components/core/src/clp/ffi/ir_stream/ir_unit_deserialization_methods.cpp (10 hunks)
components/core/tests/test-ffi_IrUnitHandlerInterface.cpp (1 hunks)
components/core/tests/test-ffi_KeyValuePairLogEvent.cpp (5 hunks)
components/core/tests/test-ffi_SchemaTree.cpp (3 hunks)

💤 Files with no reviewable changes (2)

components/core/CMakeLists.txt
components/core/src/clp/ffi/SchemaTreeNode.hpp

🚧 Files skipped from review as they are similar to previous changes (3)

components/core/src/clp/ffi/ir_stream/Serializer.cpp
components/core/tests/test-ffi_IrUnitHandlerInterface.cpp
components/core/tests/test-ffi_KeyValuePairLogEvent.cpp

🧰 Additional context used

📓 Learnings (1)

components/core/src/clp/ffi/KeyValuePairLogEvent.cpp (1)

Learnt from: LinZhihao-723
PR: y-scope/clp#554
File: components/core/src/clp/ffi/KeyValuePairLogEvent.cpp:299-307
Timestamp: 2024-10-10T05:46:35.188Z
Learning: In the C++ function `get_schema_subtree_bitmap` in `KeyValuePairLogEvent.cpp`, when a loop uses `while (true)` with an internal check on `optional.has_value()`, and comments explain that this structure is used to silence `clang-tidy` warnings about unchecked optional access, this code is acceptable and should not be refactored to use `while (optional.has_value())`.

🪛 cppcheck

components/core/tests/test-ffi_SchemaTree.cpp

[performance] 39-39: Function parameter 'locator' should be passed by const reference.

(passedByValue)

🔇 Additional comments (22)

components/core/src/clp/ffi/KeyValuePairLogEvent.hpp (1)
27-27: LGTM! Verify usage across the codebase.

The update from SchemaTreeNode::id_t to SchemaTree::Node::id_t aligns with the PR objectives and reflects the shift in the schema tree node implementation. This change appears to be part of a larger refactoring effort to improve the schema tree structure.

To ensure consistency across the codebase, please run the following script to verify the usage of SchemaTree::Node::id_t:

This will help ensure that the change has been applied consistently throughout the project.

✅ Verification successful

Verification Successful!

The usage of SchemaTree::Node::id_t has been consistently updated across the codebase, and no remaining instances of SchemaTreeNode::id_t were found.
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify the usage of SchemaTree::Node::id_t across the codebase

# Test: Search for occurrences of SchemaTree::Node::id_t
echo "Occurrences of SchemaTree::Node::id_t:"
rg --type cpp "SchemaTree::Node::id_t"

# Test: Search for any remaining occurrences of SchemaTreeNode::id_t
echo "Remaining occurrences of SchemaTreeNode::id_t (should be none):"
rg --type cpp "SchemaTreeNode::id_t"
Length of output: 5111
components/core/src/clp/ffi/ir_stream/Serializer.hpp (1)
109-109: LGTM: Signature update aligns with PR objectives.

The change from SchemaTreeNode::id_t to SchemaTree::Node::id_t is consistent with the PR's goal of transitioning to SchemaTree::Node. This improves encapsulation and clarity.

Let's verify the consistency of this change across the codebase:

✅ Verification successful

: The change has been consistently applied across the codebase with no remaining instances of SchemaTreeNode::id_t.

All occurrences of SchemaTree::Node::id_t have been correctly updated, ensuring consistency and improving encapsulation as intended by the PR.
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Check for any remaining instances of SchemaTreeNode::id_t
# and ensure SchemaTree::Node::id_t is used consistently.

echo "Checking for remaining SchemaTreeNode::id_t instances:"
rg "SchemaTreeNode::id_t" --type cpp

echo "Verifying consistent usage of SchemaTree::Node::id_t:"
rg "SchemaTree::Node::id_t" --type cpp
Length of output: 5123
components/core/tests/test-ffi_SchemaTree.cpp (5)

21-21: LGTM: Updated function signatures improve encapsulation.

The change from SchemaTreeNode::id_t to SchemaTree::Node::id_t for the expected_id parameter aligns well with the PR objective. This modification enhances encapsulation by tying the ID type more closely to the SchemaTree class structure.

Also applies to: 34-34, 40-40, 49-49

94-97: Well-implemented root node checks.

The use of SchemaTree::get_root_node_id() instead of a hardcoded value improves maintainability and reduces the risk of errors. The additional checks for the root node's status and lack of parent ID align perfectly with the PR objective of better handling the root node.

100-107: Consistent use of get_root_node_id() in test setup.

The use of SchemaTree::get_root_node_id() in the locators for the root node's ID is consistent with earlier changes. This modification enhances maintainability and ensures consistency throughout the test cases.

112-112: Consistent type usage in test loops.

The change of loop variable types to SchemaTree::Node::id_t maintains consistency with the earlier modifications to node ID types. This ensures type consistency throughout the test case, which is a good practice.

Also applies to: 119-119, 125-125, 131-131

Line range hint 1-134: Overall excellent implementation of schema tree node changes.

The modifications in this file successfully implement the new schema tree node structure, particularly in handling the parent-child relationship and the root node. The tests have been comprehensively updated to validate the new implementation, ensuring robustness and correctness.

Key improvements include:

Enhanced encapsulation with SchemaTree::Node::id_t.

More thorough node validation in the check_node function.

Consistent use of SchemaTree::get_root_node_id() for improved maintainability.

Updated test cases that cover the new functionality.

These changes align well with the PR objectives and significantly improve the overall quality of the schema tree implementation.

components/core/src/clp/ffi/SchemaTree.hpp (5)

5-6: LGTM: Addition of header

The inclusion of the header is appropriate for the use of fixed-width integer types in the new implementation. This change enhances type safety and portability.

219-233: LGTM: Updates to NodeLocator class

The modifications to the NodeLocator class are consistent with the new Node class implementation. The use of Node::id_t and Node::Type improves type consistency across the codebase, which is a good practice for maintaining code clarity and reducing potential type-related errors.

236-237: LGTM: Addition of get_root_node_id() method

The new get_root_node_id() static method is a valuable addition. It provides a convenient and efficient way to access the root node ID. The use of constexpr allows for compile-time evaluation, potentially improving performance. The [[nodiscard]] attribute is appropriately used, encouraging proper handling of the return value.

240-240: LGTM: Updated SchemaTree constructor

The modification to the SchemaTree constructor, using the new Node::create_root() method, is appropriate and consistent with the new Node class implementation. This change ensures that the root node is properly initialized using the designated factory method, maintaining the integrity of the schema tree structure.

304-309: LGTM: Updates to cRootId and m_tree_nodes

The changes to cRootId and m_tree_nodes are consistent with the new Node class implementation. Using Node::id_t for cRootId improves type consistency, while storing Node objects directly in the m_tree_nodes vector simplifies the data structure. These modifications potentially improve performance by reducing indirection and enhance code clarity.
components/core/src/clp/ffi/SchemaTree.cpp (2)
64-67: Proper handling of optional parent_id in revert function

In the revert method, you correctly check if parent_id has a value before using it:
auto const optional_parent_id{node.get_parent_id()};
if (optional_parent_id.has_value()) {
    m_tree_nodes[optional_parent_id.value()].remove_last_appended_child();
}
This ensures safe access and prevents potential null dereferences.

22-26: ⚠️ Potential issue

Handle optional parent_id to prevent null dereference

The locator.get_parent_id() now returns a std::optional<Node::id_t>, but the code assumes it always has a value when casting to size_t:
auto const parent_id{static_cast<size_t>(locator.get_parent_id())};
This can lead to undefined behaviour if get_parent_id() is std::nullopt. Before using parent_id, you should check if it has a value.

Consider modifying the code as follows:
+ if (!locator.get_parent_id().has_value()) {
+     // Handle case where parent_id is null (e.g., root node)
+     return std::nullopt;
+ }
  auto const parent_id{static_cast<size_t>(locator.get_parent_id().value())};
  if (m_tree_nodes.size() <= parent_id) {
-     return false;
+     return std::nullopt;
  }
⛔ Skipped due to learnings
Learnt from: LinZhihao-723
PR: y-scope/clp#554
File: components/core/src/clp/ffi/SchemaTree.cpp:0-0
Timestamp: 2024-10-10T05:51:15.605Z
Learning: In this codebase, `locator.get_parent_id()` returns a non-optional `id_t` and always provides a valid parent ID.
components/core/src/clp/ffi/ir_stream/ir_unit_deserialization_methods.cpp (8)
33-33: Update 'Schema' typedef to align with new 'SchemaTree::Node' type

The 'Schema' typedef has been correctly updated to use 'SchemaTree::Node::id_t', reflecting the changes made to the schema tree node implementation.

41-41: Ensure consistency in return type for 'schema_tree_node_tag_to_type'

The return type has been updated to 'std::optionalSchemaTree::Node::Type', which is consistent with the new 'SchemaTree::Node' structure.

127-129: Update parameter type in 'deserialize_value_and_insert_to_node_id_value_pairs'

The function now correctly uses 'SchemaTree::Node::id_t node_id' as a parameter, aligning with the updated type definitions.

147-149: Reflect type changes in template function 'deserialize_encoded_text_ast_and_insert_to_node_id_value_pairs'

The parameter 'node_id' is updated to 'SchemaTree::Node::id_t', which is consistent with the revised schema node implementation.

177-190: Adjust switch cases in 'schema_tree_node_tag_to_type' for updated node types

The function correctly maps encoded tags to the new 'SchemaTree::Node::Type' enumeration, ensuring accurate deserialization of node types.

309-315: Update 'schema' vector to use new 'SchemaTree::Node::id_t' type

The casting to 'SchemaTree::Node::id_t' is correct, ensuring that the schema IDs are stored using the updated type.

Line range hint 407-421: Consistent use of updated 'node_id' type in template function

The function template correctly reflects the type changes for 'node_id', ensuring consistent deserialization logic.

Line range hint 331-377: Verify proper handling of 'node_id' with updated type in value deserialization

The function 'deserialize_value_and_insert_to_node_id_value_pairs' correctly uses 'SchemaTree::Node::id_t node_id' and should ensure that all usages of 'node_id' are consistent with the new type.

Run this script to confirm that all occurrences of 'node_id' are correctly updated:

✅ Verification successful

Verification Successful: All instances of node_id are using the updated type SchemaTree::Node::id_t.
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify that 'node_id' parameters and variables use the updated type throughout the codebase.

# Test: Search for function definitions and usages of 'node_id' with old type.
# Expect: No occurrences found with 'SchemaTreeNode::id_t'.

ast-grep --lang cpp --pattern $'SchemaTreeNode::id_t node_id' 
Length of output: 62

Script:
#!/bin/bash
# Description: Verify that all 'node_id' parameters and variables use the updated type 'SchemaTree::Node::id_t' throughout the codebase.

# Test: Search for function definitions and usages of 'node_id' with the old type 'SchemaTreeNode::id_t'.
# Expect: No occurrences found with 'SchemaTreeNode::id_t'.

ast-grep --lang cpp --pattern $'SchemaTree::Node::id_t node_id'
Length of output: 64

components/core/tests/test-ffi_SchemaTree.cpp

components/core/src/clp/ffi/SchemaTree.cpp

components/core/src/clp/ffi/KeyValuePairLogEvent.cpp

coderabbitai

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between eee2b76 and 081405f.

📒 Files selected for processing (2)

components/core/src/clp/ffi/SchemaTree.cpp (4 hunks)
components/core/tests/test-ffi_SchemaTree.cpp (2 hunks)

🧰 Additional context used

📓 Learnings (1)

components/core/src/clp/ffi/SchemaTree.cpp (1)

Learnt from: LinZhihao-723
PR: y-scope/clp#554
File: components/core/src/clp/ffi/SchemaTree.cpp:0-0
Timestamp: 2024-10-10T05:51:15.605Z
Learning: In this codebase, `locator.get_parent_id()` returns a non-optional `id_t` and always provides a valid parent ID.

🔇 Additional comments (10)

components/core/tests/test-ffi_SchemaTree.cpp (7)

20-21: LGTM: Improved function signature

The changes to the insert_node function signature are appropriate:

Using SchemaTree::Node::id_t aligns with the updated schema tree node implementation.

Passing locator by const reference can improve performance by avoiding unnecessary copying.

These modifications enhance consistency and efficiency.

28-29: Grammatical error fixed in function documentation

The documentation for the check_node function has been updated to correct the grammatical error. The phrase now reads "matches the ones" instead of "matched the ones", maintaining consistent tense and improving clarity.

Thank you for addressing this issue from the previous review.

33-34: LGTM: Consistent improvements in function signature

The changes to the check_node function signature mirror those made to insert_node:

Using SchemaTree::Node::id_t maintains consistency with the updated schema tree node implementation.

Passing locator by const reference can enhance performance.

These modifications ensure consistency across the codebase and potentially improve efficiency.

39-40: LGTM: Consistent function signature update

The insert_node function implementation has been updated to match the earlier changes in its signature:

The locator parameter is now a const reference.

The expected ID type is now SchemaTree::Node::id_t.

These changes maintain consistency with the function declaration while preserving the original logic.

48-72: Excellent refactoring of the check_node function

The check_node function has been significantly improved:

It now performs a more comprehensive validation of the node, including checks for:

Node ID existence and matching

Root node status

Parent ID presence for non-root nodes

Consistency between node information and locator data

The implementation correctly handles the new schema tree node structure, where the root node's parent ID is null.

The function returns early if any check fails, improving efficiency.

These changes align well with the PR objectives and provide a more robust validation process for schema tree nodes.

92-97: Improved root node validation in test setup

The test case setup for the root node has been enhanced:

The use of SchemaTree::get_root_node_id() instead of a hardcoded value improves maintainability and reduces the risk of errors if the root ID implementation changes.

The additional check REQUIRE_FALSE(root.get_parent_id().has_value()) ensures that the root node's parent ID is null, aligning with the new schema tree node implementation.

These changes improve the robustness of the test case and ensure it correctly validates the updated schema tree structure.

100-132: Consistent improvements in test case implementation

The test case implementation has been updated to align with the earlier changes:

The locators vector now uses SchemaTree::get_root_node_id() instead of a hardcoded value for the root node ID, improving maintainability and consistency.

Loop counters and comparisons now use SchemaTree::Node::id_t, ensuring type consistency throughout the test case.

These modifications enhance the overall quality of the test case by maintaining consistency with the updated schema tree node implementation and improving code maintainability.
components/core/src/clp/ffi/SchemaTree.cpp (3)
Line range hint 10-17: Method get_node updated correctly to use Node::id_t

The update reflects the change in node representation, and the logic remains correct.

Line range hint 22-36: Properly returning std::nullopt when parent ID is out of bounds

The try_get_node_id function now correctly checks if parent_id is within bounds and returns std::nullopt when it is not, which improves robustness.

22-25: ⚠️ Potential issue

Handle optional parent_id in try_get_node_id method

In the try_get_node_id method, locator.get_parent_id() is used without checking if it has a value:
auto const parent_id{static_cast<size_t>(locator.get_parent_id())};
Since locator.get_parent_id() can be std::nullopt, directly casting it without checking may cause issues. Please ensure you check if locator.get_parent_id() has a value before using it.

Apply this diff to fix the issue:
+ if (!locator.get_parent_id().has_value()) {
+     // Handle root node case appropriately
+     return std::nullopt;
+ }
auto const parent_id{static_cast<size_t>(locator.get_parent_id().value())};
if (m_tree_nodes.size() <= parent_id) {
    return std::nullopt;
}
⛔ Skipped due to learnings
Learnt from: LinZhihao-723
PR: y-scope/clp#554
File: components/core/src/clp/ffi/SchemaTree.cpp:0-0
Timestamp: 2024-10-10T05:51:15.605Z
Learning: In this codebase, `locator.get_parent_id()` returns a non-optional `id_t` and always provides a valid parent ID.

coderabbitai · 2024-10-10T15:26:10Z

components/core/src/clp/ffi/SchemaTree.cpp

+    m_tree_nodes.emplace_back(Node::create(node_id, locator));
    auto& parent_node{m_tree_nodes[locator.get_parent_id()]};
-    if (SchemaTreeNode::Type::Obj != parent_node.get_type()) {
+    if (Node::Type::Obj != parent_node.get_type()) {
        throw OperationFailed(
                ErrorCode_Failure,
                __FILE__,


⚠️ Potential issue

Potential null dereference when accessing parent node in insert_node

In the insert_node method, you access locator.get_parent_id() without checking if it has a value:

auto& parent_node{m_tree_nodes[locator.get_parent_id()]};

Since locator.get_parent_id() can now be std::nullopt for the root node, accessing m_tree_nodes[locator.get_parent_id()] without checking may lead to undefined behaviour.

Consider checking whether locator.get_parent_id() has a value before accessing m_tree_nodes[locator.get_parent_id().value()].

Apply this diff to fix the issue:

+ if (locator.get_parent_id().has_value()) { auto& parent_node{m_tree_nodes[locator.get_parent_id().value()]}; if (Node::Type::Obj != parent_node.get_type()) { throw OperationFailed( ErrorCode_Failure, __FILE__, __LINE__, "Non-object nodes cannot have children." ); } parent_node.append_new_child(node_id); + } else { + // Handle the case where the node is the root node + }

coderabbitai · 2024-10-10T15:26:10Z

components/core/src/clp/ffi/SchemaTree.cpp

+        auto const optional_parent_id{node.get_parent_id()};
+        if (optional_parent_id.has_value()) {
+            m_tree_nodes[optional_parent_id.value()].remove_last_appended_child();
+        }


🛠️ Refactor suggestion

Consistent handling of optional parent IDs

In the revert method, you correctly check if node.get_parent_id() has a value before accessing it:

auto const optional_parent_id{node.get_parent_id()}; if (optional_parent_id.has_value()) { m_tree_nodes[optional_parent_id.value()].remove_last_appended_child(); }

Consider applying the same pattern in other methods, such as insert_node and try_get_node_id, to ensure consistent and safe handling of optional parent IDs throughout the codebase.

components/core/src/clp/ffi/KeyValuePairLogEvent.cpp

components/core/src/clp/ffi/SchemaTree.hpp

components/core/tests/test-ffi_SchemaTree.cpp

kirkrodrigues · 2024-10-11T16:24:55Z

components/core/tests/test-ffi_SchemaTree.cpp

+    if (optional_parent_id.value() != locator.get_parent_id()
+        || node.get_type() != locator.get_type() || node.get_key_name() != locator.get_key_name())


Technically this is implicitly checked by try_get_node.

that is correct, but adding this in the unit test is also checking the functionality of try_get_node works properly right?

Co-authored-by: kirkrodrigues <[email protected]>

coderabbitai

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (2)

components/core/tests/test-ffi_SchemaTree.cpp (1)
28-29: Improved function documentation with minor grammatical suggestion.

The updated return description for the check_node function provides a more comprehensive explanation of the conditions being checked. This improvement enhances the clarity and usefulness of the documentation.

However, there's a minor grammatical issue that could be addressed:
- * @return Whether an ID could be found for a node matching the locator, the ID matches the expected
- * ID, the corresponding node is not the root, and it matches the locator.
+ * @return Whether an ID could be found for a node matching the locator, the ID matches the expected
+ * ID, the corresponding node is not the root, and the node matches the locator.
This small change improves the readability of the documentation.
components/core/src/clp/ffi/SchemaTree.hpp (1)
94-208: LGTM: Well-designed Node class implementation

The new Node class is well-structured and encapsulates the properties and behaviour of a schema tree node effectively. The use of std::optional<id_t> for the parent ID is a good choice for representing the root node. The factory methods ensure proper initialization, and the const qualifiers and [[nodiscard]] attributes are used appropriately.

However, I have two minor suggestions for improvement:

Consider adding the [[nodiscard]] attribute to the append_new_child method:
-        auto append_new_child(id_t child_id) -> void { m_children_ids.push_back(child_id); }
+        [[nodiscard]] auto append_new_child(id_t child_id) -> void { m_children_ids.push_back(child_id); }
For the get_parent_id_unsafe method, consider throwing a more specific exception type when accessed on a root node, instead of relying on std::optional's behaviour:
         [[nodiscard]] auto get_parent_id_unsafe() const -> id_t {
-            // NOLINTNEXTLINE(bugprone-unchecked-optional-access)
-            return m_parent_id.value();
+            if (!m_parent_id.has_value()) {
+                throw std::logic_error("Attempted to access parent ID of root node");
+            }
+            return *m_parent_id;
         }
These changes would enhance the safety and clarity of the Node class interface.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 081405f and 4048734.

📒 Files selected for processing (3)

components/core/src/clp/ffi/KeyValuePairLogEvent.cpp (13 hunks)
components/core/src/clp/ffi/SchemaTree.hpp (6 hunks)
components/core/tests/test-ffi_SchemaTree.cpp (2 hunks)

🧰 Additional context used

📓 Learnings (2)

components/core/src/clp/ffi/KeyValuePairLogEvent.cpp (2)

Learnt from: LinZhihao-723
PR: y-scope/clp#554
File: components/core/src/clp/ffi/KeyValuePairLogEvent.cpp:102-102
Timestamp: 2024-10-10T15:21:14.506Z
Learning: In `KeyValuePairLogEvent.cpp`, the `get_next_child_schema_tree_node()` method in `JsonSerializationIterator` is always called after checking `has_next_child_schema_tree_node()`, ensuring proper iterator usage.

Learnt from: LinZhihao-723
PR: y-scope/clp#554
File: components/core/src/clp/ffi/KeyValuePairLogEvent.cpp:109-111
Timestamp: 2024-10-10T15:19:52.408Z
Learning: In `KeyValuePairLogEvent.cpp`, for the class `JsonSerializationIterator`, it's acceptable to use raw pointers for member variables like `m_schema_tree_node`, and there's no need to replace them with references or smart pointers in this use case.

components/core/tests/test-ffi_SchemaTree.cpp (2)

Learnt from: LinZhihao-723
PR: y-scope/clp#549
File: components/core/tests/test-ir_encoding_methods.cpp:110-110
Timestamp: 2024-10-01T08:00:48.347Z
Learning: The function `handle_schema_tree_node_insertion` in `IrUnitHandler` must accept `clp::ffi::SchemaTree::NodeLocator` by value to match the `IrUnitHandlerInterface`.

Learnt from: LinZhihao-723
PR: y-scope/clp#549
File: components/core/tests/test-ir_encoding_methods.cpp:110-110
Timestamp: 2024-10-08T15:52:50.753Z
Learning: The function `handle_schema_tree_node_insertion` in `IrUnitHandler` must accept `clp::ffi::SchemaTree::NodeLocator` by value to match the `IrUnitHandlerInterface`.

🔇 Additional comments (21)

components/core/tests/test-ffi_SchemaTree.cpp (7)

20-21: Excellent changes to the insert_node function signature.

The modifications to the insert_node function signature are well-considered:

Passing locator by const reference (const&) improves performance by avoiding unnecessary copying.

Updating expected_id type to SchemaTree::Node::id_t aligns with the current implementation of the SchemaTree class.

These changes enhance both efficiency and consistency.

33-34: Consistent improvements in the check_node function signature.

The changes to the check_node function signature mirror those made to insert_node:

locator is now passed by const reference, improving performance.

expected_id type is updated to SchemaTree::Node::id_t, maintaining consistency with the SchemaTree class implementation.

These modifications ensure consistency across the codebase and maintain the efficiency improvements.

39-40: Consistent update to insert_node function implementation.

The insert_node function implementation has been correctly updated to match the earlier changes in its signature:

The locator parameter is now a const reference.

The expected_id type is now SchemaTree::Node::id_t.

The function body remains unchanged, which suggests that the core logic is still valid with the new types. This consistency in implementation is commendable.

48-72: Excellent refactoring of the check_node function.

The check_node function has been significantly improved:

More comprehensive validation:

Checks if the node ID exists and matches the expected ID.

Verifies that the node is not the root.

Ensures non-root nodes have a parent ID.

Validates that the node's type and key name match the locator.

Enhanced efficiency:

Early returns when any check fails, avoiding unnecessary processing.

Improved readability:

Clear comments explaining each check.

Logical grouping of related checks.

These changes align perfectly with the updated function documentation and provide a more robust validation process. The refactoring enhances both the functionality and maintainability of the code.

92-97: Improved root node validation in test setup.

The updates to the root node checks in the test setup are well-considered:

Using SchemaTree::get_root_node_id() instead of a hardcoded value improves consistency and makes the test more robust against potential changes in the root node ID implementation.

The new check REQUIRE_FALSE(root.get_parent_id().has_value()) ensures that the root node correctly has no parent ID, aligning with the updated SchemaTree implementation.

These changes enhance the test coverage and make the tests more resilient to future modifications in the SchemaTree class.

100-107: Consistent use of SchemaTree::get_root_node_id() in test locators.

The update to use SchemaTree::get_root_node_id() in the test locators is a positive change:

It maintains consistency with the earlier modifications in the test setup.

This change further reduces hardcoding of the root node ID.

It improves the test's resilience to potential future changes in the root node ID implementation.

This modification ensures that the test cases remain valid even if the underlying implementation of the root node ID changes, contributing to more robust and maintainable tests.

112-112: Improved type consistency in test loops.

The update to use SchemaTree::Node::id_t for id_to_insert and id_to_check in the test loops is a positive change:

It maintains consistency with the earlier modifications in the insert_node and check_node functions.

This change ensures type consistency throughout the test case, reducing the risk of subtle type-related bugs.

It improves code readability by using the specific type defined by the SchemaTree class.

These modifications contribute to a more robust and maintainable test suite by ensuring type consistency across all parts of the test case.

Also applies to: 119-119, 125-125, 131-131

components/core/src/clp/ffi/SchemaTree.hpp (6)

5-7: LGTM: Addition of header

The inclusion of the header is appropriate for the use of fixed-width integer types in the new Node class implementation.

217-231: LGTM: NodeLocator class updates

The changes to the NodeLocator class are consistent with the new Node class implementation. The use of Node::id_t and Node::Type ensures type consistency throughout the SchemaTree implementation.

234-235: LGTM: Addition of get_root_node_id() method

The new static method get_root_node_id() is a good addition, providing a convenient and type-safe way to access the root node ID. The use of constexpr and [[nodiscard]] is appropriate and promotes compile-time optimization and proper usage.

238-238: LGTM: Updated SchemaTree constructor

The modification to the SchemaTree constructor, using the Node::create_root() factory method, ensures proper initialization of the root node and is consistent with the new Node class implementation.

254-255: LGTM: Addition of get_root() method

The new get_root() method is a valuable addition, providing direct and const-correct access to the root node. The use of [[nodiscard]] is appropriate and encourages proper usage of the returned value.

Line range hint 261-307: LGTM: Updated method signatures and member variable

The updates to method signatures using the new Node::id_t type and the change of m_tree_nodes to store Node instances are consistent with the new Node class implementation and improve type safety.

However, I have one concern:

Clarification needed: Removal of reset() method

The reset() method has been removed from the SchemaTree class. Could you please clarify the reasoning behind this removal and confirm that it won't negatively impact existing code that might rely on this method? If the functionality is no longer needed, consider adding a comment explaining why it was removed to maintain clear documentation of the class's evolution.

components/core/src/clp/ffi/KeyValuePairLogEvent.cpp (8)

51-51: Usage of raw pointer for m_schema_tree_node is acceptable in this context

The initialization with SchemaTree::Node const* schema_tree_node is appropriate here, considering the ownership and lifetime are managed elsewhere.

78-78: Correct usage of is_root() method

The check if (m_schema_tree_node->is_root()) enhances readability and accurately handles the root node case.

102-102: Proper implementation of get_next_child_schema_tree_node()

The method correctly retrieves the next child schema tree node, assuming prior verification with has_next_child_schema_tree_node().

109-111: Appropriate member variable declarations for iterator functionality

The member variables m_schema_tree_node, m_child_schema_tree_nodes, and m_child_schema_tree_node_it are appropriately declared to support iteration over child nodes.

192-204: Accurate type matching in node_type_matches_value_type function

The function effectively ensures that the schema node types match the corresponding value types, enhancing type safety during serialization.

243-245: Clear justification for using get_parent_id_unsafe()

The comments explain that since the node is confirmed not to be the root, using get_parent_id_unsafe() is safe and avoids redundant checks.

298-312: Appropriate loop structure to handle optional parent IDs

The use of while (true) with internal checks for optional_parent_id.has_value() addresses clang-tidy warnings and maintains code clarity.

Line range hint 333-361: Correct handling of node types in JSON serialization

The insert_kv_pair_into_json_obj function properly serializes various node types, ensuring that values are accurately represented in the JSON output.

coderabbitai

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (5)

components/core/tests/test-ffi_SchemaTree.cpp (2)
27-33: LGTM: Improved function naming and documentation.

The changes to the check_non_root_node function (formerly check_node) are appropriate:

The new name better reflects the function's purpose.

The updated documentation clearly describes the function's behaviour.

Passing locator by const reference aligns with the insert_node function.

These modifications improve clarity and consistency.

However, consider a minor improvement to the documentation:
- * @return Whether an ID could be found for a non root node matching the locator, the ID matches the
+ * @return Whether an ID could be found for a non-root node matching the locator, the ID matches the
This change adds a hyphen to "non-root" for consistency with common usage.

45-71: LGTM: Comprehensive node validation implemented.

The new implementation of check_non_root_node significantly improves the validation process:

It checks for the existence and correctness of the node ID.

It verifies that the node is not a root node.

It ensures that non-root nodes have a parent ID.

It compares the node's properties with those specified in the locator.

These changes align well with the PR objectives and enhance the robustness of the schema tree validation.

Consider a minor improvement for better error handling:
 auto check_non_root_node(
         SchemaTree const& schema_tree,
         SchemaTree::NodeLocator const& locator,
         SchemaTree::Node::id_t expected_id
 ) -> bool {
+    try {
         auto const node_id{schema_tree.try_get_node_id(locator)};
         if (false == node_id.has_value() || node_id.value() != expected_id) {
             // The node's ID doesn't match.
             return false;
         }
         auto const& node{schema_tree.get_node(expected_id)};
         if (node.is_root()) {
             // Any nodes added after the tree was constructed must not be the root.
             return false;
         }
         auto const optional_parent_id{node.get_parent_id()};
         if (false == optional_parent_id.has_value()) {
             // Non-root nodes must have a parent ID.
             return false;
         }
         if (optional_parent_id.value() != locator.get_parent_id()
             || node.get_type() != locator.get_type() || node.get_key_name() != locator.get_key_name())
         {
             // The node information doesn't match the locator.
             return false;
         }
         return true;
+    } catch (const std::exception& e) {
+        // Log the error or handle it as appropriate for your application
+        return false;
+    }
 }
This change adds exception handling to catch any unexpected errors that might occur during the validation process, improving the function's robustness.
components/core/src/clp/ffi/SchemaTree.hpp (2)
98-208: Excellent addition of the Node class

The introduction of the Node class significantly improves the encapsulation and representation of schema tree nodes. The use of std::optional<id_t> for the parent ID elegantly solves the issue of representing the root node. The factory methods ensure proper initialization, and the class provides a clear interface for node operations.

Consider adding a static_assert to ensure that the id_t type is large enough to represent all possible node IDs:
static_assert(std::numeric_limits<id_t>::max() >= std::numeric_limits<size_t>::max(),
              "id_t must be large enough to represent all possible node IDs");
This assertion would help prevent potential overflow issues if the schema tree grows very large.

Line range hint 254-288: Well-executed updates to SchemaTree method signatures

The changes to the method signatures in the SchemaTree class are consistent with the new Node class implementation. The use of Node::id_t and Node const& ensures type consistency throughout the class. The addition of the get_root() method provides a convenient way to access the root node directly.

Consider adding a [[nodiscard]] attribute to the get_root() method to encourage proper usage of the returned value:
[[nodiscard]] auto get_root() const -> Node const& { return m_tree_nodes[cRootId]; }
This addition would align with the usage of [[nodiscard]] on other getter methods in the class.
components/core/tests/test-ffi_KeyValuePairLogEvent.cpp (1)

Line range hint 173-195: LGTM: Updated type checks for consistency

The changes to use SchemaTree::Node::Type instead of SchemaTreeNode::Type are consistent with the updated schema tree node implementation. The logic remains intact, and only the type references have been updated.

For consistency, consider using the SchemaTree::Node::Type::Null enum value instead of std::nullopt on line 197. This would make the code more uniform with the other type checks.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 4048734 and 7cad744.

📒 Files selected for processing (5)

components/core/src/clp/ffi/SchemaTree.hpp (6 hunks)
components/core/src/clp/ffi/ir_stream/Serializer.cpp (11 hunks)
components/core/tests/test-ffi_IrUnitHandlerInterface.cpp (1 hunks)
components/core/tests/test-ffi_KeyValuePairLogEvent.cpp (5 hunks)
components/core/tests/test-ffi_SchemaTree.cpp (2 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

components/core/tests/test-ffi_IrUnitHandlerInterface.cpp

🧰 Additional context used

📓 Learnings (1)

components/core/tests/test-ffi_SchemaTree.cpp (2)

Learnt from: LinZhihao-723
PR: y-scope/clp#549
File: components/core/tests/test-ir_encoding_methods.cpp:110-110
Timestamp: 2024-10-01T08:00:48.347Z
Learning: The function `handle_schema_tree_node_insertion` in `IrUnitHandler` must accept `clp::ffi::SchemaTree::NodeLocator` by value to match the `IrUnitHandlerInterface`.

Learnt from: LinZhihao-723
PR: y-scope/clp#549
File: components/core/tests/test-ir_encoding_methods.cpp:110-110
Timestamp: 2024-10-08T15:52:50.753Z
Learning: The function `handle_schema_tree_node_insertion` in `IrUnitHandler` must accept `clp::ffi::SchemaTree::NodeLocator` by value to match the `IrUnitHandlerInterface`.

🔇 Additional comments (24)

components/core/tests/test-ffi_SchemaTree.cpp (5)

19-20: LGTM: Improved function signature.

The changes to the insert_node function signature are appropriate:

Passing locator by const reference can improve performance by avoiding unnecessary copying.

Updating expected_id type to SchemaTree::Node::id_t aligns with the transition from SchemaTreeNode to SchemaTree::Node.

These modifications enhance consistency and potentially improve performance.

92-96: LGTM: Improved root node validation.

The changes to the root node check are appropriate:

Using SchemaTree::cRootId instead of a hardcoded value improves code maintainability.

Verifying that the root node is indeed a root and has no parent ID aligns with the PR objectives.

These modifications enhance the robustness of the root node validation.

99-106: LGTM: Updated locators initialization.

The changes to the locators vector initialization are appropriate:

Using SchemaTree::Node::Type instead of SchemaTreeNode::Type aligns with the transition to SchemaTree::Node.

Using SchemaTree::cRootId for the root ID improves code maintainability.

These modifications enhance consistency and maintainability of the test cases.

109-109: LGTM: Consistent type usage for snapshot index.

The change to cast the snapshot index to SchemaTree::Node::id_t is appropriate. It ensures consistency with the use of SchemaTree::Node::id_t throughout the file, aligning with the transition to SchemaTree::Node.

111-119: LGTM: Updated node insertion and checking loops.

The changes to the node insertion and checking loops are appropriate:

Using SchemaTree::Node::id_t for loop variables ensures type consistency.

Using check_non_root_node instead of check_node aligns with the function renaming.

These modifications maintain the test coverage while improving consistency with the updated types and function names.

Also applies to: 124-131

components/core/src/clp/ffi/SchemaTree.hpp (4)

217-231: Appropriate updates to the NodeLocator class

The changes to the NodeLocator class are consistent with the introduction of the Node class. The use of Node::id_t and Node::Type ensures type consistency throughout the SchemaTree implementation. These updates maintain the existing functionality while seamlessly integrating with the new Node type.

235-238: Consistent updates to SchemaTree constants and constructor

The changes to the cRootId constant and the SchemaTree constructor are well-aligned with the new Node class implementation. The use of Node::id_t for cRootId ensures type consistency, and the modification of the constructor to use Node::create_root() guarantees proper initialization of the root node. These changes contribute to a more robust and type-safe implementation of the schema tree.

304-304: Appropriate update to SchemaTree member variable

The modification of the m_tree_nodes member variable to store Node objects instead of SchemaTreeNode objects is a logical and necessary change. This update ensures consistency with the newly introduced Node class and maintains the integrity of the SchemaTree implementation. This change contributes to a more cohesive and type-safe design of the schema tree structure.

Line range hint 1-304: Comprehensive improvement to the SchemaTree implementation

The changes introduced in this file represent a significant enhancement to the SchemaTree class and its associated components. The introduction of the Node class, along with the updates to NodeLocator and SchemaTree, results in a more robust, flexible, and type-safe implementation.

Key improvements include:

Better encapsulation of node properties and behavior through the Node class.

Elegant solution for representing the root node's parent ID using std::optional<id_t>.

Improved type consistency throughout the implementation.

Enhanced factory methods for creating root and non-root nodes.

Maintained backwards compatibility while introducing these improvements.

These changes collectively address the original issue of representing the root node's parent ID while also providing a cleaner and more maintainable codebase. The new implementation sets a solid foundation for future enhancements and makes the code easier to reason about and extend.
components/core/tests/test-ffi_KeyValuePairLogEvent.cpp (6)
80-80: LGTM: Updated parameter type for better encapsulation

The change from SchemaTreeNode::id_t to SchemaTree::Node::id_t for the node_id parameter type improves encapsulation and aligns with the PR's objective of modifying the schema tree node implementation.

267-277: LGTM: Updated NodeLocator definitions

The changes to use SchemaTree::Node::Type in the NodeLocator definitions are consistent with the updated schema tree node implementation. The test case structure and logic remain intact, ensuring that the existing functionality is still properly tested.

440-440: LGTM: Updated type cast for consistency

The change to cast schema_tree->get_size() to SchemaTree::Node::id_t is consistent with the updated schema tree node implementation. This ensures that the out-of-bound node ID test uses the correct type for node IDs.

Line range hint 1-452: Overall assessment: Changes align with PR objectives

The modifications in this file successfully update the schema tree node implementation as intended. The test cases have been appropriately adjusted to reflect these changes while maintaining their original functionality. The transition from SchemaTreeNode to SchemaTree::Node has been consistently applied throughout the file.

A few minor suggestions and clarifications have been requested:

Consider using SchemaTree::Node::Type::Null for consistency in type checks.

Clarification needed on the new implementation of cRootId in the SchemaTree class.

Verification of any remaining uses of SchemaTreeNode in the file.

These changes contribute to improved encapsulation and consistency in the codebase.

Line range hint 1-1: LGTM: Removed unused using declaration

The removal of using clp::ffi::SchemaTreeNode; is consistent with the changes to use SchemaTree::Node instead of SchemaTreeNode throughout the file. This cleanup helps maintain code clarity and removes unused declarations.

To ensure completeness, let's verify that there are no remaining uses of SchemaTreeNode in this file.
#!/bin/bash
# Search for any remaining uses of SchemaTreeNode in the file
rg 'SchemaTreeNode' components/core/tests/test-ffi_KeyValuePairLogEvent.cpp
267-267: LGTM: Updated root ID reference, but clarification needed

The change from cRootId to SchemaTree::cRootId is consistent with the updated schema tree implementation. However, it's not clear from this change alone whether cRootId is now a function or a static member of the SchemaTree class.

Could you please clarify the new implementation of cRootId in the SchemaTree class? Is it a function, a static member, or something else? This information would help in understanding the design choice and ensuring proper usage throughout the codebase.
components/core/src/clp/ffi/ir_stream/Serializer.cpp (9)
46-49: Constructor Parameter Type Updated Correctly

The parameter type in the MsgpackMapIterator constructor has been appropriately updated to SchemaTree::Node::id_t, aligning with the new schema tree node implementation.

55-57: Return Type of get_schema_tree_node_id Updated

The return type of get_schema_tree_node_id() is correctly changed to SchemaTree::Node::id_t, ensuring consistency with the updated schema node types.

73-75: Member Variable Type Updated

The member variable m_schema_tree_node_id now uses SchemaTree::Node::id_t, which reflects the updated node type structure.

85-87: Function Return Type Updated to Optional

The function get_schema_tree_node_type_from_msgpack_val now returns optional<SchemaTree::Node::Type>, which is appropriate given that not all MessagePack values may correspond to a schema tree node type.

149-173: Switch Statement Updated with New Node Types

The switch statement in get_schema_tree_node_type_from_msgpack_val correctly maps MessagePack types to the updated SchemaTree::Node::Type enumeration.

Line range hint 422-426: Serialization Function Parameter Updated

The serialize_key function parameter is correctly updated to SchemaTree::Node::id_t. This ensures consistency with the updated schema node ID types.

Line range hint 438-481: Updated serialize_val Function with New Node Types

The serialize_val function now uses SchemaTree::Node::Type and the switch cases have been updated accordingly. The serialization logic for different node types appears to be accurate and aligns with the new schema definitions.

518-529: Explicit Template Instantiations Updated

The explicit template instantiations for serialize_key and serialize_val have been updated to use SchemaTree::Node::id_t and SchemaTree::Node::Type. This change is consistent with the updated node structure.

383-400: ⚠️ Potential issue

Ensure Proper Handling of Optional parent_id

The code assumes that parent_id is always present, but with the change to std::optional<id_t>, there is a possibility that parent_id is empty (particularly for the root node). The current implementation does not check if parent_id has a value before using it, which may lead to undefined behaviour.

Please modify the code to handle the optional nature of parent_id. Here is a suggested fix:
auto const parent_id_optional{locator.get_parent_id()};
if (parent_id_optional.has_value()) {
    auto const parent_id = parent_id_optional.value();
    if (parent_id <= UINT8_MAX) {
        m_schema_tree_node_buf.push_back(cProtocol::Payload::SchemaTreeNodeParentIdUByte);
        m_schema_tree_node_buf.push_back(bit_cast<int8_t>(static_cast<uint8_t>(parent_id)));
    } else if (parent_id <= UINT16_MAX) {
        m_schema_tree_node_buf.push_back(cProtocol::Payload::SchemaTreeNodeParentIdUShort);
        serialize_int(static_cast<uint16_t>(parent_id), m_schema_tree_node_buf);
    } else {
        // Out of range
        return false;
    }
} else {
    // Handle the case where there is no parent ID (e.g., root node)
    m_schema_tree_node_buf.push_back(cProtocol::Payload::SchemaTreeNodeNoParentId);
}
This adjustment checks for the presence of parent_id and handles the root node appropriately.
⛔ Skipped due to learnings
Learnt from: LinZhihao-723
PR: y-scope/clp#554
File: components/core/src/clp/ffi/SchemaTree.cpp:0-0
Timestamp: 2024-10-10T05:51:15.605Z
Learning: In this codebase, `locator.get_parent_id()` returns a non-optional `id_t` and always provides a valid parent ID.

kirkrodrigues

For the PR title, how about:

ffi: Move `SchemaTreeNode` into `SchemaTree`; Use an `optional` for the parent ID in the schema tree node to track nodes which have no parents (the root).

…he parent ID in the schema tree node to track nodes which have no parents (the root). (y-scope#554) Co-authored-by: kirkrodrigues <[email protected]>

LinZhihao-723 added 2 commits October 10, 2024 00:17

Add support for optional schema tree node parent id

287ddeb

Remove reset to ensure RAII

8a70a07

LinZhihao-723 requested a review from kirkrodrigues October 10, 2024 05:03

coderabbitai bot reviewed Oct 10, 2024

View reviewed changes

LinZhihao-723 added 3 commits October 10, 2024 01:26

Move SchemaTreeNode into SchemaTree

292ac9e

Mark schema tree as friend

e1b94c7

Fix doc string and comment issues

eee2b76

coderabbitai bot reviewed Oct 10, 2024

View reviewed changes

LinZhihao-723 added 2 commits October 10, 2024 11:16

Fix const &

a82ed4f

Fix schema tree issue

081405f

coderabbitai bot reviewed Oct 10, 2024

View reviewed changes

kirkrodrigues requested changes Oct 11, 2024

View reviewed changes

Apply suggestions from code review

4048734

Co-authored-by: kirkrodrigues <[email protected]>

coderabbitai bot reviewed Oct 11, 2024

View reviewed changes

LinZhihao-723 added 2 commits October 11, 2024 13:39

Rename helper function

4021865

Change the static get id method to the raw const id

7cad744

LinZhihao-723 requested a review from kirkrodrigues October 11, 2024 18:03

coderabbitai bot reviewed Oct 11, 2024

View reviewed changes

Merge branch 'oss_main' into optional-parent-id

07be6c3

kirkrodrigues approved these changes Oct 11, 2024

View reviewed changes

LinZhihao-723 changed the title ~~ffi: Change the schema tree node implementation to use an optional ID to track the parent node.~~ ffi: Move SchemaTreeNode into SchemaTree; Use an optional for the parent ID in the schema tree node to track nodes which have no parents (the root). Oct 12, 2024

LinZhihao-723 merged commit de2cf07 into y-scope:main Oct 12, 2024
18 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ffi: Move `SchemaTreeNode` into `SchemaTree`; Use an `optional` for the parent ID in the schema tree node to track nodes which have no parents (the root). #554

ffi: Move `SchemaTreeNode` into `SchemaTree`; Use an `optional` for the parent ID in the schema tree node to track nodes which have no parents (the root). #554

LinZhihao-723 commented Oct 10, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Oct 10, 2024 •

edited

Loading

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

coderabbitai bot left a comment

coderabbitai bot left a comment

coderabbitai bot left a comment

coderabbitai bot Oct 10, 2024

coderabbitai bot Oct 10, 2024

kirkrodrigues Oct 11, 2024

LinZhihao-723 Oct 11, 2024

coderabbitai bot left a comment

coderabbitai bot left a comment

kirkrodrigues left a comment

		if (optional_parent_id.value() != locator.get_parent_id()
		\|\| node.get_type() != locator.get_type() \|\| node.get_key_name() != locator.get_key_name())

ffi: Move SchemaTreeNode into SchemaTree; Use an optional for the parent ID in the schema tree node to track nodes which have no parents (the root). #554

ffi: Move SchemaTreeNode into SchemaTree; Use an optional for the parent ID in the schema tree node to track nodes which have no parents (the root). #554

Conversation

LinZhihao-723 commented Oct 10, 2024 • edited by coderabbitai bot Loading

Description

Validation performed

Summary by CodeRabbit

Summary by CodeRabbit

coderabbitai bot commented Oct 10, 2024 • edited Loading

Walkthrough

Changes

Possibly related PRs

Suggested reviewers

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Oct 10, 2024

Choose a reason for hiding this comment

coderabbitai bot Oct 10, 2024

Choose a reason for hiding this comment

kirkrodrigues Oct 11, 2024

Choose a reason for hiding this comment

LinZhihao-723 Oct 11, 2024

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

kirkrodrigues left a comment

Choose a reason for hiding this comment

ffi: Move `SchemaTreeNode` into `SchemaTree`; Use an `optional` for the parent ID in the schema tree node to track nodes which have no parents (the root). #554

ffi: Move `SchemaTreeNode` into `SchemaTree`; Use an `optional` for the parent ID in the schema tree node to track nodes which have no parents (the root). #554

LinZhihao-723 commented Oct 10, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Oct 10, 2024 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)