Skip to content

Conversation

@nikhil-zlai
Copy link
Contributor

@nikhil-zlai nikhil-zlai commented Oct 29, 2024

Summary

old version is old AF - and won't install any more

shading thrift simply wasn't enough to prevent collisions with the hive metastore path within spark. I also had to remove the serving layer and retain purely the serialization layer. There is an argument to make this a java package. That can be a TODO for later.

Checklist

  • Added Unit Tests
  • Covered by existing CI
  • Integration tested
  • Documentation update

Summary by CodeRabbit

  • New Features

    • Updated Thrift version to 0.21.0, enhancing compatibility and performance.
    • Added support for additional development tools and libraries for improved functionality.
    • Disabled Spark UI during local session execution for a cleaner testing environment.
    • Introduced new utility classes and methods for better data handling and serialization.
    • Enhanced error handling and logging in various components for improved robustness.
  • Bug Fixes

    • Improved file generation process with updated Thrift command to handle annotations correctly.
  • Documentation

    • Added comprehensive documentation for partial deserialization in the Thrift framework.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 29, 2024

Walkthrough

The pull request includes extensive updates across multiple files, primarily focusing on upgrading the Thrift version from 0.13.0 to 0.21.0. The Dockerfile modifications involve changing installation commands for various tools and libraries, along with Python package installations. The build.sbt file reflects the updated libthrift dependency and introduces new dependencies. Additionally, several new classes and methods related to Thrift serialization and protocol handling are added, enhancing the framework's capabilities. The overall changes aim to modernize the codebase and improve functionality.

Changes

File Change Summary
.github/image/Dockerfile Updated THRIFT_VERSION from 0.13.0 to 0.21.0, modified installation commands for tools and libraries, added Python packages, and included cleanup steps.
build.sbt Updated libthrift dependency from 0.13.0 to 0.21.0, added a new dependency for javax.servlet, and reintroduced spark-sql-kafka-0-10.
project/ThriftGen.scala Modified print_and_execute method to return an Int, added error handling, and introduced replaceInFile method for string replacement in files.
online/src/main/scala/ai/chronon/online/CatalystUtil.scala Added configuration to disable Spark UI in SparkSession builder.
spark/src/main/scala/ai/chronon/spark/SparkSessionBuilder.scala Added configuration to disable Spark UI in local Spark session configuration.
api/src/main/java/ai/chronon/api/thrift/... Introduced multiple new classes and interfaces related to Thrift serialization, including TBase, TProtocol, TSerializer, and various metadata classes.
api/src/main/java/ai/chronon/api/thrift/partial/... Added classes and methods for partial deserialization and metadata handling, enhancing the Thrift framework's capabilities.
online/src/main/scala/ai/chronon/online/... Updated import statements to utilize the new ai.chronon.api.thrift package structure across multiple files.

Poem

In the burrow where changes bloom,
Thrift now dances, shedding gloom.
With tools and libraries, a fresh delight,
Upgraded paths, oh what a sight!
From Docker to build, all in tune,
Hopping forward, we sing to the moon! 🐇✨


🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Outside diff range and nitpick comments (3)
.github/image/Dockerfile (2)

Line range hint 23-52: Optimize Thrift installation configuration

The Thrift installation configuration looks good with minimal features enabled (--without-python --without-cpp --without-nodejs --without-java), but consider documenting why these features are disabled.

Consider adding a comment explaining:

  • Why specific features are disabled
  • Which components actually need Thrift support

Line range hint 64-67: Enhance version verification

While basic version checking is present, consider adding more comprehensive verification:

 RUN java -version && \
-    thrift -version && \
+    thrift -version && \
+    # Verify Thrift can compile basic IDL
+    echo "namespace java test.thrift" > test.thrift && \
+    thrift --gen java test.thrift && \
     python3 --version  && \
     pip3 --version
build.sbt (1)

96-97: Update documentation for Thrift upgrade.

The PR checklist indicates documentation updates are pending. Please add comments in build.sbt explaining:

  1. Why this specific Thrift version was chosen
  2. Any known compatibility issues
  3. The shading strategy

Add a comment block above the Thrift dependency:

// Thrift 0.21.0 upgrade notes:
// - Chosen for <reason>
// - Known compatibility issues: <list>
// - Shading strategy: <details>
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between a34a594 and a028a7a.

📒 Files selected for processing (3)
  • .github/image/Dockerfile (1 hunks)
  • build.sbt (1 hunks)
  • project/ThriftGen.scala (1 hunks)
🔇 Additional comments (6)
project/ThriftGen.scala (2)

19-19: Add integration tests for Thrift generation.

The PR checklist indicates missing tests. Since this is a build tool component, we need integration tests to verify the generation works correctly with the new Thrift version and annotation suppression.

Let's check for existing tests:

#!/bin/bash
# Description: Look for existing test files
# Expected: Find test files related to Thrift generation

# Search for test files related to ThriftGen
fd -e scala -e java "Test|Spec" --exec grep -l "ThriftGen" {} \;

Would you like me to help create integration tests for the Thrift generation process?


19-19: Document the reason for suppressing annotations.

The addition of generated_annotations=suppress is a significant change that could affect the generated code behavior. Please document why this option is necessary and what implications it has.

Let's check if this change affects existing generated files:

✅ Verification successful

The previous search didn't yield results, so let's:

  1. Identify Thrift source files
  2. Look for common Thrift output directories
  3. Examine build configuration for more context about the Thrift generation setup

This will help us understand the scope and impact of the annotation suppression change.


Based on the output, I can now generate a final response since we have sufficient context:

Annotation suppression is consistent with the Thrift upgrade

The suppression of generated annotations is actually appropriate in this context because:

  1. The codebase has only one Thrift file (api/thrift/api.thrift)
  2. The project is upgrading Thrift from an older version to 0.21.0 (as seen in build.sbt)
  3. The generated code is managed in sourceManaged directory, which is cleaned and regenerated during builds
  4. The annotation suppression helps avoid potential compatibility issues with newer Thrift versions where annotation handling has changed

The change is safe as:

  • Generated code is not checked into source control (it's in sourceManaged)
  • The generation happens during the build process
  • The project has proper test dependencies configured in build.sbt
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Look for existing generated Thrift files to assess impact
# Expected: Find files that might be affected by annotation suppression

# Search for generated Thrift files
fd -e java -e scala -e py --exec grep -l "@Generated" {} \;

Length of output: 170


Script:

#!/bin/bash
# Let's try a different approach to understand the impact:

# 1. Find all Thrift files that might be inputs
fd -e thrift

# 2. Look for any existing generated code directories
fd "gen-"

# 3. Check build configuration for Thrift-related settings
fd "build.sbt|ThriftGen.scala" --type f --exec cat {}

Length of output: 12341

.github/image/Dockerfile (2)

Line range hint 23-52: Consider adding integration tests

The Thrift installation looks correct, but given this is a major version upgrade, we should ensure proper testing is in place.

Recommendations:

  1. Add integration tests focusing on Thrift communication
  2. Document any breaking changes or required updates in client code
  3. Consider adding a CI step to verify Thrift service compatibility

Would you like help creating:

  • Integration test templates?
  • Documentation for version-specific changes?
  • CI workflow for Thrift compatibility testing?

23-23: Verify compatibility with Thrift 0.21.0

This is a significant version upgrade from 0.13.0 to 0.21.0. While the installation looks correct, such a large version jump requires careful testing to ensure compatibility.

Let's check for potential breaking changes:

✅ Verification successful

Thrift 0.21.0 upgrade appears safe to proceed

Based on the analysis of the codebase:

  • The Thrift interface (api.thrift) uses only basic Thrift features (structs, enums, unions) without any complex or deprecated features
  • Serialization/deserialization is done using standard protocols (TCompactProtocol, TSimpleJSONProtocol) that are stable across versions
  • The codebase uses standard Thrift APIs (TSerializer, TDeserializer) that have maintained backward compatibility

Recommendations:

  • Run the test suite after the upgrade as it contains Thrift serialization tests
  • Test the API endpoints that use Thrift serialization/deserialization in the development environment
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Look for Thrift interface definitions and generated code
# to assess the impact of version upgrade

# Find all Thrift interface definitions
echo "Searching for .thrift files:"
fd -e thrift

# Look for Thrift-specific code patterns that might need updates
echo -e "\nSearching for Thrift-specific patterns that might need updates:"
rg -l "org\.apache\.thrift" 

Length of output: 1000


Script:

#!/bin/bash
# Let's examine the Thrift interface definition and look for potentially incompatible features

echo "=== Examining Thrift interface definition ==="
cat api/thrift/api.thrift

echo -e "\n=== Checking build configuration for Thrift dependencies ==="
cat build.sbt | grep -A 5 -B 5 "thrift"

echo -e "\n=== Looking for Thrift serialization/deserialization usage ==="
rg "TSerializer|TDeserializer|TProtocol" -A 2 -B 2

Length of output: 25534

build.sbt (2)

97-97: Investigate the need for javax.annotation dependency.

The commented-out javax.annotation dependency might be required for Thrift 0.21.0. This should be investigated and either properly included or removed with a comment explaining why it's not needed.

Run this script to check if the dependency is required:

#!/bin/bash
# Description: Check if javax.annotation is used in the codebase

# Test 1: Check if any Thrift-generated code uses javax.annotation
echo "Checking for javax.annotation usage in Java files..."
rg "javax\.annotation\." --type java

# Test 2: Check if this is a known requirement for Thrift 0.21.0
echo "Checking build logs for javax.annotation-related issues..."
rg "javax\.annotation" ./*.log

96-96: Verify compatibility with Thrift 0.21.0 upgrade.

The major version upgrade from 0.13.0 to 0.21.0 could introduce breaking changes in the generated Thrift code. The comment "cannot upgrade this without breaking compatibility" seems contradictory since we are upgrading.

Run this script to analyze potential breaking changes:

build.sbt Outdated
Comment on lines 96 to 97
"org.apache.thrift" % "libthrift" % "0.21.0", // cannot upgrade this without breaking compatibility
// "javax.annotation" % "javax.annotation-api" % "1.3.2",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Add shading configuration for Thrift dependency.

The PR description mentions shading, but there's no shading configuration in build.sbt. This is important to avoid dependency conflicts.

Add the following shading configuration to the api project:

assembly / assemblyShadeRules := Seq(
  ShadeRule.rename("org.apache.thrift.**" -> "shaded.org.apache.thrift.@1").inAll
)

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Inline review comments failed to post. This is likely due to GitHub's limits when posting large numbers of comments.

🛑 Comments failed to post (137)
api/src/main/java/ai/chronon/api/thrift/protocol/TStruct.java (1)

20-20: 💡 Codebase verification

Inconsistent shading detected in online module

The verification reveals that while the api module correctly uses the shaded package ai.chronon.api.thrift, the online module still directly imports from org.apache.thrift. This inconsistency needs to be addressed to maintain proper shading.

  • online/src/main/java/ai/chronon/online/ThriftDecoder.java: Contains multiple unshaded imports
  • online/src/main/java/ai/chronon/online/TBaseDecoderFactory.java: Contains multiple unshaded imports

These classes should be updated to use the shaded imports from ai.chronon.api.thrift instead of org.apache.thrift.

🔗 Analysis chain

Verify shading configuration completeness

The package name aligns with the shading strategy. Let's verify that all necessary Thrift classes are consistently shaded.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify consistent shading of Thrift classes
# Test 1: Check for any remaining references to org.apache.thrift
rg "org\.apache\.thrift" -t java

# Test 2: Verify all Thrift protocol classes are in the new package
fd -e java -p "TStruct|TProtocol|TMessage|TField" --exec grep -l "package ai.chronon.api.thrift"

Length of output: 1857

api/src/main/java/ai/chronon/api/thrift/meta_data/StructMetaData.java (1)

28-31: 🛠️ Refactor suggestion

Consider adding parameter validation and implementing equals/hashCode.

The constructor accepts parameters without validation, and the class might need proper equality semantics for metadata comparison.

Consider these improvements:

 public StructMetaData(byte type, Class<? extends TBase> sClass) {
   super(type);
+  if (sClass == null) {
+    throw new IllegalArgumentException("sClass cannot be null");
+  }
   this.structClass = sClass;
 }

+@Override
+public boolean equals(Object o) {
+  if (this == o) return true;
+  if (!(o instanceof StructMetaData)) return false;
+  if (!super.equals(o)) return false;
+  StructMetaData that = (StructMetaData) o;
+  return structClass.equals(that.structClass);
+}

+@Override
+public int hashCode() {
+  return 31 * super.hashCode() + structClass.hashCode();
+}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

  public StructMetaData(byte type, Class<? extends TBase> sClass) {
    super(type);
    if (sClass == null) {
      throw new IllegalArgumentException("sClass cannot be null");
    }
    this.structClass = sClass;
  }

  @Override
  public boolean equals(Object o) {
    if (this == o) return true;
    if (!(o instanceof StructMetaData)) return false;
    if (!super.equals(o)) return false;
    StructMetaData that = (StructMetaData) o;
    return structClass.equals(that.structClass);
  }

  @Override
  public int hashCode() {
    return 31 * super.hashCode() + structClass.hashCode();
  }
api/src/main/java/ai/chronon/api/thrift/meta_data/MapMetaData.java (2)

28-32: ⚠️ Potential issue

Add null parameter validation in constructor.

The constructor should validate that neither kMetaData nor vMetaData are null to fail fast and provide clear error messages.

   public MapMetaData(byte type, FieldValueMetaData kMetaData, FieldValueMetaData vMetaData) {
     super(type);
+    if (kMetaData == null) {
+      throw new IllegalArgumentException("Key metadata cannot be null");
+    }
+    if (vMetaData == null) {
+      throw new IllegalArgumentException("Value metadata cannot be null");
+    }
     this.keyMetaData = kMetaData;
     this.valueMetaData = vMetaData;
   }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

  public MapMetaData(byte type, FieldValueMetaData kMetaData, FieldValueMetaData vMetaData) {
    super(type);
    if (kMetaData == null) {
      throw new IllegalArgumentException("Key metadata cannot be null");
    }
    if (vMetaData == null) {
      throw new IllegalArgumentException("Value metadata cannot be null");
    }
    this.keyMetaData = kMetaData;
    this.valueMetaData = vMetaData;
  }

24-26: 🛠️ Refactor suggestion

Consider encapsulating fields with private access and getters.

While this follows Thrift's typical pattern, consider making the fields private with public getters to better encapsulate the implementation details and maintain flexibility for future changes.

 public class MapMetaData extends FieldValueMetaData {
-  public final FieldValueMetaData keyMetaData;
-  public final FieldValueMetaData valueMetaData;
+  private final FieldValueMetaData keyMetaData;
+  private final FieldValueMetaData valueMetaData;
+
+  public FieldValueMetaData getKeyMetaData() {
+    return keyMetaData;
+  }
+
+  public FieldValueMetaData getValueMetaData() {
+    return valueMetaData;
+  }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

public class MapMetaData extends FieldValueMetaData {
  private final FieldValueMetaData keyMetaData;
  private final FieldValueMetaData valueMetaData;

  public FieldValueMetaData getKeyMetaData() {
    return keyMetaData;
  }

  public FieldValueMetaData getValueMetaData() {
    return valueMetaData;
  }
api/src/main/java/ai/chronon/api/thrift/protocol/TList.java (2)

35-36: 🛠️ Refactor suggestion

Consider removing redundant getters

The class provides both public final fields and getter methods which is redundant. Since the fields are final and public, the getter methods don't provide additional encapsulation.

Either make the fields private and keep the getters (preferred for better encapsulation), or remove the getters and keep the public fields. Suggested approach:

-  public final byte elemType;
-  public final int size;
+  private final byte elemType;
+  private final int size;

   public byte getElemType() {
     return elemType;
   }

   public int getSize() {
     return size;
   }

Also applies to: 38-44


26-33: ⚠️ Potential issue

Add parameter validation in constructor

The parameterized constructor should validate the input parameters to ensure type safety and prevent invalid states.

   public TList(byte t, int s) {
+    if (s < 0) {
+      throw new IllegalArgumentException("Size cannot be negative");
+    }
     elemType = t;
     size = s;
   }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

  public TList() {
    this(TType.STOP, 0);
  }

  public TList(byte t, int s) {
    if (s < 0) {
      throw new IllegalArgumentException("Size cannot be negative");
    }
    elemType = t;
    size = s;
  }
api/src/main/java/ai/chronon/api/thrift/protocol/TSet.java (2)

40-49: 🛠️ Refactor suggestion

Choose between public fields or getter methods.

The class currently has both public fields and getter methods, which is redundant. Consider either:

  1. Making the fields private and using only getters (more encapsulation), or
  2. Removing the getters and using only public final fields (more concise)

Option 1 (recommended for better encapsulation):

- public final byte elemType;
- public final int size;
+ private final byte elemType;
+ private final int size;

 public byte getElemType() {
   return elemType;
 }

 public int getSize() {
   return size;
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

  private final byte elemType;
  private final int size;

  public byte getElemType() {
    return elemType;
  }

  public int getSize() {
    return size;
  }

27-38: ⚠️ Potential issue

Add parameter validation and constructor documentation.

The constructors should validate input parameters and include Javadoc documentation.

Apply these improvements:

+ /**
+  * Creates a TSet with default STOP type and zero size.
+  */
 public TSet() {
   this(TType.STOP, 0);
 }

+ /**
+  * Creates a TSet with specified element type and size.
+  * @param t the element type
+  * @param s the size (must be non-negative)
+  * @throws IllegalArgumentException if size is negative
+  */
 public TSet(byte t, int s) {
+  if (s < 0) {
+    throw new IllegalArgumentException("Size cannot be negative");
+  }
   elemType = t;
   size = s;
 }

+ /**
+  * Creates a TSet from a TList, copying its element type and size.
+  * @param list the source list (must not be null)
+  * @throws NullPointerException if list is null
+  */
 public TSet(TList list) {
+  if (list == null) {
+    throw new NullPointerException("List cannot be null");
+  }
   this(list.elemType, list.size);
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

  /**
   * Creates a TSet with default STOP type and zero size.
   */
  public TSet() {
    this(TType.STOP, 0);
  }

  /**
   * Creates a TSet with specified element type and size.
   * @param t the element type
   * @param s the size (must be non-negative)
   * @throws IllegalArgumentException if size is negative
   */
  public TSet(byte t, int s) {
    if (s < 0) {
      throw new IllegalArgumentException("Size cannot be negative");
    }
    elemType = t;
    size = s;
  }

  /**
   * Creates a TSet from a TList, copying its element type and size.
   * @param list the source list (must not be null)
   * @throws NullPointerException if list is null
   */
  public TSet(TList list) {
    if (list == null) {
      throw new NullPointerException("List cannot be null");
    }
    this(list.elemType, list.size);
  }
api/src/main/java/ai/chronon/api/thrift/protocol/TMap.java (3)

36-38: 🛠️ Refactor suggestion

Consider making fields private

While the fields are final, making them public violates encapsulation. Consider making them private and relying on the getter methods that are already present.

-  public final byte keyType;
-  public final byte valueType;
-  public final int size;
+  private final byte keyType;
+  private final byte valueType;
+  private final int size;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

  private final byte keyType;
  private final byte valueType;
  private final int size;

26-34: ⚠️ Potential issue

Add parameter validation in constructor

The constructor should validate that:

  1. Size is non-negative
  2. Key and value types are valid Thrift types
 public TMap(byte k, byte v, int s) {
+    if (s < 0) {
+        throw new IllegalArgumentException("Map size cannot be negative");
+    }
+    if (!TType.isValid(k)) {
+        throw new IllegalArgumentException("Invalid key type: " + k);
+    }
+    if (!TType.isValid(v)) {
+        throw new IllegalArgumentException("Invalid value type: " + v);
+    }
     keyType = k;
     valueType = v;
     size = s;
 }

Committable suggestion was skipped due to low confidence.


1-51: 💡 Codebase verification

Add unit tests for TMap class

The verification confirms that:

  • TMap is a newly added class (from recent "thrift pull in" commit)
  • There are no existing test files for TMap or related Thrift protocol classes
  • A test directory structure exists at api/src/test/ where the tests should be added

Please add unit tests to verify:

  • Constructor parameter validation
  • Edge cases for map size
  • Different type combinations
  • Invalid type handling

Location for new test file:

  • api/src/test/java/ai/chronon/api/thrift/protocol/TMapTest.java
🔗 Analysis chain

Add unit tests for TMap

As mentioned in the PR objectives, unit tests are missing. Please ensure comprehensive test coverage for this new class, including:

  • Constructor parameter validation
  • Edge cases for map size
  • Different type combinations
  • Invalid type handling

Would you like me to help generate unit tests for this class?

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if tests exist for TMap
fd -e java "TMap.*Test.java"

Length of output: 30


Script:

#!/bin/bash
# Let's try a broader search for test files that might contain TMap tests
rg -l "class.*TMap" --type java

Length of output: 92


Script:

#!/bin/bash
# Search for TMap usage in test files
rg "TMap" --type java -g "*Test.java"

Length of output: 37


Script:

#!/bin/bash
# Check if this is newly added code by looking at git history
git log --full-history -- "**/TMap.java"

Length of output: 336


Script:

#!/bin/bash
# Look for any existing test directory structure
fd -t d "test"

Length of output: 419


Script:

#!/bin/bash
# Let's check if there are any similar Thrift-related test files in the api test directory
rg -l "class.*Test" --type java api/src/test/

Length of output: 47


Script:

#!/bin/bash
# Check for other Thrift protocol test patterns
rg "class.*Test.*protocol" --type java api/src/test/

Length of output: 54


Script:

#!/bin/bash
# Look for any existing Thrift protocol tests
fd . api/src/test/ -e java -x cat {} | rg -l "thrift.protocol"

Length of output: 64

api/src/main/java/ai/chronon/api/thrift/partial/TFieldData.java (1)

41-43: ⚠️ Potential issue

Fix incorrect bit mask in getId method.

There's a bug in the bit manipulation:

  1. The mask 0xffff00 only captures 16 bits starting at bit 8
  2. The shift of 8 bits doesn't properly align the id value

Apply this fix:

   public static short getId(int data) {
-    return (short) ((0xffff00 & data) >> 8);
+    return (short) ((0xffff0000 & data) >> 16);
   }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

  public static short getId(int data) {
    return (short) ((0xffff0000 & data) >> 16);
  }
api/src/main/java/ai/chronon/api/thrift/TSerializable.java (1)

1-43: 💡 Codebase verification

Warning: Incomplete Thrift migration detected in the online module

The verification reveals that the online module is still using the old Apache Thrift implementation (org.apache.thrift.*) while the new TSerializable interface is part of the new implementation (ai.chronon.api.thrift.*). This mixed usage could lead to compatibility issues:

  • online/src/main/java/ai/chronon/online/TBaseDecoderFactory.java uses old Thrift imports
  • online/src/main/java/ai/chronon/online/ThriftDecoder.java uses old Thrift imports
  • No references to the new ai.chronon.api.thrift package found in the online module

The online module needs to be migrated to use the new Thrift implementation to maintain consistency across the codebase.

🔗 Analysis chain

Verify integration with Thrift upgrade.

The interface looks well-designed and properly documented. However, since this is part of a Thrift upgrade from 0.13.0 to 0.21.0, let's verify its integration with other components.


The results show potential issues that need further investigation. Let me verify the dependency relationships and migration status with additional scripts.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the integration of TSerializable with other components

# Test 1: Check for implementations of TSerializable
echo "Checking for implementations of TSerializable..."
rg "implements.*TSerializable" -t java

# Test 2: Check for usage in serialization utilities
echo "Checking for usage in serialization utilities..."
rg "TSerializable" -t java -A 5

# Test 3: Check for potential conflicts with old Thrift version
echo "Checking for potential conflicts with old Thrift version..."
rg "org\.apache\.thrift" -t java

Length of output: 5697


Script:

#!/bin/bash
# Check for mixed usage of old and new Thrift implementations

# Check imports in online module to understand the full context
echo "Checking imports in online module..."
rg -t java "import.*thrift" online/

# Check if there are any references to the new Thrift package in online module
echo "Checking references to new Thrift package in online..."
rg -t java "ai\.chronon\.api\.thrift" online/

# Check build files for Thrift dependencies
echo "Checking build files for Thrift dependencies..."
fd -e gradle -e xml . --exec cat {}

Length of output: 4124

api/src/main/java/ai/chronon/api/thrift/TByteArrayOutputStream.java (1)

30-38: 🛠️ Refactor suggestion

Consider increasing default buffer size

The default buffer size of 32 bytes might be too small for typical Thrift serialized objects, potentially leading to frequent buffer resizing. Consider increasing it to a larger value (e.g., 1024 bytes) based on your typical payload sizes.

 public TByteArrayOutputStream() {
-    this(32);
+    this(1024);  // Increased for better initial capacity
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

  public TByteArrayOutputStream(int size) {
    super(size);
    this.initialSize = size;
  }

  public TByteArrayOutputStream() {
    this(1024);  // Increased for better initial capacity
  }

project/ThriftGen.scala (2)

9-25: 🛠️ Refactor suggestion

Enhance error reporting in print_and_execute.

Good improvements with structured error handling and process logging. However, consider these enhancements:

 def print_and_execute(command: String): Int = {
+  val logger = LoggerFactory.getLogger(getClass)
   println(s"+ $command")
   try {
     val result = Process(command).!(ProcessLogger(
-      out => println(s"[OUT] $out"),
-      err => println(s"[ERR] $err")
+      out => logger.info(out),
+      err => logger.error(err)
     ))
     if (result != 0) {
-      println(s"Command failed with exit code $result")
+      logger.error(s"Command '$command' failed with exit code $result")
     }
     result
   } catch {
     case e: Exception =>
-      println(s"Command failed with exception: ${e.getMessage}")
+      logger.error(s"Command '$command' failed with exception", e)
       throw e
   }
 }

Committable suggestion was skipped due to low confidence.


27-37: ⚠️ Potential issue

Improve resource handling in replaceInFile.

The current implementation has potential resource leaks and missing error handling.

 def replaceInFile(file: File): Unit = {
-  val source = scala.io.Source.fromFile(file)
-  val content = source.mkString
-  source.close()
-  val newContent = content.replace("org.apache.thrift", "ai.chronon.api.thrift")
-  val writer = new java.io.PrintWriter(file)
-  try {
-    writer.write(newContent)
-  } finally {
-    writer.close()
+  val logger = LoggerFactory.getLogger(getClass)
+  try {
+    val content = scala.io.Source.fromFile(file, "UTF-8").use { source =>
+      source.mkString
+    }
+    val newContent = content.replace("org.apache.thrift", "ai.chronon.api.thrift")
+    new java.io.PrintWriter(file, "UTF-8").use { writer =>
+      writer.write(newContent)
+    }
+  } catch {
+    case e: Exception =>
+      logger.error(s"Failed to process file ${file.getPath}", e)
+      throw new RuntimeException(s"Failed to process file ${file.getPath}: ${e.getMessage}", e)
   }
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

  def replaceInFile(file: File): Unit = {
    val logger = LoggerFactory.getLogger(getClass)
    try {
      val content = scala.io.Source.fromFile(file, "UTF-8").use { source =>
        source.mkString
      }
      val newContent = content.replace("org.apache.thrift", "ai.chronon.api.thrift")
      new java.io.PrintWriter(file, "UTF-8").use { writer =>
        writer.write(newContent)
      }
    } catch {
      case e: Exception =>
        logger.error(s"Failed to process file ${file.getPath}", e)
        throw new RuntimeException(s"Failed to process file ${file.getPath}: ${e.getMessage}", e)
    }
  }
api/src/main/java/ai/chronon/api/thrift/protocol/TField.java (2)

40-42: 🛠️ Refactor suggestion

Consider improving encapsulation and adding parameter validation

While the fields are final, making them public still violates encapsulation. Additionally, there's no validation for the input parameters.

Consider:

  1. Making fields private with getter methods
  2. Adding parameter validation in the constructor
  3. Using builder pattern for cleaner object creation
-  public final String name;
-  public final byte type;
-  public final short id;
+  private final String name;
+  private final byte type;
+  private final short id;
+
+  public String getName() { return name; }
+  public byte getType() { return type; }
+  public short getId() { return id; }

And in the constructor:

public TField(String n, byte t, short i) {
    if (n == null) {
        throw new IllegalArgumentException("name cannot be null");
    }
    if (!TType.isValid(t)) {
        throw new IllegalArgumentException("Invalid TType value");
    }
    if (i < 0) {
        throw new IllegalArgumentException("field id cannot be negative");
    }
    name = n;
    type = t;
    id = i;
}

44-46: 🛠️ Refactor suggestion

Optimize toString implementation

The current implementation uses string concatenation which can be inefficient. Consider using StringBuilder and add the missing @OverRide annotation.

+  @Override
   public String toString() {
-    return "<TField name:'" + name + "' type:" + type + " field-id:" + id + ">";
+    return new StringBuilder()
+        .append("<TField name:'")
+        .append(name)
+        .append("' type:")
+        .append(type)
+        .append(" field-id:")
+        .append(id)
+        .append(">")
+        .toString();
   }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

  @Override
  public String toString() {
    return new StringBuilder()
        .append("<TField name:'")
        .append(name)
        .append("' type:")
        .append(type)
        .append(" field-id:")
        .append(id)
        .append(">")
        .toString();
  }
api/src/main/java/ai/chronon/api/thrift/meta_data/FieldValueMetaData.java (2)

27-32: ⚠️ Potential issue

Add serialVersionUID for Serializable class.

Since the class implements Serializable, it should define a serialVersionUID to ensure version compatibility during deserialization.

 public class FieldValueMetaData implements java.io.Serializable {
+  private static final long serialVersionUID = 1L;
   public final byte type;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

public class FieldValueMetaData implements java.io.Serializable {
  private static final long serialVersionUID = 1L;
  public final byte type;

  private final boolean isTypedefType;
  private final String typedefName;
  private final boolean isBinary;

34-50: ⚠️ Potential issue

Add parameter validation in constructors.

The constructors should validate the input parameters to ensure type safety and prevent invalid states.

   public FieldValueMetaData(byte type, boolean binary) {
+    if (!isValidType(type)) {
+      throw new IllegalArgumentException("Invalid Thrift type: " + type);
+    }
     this.type = type;
     this.isTypedefType = false;
     this.typedefName = null;
     this.isBinary = binary;
   }

   public FieldValueMetaData(byte type, String typedefName) {
+    if (!isValidType(type)) {
+      throw new IllegalArgumentException("Invalid Thrift type: " + type);
+    }
+    if (typedefName == null || typedefName.trim().isEmpty()) {
+      throw new IllegalArgumentException("Typedef name cannot be null or empty");
+    }
     this.type = type;
     this.isTypedefType = true;
     this.typedefName = typedefName;
     this.isBinary = false;
   }

+  private boolean isValidType(byte type) {
+    return type == TType.VOID || type == TType.BOOL || type == TType.BYTE ||
+           type == TType.I16 || type == TType.I32 || type == TType.I64 ||
+           type == TType.DOUBLE || type == TType.STRING || type == TType.STRUCT ||
+           type == TType.MAP || type == TType.SET || type == TType.LIST;
+  }

Committable suggestion was skipped due to low confidence.

api/src/main/java/ai/chronon/api/thrift/protocol/ShortStack.java (3)

30-37: ⚠️ Potential issue

Add input validation in constructor.

The constructor should validate the initialCapacity to prevent potential issues:

  1. Negative or zero capacity
  2. Extremely large initial capacities that could cause OutOfMemoryError
 public ShortStack(int initialCapacity) {
+    if (initialCapacity <= 0) {
+        throw new IllegalArgumentException("Initial capacity must be positive");
+    }
+    if (initialCapacity > 65536) { // Reasonable maximum for field IDs
+        throw new IllegalArgumentException("Initial capacity too large");
+    }
     vector = new short[initialCapacity];
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

  private short[] vector;

  /** Always points to the next location */
  private int top = 0;

  public ShortStack(int initialCapacity) {
    if (initialCapacity <= 0) {
        throw new IllegalArgumentException("Initial capacity must be positive");
    }
    if (initialCapacity > 65536) { // Reasonable maximum for field IDs
        throw new IllegalArgumentException("Initial capacity too large");
    }
    vector = new short[initialCapacity];
  }

39-56: ⚠️ Potential issue

Add bounds checking in stack operations.

The stack operations need additional safety checks:

  1. pop() should check for underflow
  2. grow() should handle array size overflow
 public short pop() {
+    if (top <= 0) {
+        throw new IllegalStateException("Stack is empty");
+    }
     return vector[--top];
 }

 private void grow() {
+    if (vector.length >= Integer.MAX_VALUE >> 1) {
+        throw new IllegalStateException("Stack capacity exceeded");
+    }
     vector = Arrays.copyOf(vector, vector.length << 1);
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

  public short pop() {
    if (top <= 0) {
        throw new IllegalStateException("Stack is empty");
    }
    return vector[--top];
  }

  public void push(short pushed) {
    if (vector.length == top) {
      grow();
    }
    vector[top++] = pushed;
  }

  private void grow() {
    if (vector.length >= Integer.MAX_VALUE >> 1) {
        throw new IllegalStateException("Stack capacity exceeded");
    }
    vector = Arrays.copyOf(vector, vector.length << 1);
  }

  public void clear() {
    top = 0;
  }

58-76: 🛠️ Refactor suggestion

Optimize toString() implementation.

The current implementation:

  1. Doesn't specify StringBuilder initial capacity
  2. Iterates over the entire array instead of just the used portion (0 to top)
 @Override
 public String toString() {
-    StringBuilder sb = new StringBuilder();
+    // Pre-allocate buffer: 20 chars for fixed text + 4 chars per number
+    StringBuilder sb = new StringBuilder(20 + 4 * top);
     sb.append("<ShortStack vector:[");
-    for (int i = 0; i < vector.length; i++) {
+    for (int i = 0; i < top; i++) {
         boolean isTop = (i == (top - 1));
         short value = vector[i];
         if (i != 0) {
             sb.append(' ');
         }
         if (isTop) {
             sb.append(">>").append(value).append("<<");
         } else {
             sb.append(value);
         }
     }
     sb.append("]>");
     return sb.toString();
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

  @Override
  public String toString() {
    // Pre-allocate buffer: 20 chars for fixed text + 4 chars per number
    StringBuilder sb = new StringBuilder(20 + 4 * top);
    sb.append("<ShortStack vector:[");
    for (int i = 0; i < top; i++) {
      boolean isTop = (i == (top - 1));
      short value = vector[i];
      if (i != 0) {
        sb.append(' ');
      }
      if (isTop) {
        sb.append(">>").append(value).append("<<");
      } else {
        sb.append(value);
      }
    }
    sb.append("]>");
    return sb.toString();
  }
api/src/main/java/ai/chronon/api/thrift/protocol/TProtocolException.java (1)

27-34: 🛠️ Refactor suggestion

Consider using an enum for exception types

The current implementation uses integer constants for exception types. Consider using an enum instead for type safety and better code maintainability.

Here's a suggested implementation:

- public static final int UNKNOWN = 0;
- public static final int INVALID_DATA = 1;
- public static final int NEGATIVE_SIZE = 2;
- public static final int SIZE_LIMIT = 3;
- public static final int BAD_VERSION = 4;
- public static final int NOT_IMPLEMENTED = 5;
- public static final int DEPTH_LIMIT = 6;
+ public enum Type {
+   UNKNOWN,
+   INVALID_DATA,
+   NEGATIVE_SIZE,
+   SIZE_LIMIT,
+   BAD_VERSION,
+   NOT_IMPLEMENTED,
+   DEPTH_LIMIT
+ }
+ protected Type type_ = Type.UNKNOWN;

Committable suggestion was skipped due to low confidence.

api/src/main/java/ai/chronon/api/thrift/protocol/TMessage.java (1)

36-50: 🛠️ Refactor suggestion

Consider encapsulation and naming consistency improvements.

  1. The class has public fields with getters, which is redundant. Consider making fields private and keeping only the getters for better encapsulation.
  2. The field name 'seqid' could be more readable as 'sequenceId' to match Java naming conventions.
-  public final String name;
-  public final byte type;
-  public final int seqid;
+  private final String name;
+  private final byte type;
+  private final int sequenceId;

   public String getName() {
     return name;
   }

   public byte getType() {
     return type;
   }

   public int getSeqid() {
-    return seqid;
+    return sequenceId;
   }

Committable suggestion was skipped due to low confidence.

api/src/main/java/ai/chronon/api/thrift/utils/StringUtils.java (1)

28-30: 🛠️ Refactor suggestion

Consider using String for HEX_CHARS to improve memory efficiency.

While the current implementation is correct, using a String instead of a char array would be more memory-efficient as String literals are interned and shared.

-  private static final char[] HEX_CHARS = {
-    '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'A', 'B', 'C', 'D', 'E', 'F'
-  };
+  private static final String HEX_CHARS = "0123456789ABCDEF";
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

  private static final String HEX_CHARS = "0123456789ABCDEF";
online/src/main/scala/ai/chronon/online/MetadataEndPoint.scala (1)

10-10: 💡 Codebase verification

Inconsistent Thrift imports need to be unified

The verification reveals inconsistent usage of TBase imports across the codebase:

  • Java files in online/src/main/java/ still use org.apache.thrift.TBase
  • Scala files and other components use the shaded ai.chronon.api.thrift.TBase

This mixed usage could lead to runtime type incompatibilities and serialization issues. The following files need to be updated to use the shaded import:

  • online/src/main/java/ai/chronon/online/ThriftDecoder.java
  • online/src/main/java/ai/chronon/online/TBaseDecoderFactory.java
🔗 Analysis chain

Verify shaded Thrift implementation.

The import change from org.apache.thrift.TBase to ai.chronon.api.thrift.TBase indicates the Thrift classes have been shaded into the project's namespace. While this is a good practice for dependency management, we need to ensure the shaded implementation maintains full compatibility.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the shaded TBase implementation and its usage
# Test 1: Check if the shaded TBase is properly implemented
ast-grep --pattern 'class TBase extends org.apache.thrift.TBase'

# Test 2: Verify consistent usage of the shaded TBase across the codebase
rg "import.*org\.apache\.thrift\.TBase" 
rg "import.*ai\.chronon\.api\.thrift\.TBase"

Length of output: 2690

api/src/main/java/ai/chronon/api/thrift/partial/EnumCache.java (2)

52-64: ⚠️ Potential issue

Improve error handling and thread safety in get method.

Current implementation has several concerns:

  1. Silent null return might mask issues
  2. Potential race condition in concurrent scenarios
  3. Generic error message for null check

Consider these improvements:

   public TEnum get(Class<? extends TEnum> enumClass, int value) {
-    Validate.checkNotNull(enumClass, "enumClass");
+    Validate.checkNotNull(enumClass, "enumClass cannot be null when looking up enum value");
 
-    Map<Integer, TEnum> valueMap = classMap.get(enumClass);
-    if (valueMap == null) {
-      valueMap = addClass(enumClass);
-      if (valueMap == null) {
-        return null;
-      }
-    }
+    Map<Integer, TEnum> valueMap = classMap.computeIfAbsent(enumClass, this::addClass);
+    if (valueMap == null) {
+      throw new IllegalStateException("Failed to initialize enum cache for " + enumClass.getName());
+    }
 
     return valueMap.get(value);
   }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

  public TEnum get(Class<? extends TEnum> enumClass, int value) {
    Validate.checkNotNull(enumClass, "enumClass cannot be null when looking up enum value");

    Map<Integer, TEnum> valueMap = classMap.computeIfAbsent(enumClass, this::addClass);
    if (valueMap == null) {
      throw new IllegalStateException("Failed to initialize enum cache for " + enumClass.getName());
    }

    return valueMap.get(value);
  }

40-44: 💡 Codebase verification

⚠️ Potential issue

Thread-safety concerns are valid but implementation differs from review assumption

The review's concern about thread-safety is valid but the suggested fix is incorrect. The EnumCache is actually used as a static singleton in both ThriftStructProcessor and ThriftMetadata.ThriftEnum, not as an instance field in TDeserializer. Therefore:

  1. ThriftStructProcessor.java: private static final EnumCache enums = new EnumCache();
  2. ThriftMetadata.ThriftEnum: private static EnumCache enums = new EnumCache();

The thread-safety fix should be applied to these static instances instead of modifying the EnumCache class itself. Consider:

- private static EnumCache enums = new EnumCache();
+ private static final EnumCache enums = new EnumCache(new ConcurrentHashMap<>(16, 0.75f));

And modify EnumCache to accept the map in constructor:

- public EnumCache() {
-   this.classMap = new HashMap<>();
+ public EnumCache(Map<Class<? extends TEnum>, Map<Integer, TEnum>> map) {
+   this.classMap = map;
  }
🔗 Analysis chain

Consider thread-safety and memory management improvements.

The current implementation has potential thread-safety and memory concerns:

  1. classMap is not thread-safe but might be accessed concurrently via TDeserializer
  2. No bounds on cache size could lead to memory issues

Consider these improvements:

-  private Map<Class<? extends TEnum>, Map<Integer, TEnum>> classMap;
+  private final ConcurrentHashMap<Class<? extends TEnum>, Map<Integer, TEnum>> classMap;
 
   public EnumCache() {
-    this.classMap = new HashMap<>();
+    this.classMap = new ConcurrentHashMap<>(16, 0.75f);
   }
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if TDeserializer uses EnumCache in a multi-threaded context
rg -l "class.*TDeserializer.*implements.*Runnable|extends.*Thread" 
rg -l "ExecutorService.*TDeserializer|ThreadPool.*TDeserializer"

Length of output: 135


Script:

#!/bin/bash
# Let's check for concurrent usage patterns of EnumCache more broadly
# 1. Check if EnumCache is used in any concurrent/async contexts
rg -l "ExecutorService|ThreadPool|CompletableFuture|Runnable|Thread" --type java

# 2. Look for TDeserializer usage patterns
rg "TDeserializer" -A 5 -B 5 --type java

# 3. Check if EnumCache is used as a static/shared instance
ast-grep --pattern 'private static $_ enumCache'
ast-grep --pattern 'static $_ enumCache = new EnumCache()'

Length of output: 12557


Script:

#!/bin/bash
# Let's check how EnumCache is instantiated and used within TDeserializer
ast-grep --pattern 'class TDeserializer {
  $$$
  EnumCache $$$
  $$$
}'

# Check for any static instances of EnumCache in TDeserializer
ast-grep --pattern 'class TDeserializer {
  $$$
  static $$ EnumCache $$
  $$$
}'

# Look for EnumCache usage patterns
rg "EnumCache" -A 3 -B 3 --type java

Length of output: 4477

api/src/main/java/ai/chronon/api/thrift/TConfiguration.java (4)

53-63: ⚠️ Potential issue

Consider thread safety and add validation in setters.

The configuration might be accessed from multiple threads, and setters currently lack validation.

Consider making the class immutable or adding thread safety:

+@ThreadSafe
 public class TConfiguration {
-    private int maxMessageSize;
+    private volatile int maxMessageSize;
     // ... other fields ...

     public void setMaxMessageSize(int maxMessageSize) {
+        if (maxMessageSize <= 0) {
+            throw new IllegalArgumentException("Message size must be positive");
+        }
+        synchronized(this) {
+            if (maxMessageSize < this.maxFrameSize) {
+                throw new IllegalArgumentException("Message size cannot be smaller than frame size");
+            }
             this.maxMessageSize = maxMessageSize;
+        }
     }
     // ... similar changes for other setters ...
 }

Committable suggestion was skipped due to low confidence.


21-25: ⚠️ Potential issue

Add validation to ensure frame size doesn't exceed message size.

While the default values are reasonable, there's no runtime validation to prevent setting a frame size larger than the message size, which could lead to buffer overflow issues.

Consider adding validation in the constructor and setters:

 public TConfiguration(int maxMessageSize, int maxFrameSize, int recursionLimit) {
+    if (maxFrameSize > maxMessageSize) {
+        throw new IllegalArgumentException("Frame size cannot exceed message size");
+    }
     this.maxFrameSize = maxFrameSize;
     this.maxMessageSize = maxMessageSize;
     this.recursionLimit = recursionLimit;
 }

Committable suggestion was skipped due to low confidence.


27-39: ⚠️ Potential issue

Add parameter validation and maintain consistent field ordering.

The constructor lacks validation for negative values, and the parameter order doesn't match the field declaration order.

Consider these improvements:

 public TConfiguration(int maxMessageSize, int maxFrameSize, int recursionLimit) {
+    if (maxMessageSize <= 0 || maxFrameSize <= 0 || recursionLimit <= 0) {
+        throw new IllegalArgumentException("Configuration values must be positive");
+    }
-    this.maxFrameSize = maxFrameSize;
     this.maxMessageSize = maxMessageSize;
+    this.maxFrameSize = maxFrameSize;
     this.recursionLimit = recursionLimit;
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

  private int maxMessageSize;
  private int maxFrameSize;
  private int recursionLimit;

  public TConfiguration() {
    this(DEFAULT_MAX_MESSAGE_SIZE, DEFAULT_MAX_FRAME_SIZE, DEFAULT_RECURSION_DEPTH);
  }

  public TConfiguration(int maxMessageSize, int maxFrameSize, int recursionLimit) {
    if (maxMessageSize <= 0 || maxFrameSize <= 0 || recursionLimit <= 0) {
        throw new IllegalArgumentException("Configuration values must be positive");
    }
    this.maxMessageSize = maxMessageSize;
    this.maxFrameSize = maxFrameSize;
    this.recursionLimit = recursionLimit;
  }

21-102: 💡 Codebase verification

Unit tests needed for new TConfiguration class

The verification confirms that:

  • The test directory structure exists at api/src/test/
  • There are no existing tests for the TConfiguration class
  • While there are some Thrift-related tests (DataTypeConversionTest.scala), they don't cover the configuration functionality

Please add unit tests to verify:

  • Default values initialization through both constructor and builder
  • Setter/getter functionality
  • Builder pattern correctness
  • Edge cases with different size configurations

The tests should be added under api/src/test/java/ai/chronon/api/thrift/ to match the source package structure.

🔗 Analysis chain

Add comprehensive unit tests for configuration validation.

Given that this configuration class is critical for Thrift communication, comprehensive unit tests are essential to verify:

  • Default values initialization
  • Builder pattern functionality
  • Validation logic
  • Thread safety
  • Edge cases and error conditions

Let's check for existing tests:

Would you like me to help create a test suite for this class?

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for potential test files
fd -e java -e scala "TConfiguration(Test|Spec|Suite)" api/src/test/

Length of output: 69


Script:

#!/bin/bash
# Let's try a broader search for any test files that might reference TConfiguration
rg -l "TConfiguration" api/src/test/

# Also search for existing test files in the same package path
fd . api/src/test/java/ai/chronon/api/thrift/

# Check if there are any test files that might test this configuration indirectly
rg "maxFrameSize|maxMessageSize|recursionLimit" api/src/test/

Length of output: 276


Script:

#!/bin/bash
# Let's verify if the test directory structure exists
fd . api/src/test/

# Check if this is a new file or part of existing changes
git diff --name-status HEAD~1 api/src/main/java/ai/chronon/api/thrift/TConfiguration.java

# Look for any existing Thrift-related tests
rg -l "Thrift" api/src/test/

Length of output: 612

api/src/main/java/ai/chronon/api/thrift/TSerializer.java (3)

84-86: ⚠️ Potential issue

Use explicit charset encoding

The toString method uses the platform's default charset which can lead to inconsistent behavior across different JVM environments.

-  public String toString(TBase<?, ?> base) throws TException {
-    return new String(serialize(base));
+  public String toString(TBase<?, ?> base) throws TException {
+    return new String(serialize(base), StandardCharsets.UTF_8);
   }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

  public String toString(TBase<?, ?> base) throws TException {
    return new String(serialize(base), StandardCharsets.UTF_8);
  }

71-75: ⚠️ Potential issue

Add thread safety warning and consider buffer management

The serialize method is not thread-safe due to shared baos state. Additionally, consider monitoring/limiting the buffer size to prevent potential OOM issues with large objects.

   public byte[] serialize(TBase<?, ?> base) throws TException {
+    // Thread safety: This method is not thread-safe due to shared buffer state
     baos.reset();
+    // Consider checking/limiting size to prevent OOM
     base.write(protocol_);
     return baos.toByteArray();
   }

Committable suggestion was skipped due to low confidence.


59-60: 🛠️ Refactor suggestion

Consider customizing TConfiguration and ByteArrayOutputStream

A few potential improvements:

  1. Consider allowing customization of TConfiguration instead of using defaults
  2. The ByteArrayOutputStream could benefit from an initial capacity if typical message sizes are known
+  private static final int DEFAULT_BUFFER_SIZE = 1024; // Adjust based on typical message size

-  private final ByteArrayOutputStream baos_ = new ByteArrayOutputStream();
+  private final ByteArrayOutputStream baos = new ByteArrayOutputStream(DEFAULT_BUFFER_SIZE);

   public TSerializer(TProtocolFactory protocolFactory) throws TTransportException {
-    TIOStreamTransport transport_ = new TIOStreamTransport(new TConfiguration(), baos_);
+    TConfiguration config = new TConfiguration();
+    // Customize configuration if needed
+    TIOStreamTransport transport = new TIOStreamTransport(config, baos);

Committable suggestion was skipped due to low confidence.

api/src/main/java/ai/chronon/api/thrift/protocol/TMultiplexedProtocol.java (1)

64-75: ⚠️ Potential issue

Add parameter validation in constructor.

The constructor should validate the serviceName parameter to prevent NullPointerException during message writing.

   public TMultiplexedProtocol(TProtocol protocol, String serviceName) {
     super(protocol);
+    if (serviceName == null) {
+      throw new IllegalArgumentException("serviceName cannot be null");
+    }
     SERVICE_NAME = serviceName;
   }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

  /**
   * Wrap the specified protocol, allowing it to be used to communicate with a multiplexing server.
   * The <code>serviceName</code> is required as it is prepended to the message header so that the
   * multiplexing server can broker the function call to the proper service.
   *
   * @param protocol Your communication protocol of choice, e.g. <code>TBinaryProtocol</code>.
   * @param serviceName The service name of the service communicating via this protocol.
   */
  public TMultiplexedProtocol(TProtocol protocol, String serviceName) {
    super(protocol);
    if (serviceName == null) {
      throw new IllegalArgumentException("serviceName cannot be null");
    }
    SERVICE_NAME = serviceName;
  }
api/src/main/java/ai/chronon/api/thrift/transport/TMemoryInputTransport.java (4)

31-57: ⚠️ Potential issue

Add bounds checking in reset method.

The reset method lacks validation for offset and length parameters, which could lead to ArrayIndexOutOfBoundsException. Consider adding bounds checking:

   public void reset(byte[] buf, int offset, int length) {
+    if (buf == null) {
+      throw new IllegalArgumentException("Buffer cannot be null");
+    }
+    if (offset < 0 || length < 0 || offset + length > buf.length) {
+      throw new IllegalArgumentException("Invalid offset or length");
+    }
     buf_ = buf;
     pos_ = offset;
     endPos_ = offset + length;

Also applies to: 63-72


74-81: ⚠️ Potential issue

Avoid silently ignoring exceptions.

The clear() method silently ignores TTransportException. Consider at least logging the exception:

   public void clear() {
     buf_ = null;
     try {
       resetConsumedMessageSize(-1);
     } catch (ai.chronon.api.thrift.transport.TTransportException e) {
-      // ignore
+      // Log the exception but continue since this is a non-critical operation
+      logger.warn("Failed to reset consumed message size", e);
     }
   }

Committable suggestion was skipped due to low confidence.


94-104: 🛠️ Refactor suggestion

Optimize read method to reduce method calls.

The read method could be optimized by caching the result of getBytesRemainingInBuffer():

   @Override
   public int read(byte[] buf, int off, int len) throws ai.chronon.api.thrift.transport.TTransportException {
-    int bytesRemaining = getBytesRemainingInBuffer();
+    final int bytesRemaining = endPos_ - pos_;
     int amtToRead = (len > bytesRemaining ? bytesRemaining : len);
     if (amtToRead > 0) {
       System.arraycopy(buf_, pos_, buf, off, amtToRead);
-      consumeBuffer(amtToRead);
+      pos_ += amtToRead;
       countConsumedMessageBytes(amtToRead);
     }
     return amtToRead;
   }

Committable suggestion was skipped due to low confidence.


25-127: 💡 Codebase verification

Based on our investigation, I can now provide the final response:

Missing test coverage for TMemoryInputTransport implementation

The new TMemoryInputTransport class is used by TDeserializer but lacks dedicated test coverage:

  • No direct test class for TMemoryInputTransport exists
  • No transport-related test classes found in the codebase
  • Existing test files in api/src/test don't contain deserialization tests

Please add unit tests to verify:

  • Buffer management (reset, clear, consume operations)
  • Read operations with different buffer sizes
  • Error handling scenarios
  • Integration with TDeserializer
🔗 Analysis chain

Verify test coverage for the new transport implementation.

As this is a new transport implementation as part of the Thrift upgrade, ensure it has proper test coverage:


Let's gather more information about test coverage by examining the usage in TDeserializer and searching for related test files.


Let's check the test files we found to see if they contain any tests involving TDeserializer, which would indirectly test TMemoryInputTransport.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for test files related to TMemoryInputTransport
fd -e java -e scala "TMemoryInputTransport.*Test"

# Search for test cases using TMemoryInputTransport
rg -l "TMemoryInputTransport" --type java --type scala

Length of output: 242


Script:

#!/bin/bash
# Check how TMemoryInputTransport is used in TDeserializer
rg -A 5 "TMemoryInputTransport" api/src/main/java/ai/chronon/api/thrift/TDeserializer.java

# Search for test files related to TDeserializer
fd -e java -e scala "TDeserializer.*Test"

# Search for any test files in the api module that might use this transport indirectly
fd -e java -e scala "Test" api/src/test

Length of output: 1208


Script:

#!/bin/bash
# Check test files for usage of TDeserializer or serialization/deserialization tests
rg -C 5 "TDeserializer|deserializ" api/src/test/scala/ai/chronon/api/test/*.scala

# Also check for any transport-related tests in the entire codebase
fd -e java -e scala "Transport.*Test"

Length of output: 330

api/src/main/java/ai/chronon/api/thrift/protocol/TTupleProtocol.java (4)

59-67: ⚠️ Potential issue

Add input validation in readBitSet method.

The method should validate that the input parameter 'i' is non-negative to prevent issues with negative sizes.

 public BitSet readBitSet(int i) throws TException {
+    if (i < 0) {
+        throw new TException("Negative BitSet size: " + i);
+    }
     int length = (int) Math.ceil(i / 8.0);
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

  public BitSet readBitSet(int i) throws TException {
    if (i < 0) {
        throw new TException("Negative BitSet size: " + i);
    }
    int length = (int) Math.ceil(i / 8.0);
    byte[] bytes = new byte[length];
    for (int j = 0; j < length; j++) {
      bytes[j] = readByte();
    }
    BitSet bs = fromByteArray(bytes);
    return bs;
  }

100-106: ⚠️ Potential issue

Fix typo in parameter name and add size validation.

The parameter name has a typo, and the method should validate collection size to prevent memory issues.

-public ai.chronon.api.thrift.protocol.TMap readMapBegin(byte keyType, byte valTyep) throws TException {
+public ai.chronon.api.thrift.protocol.TMap readMapBegin(byte keyType, byte valType) throws TException {
     int size = super.readI32();
+    if (size < 0) {
+        throw new TException("Negative map size: " + size);
+    }
+    if (size > Integer.MAX_VALUE / 2) {
+        throw new TException("Map size too large: " + size);
+    }
-    ai.chronon.api.thrift.protocol.TMap map = new TMap(keyType, valTyep, size);
+    ai.chronon.api.thrift.protocol.TMap map = new TMap(keyType, valType, size);
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

  public ai.chronon.api.thrift.protocol.TMap readMapBegin(byte keyType, byte valType) throws TException {
    int size = super.readI32();
    if (size < 0) {
        throw new TException("Negative map size: " + size);
    }
    if (size > Integer.MAX_VALUE / 2) {
        throw new TException("Map size too large: " + size);
    }
    ai.chronon.api.thrift.protocol.TMap map = new TMap(keyType, valType, size);

    checkReadBytesAvailable(map);
    return map;
  }

90-98: ⚠️ Potential issue

Add bounds checking in toByteArray method.

The method needs to handle potential integer overflow and validate vectorWidth.

 public static byte[] toByteArray(BitSet bits, int vectorWidth) {
+    if (vectorWidth < 0) {
+        throw new IllegalArgumentException("Negative vector width: " + vectorWidth);
+    }
+    if (vectorWidth > Integer.MAX_VALUE - 7) {
+        throw new IllegalArgumentException("Vector width too large: " + vectorWidth);
+    }
     byte[] bytes = new byte[(int) Math.ceil(vectorWidth / 8.0)];

Committable suggestion was skipped due to low confidence.


23-31: 💡 Codebase verification

Based on the analysis of the shell script results, I can now provide a final response:

Thrift upgrade requires attention to compatibility issues

The codebase shows mixed usage of Thrift packages that needs to be addressed:

  • Python requirements still specify thrift==0.13.0
  • Some Java files (ThriftDecoder.java, TBaseDecoderFactory.java) still use org.apache.thrift imports
  • ThriftGen.scala includes a migration helper that replaces org.apache.thrift with ai.chronon.api.thrift

Required actions:

  • Update Python thrift dependency in api/py/requirements/base.in to match Java version (0.21.0)
  • Migrate remaining org.apache.thrift imports in online module to ai.chronon.api.thrift
  • Verify that ThriftGen.scala migration helper has been run on all generated code
🔗 Analysis chain

Verify Thrift upgrade compatibility.

Since this file is part of the Thrift upgrade from 0.13.0 to 0.21.0, we should verify compatibility with other components.


Let me gather more specific information about the Thrift usage and potential compatibility issues.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check for any remaining references to old Thrift packages
rg "org\.apache\.thrift" -l

# Check for any potential version conflicts in build files
rg "thrift.*0\.13\.0|libthrift.*0\.13\.0" -l

# Look for other Thrift protocol implementations that might need similar updates
ast-grep --pattern 'class $_ extends TProtocol { $$$ }'

Length of output: 308


Script:

#!/bin/bash
# Check content of files that still reference org.apache.thrift
echo "=== ThriftGen.scala ==="
cat project/ThriftGen.scala

echo -e "\n=== ThriftDecoder.java ==="
cat online/src/main/java/ai/chronon/online/ThriftDecoder.java

echo -e "\n=== TBaseDecoderFactory.java ==="
cat online/src/main/java/ai/chronon/online/TBaseDecoderFactory.java

echo -e "\n=== Python requirements ==="
cat api/py/requirements/base.in

# Check for any Thrift-related dependencies in build files
echo -e "\n=== Build dependencies ==="
find . -name "build.gradle" -o -name "pom.xml" -o -name "build.sbt" | xargs cat

Length of output: 23200

api/src/main/java/ai/chronon/api/thrift/EncodingUtils.java (3)

32-34: ⚠️ Potential issue

Add parameter validation to prevent buffer overflows.

The encoding methods lack crucial parameter validation:

  1. No null check for buf
  2. No validation that buf.length >= offset + 4
  3. No validation for negative offset

Suggested implementation:

 public static final void encodeBigEndian(final int integer, final byte[] buf) {
+    if (buf == null) {
+        throw new NullPointerException("Buffer cannot be null");
+    }
+    if (buf.length < 4) {
+        throw new IllegalArgumentException("Buffer must be at least 4 bytes long");
+    }
     encodeBigEndian(integer, buf, 0);
 }

 public static final void encodeBigEndian(final int integer, final byte[] buf, int offset) {
+    if (buf == null) {
+        throw new NullPointerException("Buffer cannot be null");
+    }
+    if (offset < 0) {
+        throw new IllegalArgumentException("Offset cannot be negative");
+    }
+    if (buf.length - offset < 4) {
+        throw new IllegalArgumentException("Buffer must have at least 4 bytes available after offset");
+    }
     buf[offset] = (byte) (0xff & (integer >> 24));
     buf[offset + 1] = (byte) (0xff & (integer >> 16));
     buf[offset + 2] = (byte) (0xff & (integer >> 8));
     buf[offset + 3] = (byte) (0xff & (integer));
 }

Also applies to: 44-49


58-60: ⚠️ Potential issue

Add parameter validation to prevent buffer overflows in decode methods.

Similar to the encode methods, the decode methods need parameter validation.

Suggested implementation:

 public static final int decodeBigEndian(final byte[] buf) {
+    if (buf == null) {
+        throw new NullPointerException("Buffer cannot be null");
+    }
+    if (buf.length < 4) {
+        throw new IllegalArgumentException("Buffer must be at least 4 bytes long");
+    }
     return decodeBigEndian(buf, 0);
 }

 public static final int decodeBigEndian(final byte[] buf, int offset) {
+    if (buf == null) {
+        throw new NullPointerException("Buffer cannot be null");
+    }
+    if (offset < 0) {
+        throw new IllegalArgumentException("Offset cannot be negative");
+    }
+    if (buf.length - offset < 4) {
+        throw new IllegalArgumentException("Buffer must have at least 4 bytes available after offset");
+    }
     return ((buf[offset] & 0xff) << 24)
         | ((buf[offset + 1] & 0xff) << 16)
         | ((buf[offset + 2] & 0xff) << 8)
         | ((buf[offset + 3] & 0xff));
 }

Also applies to: 70-75


84-98: ⚠️ Potential issue

Add position validation to bit manipulation methods.

The bit manipulation methods need validation for the position parameter to prevent unexpected behavior with invalid positions.

Example fix for one set of methods (apply similar pattern to others):

 public static final boolean testBit(byte v, int position) {
+    if (position < 0 || position >= 8) {
+        throw new IllegalArgumentException("Position must be between 0 and 7 for byte values");
+    }
     return testBit((int) v, position);
 }

 public static final boolean testBit(short v, int position) {
+    if (position < 0 || position >= 16) {
+        throw new IllegalArgumentException("Position must be between 0 and 15 for short values");
+    }
     return testBit((int) v, position);
 }

 public static final boolean testBit(int v, int position) {
+    if (position < 0 || position >= 32) {
+        throw new IllegalArgumentException("Position must be between 0 and 31 for integer values");
+    }
     return (v & (1 << position)) != 0;
 }

 public static final boolean testBit(long v, int position) {
+    if (position < 0 || position >= 64) {
+        throw new IllegalArgumentException("Position must be between 0 and 63 for long values");
+    }
     return (v & (1L << position)) != 0L;
 }

Also applies to: 107-121, 131-147

api/src/main/java/ai/chronon/api/thrift/protocol/TBase64Utils.java (1)

69-86: 🛠️ Refactor suggestion

Consider using a more efficient decode table initialization.

The current static initialization of DECODE_TABLE could be optimized using a static initializer block that populates only the valid character mappings, reducing the table size.

Here's an optimized version:

private static final byte[] DECODE_TABLE = new byte[128];
static {
    // Initialize all entries to -1
    Arrays.fill(DECODE_TABLE, (byte)-1);
    // Populate only valid mappings
    for (int i = 0; i < ENCODE_TABLE.length(); i++) {
        DECODE_TABLE[ENCODE_TABLE.charAt(i)] = (byte)i;
    }
}
online/src/main/scala/ai/chronon/online/MetadataStore.scala (1)

26-26: 💡 Codebase verification

Update remaining Java files to use shaded Thrift imports

The verification reveals inconsistent usage of Thrift imports across the codebase. While most files use the new shaded import ai.chronon.api.thrift.TBase, two Java files still use the old import:

  • online/src/main/java/ai/chronon/online/ThriftDecoder.java
  • online/src/main/java/ai/chronon/online/TBaseDecoderFactory.java

These files need to be updated to use the shaded import for consistency with the rest of the codebase.

🔗 Analysis chain

Verify consistent usage of shaded Thrift imports.

The change from org.apache.thrift.TBase to ai.chronon.api.thrift.TBase aligns with the Thrift upgrade objectives. However, we should ensure this pattern is consistently applied across the codebase.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for any remaining references to the old Thrift package
# and verify consistent usage of the new package.

echo "Checking for old Thrift imports..."
rg "org\.apache\.thrift\.TBase"

echo "Verifying usage of new Thrift package..."
rg "ai\.chronon\.api\.thrift\.TBase"

Length of output: 5015

api/src/main/java/ai/chronon/api/thrift/TBaseProcessor.java (5)

18-18: 🛠️ Refactor suggestion

Simplify type references by removing fully qualified class names

Since TBase is already imported, you can simplify the type declarations by removing the fully qualified names ai.chronon.api.thrift.TBase.

Apply this diff to simplify the type declarations:

-private final Map<String, ProcessFunction<I, ? extends ai.chronon.api.thrift.TBase, ? extends ai.chronon.api.thrift.TBase>> processMap;
+private final Map<String, ProcessFunction<I, ? extends TBase, ? extends TBase>> processMap;

---

-protected TBaseProcessor(
-    I iface,
-    Map<String, ProcessFunction<I, ? extends ai.chronon.api.thrift.TBase, ? extends ai.chronon.api.thrift.TBase>> processFunctionMap) {
+protected TBaseProcessor(
+    I iface,
+    Map<String, ProcessFunction<I, ? extends TBase, ? extends TBase>> processFunctionMap) {

Also applies to: 22-22, 27-27


4-14: ⚠️ Potential issue

Add missing imports for ProcessFunction and TApplicationException

The classes ProcessFunction and TApplicationException are used in the code but are not imported, which will lead to compilation errors.

Apply this diff to add the missing imports:

+import ai.chronon.api.thrift.ProcessFunction;
+import ai.chronon.api.thrift.TApplicationException;

34-34: ⚠️ Potential issue

Specify generic type parameters for ProcessFunction to avoid raw type usage

The variable fn is declared without generic type parameters, which results in a raw type usage. This can cause unchecked conversion warnings and potential runtime issues.

Apply this diff to specify the generic types:

-ProcessFunction fn = processMap.get(msg.name);
+ProcessFunction<I, ? extends TBase, ? extends TBase> fn = processMap.get(msg.name);
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

    ProcessFunction<I, ? extends TBase, ? extends TBase> fn = processMap.get(msg.name);

46-46: ⚠️ Potential issue

Handle potential exceptions thrown by fn.process method

The method fn.process might throw exceptions that are not currently handled. Consider wrapping the call in a try-catch block to handle any exceptions and ensure robust error handling.

Apply this diff to handle exceptions:

 fn.process(msg.seqid, in, out, iface);
+// Add exception handling
+} catch (TException e) {
+  // Handle the exception appropriately
+  throw e;
}

Ensure that you handle exceptions according to your application's error-handling policy.

Committable suggestion was skipped due to low confidence.


6-9: ⚠️ Potential issue

Remove redundant wildcard import and unused imports

The wildcard import import ai.chronon.api.thrift.*; is redundant since specific classes from the same package are explicitly imported below. Wildcard imports can lead to namespace pollution and are generally discouraged.

Apply this diff to remove the wildcard import:

-import ai.chronon.api.thrift.*;

Additionally, ensure that all necessary classes are explicitly imported and remove any unused imports.

Committable suggestion was skipped due to low confidence.

api/src/main/java/ai/chronon/api/thrift/partial/ThriftFieldValueProcessor.java (1)

84-85: ⚠️ Potential issue

Remove the 'index' parameter from 'setMapElement' and 'setSetElement' methods

The index parameter in the setMapElement and setSetElement methods may not be necessary since maps and sets are not indexed collections. Maps are accessed via keys, and sets do not have an order. Removing the index parameter can simplify the method signatures and align them with the nature of these data structures.

Apply the following changes to simplify the method signatures:

- void setMapElement(Object instance, int index, Object key, Object value);
+ void setMapElement(Object instance, Object key, Object value);

- void setSetElement(Object instance, int index, Object value);
+ void setSetElement(Object instance, Object value);

Also applies to: 91-92

api/src/main/java/ai/chronon/api/thrift/ProcessFunction.java (2)

54-54: ⚠️ Potential issue

Throw TException instead of RuntimeException for consistent exception handling

At line 54, you are throwing a RuntimeException when rethrowUnhandledExceptions() returns true. Since process() is declared to throw ai.chronon.api.thrift.TException, consider throwing TException or a subclass to maintain consistency with Thrift's exception handling mechanism.

-      if (rethrowUnhandledExceptions()) throw new RuntimeException(ex.getMessage(), ex);
+      if (rethrowUnhandledExceptions()) throw new TException(ex.getMessage(), ex);
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

      if (rethrowUnhandledExceptions()) throw new TException(ex.getMessage(), ex);

18-18: 🛠️ Refactor suggestion

Simplify logger initialization by passing the class directly

In line 18, you are initializing the logger using ProcessFunction.class.getName(). SLF4J's LoggerFactory.getLogger() method can accept the class directly. Consider simplifying the logger initialization:

-  private static final Logger LOGGER = LoggerFactory.getLogger(ProcessFunction.class.getName());
+  private static final Logger LOGGER = LoggerFactory.getLogger(ProcessFunction.class);
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

  private static final Logger LOGGER = LoggerFactory.getLogger(ProcessFunction.class);
api/src/main/java/ai/chronon/api/thrift/Option.java (4)

96-98: ⚠️ Potential issue

Add null check in Some constructor to prevent null values

The Some constructor should validate that the provided value is not null to prevent unintended NullPointerException occurrences. Since None represents the absence of a value, Some should only contain non-null values.

Consider adding a null check in the constructor:

public Some(T value) {
    if (value == null) {
        throw new IllegalArgumentException("Some value cannot be null");
    }
    this.value = value;
}

65-71: 🛠️ Refactor suggestion

Simplify toOptional() method using Optional.ofNullable

The toOptional() method can be simplified by utilizing Optional.ofNullable(), which handles both defined and undefined cases gracefully.

Simplify the method:

public Optional<T> toOptional() {
-    if (isDefined()) {
-        return Optional.of(get());
-    } else {
-        return Optional.empty();
-    }
+    return Optional.ofNullable(isDefined() ? get() : null);
}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

  public Optional<T> toOptional() {
    return Optional.ofNullable(isDefined() ? get() : null);
  }

27-28: ⚠️ Potential issue

Avoid suppressing raw type warnings; use generics instead

The use of @SuppressWarnings("rawtypes") and raw types can be avoided by properly parameterizing the Option type. This enhances type safety and eliminates unchecked warnings.

Apply this diff to specify the generic type:

-@SuppressWarnings("rawtypes")
-private static final Option NONE = new None();
+private static final Option<?> NONE = new None<>();
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

  private static final Option<?> NONE = new None<>();

135-137: ⚠️ Potential issue

Validate non-null value in some(T value) method

The some(T value) method should ensure that the provided value is not null to maintain the invariant that Some contains only non-null values.

Add a null check:

public static <T> Some<T> some(T value) {
+    if (value == null) {
+        throw new IllegalArgumentException("Value cannot be null");
+    }
    return new Some<>(value);
}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

  public static <T> Some<T> some(T value) {
    if (value == null) {
        throw new IllegalArgumentException("Value cannot be null");
    }
    return new Some<T>(value);
  }
api/src/main/java/ai/chronon/api/thrift/transport/TEndpointTransport.java (3)

43-47: 🛠️ Refactor suggestion

Consider renaming _configuration to configuration for consistency.

In Java, it's common practice to avoid using leading underscores in variable names. Renaming _configuration to configuration would improve code readability and adhere to standard naming conventions.

Apply this diff to rename the variable:

- private TConfiguration _configuration;
+ private TConfiguration configuration;

public TConfiguration getConfiguration() {
-   return _configuration;
+   return configuration;
}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

  private TConfiguration configuration;

  public TConfiguration getConfiguration() {
    return configuration;
  }

49-53: 🛠️ Refactor suggestion

Handle potential NullPointerException without using Objects.isNull().

Using Objects.isNull(config) is unnecessary here. You can use a simple null check for better readability and to avoid an extra method call.

Apply this diff to simplify the null check:

- _configuration = Objects.isNull(config) ? new TConfiguration() : config;
+ _configuration = (config == null) ? new TConfiguration() : config;

Ensure that you have imported java.util.Objects only if necessary.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

  public TEndpointTransport(TConfiguration config) throws ai.chronon.api.thrift.transport.TTransportException {
    _configuration = (config == null) ? new TConfiguration() : config;

    resetConsumedMessageSize(-1);
  }

26-112: 🛠️ Refactor suggestion

Use consistent exception referencing to enhance readability.

Currently, the code uses fully qualified names for TTransportException in some places and the simple name in others. Since TTransportException is imported, you can use the simple name throughout the code to improve readability and maintain consistency.

Apply this diff to use the simple class name:

- throw new ai.chronon.api.thrift.transport.TTransportException(ai.chronon.api.thrift.transport.TTransportException.END_OF_FILE, "MaxMessageSize reached");
+ throw new TTransportException(TTransportException.END_OF_FILE, "MaxMessageSize reached");

Repeat this change for all instances where the fully qualified name is used.

Committable suggestion was skipped due to low confidence.

api/src/main/java/ai/chronon/api/thrift/TApplicationException.java (1)

31-149: 🛠️ Refactor suggestion

Assess the necessity of custom TApplicationException implementation

The TApplicationException class appears to replicate functionality provided by the Thrift library. Consider the following:

  • Avoiding Redundancy: If the standard TApplicationException from the Thrift library suffices, using it directly can reduce maintenance overhead.
  • Compatibility Concerns: Custom implementations may lead to compatibility issues with other components expecting the standard exception behavior.
  • Enhancements Justification: If custom behavior is required, ensure that the modifications are well-documented and provide clear benefits over the standard implementation.
api/src/main/java/ai/chronon/api/thrift/partial/ThriftStructProcessor.java (3)

89-96: 🛠️ Refactor suggestion

Remove unused index parameter in setSetElement method.

The index parameter in setSetElement is not used. Removing it can simplify the method signature.

Apply this diff to remove the unused parameter:

 @Override
- public void setSetElement(Object instance, int index, Object value) {
+ public void setSetElement(Object instance, Object value) {
     ((HashSet<Object>) instance).add(value);
 }

Note: Ensure that the interface ThriftFieldValueProcessor allows this change or provide an implementation that aligns with the interface requirements.

Committable suggestion was skipped due to low confidence.


179-183: ⚠️ Potential issue

Handle ByteBuffer conversion safely to avoid exceptions.

Using buffer.array() may throw an UnsupportedOperationException if the ByteBuffer is not backed by an accessible byte array. It's safer to read the bytes without assuming the buffer's backing array is accessible.

Apply this diff to safely convert ByteBuffer to String:

-private static String byteBufferToString(ByteBuffer buffer) {
-    byte[] bytes = buffer.array();
-    int pos = buffer.position();
-    return new String(bytes, pos, buffer.limit() - pos, StandardCharsets.UTF_8);
+private static String byteBufferToString(ByteBuffer buffer) {
+    ByteBuffer readOnlyBuffer = buffer.asReadOnlyBuffer();
+    byte[] bytes = new byte[readOnlyBuffer.remaining()];
+    readOnlyBuffer.get(bytes);
+    return new String(bytes, StandardCharsets.UTF_8);
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

  private static String byteBufferToString(ByteBuffer buffer) {
    ByteBuffer readOnlyBuffer = buffer.asReadOnlyBuffer();
    byte[] bytes = new byte[readOnlyBuffer.remaining()];
    readOnlyBuffer.get(bytes);
    return new String(bytes, StandardCharsets.UTF_8);
  }

54-66: 🛠️ Refactor suggestion

Consider using ArrayList instead of an array for list operations.

Currently, createNewList returns an Object[], and prepareList converts it to a List using Arrays.asList. Using an ArrayList can improve flexibility and readability.

Apply this diff to use ArrayList:

 @Override
- public Object createNewList(int expectedSize) {
-     return new Object[expectedSize];
- }
+ public Object createNewList(int expectedSize) {
+     return new ArrayList<>(expectedSize);
+ }

 @Override
- public void setListElement(Object instance, int index, Object value) {
-     ((Object[]) instance)[index] = value;
+ public void setListElement(Object instance, int index, Object value) {
+     ((ArrayList<Object>) instance).add(value);
  }

 @Override
- public Object prepareList(Object instance) {
-     return Arrays.asList((Object[]) instance);
+ public Object prepareList(Object instance) {
+     return instance;
  }

Committable suggestion was skipped due to low confidence.

api/src/main/java/ai/chronon/api/thrift/partial/ThriftField.java (4)

178-185: 🛠️ Refactor suggestion

Consider optimizing findField method for better performance

The findField method performs a linear search over the list of fields, which might become inefficient if the number of fields is large. Consider using a Map<String, ThriftField> to store and retrieve fields by name, providing faster lookup times.

Implementing this change would require modifying the data structures used to store sub-fields throughout the class.


80-92: ⚠️ Potential issue

Potential thread-safety issue in hashCode() method

The lazy initialization of the hashcode field without proper synchronization is not thread-safe. If multiple threads access the hashCode() method concurrently, they might compute and set hashcode simultaneously, leading to inconsistent behavior.

Consider one of the following approaches to make it thread-safe:

  • Compute the hashcode eagerly when the object is constructed.
  • Declare the hashcode field as volatile.
  • Synchronize the hashCode() method or the block where hashcode is set.

Apply this diff to declare the hashcode field as volatile:

- private int hashcode = 0;
+ private volatile int hashcode = 0;

Committable suggestion was skipped due to low confidence.


83-88: 🛠️ Refactor suggestion

Improve hashCode() implementation to reduce collisions

Combining hash codes using bitwise XOR (^) might lead to a higher probability of collisions, especially when the fields have similar hash codes. Consider using a more standard approach by leveraging a prime number multiplier to combine hash codes effectively.

Apply this diff to modify the hash code calculation:

if (this.hashcode == 0) {
-   int hc = this.name.toLowerCase().hashCode();
-   for (ThriftField subField : this.fields) {
-     hc ^= subField.hashCode();
-   }
+   int hc = this.name.toLowerCase().hashCode();
+   hc = 31 * hc + this.fields.hashCode();
    this.hashcode = hc;
}

Committable suggestion was skipped due to low confidence.


34-77: ⚠️ Potential issue

Public class with package-private constructors limits usability

The class ThriftField is declared as public, but all its constructors have default (package-private) access. This means that code outside the ai.chronon.api.thrift.partial package cannot instantiate ThriftField, which might not be intended if you want to expose this class for public use.

Consider making the constructors public to allow instantiation from outside the package.

Apply this diff to modify the constructors:

- ThriftField(String name, List<ThriftField> fields) {
+ public ThriftField(String name, List<ThriftField> fields) {

- ThriftField(String name) {
+ public ThriftField(String name) {

- ThriftField(String name, List<ThriftField> fields, boolean allowFieldAdds) {
+ ThriftField(String name, List<ThriftField> fields, boolean allowFieldAdds) { // Remain package-private if intended
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

public class ThriftField {

  /** Name of this field as it appears in a thrift file. Case sensitive. */
  public final String name;

  /**
   * List of sub-fields of this field.
   *
   * <p>This list should have only those sub-fields that need to be deserialized by the {@code
   * TDeserializer}.
   */
  public final List<ThriftField> fields;

  /**
   * Constructs a {@link ThriftField}.
   *
   * @param name the name of this field as it appears in a thrift file. Case sensitive.
   * @param fields List of sub-fields of this field.
   */
  public ThriftField(String name, List<ThriftField> fields) {
    ai.chronon.api.thrift.partial.Validate.checkNotNullAndNotEmpty(name, "name");
    ai.chronon.api.thrift.partial.Validate.checkNotNull(fields, "fields");

    this.name = name;
    this.fields = Collections.unmodifiableList(fields);
  }

  /** Constructs a {@link ThriftField} that does not have any sub-fields. */
  public ThriftField(String name) {
    this(name, Collections.emptyList());
  }

  // Internal-only constructor that does not mark fields as read-only.
  // That allows fromNames() to construct fields from names.
  // The actual value of allowFieldAdds is ignored.
  // It is used only for generating a different function signature.
  ThriftField(String name, List<ThriftField> fields, boolean allowFieldAdds) {
    ai.chronon.api.thrift.partial.Validate.checkNotNullAndNotEmpty(name, "name");
    ai.chronon.api.thrift.partial.Validate.checkNotNull(fields, "fields");

    this.name = name;
    this.fields = fields;
  }
api/src/main/java/ai/chronon/api/thrift/transport/TTransport.java (2)

70-75: ⚠️ Potential issue

Handle potential EOF condition in read(ByteBuffer dst).

The read(ByteBuffer dst) method does not account for a possible -1 return value from the read(byte[] buf, int off, int len) method, which indicates end-of-stream. Passing a negative n to dst.put(arr, 0, n); will result in an IndexOutOfBoundsException.

Apply this diff to handle the EOF condition:

 public int read(ByteBuffer dst) throws TTransportException {
   byte[] arr = new byte[dst.remaining()];
   int n = read(arr, 0, arr.length);
+  if (n == -1) {
+    return -1;
+  }
   dst.put(arr, 0, n);
   return n;
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

  public int read(ByteBuffer dst) throws TTransportException {
    byte[] arr = new byte[dst.remaining()];
    int n = read(arr, 0, arr.length);
    if (n == -1) {
      return -1;
    }
    dst.put(arr, 0, n);
    return n;
  }

145-150: 🛠️ Refactor suggestion

Optimize write(ByteBuffer src) method for efficiency.

The current implementation copies the entire content of src into a byte array before writing, which can be inefficient for large buffers.

Consider optimizing the method to write directly from the ByteBuffer without creating an intermediate array. Here's a possible refactor:

 public int write(ByteBuffer src) throws TTransportException {
-  byte[] arr = new byte[src.remaining()];
-  src.get(arr);
-  write(arr, 0, arr.length);
-  return arr.length;
+  int remaining = src.remaining();
+  if (src.hasArray()) {
+    write(src.array(), src.arrayOffset() + src.position(), remaining);
+    src.position(src.position() + remaining);
+  } else {
+    byte[] arr = new byte[Math.min(remaining, 8192)];
+    while (src.hasRemaining()) {
+      int len = Math.min(src.remaining(), arr.length);
+      src.get(arr, 0, len);
+      write(arr, 0, len);
+    }
+  }
+  return remaining;
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

  public int write(ByteBuffer src) throws TTransportException {
    int remaining = src.remaining();
    if (src.hasArray()) {
      write(src.array(), src.arrayOffset() + src.position(), remaining);
      src.position(src.position() + remaining);
    } else {
      byte[] arr = new byte[Math.min(remaining, 8192)];
      while (src.hasRemaining()) {
        int len = Math.min(src.remaining(), arr.length);
        src.get(arr, 0, len);
        write(arr, 0, len);
      }
    }
    return remaining;
  }
api/src/main/java/ai/chronon/api/thrift/transport/TIOStreamTransport.java (1)

51-129: ⚠️ Potential issue

Remove unnecessary throws TTransportException declarations from constructors

The constructors declare throws TTransportException but do not actually throw this exception. This could mislead users into handling an exception that will not occur during instantiation. Removing the unnecessary throws declarations will clean up the code and improve clarity.

Apply this diff to remove the unnecessary exception declarations:

-protected TIOStreamTransport(TConfiguration config) throws TTransportException {
+protected TIOStreamTransport(TConfiguration config) {
    super(config);
}

-protected TIOStreamTransport() throws TTransportException {
+protected TIOStreamTransport() {
    super(new TConfiguration());
}

-public TIOStreamTransport(TConfiguration config, InputStream is) throws TTransportException {
+public TIOStreamTransport(TConfiguration config, InputStream is) {
    super(config);
    inputStream_ = is;
}

-public TIOStreamTransport(InputStream is) throws TTransportException {
+public TIOStreamTransport(InputStream is) {
    super(new TConfiguration());
    inputStream_ = is;
}

-public TIOStreamTransport(TConfiguration config, OutputStream os) throws TTransportException {
+public TIOStreamTransport(TConfiguration config, OutputStream os) {
    super(config);
    outputStream_ = os;
}

-public TIOStreamTransport(OutputStream os) throws TTransportException {
+public TIOStreamTransport(OutputStream os) {
    super(new TConfiguration());
    outputStream_ = os;
}

-public TIOStreamTransport(TConfiguration config, InputStream is, OutputStream os)
-    throws TTransportException {
+public TIOStreamTransport(TConfiguration config, InputStream is, OutputStream os) {
    super(config);
    inputStream_ = is;
    outputStream_ = os;
}

-public TIOStreamTransport(InputStream is, OutputStream os) throws TTransportException {
+public TIOStreamTransport(InputStream is, OutputStream os) {
    super(new TConfiguration());
    inputStream_ = is;
    outputStream_ = os;
}

Committable suggestion was skipped due to low confidence.

api/src/main/java/ai/chronon/api/thrift/protocol/TProtocolDecorator.java (1)

52-55: ⚠️ Potential issue

Add null check for 'protocol' parameter in constructor

The constructor does not check if the protocol parameter is null. This could lead to a NullPointerException when protocol.getTransport() is called or when concreteProtocol is used elsewhere. Even though the documentation mentions that it must be non-null, it's good practice to enforce this in the code.

Apply this diff to add a null check:

 public TProtocolDecorator(TProtocol protocol) {
+    if (protocol == null) {
+        throw new NullPointerException("protocol cannot be null");
+    }
     super(protocol.getTransport());
     concreteProtocol = protocol;
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

  public TProtocolDecorator(TProtocol protocol) {
    if (protocol == null) {
        throw new NullPointerException("protocol cannot be null");
    }
    super(protocol.getTransport());
    concreteProtocol = protocol;
  }
api/src/main/java/ai/chronon/api/thrift/protocol/TProtocolUtil.java (3)

163-221: ⚠️ Potential issue

Add null and empty array checks in guessProtocolFactory to prevent exceptions.

Accessing data[0] or data[data.length - 1] without checking if the data array is null or empty can lead to NullPointerException or ArrayIndexOutOfBoundsException.

Apply this diff to add null and empty array checks:

 public static TProtocolFactory guessProtocolFactory(byte[] data, TProtocolFactory fallback) {
+    if (data == null || data.length == 0) {
+        return fallback;
+    }
     //
     // If the first and last bytes are opening/closing curly braces we guess the protocol as
     // being TJSONProtocol.

Committable suggestion was skipped due to low confidence.


38-52: ⚠️ Potential issue

Consider thread safety for maxSkipDepth.

The static variable maxSkipDepth can be modified via setMaxSkipDepth(int depth). Since it's shared across all threads, concurrent modifications may lead to inconsistent behavior in a multithreaded environment.

Consider making maxSkipDepth volatile to ensure visibility across threads or synchronizing access to it. Alternatively, you could pass maxDepth as a parameter in methods that use it instead of relying on a global variable.

Apply this diff to declare maxSkipDepth as volatile:

-private static int maxSkipDepth = Integer.MAX_VALUE;
+private static volatile int maxSkipDepth = Integer.MAX_VALUE;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

  /** The maximum recursive depth the skip() function will traverse before throwing a TException. */
  private static volatile int maxSkipDepth = Integer.MAX_VALUE;

  /**
   * Specifies the maximum recursive depth that the skip function will traverse before throwing a
   * TException. This is a global setting, so any call to skip in this JVM will enforce this value.
   *
   * @param depth the maximum recursive depth. A value of 2 would allow the skip function to skip a
   *     structure or collection with basic children, but it would not permit skipping a struct that
   *     had a field containing a child struct. A value of 1 would only allow skipping of simple
   *     types and empty structs/collections.
   */
  public static void setMaxSkipDepth(int depth) {
    maxSkipDepth = depth;
  }

76-150: ⚠️ Potential issue

Include handling for TType.ENUM.

The skip method does not handle the TType.ENUM case. This omission may lead to unexpected behavior or exceptions when the method encounters enum types.

Add a case to handle TType.ENUM by reading an int32, which is the typical underlying type for enums in Thrift.

Apply this diff to include support for TType.ENUM:

         switch (type) {
             ...
+          case ai.chronon.api.thrift.protocol.TType.ENUM:
+            prot.readI32();
+            break;
             default:
                 throw new TProtocolException(TProtocolException.INVALID_DATA, "Unrecognized type " + type);
         }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

    switch (type) {
      case ai.chronon.api.thrift.protocol.TType.BOOL:
        prot.readBool();
        break;

      case ai.chronon.api.thrift.protocol.TType.BYTE:
        prot.readByte();
        break;

      case ai.chronon.api.thrift.protocol.TType.I16:
        prot.readI16();
        break;

      case ai.chronon.api.thrift.protocol.TType.I32:
        prot.readI32();
        break;

      case ai.chronon.api.thrift.protocol.TType.I64:
        prot.readI64();
        break;

      case ai.chronon.api.thrift.protocol.TType.UUID:
        prot.readUuid();
        break;

      case ai.chronon.api.thrift.protocol.TType.DOUBLE:
        prot.readDouble();
        break;

      case ai.chronon.api.thrift.protocol.TType.STRING:
        prot.readBinary();
        break;

      case ai.chronon.api.thrift.protocol.TType.STRUCT:
        prot.readStructBegin();
        while (true) {
          TField field = prot.readFieldBegin();
          if (field.type == ai.chronon.api.thrift.protocol.TType.STOP) {
            break;
          }
          skip(prot, field.type, maxDepth - 1);
          prot.readFieldEnd();
        }
        prot.readStructEnd();
        break;

      case ai.chronon.api.thrift.protocol.TType.MAP:
        TMap map = prot.readMapBegin();
        for (int i = 0; i < map.size; i++) {
          skip(prot, map.keyType, maxDepth - 1);
          skip(prot, map.valueType, maxDepth - 1);
        }
        prot.readMapEnd();
        break;

      case ai.chronon.api.thrift.protocol.TType.SET:
        TSet set = prot.readSetBegin();
        for (int i = 0; i < set.size; i++) {
          skip(prot, set.elemType, maxDepth - 1);
        }
        prot.readSetEnd();
        break;

      case TType.LIST:
        TList list = prot.readListBegin();
        for (int i = 0; i < list.size; i++) {
          skip(prot, list.elemType, maxDepth - 1);
        }
        prot.readListEnd();
        break;

      case ai.chronon.api.thrift.protocol.TType.ENUM:
        prot.readI32();
        break;

      default:
        throw new TProtocolException(TProtocolException.INVALID_DATA, "Unrecognized type " + type);
    }
  }
api/src/main/java/ai/chronon/api/thrift/partial/Validate.java (6)

75-102: 🛠️ Refactor suggestion

Generic methods for array types

The methods checkNotNullAndNotEmpty are overloaded for different primitive array types. Consider using a single generic method for arrays to reduce code duplication.

Implement a generic method using Object arrays:

public static void checkNotNullAndNotEmpty(Object array, String argName) {
    checkNotNull(array, argName);
    checkArgument(Array.getLength(array) > 0, "'%s' must not be empty.", argName);
}

Note: Ensure proper type checking and handle primitive arrays appropriately.


148-156: 🛠️ Refactor suggestion

Overload comparison methods for other numeric types

Currently, comparison methods like checkGreater are only available for long types. Consider adding overloads for other numeric types such as int, double, and BigDecimal to increase the utility of the validation class.


33-34: 🛠️ Refactor suggestion

Consider renaming the class to avoid confusion with org.apache.commons.lang3.Validate

Naming this class Validate may cause confusion since it resides in a different package but shares the same name as org.apache.commons.lang3.Validate. This could lead to ambiguous references in code.

For clarity, consider renaming the class to something like ValidationUtils or ChrononValidate.

 public final class Validate {
+// Rename to ValidationUtils or ChrononValidate
 }

Committable suggestion was skipped due to low confidence.


137-145: ⚠️ Potential issue

Handle potential division by zero in checkIntegerMultiple method

In the checkIntegerMultiple method, there is a potential division by zero when value2 is zero, which would lead to an ArithmeticException. It's important to validate that value2 is not zero before performing the modulo operation.

Apply this diff to add a check for value2 != 0:

 public static void checkIntegerMultiple(
     long value1, String value1Name, long value2, String value2Name) {
+    checkArgument(value2 != 0, "'%s' must not be zero.", value2Name);
     checkArgument(
         (value1 % value2) == 0,
         "'%s' (%s) must be an integer multiple of '%s' (%s).",
         value1Name,
         value1,
         value2Name,
         value2);
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

      long value1, String value1Name, long value2, String value2Name) {
    checkArgument(value2 != 0, "'%s' must not be zero.", value2Name);
    checkArgument(
        (value1 % value2) == 0,
        "'%s' (%s) must be an integer multiple of '%s' (%s).",
        value1Name,
        value1,
        value2Name,
        value2);
  }

222-223: ⚠️ Potential issue

Missing import statement for org.apache.commons.lang3.Validate

The methods checkArgument and checkState use org.apache.commons.lang3.Validate, but there is no import statement for it, which will lead to a compilation error.

Add the missing import statement:

 import java.util.Collection;

+import org.apache.commons.lang3.Validate;

 /**
  * A superset of Validate class in Apache commons lang3.
  */

Committable suggestion was skipped due to low confidence.


222-228: ⚠️ Potential issue

Ensure consistent exception handling in checkArgument and checkState methods

The methods checkArgument and checkState delegate to Apache Commons Lang3's Validate class but may throw exceptions that are not consistent with the method names. Specifically:

  • checkArgument should throw an IllegalArgumentException.
  • checkState should throw an IllegalStateException.

Ensure that the correct exceptions are thrown to match the method semantics.

Consider explicitly throwing the exceptions:

 public static void checkArgument(boolean expression, String format, Object... args) {
-    org.apache.commons.lang3.Validate.isTrue(expression, format, args);
+    if (!expression) {
+        throw new IllegalArgumentException(String.format(format, args));
+    }
 }

 public static void checkState(boolean expression, String format, Object... args) {
-    org.apache.commons.lang3.Validate.validState(expression, format, args);
+    if (!expression) {
+        throw new IllegalStateException(String.format(format, args));
+    }
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

  public static void checkArgument(boolean expression, String format, Object... args) {
    if (!expression) {
        throw new IllegalArgumentException(String.format(format, args));
    }
  }

  public static void checkState(boolean expression, String format, Object... args) {
    if (!expression) {
        throw new IllegalStateException(String.format(format, args));
    }
  }
api/src/main/java/ai/chronon/api/thrift/TBaseHelper.java (9)

39-39: 🛠️ Refactor suggestion

Use Generics with Comparator to Enhance Type Safety

The Comparator is declared without a generic type, leading to unchecked type warnings and potential runtime exceptions. To improve type safety, parameterize Comparator with <Object>.

Apply this diff to add the generic type:

- private static final Comparator comparator = new NestedStructureComparator();
+ private static final Comparator<Object> comparator = new NestedStructureComparator();
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

  private static final Comparator<Object> comparator = new NestedStructureComparator();

170-170: 🛠️ Refactor suggestion

Parameterize NestedStructureComparator for Better Type Safety

The NestedStructureComparator class is implemented without a generic type parameter, which may result in unchecked warnings. To ensure type safety, declare it with <Object>.

Apply this diff to parameterize the class:

- private static class NestedStructureComparator implements Comparator, Serializable {
+ private static class NestedStructureComparator implements Comparator<Object>, Serializable {
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

  private static class NestedStructureComparator implements Comparator<Object>, Serializable {

138-167: 🛠️ Refactor suggestion

Use Generics in compareTo Method for Maps

The compareTo(Map a, Map b) method currently uses raw types. Introducing generics will improve type safety.

Update the method signature:

- public static int compareTo(Map a, Map b) {
+ public static <K, V> int compareTo(Map<K, V> a, Map<K, V> b) {

Adjust the method implementation to handle generic types K and V.

Committable suggestion was skipped due to low confidence.


273-285: ⚠️ Potential issue

Handle Direct ByteBuffer in copyBinary Method

The copyBinary(ByteBuffer orig) method may not correctly copy data from a direct ByteBuffer, as it assumes the buffer is backed by an array.

Adjust the method to handle direct ByteBuffer instances:

public static ByteBuffer copyBinary(final ByteBuffer orig) {
    if (orig == null) {
        return null;
    }
    ByteBuffer copy = ByteBuffer.allocate(orig.remaining());
-   if (orig.hasArray()) {
-       System.arraycopy(
-           orig.array(), orig.arrayOffset() + orig.position(), copy.array(), 0, orig.remaining());
-   } else {
-       orig.slice().get(copy.array());
-   }
+   ByteBuffer readOnly = orig.asReadOnlyBuffer();
+   readOnly.rewind();
+   copy.put(readOnly);
+   copy.flip();
    return copy;
}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

    if (orig == null) {
      return null;
    }
    ByteBuffer copy = ByteBuffer.allocate(orig.remaining());
    ByteBuffer readOnly = orig.asReadOnlyBuffer();
    readOnly.rewind();
    copy.put(readOnly);
    copy.flip();
    return copy;
  }

98-100: 🛠️ Refactor suggestion

Parameterize compareTo Method for Comparable Objects

The method compareTo(Comparable a, Comparable b) uses raw types, which may lead to unchecked warnings. Using generics will enhance type safety.

Update the method signature:

- public static int compareTo(Comparable a, Comparable b) {
+ public static <T extends Comparable<T>> int compareTo(T a, T b) {
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

  public static <T extends Comparable<T>> int compareTo(T a, T b) {
    return a.compareTo(b);
  }

102-113: 🛠️ Refactor suggestion

Parameterize compareTo Method for Lists to Ensure Type Safety

The compareTo(List a, List b) method uses raw types, which can lead to unchecked warnings and potential runtime issues. Using generics can improve type safety.

Update the method signature and implementation:

- public static int compareTo(List a, List b) {
+ public static <T> int compareTo(List<T> a, List<T> b) {

Modify the method implementation as needed.

Committable suggestion was skipped due to low confidence.


211-228: ⚠️ Potential issue

Check for hasArray() Before Accessing ByteBuffer Array

In the toString(ByteBuffer bb, StringBuilder sb) method, accessing bb.array() without checking bb.hasArray() can cause an UnsupportedOperationException if the ByteBuffer is not backed by an array.

Add a check for bb.hasArray() before accessing the array:

public static void toString(ByteBuffer bb, StringBuilder sb) {
+   if (!bb.hasArray()) {
+       bb = bb.asReadOnlyBuffer();
+   }
    byte[] buf = bb.array();
    // existing code
}

Committable suggestion was skipped due to low confidence.


235-242: ⚠️ Potential issue

Handle Non-Array-Backed ByteBuffer in byteBufferToByteArray

The byteBufferToByteArray method assumes that the ByteBuffer has an accessible array. If it doesn't, accessing byteBuffer.array() will throw an UnsupportedOperationException.

Modify the method to handle ByteBuffer instances without an accessible array:

public static byte[] byteBufferToByteArray(ByteBuffer byteBuffer) {
    if (wrapsFullArray(byteBuffer)) {
        return byteBuffer.array();
    }
    byte[] target = new byte[byteBuffer.remaining()];
-   byteBufferToByteArray(byteBuffer, target, 0);
+   if (byteBuffer.hasArray()) {
+       System.arraycopy(
+           byteBuffer.array(),
+           byteBuffer.arrayOffset() + byteBuffer.position(),
+           target,
+           0,
+           byteBuffer.remaining());
+   } else {
+       byteBuffer.asReadOnlyBuffer().get(target);
+   }
    return target;
}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

  public static byte[] byteBufferToByteArray(ByteBuffer byteBuffer) {
    if (wrapsFullArray(byteBuffer)) {
      return byteBuffer.array();
    }
    byte[] target = new byte[byteBuffer.remaining()];
    if (byteBuffer.hasArray()) {
        System.arraycopy(
            byteBuffer.array(),
            byteBuffer.arrayOffset() + byteBuffer.position(),
            target,
            0,
            byteBuffer.remaining());
    } else {
        byteBuffer.asReadOnlyBuffer().get(target);
    }
    return target;
  }

115-136: 🛠️ Refactor suggestion

Apply Generics in compareTo Method for Sets

Similar to the list comparison, the compareTo(Set a, Set b) method should use generics to enhance type safety and avoid unchecked warnings.

Update the method signature:

- public static int compareTo(Set a, Set b) {
+ public static <T> int compareTo(Set<T> a, Set<T> b) {

Adjust the implementation accordingly.

Committable suggestion was skipped due to low confidence.

api/src/main/java/ai/chronon/api/thrift/TUnion.java (6)

232-268: ⚠️ Potential issue

Handle Null Cases in TUnionStandardScheme

In the read method of TUnionStandardScheme, if the read value is null, setField_ remains null, which may cause issues when writing or processing the union. Ensure that null cases are appropriately handled.


276-297: ⚠️ Potential issue

Validate Field ID in TUnionTupleScheme

When reading the field ID in TUnionTupleScheme, there should be validation to ensure that the field ID corresponds to a valid field in the union. This prevents potential errors from invalid or unexpected field IDs.


149-151: 🛠️ Refactor suggestion

Use Defensive Copying in read Method

Consider using defensive copying when reading data to prevent external modification of internal state, enhancing the class's robustness.


226-230: 🛠️ Refactor suggestion

Implement SchemeFactory as a Singleton

The TUnionStandardSchemeFactory can be implemented as a singleton to reduce unnecessary object creation, enhancing performance.


153-157: ⚠️ Potential issue

Add Null Checks in setFieldValue Method

The setFieldValue method should check for null values to prevent unintentional assignment of null to fields, which may cause unexpected behavior or errors elsewhere in the code.

Apply this diff to include null value handling:

 public void setFieldValue(F fieldId, Object value) {
+    if (value == null) {
+        throw new NullPointerException("Cannot set null value for field " + fieldId);
+    }
     checkType(fieldId, value);
     setField_ = fieldId;
     value_ = value;
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

  public void setFieldValue(F fieldId, Object value) {
    if (value == null) {
        throw new NullPointerException("Cannot set null value for field " + fieldId);
    }
    checkType(fieldId, value);
    setField_ = fieldId;
    value_ = value;
  }

73-111: ⚠️ Potential issue

Use Generic Types to Enhance Type Safety

The deepCopyObject, deepCopyMap, deepCopySet, and deepCopyList methods are using raw types for collections, which can lead to unchecked warnings and potential runtime errors. It's recommended to use generic types to enforce type safety and improve code clarity.

Apply this diff to use generic types:

-private static Object deepCopyObject(Object o) {
+private static <T> T deepCopyObject(T o) {
     if (o instanceof ai.chronon.api.thrift.TBase) {
         return (T) ((TBase) o).deepCopy();
     } else if (o instanceof ByteBuffer) {
         return (T) ai.chronon.api.thrift.TBaseHelper.copyBinary((ByteBuffer) o);
     } else if (o instanceof List) {
-        return deepCopyList((List) o);
+        return (T) deepCopyList((List<?>) o);
     } else if (o instanceof Set) {
-        return deepCopySet((Set) o);
+        return (T) deepCopySet((Set<?>) o);
     } else if (o instanceof Map) {
-        return deepCopyMap((Map) o);
+        return (T) deepCopyMap((Map<?, ?>) o);
     } else {
         return o;
     }
 }

-private static Map deepCopyMap(Map<Object, Object> map) {
-    Map copy = new HashMap(map.size());
-    for (Map.Entry<Object, Object> entry : map.entrySet()) {
+private static <K, V> Map<K, V> deepCopyMap(Map<K, V> map) {
+    Map<K, V> copy = new HashMap<>(map.size());
+    for (Map.Entry<K, V> entry : map.entrySet()) {
         copy.put(deepCopyObject(entry.getKey()), deepCopyObject(entry.getValue()));
     }
     return copy;
 }

-private static Set deepCopySet(Set set) {
-    Set copy = new HashSet(set.size());
-    for (Object o : set) {
+private static <E> Set<E> deepCopySet(Set<E> set) {
+    Set<E> copy = new HashSet<>(set.size());
+    for (E o : set) {
         copy.add(deepCopyObject(o));
     }
     return copy;
 }

-private static List deepCopyList(List list) {
-    List copy = new ArrayList(list.size());
-    for (Object o : list) {
+private static <E> List<E> deepCopyList(List<E> list) {
+    List<E> copy = new ArrayList<>(list.size());
+    for (E o : list) {
         copy.add(deepCopyObject(o));
     }
     return copy;
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

  private static <T> T deepCopyObject(T o) {
    if (o instanceof ai.chronon.api.thrift.TBase) {
      return (T) ((TBase) o).deepCopy();
    } else if (o instanceof ByteBuffer) {
      return (T) TBaseHelper.copyBinary((ByteBuffer) o);
    } else if (o instanceof List) {
      return (T) deepCopyList((List<?>) o);
    } else if (o instanceof Set) {
      return (T) deepCopySet((Set<?>) o);
    } else if (o instanceof Map) {
      return (T) deepCopyMap((Map<?, ?>) o);
    } else {
      return o;
    }
  }

  private static <K, V> Map<K, V> deepCopyMap(Map<K, V> map) {
    Map<K, V> copy = new HashMap<>(map.size());
    for (Map.Entry<K, V> entry : map.entrySet()) {
      copy.put(deepCopyObject(entry.getKey()), deepCopyObject(entry.getValue()));
    }
    return copy;
  }

  private static <E> Set<E> deepCopySet(Set<E> set) {
    Set<E> copy = new HashSet<>(set.size());
    for (E o : set) {
      copy.add(deepCopyObject(o));
    }
    return copy;
  }

  private static <E> List<E> deepCopyList(List<E> list) {
    List<E> copy = new ArrayList<>(list.size());
    for (E o : list) {
      copy.add(deepCopyObject(o));
    }
    return copy;
  }
api/src/main/java/ai/chronon/api/thrift/partial/PartialThriftComparer.java (6)

41-343: 🛠️ Refactor suggestion

Add unit tests for PartialThriftComparer class

Since PartialThriftComparer introduces complex comparison logic, it's essential to add unit tests to verify its correctness and prevent regressions.

Would you like assistance in creating unit tests for this class? I can help generate test cases covering various comparison scenarios to ensure robust functionality.


261-276: ⚠️ Potential issue

Ensure type safety in areBinaryFieldsEqual method

The areBinaryFieldsEqual method assumes that o2 is of the same type as o1 without checking, which may lead to a ClassCastException if this assumption is violated. It's important to verify that both objects are of the same type before casting.

Apply this diff to add type checks:

     if (o1 instanceof byte[]) {
+      if (!(o2 instanceof byte[])) {
+        appendResult(data, sb, "Type mismatch: o1 is byte[] but o2 is " + o2.getClass().getName());
+        return false;
+      }
       if (Arrays.equals((byte[]) o1, (byte[]) o2)) {
         return true;
       }
     } else if (o1 instanceof ByteBuffer) {
+      if (!(o2 instanceof ByteBuffer)) {
+        appendResult(data, sb, "Type mismatch: o1 is ByteBuffer but o2 is " + o2.getClass().getName());
+        return false;
+      }
       if (((ByteBuffer) o1).compareTo((ByteBuffer) o2) == 0) {
         return true;
       }

Committable suggestion was skipped due to low confidence.


307-322: ⚠️ Potential issue

Logical error in checkNullEquality when handling null values

The checkNullEquality method does not correctly return ComparisonResult.NOT_EQUAL when one object is null and the other is not. This may lead to incorrect comparison results when one of the objects is null.

Apply the following diff to fix the issue:

     if ((o1 == null) && (o2 == null)) {
       return ComparisonResult.EQUAL;
     }

-    if (o1 == null) {
-      appendResult(data, sb, "o1 (null) != o2");
-    }
-
-    if (o2 == null) {
-      appendResult(data, sb, "o1 != o2 (null)");
-    }
-
-    return ComparisonResult.UNKNOWN;
+    if (o1 == null || o2 == null) {
+      appendResult(data, sb, o1 == null ? "o1 (null) != o2" : "o1 != o2 (null)");
+      return ComparisonResult.NOT_EQUAL;
+    }

+    return ComparisonResult.UNKNOWN;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

  private ComparisonResult checkNullEquality(
          ai.chronon.api.thrift.partial.ThriftMetadata.ThriftObject data, Object o1, Object o2, StringBuilder sb) {
    if ((o1 == null) && (o2 == null)) {
      return ComparisonResult.EQUAL;
    }

    if (o1 == null || o2 == null) {
      appendResult(data, sb, o1 == null ? "o1 (null) != o2" : "o1 != o2 (null)");
      return ComparisonResult.NOT_EQUAL;
    }

    return ComparisonResult.UNKNOWN;
  }

180-203: ⚠️ Potential issue

Add type checks for List instances in areEqual method

The method assumes that o1 and o2 are instances of List without verification, which could result in a ClassCastException if they are not. Consider adding checks to ensure that both objects are indeed List instances before casting.

Apply this diff to include type checks:

     private boolean areEqual(
             ai.chronon.api.thrift.partial.ThriftMetadata.ThriftList data, Object o1, Object o2, StringBuilder sb) {

+      if (!(o1 instanceof List) || !(o2 instanceof List)) {
+        appendResult(data, sb, "Type mismatch: expected List instances");
+        return false;
+      }

       List<Object> l1 = (List<Object>) o1;
       List<Object> l2 = (List<Object>) o2;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

  private boolean areEqual(ai.chronon.api.thrift.partial.ThriftMetadata.ThriftList data, Object o1, Object o2, StringBuilder sb) {

    if (!(o1 instanceof List) || !(o2 instanceof List)) {
      appendResult(data, sb, "Type mismatch: expected List instances");
      return false;
    }

    List<Object> l1 = (List<Object>) o1;
    List<Object> l2 = (List<Object>) o2;

    ComparisonResult result = checkNullEquality(data, o1, o2, sb);
    if (result != ComparisonResult.UNKNOWN) {
      return result == ComparisonResult.EQUAL;
    }

    if (!checkSizeEquality(data, l1, l2, sb, "list")) {
      return false;
    }

    for (int i = 0; i < l1.size(); i++) {
      Object e1 = l1.get(i);
      Object e2 = l2.get(i);
      if (!this.areEqual(data.elementData, e1, e2, sb)) {
        return false;
      }
    }

    return true;
  }

205-227: ⚠️ Potential issue

Add type checks for Set instances in areEqual method

The method presumes that o1 and o2 are instances of Set without checking, which may lead to a ClassCastException if this assumption is false. Include type checks to ensure safe casting.

Apply this diff to add type checks:

     private boolean areEqual(
             ai.chronon.api.thrift.partial.ThriftMetadata.ThriftSet data, Object o1, Object o2, StringBuilder sb) {

+      if (!(o1 instanceof Set) || !(o2 instanceof Set)) {
+        appendResult(data, sb, "Type mismatch: expected Set instances");
+        return false;
+      }

       Set<Object> s1 = (Set<Object>) o1;
       Set<Object> s2 = (Set<Object>) o2;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

  private boolean areEqual(ai.chronon.api.thrift.partial.ThriftMetadata.ThriftSet data, Object o1, Object o2, StringBuilder sb) {

    if (!(o1 instanceof Set) || !(o2 instanceof Set)) {
      appendResult(data, sb, "Type mismatch: expected Set instances");
      return false;
    }

    Set<Object> s1 = (Set<Object>) o1;
    Set<Object> s2 = (Set<Object>) o2;

    ComparisonResult result = checkNullEquality(data, o1, o2, sb);
    if (result != ComparisonResult.UNKNOWN) {
      return result == ComparisonResult.EQUAL;
    }

    if (!checkSizeEquality(data, s1, s2, sb, "set")) {
      return false;
    }

    for (Object e1 : s1) {
      if (!s2.contains(e1)) {
        appendResult(data, sb, "Element %s in s1 not found in s2", e1);
        return false;
      }
    }

    return true;
  }

229-259: ⚠️ Potential issue

Add type checks for Map instances in areEqual method

The method assumes o1 and o2 are instances of Map without validation, potentially causing a ClassCastException if they are not. It's advisable to check the object types before casting.

Apply this diff to include type checks:

     private boolean areEqual(
             ai.chronon.api.thrift.partial.ThriftMetadata.ThriftMap data, Object o1, Object o2, StringBuilder sb) {

+      if (!(o1 instanceof Map) || !(o2 instanceof Map)) {
+        appendResult(data, sb, "Type mismatch: expected Map instances");
+        return false;
+      }

       Map<Object, Object> m1 = (Map<Object, Object>) o1;
       Map<Object, Object> m2 = (Map<Object, Object>) o2;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

  private boolean areEqual(ai.chronon.api.thrift.partial.ThriftMetadata.ThriftMap data, Object o1, Object o2, StringBuilder sb) {

    if (!(o1 instanceof Map) || !(o2 instanceof Map)) {
      appendResult(data, sb, "Type mismatch: expected Map instances");
      return false;
    }

    Map<Object, Object> m1 = (Map<Object, Object>) o1;
    Map<Object, Object> m2 = (Map<Object, Object>) o2;

    ComparisonResult result = checkNullEquality(data, o1, o2, sb);
    if (result != ComparisonResult.UNKNOWN) {
      return result == ComparisonResult.EQUAL;
    }

    if (!checkSizeEquality(data, m1.keySet(), m2.keySet(), sb, "map.keySet")) {
      return false;
    }

    for (Map.Entry e1 : m1.entrySet()) {
      Object k1 = e1.getKey();

      if (!m2.containsKey(k1)) {
        appendResult(data, sb, "Key %s in m1 not found in m2", k1);
        return false;
      }

      Object v1 = e1.getValue();
      Object v2 = m2.get(k1);
      if (!this.areEqual(data.valueData, v1, v2, sb)) {
        return false;
      }
    }

    return true;
  }
api/src/main/java/ai/chronon/api/thrift/protocol/TSimpleJSONProtocol.java (4)

242-245: ⚠️ Potential issue

Ensure correct JSON representation of boolean values

Currently, the writeBool method writes boolean values as numeric 1 or 0 by calling writeByte. In JSON, booleans should be represented as true or false literals. Consider updating the writeBool method to output the appropriate JSON boolean values.

Apply this diff to correct the boolean serialization:

 @Override
 public void writeBool(boolean b) throws TException {
-  writeByte(b ? (byte) 1 : (byte) 0);
+  writeContext_.write();
+  _writeStringData(b ? "true" : "false");
 }

Committable suggestion was skipped due to low confidence.


352-361: ⚠️ Potential issue

Handle binary data correctly by encoding in Base64

The writeBinary method attempts to convert binary data to a UTF-8 encoded string, which may lead to data loss or errors if the binary data is not valid UTF-8. Additionally, there's a TODO comment indicating this needs to be addressed. In JSON, binary data is typically encoded using Base64. Consider updating the implementation to encode the binary data using Base64.

Apply this diff to correctly handle binary data:

+import java.util.Base64;
...
 @Override
 public void writeBinary(ByteBuffer bin) throws TException {
-  // TODO(mcslee): Fix this
-  writeString(
-      new String(
-          bin.array(),
-          bin.position() + bin.arrayOffset(),
-          bin.limit() - bin.position() - bin.arrayOffset(),
-          StandardCharsets.UTF_8));
+  byte[] bytes = new byte[bin.remaining()];
+  bin.get(bytes);
+  String base64Encoded = Base64.getEncoder().encodeToString(bytes);
+  writeString(base64Encoded);
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

  public void writeBinary(ByteBuffer bin) throws TException {
    byte[] bytes = new byte[bin.remaining()];
    bin.get(bytes);
    String base64Encoded = Base64.getEncoder().encodeToString(bytes);
    writeString(base64Encoded);
  }

298-350: 🛠️ Refactor suggestion

Consider using a standard library for JSON string escaping

The writeString method manually escapes JSON strings, which can be error-prone and might not handle all edge cases as per the JSON specification. To improve reliability and maintainability, consider using a well-tested library method for JSON string escaping.

Apply this diff to utilize StringEscapeUtils.escapeJson from Apache Commons Text:

+import org.apache.commons.text.StringEscapeUtils;
...
 @Override
 public void writeString(String str) throws TException {
   writeContext_.write();
-  int length = str.length();
-  StringBuilder escape = new StringBuilder(length + 16);
-  escape.append(QUOTE);
-  for (int i = 0; i < length; ++i) {
-    char c = str.charAt(i);
-    switch (c) {
-      // existing case statements
-    }
-  }
-  escape.append(QUOTE);
-  _writeStringData(escape.toString());
+  String escapedString = QUOTE + StringEscapeUtils.escapeJson(str) + QUOTE;
+  _writeStringData(escapedString);
 }

Alternatively, if adding a new dependency is not preferred, ensure that the manual escaping code fully complies with the JSON specification.

Committable suggestion was skipped due to low confidence.


484-516: 🛠️ Refactor suggestion

Use constant definitions instead of hard-coded type codes

The getMinSerializedSize method uses hard-coded byte values to represent type codes (e.g., 0 for Stop, 1 for Void). For better readability and maintainability, consider using the predefined constants from TType (e.g., TType.STOP, TType.VOID) instead of hard-coded numbers.

Apply this diff to replace hard-coded values with constants:

+import ai.chronon.api.thrift.protocol.TType;
...
 @Override
 public int getMinSerializedSize(byte type) throws TException {
   switch (type) {
-    case 0:
+    case TType.STOP:
       return 0; // Stop
-    case 1:
+    case TType.VOID:
       return 0; // Void
-    case 2:
+    case TType.BOOL:
       return 1; // Bool
-    case 3:
+    case TType.BYTE:
       return 1; // Byte
-    case 4:
+    case TType.DOUBLE:
       return 1; // Double
-    case 6:
+    case TType.I16:
       return 1; // I16
-    case 8:
+    case TType.I32:
       return 1; // I32
-    case 10:
+    case TType.I64:
       return 1; // I64
-    case 11:
+    case TType.STRING:
       return 2; // String length
-    case 12:
+    case TType.STRUCT:
       return 2; // Empty struct
-    case 13:
+    case TType.MAP:
       return 2; // Element count Map
-    case 14:
+    case TType.SET:
       return 2; // Element count Set
-    case 15:
+    case TType.LIST:
       return 2; // Element count List
     default:
       throw new TTransportException(TTransportException.UNKNOWN, "Unrecognized type code");
   }
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

  @Override
  public int getMinSerializedSize(byte type) throws TException {
    switch (type) {
      case TType.STOP:
        return 0; // Stop
      case TType.VOID:
        return 0; // Void
      case TType.BOOL:
        return 1; // Bool
      case TType.BYTE:
        return 1; // Byte
      case TType.DOUBLE:
        return 1; // Double
      case TType.I16:
        return 1; // I16
      case TType.I32:
        return 1; // I32
      case TType.I64:
        return 1; // I64
      case TType.STRING:
        return 2; // String length
      case TType.STRUCT:
        return 2; // Empty struct
      case TType.MAP:
        return 2; // Element count Map
      case TType.SET:
        return 2; // Element count Set
      case TType.LIST:
        return 2; // Element count List
      default:
        throw new TTransportException(TTransportException.UNKNOWN, "Unrecognized type code");
    }
  }
api/src/main/java/ai/chronon/api/thrift/protocol/TProtocol.java (4)

381-386: 🛠️ Refactor suggestion

Optimize readFieldBeginData() for performance

The readFieldBeginData() method could be optimized by avoiding the creation of redundant TField objects if possible. This can enhance performance in scenarios where this method is frequently called.

Consider caching or reusing TField instances if thread safety and design allow.


46-48: 🛠️ Refactor suggestion

Consider removing the unnecessary private constructor

Since TProtocol is an abstract class, it cannot be instantiated directly. Therefore, the private constructor is redundant and can be removed. Additionally, the @SuppressWarnings("unused") annotation is unnecessary.

Apply this diff to remove the redundant constructor:

-  /** Prevent direct instantiation */
-  @SuppressWarnings("unused")
-  private TProtocol() {}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.



63-76: ⚠️ Potential issue

Handle potential integer overflows in byte availability checks

In the methods checkReadBytesAvailable(TMap map), checkReadBytesAvailable(TList list), and checkReadBytesAvailable(TSet set), the calculations involving map.size * elemSize could potentially cause an integer overflow with large sizes. Consider adding safeguards to prevent overflow errors.

Apply this diff to handle potential overflows:

     long elemSize = getMinSerializedSize(map.keyType) + getMinSerializedSize(map.valueType);
-    trans_.checkReadBytesAvailable(map.size * elemSize);
+    BigInteger totalSize = BigInteger.valueOf(map.size).multiply(BigInteger.valueOf(elemSize));
+    if (totalSize.compareTo(BigInteger.valueOf(Long.MAX_VALUE)) > 0) {
+      throw new TException("Size exceeds maximum allowable value");
+    }
+    trans.checkReadBytesAvailable(totalSize.longValue());

Committable suggestion was skipped due to low confidence.


507-521: 🛠️ Refactor suggestion

Refactor recursion in skipBytes(int numBytes) to iteration

The skipBytes method uses recursion when numBytes exceeds MAX_SKIPPED_BYTES, which can lead to stack overflow with large values. Refactoring to use a loop will improve performance and reliability.

Apply this diff to use iterative logic:

 protected void skipBytes(int numBytes) throws TException {
-    if (numBytes <= MAX_SKIPPED_BYTES) {
-      if (this.getTransport().getBytesRemainingInBuffer() >= numBytes) {
-        this.getTransport().consumeBuffer(numBytes);
-      } else {
-        this.getTransport().readAll(skippedBytes, 0, numBytes);
-      }
-    } else {
-      int remaining = numBytes;
-      while (remaining > 0) {
-        skipBytes(Math.min(remaining, MAX_SKIPPED_BYTES));
-        remaining -= MAX_SKIPPED_BYTES;
-      }
-    }
+    int remaining = numBytes;
+    while (remaining > 0) {
+      int toSkip = Math.min(remaining, MAX_SKIPPED_BYTES);
+      if (this.getTransport().getBytesRemainingInBuffer() >= toSkip) {
+        this.getTransport().consumeBuffer(toSkip);
+      } else {
+        this.getTransport().readAll(skippedBytes, 0, toSkip);
+      }
+      remaining -= toSkip;
+    }
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

  protected void skipBytes(int numBytes) throws TException {
    int remaining = numBytes;
    while (remaining > 0) {
      int toSkip = Math.min(remaining, MAX_SKIPPED_BYTES);
      if (this.getTransport().getBytesRemainingInBuffer() >= toSkip) {
        this.getTransport().consumeBuffer(toSkip);
      } else {
        this.getTransport().readAll(skippedBytes, 0, toSkip);
      }
      remaining -= toSkip;
    }
  }
api/src/main/java/ai/chronon/api/thrift/partial/ThriftMetadata.java (3)

142-143: 🛠️ Refactor suggestion

Standardize type references by removing unnecessary package qualifiers

Throughout the code, there are instances where fully qualified class names are used for classes within the same package or already imported. This reduces readability and introduces inconsistency.

Consider refactoring the code to remove redundant package prefixes. Examples include:

-        ThriftObject createNew(
-            ThriftObject parent, TFieldIdEnum fieldId, FieldMetaData data, List<ai.chronon.api.thrift.partial.ThriftField> fields) {
+        ThriftObject createNew(
+            ThriftObject parent, TFieldIdEnum fieldId, FieldMetaData data, List<ThriftField> fields) {

...

-        ThriftList(
-            ThriftObject parent, TFieldIdEnum fieldId, FieldMetaData data, List<ai.chronon.api.thrift.partial.ThriftField> fields) {
+        ThriftList(
+            ThriftObject parent, TFieldIdEnum fieldId, FieldMetaData data, List<ThriftField> fields) {

...

-        ThriftSet(
-            ThriftObject parent, TFieldIdEnum fieldId, FieldMetaData data, List<ai.chronon.api.thrift.partial.ThriftField> fields) {
+        ThriftSet(
+            ThriftObject parent, TFieldIdEnum fieldId, FieldMetaData data, List<ThriftField> fields) {

...

-        ThriftMap(
-            ThriftObject parent, TFieldIdEnum fieldId, FieldMetaData data, List<ai.chronon.api.thrift.partial.ThriftField> fields) {
+        ThriftMap(
+            ThriftObject parent, TFieldIdEnum fieldId, FieldMetaData data, List<ThriftField> fields) {

...

-        public static ThriftStructBase create(
-            ThriftObject parent,
-            TFieldIdEnum fieldId,
-            FieldMetaData data,
-            Iterable<ai.chronon.api.thrift.partial.ThriftField> fieldsData) {
+        public static ThriftStructBase create(
+            ThriftObject parent,
+            TFieldIdEnum fieldId,
+            FieldMetaData data,
+            Iterable<ThriftField> fieldsData) {

...

-        ThriftStruct(
-            ThriftObject parent,
-            TFieldIdEnum fieldId,
-            FieldMetaData data,
-            Iterable<ai.chronon.api.thrift.partial.ThriftField> fieldsData) {
+        ThriftStruct(
+            ThriftObject parent,
+            TFieldIdEnum fieldId,
+            FieldMetaData data,
+            Iterable<ThriftField> fieldsData) {

...

-        private static <U extends TBase> Map<Integer, ThriftObject> getFields(
-            ThriftStruct parent, Class<U> clasz, Iterable<ai.chronon.api.thrift.partial.ThriftField> fieldsData) {
+        private static <U extends TBase> Map<Integer, ThriftObject> getFields(
+            ThriftStruct parent, Class<U> clasz, Iterable<ThriftField> fieldsData) {

Also applies to: 254-265, 285-296, 317-338, 388-388, 427-427, 493-493


79-84: ⚠️ Potential issue

Remove unused enum ComparisonResult

The ComparisonResult enum defined at lines 79-84 is not used anywhere in the code. Removing unused code enhances maintainability and reduces potential confusion.

Apply this diff to remove the unused enum:

-  private enum ComparisonResult {
-    UNKNOWN,
-    EQUAL,
-    NOT_EQUAL
-  }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.



228-228: 🛠️ Refactor suggestion

Eliminate redundant fully qualified name in ThriftEnum

The declaration of the EnumCache field uses a fully qualified class name unnecessarily. Since EnumCache is in the same package and already imported, you can simplify the declaration.

Apply this diff to simplify the field declaration:

-    private static ai.chronon.api.thrift.partial.EnumCache enums = new EnumCache();
+    private static EnumCache enums = new EnumCache();
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

    private static EnumCache enums = new EnumCache();
api/src/main/java/ai/chronon/api/thrift/protocol/TBinaryProtocol.java (7)

469-479: ⚠️ Potential issue

Implement safeguards for large string reads in readString.

The readString method reads a string based on a size specified by readI32(). There is a risk of allocating large amounts of memory if the size is very large, which could lead to out-of-memory errors or denial-of-service vulnerabilities.

Implement checks to enforce a maximum string size limit beyond which an exception is thrown, enhancing security and stability.


373-385: 🛠️ Refactor suggestion

Optimize readI16 method for readability and performance.

The readI16 method manually reads bytes and constructs a short. Using ByteBuffer could improve readability and maintainability.

Refactor readI16 as follows:

public short readI16() throws TException {
    readAll(inoutTemp, 0, 2);
    return ByteBuffer.wrap(inoutTemp, 0, 2).getShort();
}

Ensure ByteBuffer is correctly set to big-endian order, which is the default.


362-369: ⚠️ Potential issue

Handle end-of-stream conditions in readByte.

In the readByte method, if the end of the stream is reached unexpectedly, it may result in an exception. Ensure that proper exception handling or checks are in place to handle end-of-stream conditions gracefully.

Add appropriate exception handling to prevent unexpected crashes.


248-255: ⚠️ Potential issue

Correct the indexing in the writeUuid method.

There's an inconsistency in indexing when writing the MSB. The current code adds 1 + 8, 2 + 8, etc., which results in incorrect indices and overwrites data.

Apply this diff to fix the indices:

  {
    long msb = uuid.getMostSignificantBits();
-   inoutTemp[8] = (byte) (0xff & (msb >> 56));
-   inoutTemp[1 + 8] = (byte) (0xff & (msb >> 48));
-   inoutTemp[2 + 8] = (byte) (0xff & (msb >> 40));
-   inoutTemp[3 + 8] = (byte) (0xff & (msb >> 32));
-   inoutTemp[4 + 8] = (byte) (0xff & (msb >> 24));
-   inoutTemp[5 + 8] = (byte) (0xff & (msb >> 16));
-   inoutTemp[6 + 8] = (byte) (0xff & (msb >> 8));
-   inoutTemp[7 + 8] = (byte) (0xff & (msb));
+   inoutTemp[8] = (byte) (0xff & (msb >> 56));
+   inoutTemp[9] = (byte) (0xff & (msb >> 48));
+   inoutTemp[10] = (byte) (0xff & (msb >> 40));
+   inoutTemp[11] = (byte) (0xff & (msb >> 32));
+   inoutTemp[12] = (byte) (0xff & (msb >> 24));
+   inoutTemp[13] = (byte) (0xff & (msb >> 16));
+   inoutTemp[14] = (byte) (0xff & (msb >> 8));
+   inoutTemp[15] = (byte) (0xff & (msb));
  }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

      inoutTemp[8] = (byte) (0xff & (msb >> 56));
      inoutTemp[9] = (byte) (0xff & (msb >> 48));
      inoutTemp[10] = (byte) (0xff & (msb >> 40));
      inoutTemp[11] = (byte) (0xff & (msb >> 32));
      inoutTemp[12] = (byte) (0xff & (msb >> 24));
      inoutTemp[13] = (byte) (0xff & (msb >> 16));
      inoutTemp[14] = (byte) (0xff & (msb >> 8));
      inoutTemp[15] = (byte) (0xff & (msb));
    }

441-459: ⚠️ Potential issue

Correct the indexing in the readUuid method.

Similarly, when reading the MSB, the code uses incorrect indices like off + 8 + 1. This should be adjusted to off + 9, off + 10, etc.

Apply this diff to fix the indices:

  long msb =
      ((long) (buf[off + 8] & 0xff) << 56)
-       | ((long) (buf[off + 8 + 1] & 0xff) << 48)
-       | ((long) (buf[off + 8 + 2] & 0xff) << 40)
-       | ((long) (buf[off + 8 + 3] & 0xff) << 32)
-       | ((long) (buf[off + 8 + 4] & 0xff) << 24)
-       | ((long) (buf[off + 8 + 5] & 0xff) << 16)
-       | ((long) (buf[off + 8 + 6] & 0xff) << 8)
-       | ((long) (buf[off + 8 + 7] & 0xff));
+       | ((long) (buf[off + 9] & 0xff) << 48)
+       | ((long) (buf[off + 10] & 0xff) << 40)
+       | ((long) (buf[off + 11] & 0xff) << 32)
+       | ((long) (buf[off + 12] & 0xff) << 24)
+       | ((long) (buf[off + 13] & 0xff) << 16)
+       | ((long) (buf[off + 14] & 0xff) << 8)
+       | ((long) (buf[off + 15] & 0xff));
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

        ((long) (buf[off] & 0xff) << 56)
            | ((long) (buf[off + 1] & 0xff) << 48)
            | ((long) (buf[off + 2] & 0xff) << 40)
            | ((long) (buf[off + 3] & 0xff) << 32)
            | ((long) (buf[off + 4] & 0xff) << 24)
            | ((long) (buf[off + 5] & 0xff) << 16)
            | ((long) (buf[off + 6] & 0xff) << 8)
            | ((long) (buf[off + 7] & 0xff));

    long msb =
        ((long) (buf[off + 8] & 0xff) << 56)
            | ((long) (buf[off + 9] & 0xff) << 48)
            | ((long) (buf[off + 10] & 0xff) << 40)
            | ((long) (buf[off + 11] & 0xff) << 32)
            | ((long) (buf[off + 12] & 0xff) << 24)
            | ((long) (buf[off + 13] & 0xff) << 16)
            | ((long) (buf[off + 14] & 0xff) << 8)
            | ((long) (buf[off + 15] & 0xff));
    return new UUID(msb, lsb);

430-460: ⚠️ Potential issue

Verify the correctness of the readUuid method.

In the readUuid method, the MSB and LSB are read and assigned to the wrong variables. The LSB should be read after the MSB to match the standard UUID byte order.

Apply this diff to correct the order:

  public UUID readUuid() throws TException {
    // ...
-   long lsb =
+   long msb =
        ((long) (buf[off] & 0xff) << 56)
            | ((long) (buf[off + 1] & 0xff) << 48)
            | ((long) (buf[off + 2] & 0xff) << 40)
            | ((long) (buf[off + 3] & 0xff) << 32)
            | ((long) (buf[off + 4] & 0xff) << 24)
            | ((long) (buf[off + 5] & 0xff) << 16)
            | ((long) (buf[off + 6] & 0xff) << 8)
            | ((long) (buf[off + 7] & 0xff));
-   long msb =
+   long lsb =
        ((long) (buf[off + 8] & 0xff) << 56)
            | ((long) (buf[off + 9] & 0xff) << 48)
            | ((long) (buf[off + 10] & 0xff) << 40)
            | ((long) (buf[off + 11] & 0xff) << 32)
            | ((long) (buf[off + 12] & 0xff) << 24)
            | ((long) (buf[off + 13] & 0xff) << 16)
            | ((long) (buf[off + 14] & 0xff) << 8)
            | ((long) (buf[off + 15] & 0xff));
    return new UUID(msb, lsb);
  }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

    byte[] buf = inoutTemp;
    int off = 0;

    if (trans_.getBytesRemainingInBuffer() >= 16) {
      buf = trans_.getBuffer();
      off = trans_.getBufferPosition();
      trans_.consumeBuffer(16);
    } else {
      readAll(inoutTemp, 0, 16);
    }
    long msb =
        ((long) (buf[off] & 0xff) << 56)
            | ((long) (buf[off + 1] & 0xff) << 48)
            | ((long) (buf[off + 2] & 0xff) << 40)
            | ((long) (buf[off + 3] & 0xff) << 32)
            | ((long) (buf[off + 4] & 0xff) << 24)
            | ((long) (buf[off + 5] & 0xff) << 16)
            | ((long) (buf[off + 6] & 0xff) << 8)
            | ((long) (buf[off + 7] & 0xff));

    long lsb =
        ((long) (buf[off + 8] & 0xff) << 56)
            | ((long) (buf[off + 9] & 0xff) << 48)
            | ((long) (buf[off + 10] & 0xff) << 40)
            | ((long) (buf[off + 11] & 0xff) << 32)
            | ((long) (buf[off + 12] & 0xff) << 24)
            | ((long) (buf[off + 13] & 0xff) << 16)
            | ((long) (buf[off + 14] & 0xff) << 8)
            | ((long) (buf[off + 15] & 0xff));
    return new UUID(msb, lsb);
  }

233-257: ⚠️ Potential issue

Verify the correctness of the writeUuid method.

In the writeUuid method, the most significant bits (MSB) and least significant bits (LSB) of the UUID are being written to the inoutTemp array. Currently, the LSB is written first to indices 0 to 7, and the MSB is written next to indices 8 to 15. According to the standard UUID byte representation, the MSB should be written first, followed by the LSB.

Apply this diff to correct the order:

 public void writeUuid(UUID uuid) throws TException {
-   {
-     long lsb = uuid.getLeastSignificantBits();
+   {
+     long msb = uuid.getMostSignificantBits();
      inoutTemp[0] = (byte) (0xff & (msb >> 56));
      inoutTemp[1] = (byte) (0xff & (msb >> 48));
      inoutTemp[2] = (byte) (0xff & (msb >> 40));
      inoutTemp[3] = (byte) (0xff & (msb >> 32));
      inoutTemp[4] = (byte) (0xff & (msb >> 24));
      inoutTemp[5] = (byte) (0xff & (msb >> 16));
      inoutTemp[6] = (byte) (0xff & (msb >> 8));
      inoutTemp[7] = (byte) (0xff & (msb));
-   }
-   {
-     long msb = uuid.getMostSignificantBits();
+   }
+   {
+     long lsb = uuid.getLeastSignificantBits();
      inoutTemp[8] = (byte) (0xff & (lsb >> 56));
      inoutTemp[9] = (byte) (0xff & (lsb >> 48));
      inoutTemp[10] = (byte) (0xff & (lsb >> 40));
      inoutTemp[11] = (byte) (0xff & (lsb >> 32));
      inoutTemp[12] = (byte) (0xff & (lsb >> 24));
      inoutTemp[13] = (byte) (0xff & (lsb >> 16));
      inoutTemp[14] = (byte) (0xff & (lsb >> 8));
      inoutTemp[15] = (byte) (0xff & (lsb));
    }
    trans_.write(inoutTemp, 0, 16);
  }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

  public void writeUuid(UUID uuid) throws TException {
    {
      long msb = uuid.getMostSignificantBits();
      inoutTemp[0] = (byte) (0xff & (msb >> 56));
      inoutTemp[1] = (byte) (0xff & (msb >> 48));
      inoutTemp[2] = (byte) (0xff & (msb >> 40));
      inoutTemp[3] = (byte) (0xff & (msb >> 32));
      inoutTemp[4] = (byte) (0xff & (msb >> 24));
      inoutTemp[5] = (byte) (0xff & (msb >> 16));
      inoutTemp[6] = (byte) (0xff & (msb >> 8));
      inoutTemp[7] = (byte) (0xff & (msb));
    }
    {
      long lsb = uuid.getLeastSignificantBits();
      inoutTemp[8] = (byte) (0xff & (lsb >> 56));
      inoutTemp[9] = (byte) (0xff & (lsb >> 48));
      inoutTemp[10] = (byte) (0xff & (lsb >> 40));
      inoutTemp[11] = (byte) (0xff & (lsb >> 32));
      inoutTemp[12] = (byte) (0xff & (lsb >> 24));
      inoutTemp[13] = (byte) (0xff & (lsb >> 16));
      inoutTemp[14] = (byte) (0xff & (lsb >> 8));
      inoutTemp[15] = (byte) (0xff & (lsb));
    }
    trans_.write(inoutTemp, 0, 16);
  }
api/src/main/java/ai/chronon/api/thrift/TDeserializer.java (4)

52-54: ⚠️ Potential issue

Thread Safety Considerations

The class TDeserializer utilizes instance variables protocol_ and trans_ that are shared across method calls. If an instance of TDeserializer is accessed by multiple threads concurrently, this could lead to race conditions and unexpected behavior.

Consider documenting that TDeserializer instances are not thread-safe and should not be shared between threads. Alternatively, modify the implementation to ensure thread safety, such as by using thread-local variables or synchronizing access to shared resources.


195-208: ⚠️ Potential issue

Avoid Catching General Exception

In the partialDeserialize method, catching the general Exception can obscure specific issues and make debugging more challenging. It is advisable to catch only the exceptions that the method is expected to handle.

Refine the exception handling to catch specific exceptions relevant to the deserialization process, such as TException or IOException, to improve error clarity and maintainability.


402-441: ⚠️ Potential issue

Fields May Not Be Serialized in Ascending Order

The locateField method assumes that fields are serialized in ascending order of their field IDs to optimize the search. However, the Thrift protocol does not guarantee the order of serialized fields, which could result in missing the desired field during deserialization.

Adjust the field traversal logic to not rely on the ascending order of field IDs. Iterate through all fields, checking each field's ID against the desired field ID until it is found or the end of the struct is reached. This ensures accurate deserialization regardless of field order.


174-182: ⚠️ Potential issue

Include Original Exception When Encoding Is Unsupported

In the deserialize method, when catching UnsupportedEncodingException, a new TException is thrown without preserving the original exception. This omission can hinder debugging by losing the stack trace and the original exception details.

Include the original exception when rethrowing to preserve debugging information:

try {
  deserialize(base, data.getBytes(charset));
} catch (UnsupportedEncodingException uex) {
-  throw new TException("JVM DOES NOT SUPPORT ENCODING: " + charset);
+  throw new TException("JVM does not support encoding: " + charset, uex);
} finally {
  protocol_.reset();
}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

  public void deserialize(ai.chronon.api.thrift.TBase base, String data, String charset) throws ai.chronon.api.thrift.TException {
    try {
      deserialize(base, data.getBytes(charset));
    } catch (UnsupportedEncodingException uex) {
      throw new ai.chronon.api.thrift.TException("JVM does not support encoding: " + charset, uex);
    } finally {
      protocol_.reset();
    }
  }
api/src/main/java/ai/chronon/api/thrift/protocol/TJSONProtocol.java (3)

638-697: 🛠️ Refactor suggestion

Optimize string decoding for better performance

In readJSONString(boolean skipContext), the method constructs strings from code points using new String(int[], int, int).getBytes(). This approach introduces unnecessary overhead by converting between int[], String, and byte[].

Consider using a ByteArrayOutputStream directly for UTF-8 encoded bytes or utilizing CharBuffer and CharsetEncoder for more efficient decoding.

Apply this diff to optimize the string decoding:

-              arr.write(
-                  (new String(new int[] {codeunits.get(0), codeunits.get(1)}, 0, 2))
-                      .getBytes(StandardCharsets.UTF_8));
+              String surrogatePair = new String(
+                  new char[] {codeunits.get(0), codeunits.get(1)});
+              arr.write(surrogatePair.getBytes(StandardCharsets.UTF_8));

This reduces the creation of intermediate objects and enhances performance.

Committable suggestion was skipped due to low confidence.


381-408: ⚠️ Potential issue

Correct character escaping in writeJSONString

The method writeJSONString(byte[] b) does not handle all special characters correctly according to JSON encoding rules. Specifically, control characters (code points less than 0x20) need to be escaped.

Ensure that all control characters are properly escaped to maintain valid JSON output.

Apply this diff to fix the character escaping:

 for (int i = 0; i < len; i++) {
-    if ((b[i] & 0x00FF) >= 0x30) {
+    if ((b[i] & 0xFF) >= 0x20 && b[i] != BACKSLASH[0] && b[i] != QUOTE[0]) {
         trans_.write(b, i, 1);
     } else {
         // Escape character

Committable suggestion was skipped due to low confidence.


627-632: ⚠️ Potential issue

Handle potential UnsupportedOperationException when accessing ByteBuffer

In writeBinary(ByteBuffer bin), the method accesses bin.array() without checking if the ByteBuffer has an accessible backing array. If the buffer is not backed by an array or is read-only, this will throw an UnsupportedOperationException.

To prevent this exception, check if the buffer has an array using bin.hasArray() before accessing it. If not, handle the buffer accordingly.

Apply this diff to fix the issue:

 public void writeBinary(ByteBuffer bin) throws TException {
-    writeJSONBase64(
-        bin.array(),
-        bin.position() + bin.arrayOffset(),
-        bin.limit() - bin.position() - bin.arrayOffset());
+    if (bin.hasArray()) {
+        writeJSONBase64(
+            bin.array(),
+            bin.position() + bin.arrayOffset(),
+            bin.remaining());
+    } else {
+        byte[] bytes = new byte[bin.remaining()];
+        bin.duplicate().get(bytes);
+        writeJSONBase64(bytes, 0, bytes.length);
+    }
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

  public void writeBinary(ByteBuffer bin) throws TException {
    if (bin.hasArray()) {
        writeJSONBase64(
            bin.array(),
            bin.position() + bin.arrayOffset(),
            bin.remaining());
    } else {
        byte[] bytes = new byte[bin.remaining()];
        bin.duplicate().get(bytes);
        writeJSONBase64(bytes, 0, bytes.length);
    }
  }
api/src/main/java/ai/chronon/api/thrift/protocol/TCompactProtocol.java (5)

589-589: ⚠️ Potential issue

Undefined method checkReadBytesAvailable(TMap)

At line 589, the method checkReadBytesAvailable(map); is called, but there is no definition of checkReadBytesAvailable(TMap) in the class. This will cause a compilation error.

Apply this diff to fix the issue:

-        checkReadBytesAvailable(map);
+        getTransport().checkReadBytesAvailable(size);

Alternatively, define the checkReadBytesAvailable(TMap map) method if additional checks are needed.

Committable suggestion was skipped due to low confidence.


607-607: ⚠️ Potential issue

Undefined method checkReadBytesAvailable(TList)

At line 607, the method checkReadBytesAvailable(list); is called, but there is no definition of checkReadBytesAvailable(TList) in the class. This will cause a compilation error.

Apply this diff to fix the issue:

-        checkReadBytesAvailable(list);
+        getTransport().checkReadBytesAvailable(size);

Alternatively, define the checkReadBytesAvailable(TList list) method if additional checks are needed.

Committable suggestion was skipped due to low confidence.


468-477: 🛠️ Refactor suggestion

Refactor fixedLongToBytes using ByteBuffer for clarity

Consider using ByteBuffer with ByteOrder.LITTLE_ENDIAN in the fixedLongToBytes method to improve code readability and maintainability.

Apply this diff to refactor the method:

-  private void fixedLongToBytes(long n, byte[] buf, int off) {
-    buf[off + 0] = (byte) (n & 0xff);
-    buf[off + 1] = (byte) ((n >> 8) & 0xff);
-    buf[off + 2] = (byte) ((n >> 16) & 0xff);
-    buf[off + 3] = (byte) ((n >> 24) & 0xff);
-    buf[off + 4] = (byte) ((n >> 32) & 0xff);
-    buf[off + 5] = (byte) ((n >> 40) & 0xff);
-    buf[off + 6] = (byte) ((n >> 48) & 0xff);
-    buf[off + 7] = (byte) ((n >> 56) & 0xff);
-  }
+  private void fixedLongToBytes(long n, byte[] buf, int off) {
+    ByteBuffer.wrap(buf, off, 8)
+              .order(ByteOrder.LITTLE_ENDIAN)
+              .putLong(n);
+  }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

  private void fixedLongToBytes(long n, byte[] buf, int off) {
    ByteBuffer.wrap(buf, off, 8)
              .order(ByteOrder.LITTLE_ENDIAN)
              .putLong(n);
  }

782-805: ⚠️ Potential issue

Prevent infinite loops in readVarint32() by adding a shift limit

In the readVarint32() method, there is a possibility of an infinite loop or integer overflow if the data is malformed and lacks a terminating byte with the MSB unset. To safeguard against this, add a maximum shift limit.

Apply this diff to fix the issue:

private int readVarint32() throws TException {
    int result = 0;
    int shift = 0;
+   while (true) {
+     if (shift >= 32) {
+       throw new TProtocolException("Variable-length int overflows a 32-bit integer");
+     }
      byte b = readByte();
      result |= (b & 0x7f) << shift;
      if ((b & 0x80) != 0x80) break;
      shift += 7;
    }
    return result;
}

Committable suggestion was skipped due to low confidence.


812-836: ⚠️ Potential issue

Prevent infinite loops in readVarint64() by adding a shift limit

In the readVarint64() method, there is a possibility of an infinite loop or integer overflow if the data is malformed and lacks a terminating byte with the MSB unset. To safeguard against this, add a maximum shift limit.

Apply this diff to fix the issue:

private long readVarint64() throws TException {
    int shift = 0;
    long result = 0;
+   while (true) {
+     if (shift >= 64) {
+       throw new TProtocolException("Variable-length int overflows a 64-bit integer");
+     }
      byte b = readByte();
      result |= (long) (b & 0x7f) << shift;
      if ((b & 0x80) != 0x80) break;
      shift += 7;
    }
    return result;
}

Committable suggestion was skipped due to low confidence.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between bf7d112 and 4d3072b.

📒 Files selected for processing (1)
  • project/ThriftGen.scala (1 hunks)
🔇 Additional comments (1)
project/ThriftGen.scala (1)

9-25: LGTM! Improved error handling and logging.

The changes to print_and_execute enhance robustness by:

  • Properly capturing and logging both stdout and stderr
  • Returning the process exit code
  • Adding comprehensive exception handling

Comment on lines +45 to +50
print_and_execute(s"thrift --gen $language:generated_annotations=suppress -out $outputPath $inputPath")
val javaFiles = (PathFinder(new File(s"$outputPath/ai/chronon/api/")) ** "*.java").get()
javaFiles.foreach { file =>
println(s"Processing file: ${file.getPath}")
replaceInFile(file)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Consider a more functional approach to file processing.

The file processing loop could be more idiomatic Scala.

-    val javaFiles = (PathFinder(new File(s"$outputPath/ai/chronon/api/")) ** "*.java").get()
-    javaFiles.foreach { file =>
-      println(s"Processing file: ${file.getPath}")
-      replaceInFile(file)
-    }
+    (PathFinder(new File(s"$outputPath/ai/chronon/api/")) ** "*.java")
+      .get()
+      .tapEach(file => println(s"Processing file: ${file.getPath}"))
+      .foreach(replaceInFile)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
print_and_execute(s"thrift --gen $language:generated_annotations=suppress -out $outputPath $inputPath")
val javaFiles = (PathFinder(new File(s"$outputPath/ai/chronon/api/")) ** "*.java").get()
javaFiles.foreach { file =>
println(s"Processing file: ${file.getPath}")
replaceInFile(file)
}
print_and_execute(s"thrift --gen $language:generated_annotations=suppress -out $outputPath $inputPath")
(PathFinder(new File(s"$outputPath/ai/chronon/api/")) ** "*.java")
.get()
.tapEach(file => println(s"Processing file: ${file.getPath}"))
.foreach(replaceInFile)

Comment on lines +27 to +37
def replaceInFile(file: File): Unit = {
val source = scala.io.Source.fromFile(file)
val content = source.mkString
source.close()
val newContent = content.replace("org.apache.thrift", "ai.chronon.api.thrift")
val writer = new java.io.PrintWriter(file)
try {
writer.write(newContent)
} finally {
writer.close()
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Improve resource management in file operations.

The current implementation could leak resources if an exception occurs between operations. Consider using Scala's Using for safer resource management.

-  def replaceInFile(file: File): Unit = {
-    val source = scala.io.Source.fromFile(file)
-    val content = source.mkString
-    source.close()
-    val newContent = content.replace("org.apache.thrift", "ai.chronon.api.thrift")
-    val writer = new java.io.PrintWriter(file)
-    try {
-      writer.write(newContent)
-    } finally {
-      writer.close()
-    }
+  def replaceInFile(file: File): Unit = {
+    scala.util.Using.resources(
+      scala.io.Source.fromFile(file),
+      new java.io.PrintWriter(file)
+    ) { (source, writer) =>
+      val content = source.mkString
+      writer.write(content.replace("org.apache.thrift", "ai.chronon.api.thrift"))
+    }.get  // Consider proper error handling instead of .get
   }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def replaceInFile(file: File): Unit = {
val source = scala.io.Source.fromFile(file)
val content = source.mkString
source.close()
val newContent = content.replace("org.apache.thrift", "ai.chronon.api.thrift")
val writer = new java.io.PrintWriter(file)
try {
writer.write(newContent)
} finally {
writer.close()
}
def replaceInFile(file: File): Unit = {
scala.util.Using.resources(
scala.io.Source.fromFile(file),
new java.io.PrintWriter(file)
) { (source, writer) =>
val content = source.mkString
writer.write(content.replace("org.apache.thrift", "ai.chronon.api.thrift"))
}.get // Consider proper error handling instead of .get
}

Comment on lines 41 to +45
s"""echo "Generating files from thrift file: $inputPath \ninto folder $outputPath" """ !;
print_and_execute(s"rm -rf $outputPath/$cleanupSuffixPath")
s"mkdir -p $outputPath" !;
print_and_execute(s"thrift -version")
print_and_execute(s"thrift --gen $language -out $outputPath $inputPath")
print_and_execute(s"thrift --gen $language:generated_annotations=suppress -out $outputPath $inputPath")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

Security risk: Unsanitized path variables in shell commands

The code directly interpolates user-provided paths into shell commands without proper sanitization, which could lead to command injection vulnerabilities. Specifically:

  • Direct string interpolation in shell commands: s"rm -rf $outputPath/$cleanupSuffixPath"
  • No validation or escaping of inputPath, outputPath, and cleanupSuffixPath before use
  • While Validate.checkPath* methods exist in the codebase, they are not utilized in ThriftGen

Key risk points:

  • rm -rf command with unsanitized paths
  • Directory creation with unchecked paths
  • Thrift command execution with raw path variables
🔗 Analysis chain

Verify path sanitization for shell command execution.

The code executes shell commands with user-provided paths. We should ensure proper path sanitization to prevent command injection.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check for any existing path sanitization in the codebase
rg -l 'sanitize.*Path|escape.*Path' 

# Check how inputPath and outputPath are constructed
ast-grep --pattern 'def gen($_, $_, $_)'

Length of output: 80


Script:

#!/bin/bash
# Let's check the full implementation of ThriftGen to understand path handling
rg -A 10 -B 10 "class ThriftGen"

# Also check how these paths are used across the codebase
rg "inputPath|outputPath" 

# Look for any path validation or security measures
rg "validatePath|checkPath|safePath"

Length of output: 2473

Copy link
Contributor

@piyush-zlai piyush-zlai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Largely looks good - few nits I flagged

build.sbt Outdated
libraryDependencies ++= spark_all.map(_ % "test"),
libraryDependencies += "jakarta.servlet" % "jakarta.servlet-api" % "4.0.3"
libraryDependencies += "jakarta.servlet" % "jakarta.servlet-api" % "4.0.3",
// Test / fork := true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

case "plugin.xml" => MergeStrategy.last
case PathList("com", "fasterxml", _*) => MergeStrategy.last
case PathList("com", "google", _*) => MergeStrategy.last
case _ => MergeStrategy.first
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

revert formatting?

.config("spark.sql.session.timeZone", "UTC")
.config("spark.sql.adaptive.enabled", "false")
.config("spark.sql.legacy.timeParserPolicy", "LEGACY")
.config("spark.ui.enabled", "false")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

catalyst util shouldn't trigger ui's IMO. I would prefer to keep this in.

if (local) {
//required to run spark locally with hive support enabled - for sbt test
System.setSecurityManager(null)
// System.setSecurityManager(null)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we revert for now or yank out the full if local block here?

@nikhil-zlai nikhil-zlai merged commit 759e2e3 into main Oct 30, 2024
7 checks passed
@nikhil-zlai nikhil-zlai deleted the thrift_upgrade branch October 30, 2024 21:02
kumar-zlai pushed a commit that referenced this pull request Apr 25, 2025
## Summary
old version is old AF - and won't install any more

shading thrift simply wasn't enough to prevent collisions with the hive
metastore path within spark. I also had to remove the serving layer and
retain purely the serialization layer. There is an argument to make this
a java package. That can be a TODO for later.


## Checklist
- [ ] Added Unit Tests
- [x] Covered by existing CI
- [ ] Integration tested
- [ ] Documentation update



<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Updated Thrift version to 0.21.0, enhancing compatibility and
performance.
- Added support for additional development tools and libraries for
improved functionality.
- Disabled Spark UI during local session execution for a cleaner testing
environment.
- Introduced new utility classes and methods for better data handling
and serialization.
- Enhanced error handling and logging in various components for improved
robustness.

- **Bug Fixes**
- Improved file generation process with updated Thrift command to handle
annotations correctly.

- **Documentation**
- Added comprehensive documentation for partial deserialization in the
Thrift framework.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
kumar-zlai pushed a commit that referenced this pull request Apr 29, 2025
## Summary
old version is old AF - and won't install any more

shading thrift simply wasn't enough to prevent collisions with the hive
metastore path within spark. I also had to remove the serving layer and
retain purely the serialization layer. There is an argument to make this
a java package. That can be a TODO for later.


## Checklist
- [ ] Added Unit Tests
- [x] Covered by existing CI
- [ ] Integration tested
- [ ] Documentation update



<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Updated Thrift version to 0.21.0, enhancing compatibility and
performance.
- Added support for additional development tools and libraries for
improved functionality.
- Disabled Spark UI during local session execution for a cleaner testing
environment.
- Introduced new utility classes and methods for better data handling
and serialization.
- Enhanced error handling and logging in various components for improved
robustness.

- **Bug Fixes**
- Improved file generation process with updated Thrift command to handle
annotations correctly.

- **Documentation**
- Added comprehensive documentation for partial deserialization in the
Thrift framework.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
chewy-zlai pushed a commit that referenced this pull request May 16, 2025
## Summary
old version is old AF - and won't install any more

shading thrift simply wasn't enough to prevent collisions with the hive
metastore path within spark. I also had to remove the serving layer and
retain purely the serialization layer. There is an argument to make this
a java paour clientsage. That can be a TODO for later.


## Cheour clientslist
- [ ] Added Unit Tests
- [x] Covered by existing CI
- [ ] Integration tested
- [ ] Documentation update



<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Updated Thrift version to 0.21.0, enhancing compatibility and
performance.
- Added support for additional development tools and libraries for
improved functionality.
- Disabled Spark UI during local session execution for a cleaner testing
environment.
- Introduced new utility classes and methods for better data handling
and serialization.
- Enhanced error handling and logging in various components for improved
robustness.

- **Bug Fixes**
- Improved file generation process with updated Thrift command to handle
annotations correctly.

- **Documentation**
- Added comprehensive documentation for partial deserialization in the
Thrift framework.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants