-
-
Notifications
You must be signed in to change notification settings - Fork 88
feat: BOLT protocol first version #3250
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Summary of ChangesHello @lvca, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request delivers a significant new feature by integrating the Neo4j BOLT protocol into ArcadeDB. This integration allows ArcadeDB to communicate directly with Neo4j drivers, opening up new avenues for applications built on the Neo4j ecosystem to leverage ArcadeDB's capabilities. The implementation covers essential aspects of the protocol, from connection establishment and authentication to query execution and transaction handling, ensuring a robust and compatible experience for users. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
🧪 CI InsightsHere's what we observed from your CI run for 8279746. 🟢 All jobs passed!But CI Insights is watching 👀 |
Code Review: BOLT Protocol ImplementationThis is a substantial and well-executed feature (4,530 lines) implementing BOLT protocol support for Neo4j driver compatibility. ✅ Strengths
🔍 Critical Issues1. Resource Leak - Database Not ClosedLocation: BoltNetworkExecutor.java:724, cleanup() method (lines 811-829) The database instance from Fix: Add database.close() in cleanup() method 2. ResultSet Not ClosedLocation: BoltNetworkExecutor.java lines 383, 508, 546, 648 The currentResultSet is never explicitly closed, holding database resources. Fix: Add try/close blocks when nulling currentResultSet 📋 Medium Priority Issues3. Potential Integer OverflowLocation: BoltStructureMapper.java:311 RID-to-ID conversion could overflow for bucket IDs > 2^16. Consider adding validation. 4. Thread Safety DocumentationInstance variables (currentResultSet, currentFields, firstResult) lack thread safety documentation. 🎯 SummaryHigh-quality production code with proper architecture. Approve after fixing resource leaks (issues 1-2). Recommended actions:
Great work! This will be a valuable addition to ArcadeDB. 🚀 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces the first version of the BOLT protocol support for ArcadeDB, enabling compatibility with Neo4j drivers. The implementation includes chunked message framing, PackStream serialization/deserialization, and a state machine for handling various BOLT messages like HELLO, RUN, PULL, COMMIT, and ROLLBACK. It also provides mapping of ArcadeDB graph elements to BOLT structures. Comprehensive unit and integration tests are included, covering core functionalities, data types, and transaction management. New global configuration properties have been added for BOLT port, host, and debug settings. Overall, this is a substantial and well-tested addition to ArcadeDB's protocol support.
| // Build success response with query metadata | ||
| final Map<String, Object> metadata = new LinkedHashMap<>(); | ||
| metadata.put("fields", currentFields); | ||
| metadata.put("t_first", 0L); // Time to first record (placeholder) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The t_first metadata field is currently a placeholder (0L). For accurate performance monitoring and client-side analysis, this value should reflect the actual time to the first record. Consider implementing proper timing for this metric.
| metadata.put("t_first", 0L); // Time to first record (placeholder) | |
| metadata.put("t_first", 0L); // TODO: Implement actual time to first record calculation |
| metadata.put("type", "r"); // Read-only query type | ||
| metadata.put("t_last", 0L); // Time to last record |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The type metadata field is hardcoded to "r" (read-only), and t_last is a placeholder (0L). The query type should be dynamically determined based on the executed query (e.g., read, write, schema, etc.), and t_last should be calculated for accurate performance metrics.
| metadata.put("type", "r"); // Read-only query type | |
| metadata.put("t_last", 0L); // Time to last record | |
| metadata.put("type", "r"); // TODO: Determine query type dynamically | |
| metadata.put("t_last", 0L); // TODO: Implement actual time to last record calculation |
| if (databaseName == null || databaseName.isEmpty()) { | ||
| // Try to get default database or first available | ||
| final Collection<String> databases = server.getDatabaseNames(); | ||
| if (databases.isEmpty()) { | ||
| sendFailure(BoltException.DATABASE_ERROR, "No database available"); | ||
| state = State.FAILED; | ||
| return false; | ||
| } | ||
| databaseName = databases.iterator().next(); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If databaseName is null or empty, the ensureDatabase method attempts to pick the "first available" database. This behavior can be unpredictable in environments with multiple databases. It would be more robust to either enforce an explicit database name or define a clear, configurable default database for BOLT connections.
| final String address = host + ":" + port; | ||
|
|
||
| final Map<String, Object> rt = new LinkedHashMap<>(); | ||
| rt.put("ttl", 300L); // 5 minute TTL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ttl for the routing table is hardcoded to 300L. Consider making this value configurable through GlobalConfiguration to allow administrators to adjust it based on their cluster topology and update frequency requirements.
| rt.put("ttl", 300L); // 5 minute TTL | |
| rt.put("ttl", 300L); // TODO: Make configurable via GlobalConfiguration |
| final Map<String, Object> extra = fields.isEmpty() ? Map.of() : (Map<String, Object>) fields.get(0); | ||
| return new HelloMessage(extra); | ||
| } | ||
|
|
||
| @SuppressWarnings("unchecked") | ||
| private static RunMessage parseRun(final List<Object> fields) { | ||
| final String query = (String) fields.get(0); | ||
| final Map<String, Object> parameters = fields.size() > 1 && fields.get(1) != null ? (Map<String, Object>) fields.get(1) : Map.of(); | ||
| final Map<String, Object> extra = fields.size() > 2 && fields.get(2) != null ? (Map<String, Object>) fields.get(2) : Map.of(); | ||
| return new RunMessage(query, parameters, extra); | ||
| } | ||
|
|
||
| @SuppressWarnings("unchecked") | ||
| private static BeginMessage parseBegin(final List<Object> fields) { | ||
| final Map<String, Object> extra = fields.isEmpty() || fields.get(0) == null ? Map.of() : (Map<String, Object>) fields.get(0); | ||
| return new BeginMessage(extra); | ||
| } | ||
|
|
||
| @SuppressWarnings("unchecked") | ||
| private static DiscardMessage parseDiscard(final List<Object> fields) { | ||
| final Map<String, Object> extra = fields.isEmpty() || fields.get(0) == null ? Map.of() : (Map<String, Object>) fields.get(0); | ||
| return new DiscardMessage(extra); | ||
| } | ||
|
|
||
| @SuppressWarnings("unchecked") | ||
| private static PullMessage parsePull(final List<Object> fields) { | ||
| final Map<String, Object> extra = fields.isEmpty() || fields.get(0) == null ? Map.of() : (Map<String, Object>) fields.get(0); | ||
| return new PullMessage(extra); | ||
| } | ||
|
|
||
| @SuppressWarnings("unchecked") | ||
| private static LogonMessage parseLogon(final List<Object> fields) { | ||
| final Map<String, Object> auth = fields.isEmpty() || fields.get(0) == null ? Map.of() : (Map<String, Object>) fields.get(0); | ||
| return new LogonMessage(auth); | ||
| } | ||
|
|
||
| @SuppressWarnings("unchecked") | ||
| private static RouteMessage parseRoute(final List<Object> fields) { | ||
| final Map<String, Object> routing = fields.isEmpty() || fields.get(0) == null ? Map.of() : (Map<String, Object>) fields.get(0); | ||
| final List<String> bookmarks = fields.size() > 1 && fields.get(1) != null ? (List<String>) fields.get(1) : List.of(); | ||
| final String database = fields.size() > 2 ? (String) fields.get(2) : null; | ||
| return new RouteMessage(routing, bookmarks, database); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The parsing logic for message fields (e.g., fields.isEmpty() ? Map.of() : (Map<String, Object>) fields.get(0)) is quite verbose and repeated. While functional, it could be made more concise and potentially safer against IndexOutOfBoundsException by using helper methods that safely retrieve and cast fields with default values.
| public static final String AUTHENTICATION_ERROR = "Neo.ClientError.Security.Unauthorized"; | ||
| public static final String SYNTAX_ERROR = "Neo.ClientError.Statement.SyntaxError"; | ||
| public static final String SEMANTIC_ERROR = "Neo.ClientError.Statement.SemanticError"; | ||
| public static final String DATABASE_ERROR = "Neo.DatabaseError.General.UnknownError"; | ||
| public static final String TRANSACTION_ERROR = "Neo.ClientError.Transaction.TransactionNotFound"; | ||
| public static final String FORBIDDEN_ERROR = "Neo.ClientError.Security.Forbidden"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for addressing the suggestion to centralize the error codes! I see that you've created BoltErrorCodes.java and updated BoltException.java to reference these new constants, which is great progress.
Regarding your request to implement this, as an AI, I'm unable to directly modify the codebase. However, I can provide a code suggestion for the current file (bolt/src/main/java/com/arcadedb/bolt/message/FailureMessage.java) to complete the refactoring.
The constants defined here (lines 71-76) are now redundant, as the canonical definitions are in BoltErrorCodes.java and other classes (like BoltNetworkExecutor) are correctly referencing BoltException (which now points to BoltErrorCodes). Removing these unused definitions will clean up the FailureMessage class.
Here's a suggestion to remove these duplicated constant definitions:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Warning
Gemini encountered an error creating the review-comment-reply. You can try again by commenting /gemini review-comment-reply.
|
@claude please implement ALL your recommendations |
|
Claude finished @lvca's task in 4m 47s —— View job Implementation Complete ✅I've successfully implemented all code review recommendations for the BOLT protocol implementation. Critical Issues (Claude Review) - All Fixed ✅
Medium Priority Issues (Gemini Review) - All Implemented ✅
Summary of ChangesFiles Modified:
Files Created:
Next StepsThe changes have been committed and pushed. CI will automatically run the full test suite to verify all changes work correctly. The implementation addresses all critical resource leaks and improves code maintainability through better organization and documentation. Commit: 8cf94e2 |
Critical fixes: - Close database instance in cleanup() to prevent resource leak - Close ResultSet in multiple handlers (RESET, PULL, DISCARD, ROLLBACK) - Add bucket ID validation in ridToId() to prevent integer overflow - Add thread safety documentation for instance variables Improvements: - Extract authentication logic into reusable helper method - Add configurable routing table TTL (BOLT_ROUTING_TTL) - Centralize error codes in BoltErrorCodes class - Add TODOs for timing metrics and query type determination - Add TODO for database selection behavior clarification Co-authored-by: Luca Garulli <[email protected]>
Coverage summary from CodacySee diff coverage on Codacy
Coverage variation details
Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch: Diff coverage details
Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified: See your quality gate settings Change summary preferences |
Pull Request Review: BOLT Protocol ImplementationThis is an impressive and substantial implementation of the Neo4j BOLT protocol for ArcadeDB. The code demonstrates solid engineering with proper protocol implementation, state management, and integration with ArcadeDB's native OpenCypher engine. ✅ Strengths
|
Code Review: BOLT Protocol ImplementationThank you for this comprehensive implementation of the BOLT protocol! This is a substantial and well-structured addition to ArcadeDB. Below is a detailed review covering security, architecture, performance, and code quality. ✅ StrengthsArchitecture & Module Structure
Security
Resource Management
Test Coverage
|
|
@claude please implement your suggestions:
|
|
Claude finished @lvca's task in 5m 35s —— View job Implementation Complete ✅Successfully implemented all code review recommendations for the BOLT protocol implementation. Tasks Completed
Implementation DetailsTask 1: Query Analyzer Integration ✅Problem: String-based keyword matching for write detection was fragile and could produce false positives. Solution: Replaced with Location: Task 2: Default Database Configuration ✅Problem: Database selection defaulted to first available database without configuration option. Solution:
Locations:
Task 3: Performance Metrics ✅Problem: TODOs for Solution:
Locations:
Task 4: Exception Handling ✅Problem: Solution: Added null checks with descriptive fallback messages across all exception handlers. Locations: Applied in:
Task 5: Connection Limiting ✅Problem: No connection limit enforcement, potential for resource exhaustion. Solution:
Locations:
Task 7: New Integration Tests ✅Added 9 comprehensive integration tests to
Note: The existing test VerificationAll changes have been committed and pushed to the
Commit: d039e85 |
- Replace string-based write query detection with QueryEngine.analyze() - Add BOLT_DEFAULT_DATABASE and BOLT_MAX_CONNECTIONS configurations - Implement accurate timing metrics (t_first, t_last) using nanoTime - Determine query type dynamically (read vs write) - Improve exception handling with null checks and fallback messages - Add connection limiting with concurrent tracking - Add 9 new integration tests covering: * Database selection and switching * Concurrent sessions (5 threads) * Invalid database handling * Connection pooling stress test * Transaction isolation * Large parameter maps (50 params) * NULL parameter handling * Multiple error recovery Co-authored-by: Luca Garulli <[email protected]>
PR #3250 Review: BOLT Protocol ImplementationExecutive SummaryThis PR adds comprehensive Neo4j BOLT protocol (v4.4, v4.0, v3.0) support to ArcadeDB. The implementation is well-architected with 47 files adding ~6,500 lines of code, including a complete protocol state machine, PackStream serialization, and multi-label vertex support for OpenCypher. Overall Assessment: ✅ Approve with Minor Concerns The code quality is strong and follows project conventions. Test coverage is excellent with real Neo4j driver-based integration tests. A few areas need attention before or shortly after merge. Code Quality & Best Practices✅ Strengths
Potential Bugs & Issues🔴 Critical1. Multi-Label Pattern Matching - Needs Verification The multi-label implementation uses composite types (e.g.,
Location: Recommendation: Add explicit test case for multi-label pattern matching and document the matching semantics. 2. Temporal Type Handling - Compatibility Risk Date/time types are converted to ISO strings instead of BOLT temporal types. This means Neo4j client libraries may expect native temporal types (DateTimeValue, LocalDateValue, etc.), which could lead to:
Location: Recommendation: Implement BOLT v4 temporal type serialization per specification. 🟡 Medium Priority3. Connection Lifecycle Management No connection timeout or idle connection cleanup in
Recommendation: Add connection timeout settings and periodic cleanup task. 4. RID-to-Long Conversion Limitations The ID generation in
Recommendation: Document this limitation in release notes and consider validation at bucket creation time. 5. Write Query Detection Fallback Query analysis failure assumes write operation (conservative but potentially inefficient). Read queries that fail analysis will use write path. Recommendation: Add logging/metrics for analysis failures to identify problematic queries. 🟢 Low Priority6. Cluster Support Not Implemented ROUTE message handler returns same server for all roles. Single-server deployments work fine; cluster deployments need future enhancement. Recommendation: Document as single-server limitation in release notes. 7. Bookmark Implementation Simple timestamp-based bookmarks won't work correctly in distributed scenarios or with clock skew. Recommendation: Future enhancement for HA deployments. Performance Considerations✅ Good Practices
|
Code Review - PR #3250: BOLT Protocol SupportThis PR implements Neo4j BOLT protocol support for ArcadeDB. The implementation is well-structured with excellent test coverage, but there are several critical issues that must be addressed before merging. 🚨 CRITICAL ISSUES (Must Fix)1. Database Resource Leak - Will Break Multi-User ScenariosLocation: try {
if (database != null) {
database.close(); // ⚠️ CLOSES SHARED DATABASE INSTANCE!
database = null;
}
} catch (final Exception e) {
// Ignore
}Problem: The Impact:
Evidence: The PostgreSQL wire protocol implementation ( Fix: Remove the try {
if (database != null) {
// Server manages database lifecycle - just null out our reference
database = null;
}
} catch (final Exception e) {
// Ignore
}2. Silent ResultSet Resource LeakLocations: Lines 354-360, 524-528, 565-570, 672-678 Multiple places close ResultSet with empty catch blocks: try {
currentResultSet.close();
} catch (final Exception e) {
// Ignore // ⚠️ Silently ignores closure failures
}Problem:
Fix: Add logging for closure failures: try {
currentResultSet.close();
} catch (final Exception e) {
LogManager.instance().log(this, Level.WARNING,
"Failed to close ResultSet", e);
}Additional Concern (Lines 496-508): The while (currentResultSet.hasNext() && (n < 0 || count < n)) {
final Result record = currentResultSet.next(); // Could throw
// ... processing
}If iteration throws an exception, the ResultSet might not be properly closed. Consider wrapping iteration in try-catch with cleanup in finally block. 🔴 HIGH PRIORITY ISSUES (Should Fix)3. RID Position Overflow RiskLocation: public static long ridToId(final RID rid) {
// ... bucket validation ...
return ((long) bucketId << 48) | (rid.getPosition() & 0xFFFFFFFFFFFFL);
}Problem: No validation that position fits in 48 bits. If position exceeds Fix: Add position validation: final long position = rid.getPosition();
if (position < 0 || position > 0xFFFFFFFFFFFFL) {
throw new IllegalArgumentException(
"Position out of range for BOLT ID conversion: " + position);
}
return ((long) bucketId << 48) | position;4. Authentication Error Handling GapLocation: private boolean authenticateUser(final String principal, final String credentials) throws IOException {
if (principal == null || credentials == null) {
return false; // ⚠️ No failure sent to client!
}
// ...
}Problem: When credentials are null, method returns false without sending a failure message to the client. Client may hang waiting for response. Fix: if (principal == null || credentials == null) {
sendFailure(BoltException.AUTHENTICATION_ERROR,
"Missing credentials");
state = State.FAILED;
return false;
}5. Precision Loss in Number ConversionLocation: if (number instanceof BigInteger bigInt) {
try {
return bigInt.longValueExact();
} catch (final ArithmeticException e) {
return bigInt.doubleValue(); // ⚠️ Loss of precision
}
}
if (number instanceof BigDecimal bigDec) {
return bigDec.doubleValue(); // ⚠️ Loss of precision
}Problem:
Impact: Clients expecting exact numeric values may receive approximations. Recommendations:
|
Coverage summary from CodacySee diff coverage on Codacy
Coverage variation details
Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch: Diff coverage details
Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified: See your quality gate settings Change summary preferences |
|
@claude please implement ALL your suggestions to best address the issues you found |
|
Claude finished @lvca's task in 4m 10s —— View job Implementation Complete ✅Successfully implemented all code review recommendations to address critical, high-priority, and medium-priority issues in the BOLT protocol implementation. Summary of ChangesAll identified issues have been fixed:
Files Modified
Commit DetailsCommit: 8279746 Next StepsThe changes have been pushed and CI will automatically run the full test suite including:
The most critical fix is the removal of View job | View commit | |
- CRITICAL: Remove database.close() call in cleanup() - database is managed by server - CRITICAL: Add logging for all ResultSet closure failures - HIGH: Add position validation in ridToId() to prevent overflow - HIGH: Fix authentication error handling to send failure message to client - HIGH: Document precision loss for BigInteger/BigDecimal conversions - MEDIUM: Add database health check in handleReset() to force re-auth if needed - MEDIUM: Add rollback attempt in transaction exception handler - MEDIUM: Sanitize authentication error messages to prevent information disclosure - LOW: Add query analysis failure logging with improved verbosity control - LOW: Add test cases for database persistence, connection limits, and path structures Co-authored-by: Luca Garulli <[email protected]>
Key Features Implemented
Usage
To enable the BOLT plugin, configure:
Then connect with any Neo4j driver: