Skip to content

Conversation

@santigimeno
Copy link
Member

@santigimeno santigimeno commented Apr 8, 2025

This commit introduces performance optimizations for tracing span attribute handling by adding two new fast API calls:

  1. FastPushSpanDataString - Optimized version of the existing string push method
  2. FastPushSpanDataString3 - New method to efficiently handle three string values in a single call

Key changes include:

  • Avoided JavaScript string concatenation which produces non-flattened V8 strings that cannot trigger fast API paths
  • Implemented direct passing of individual string components to C++ where concatenation can happen after the fast API boundary
  • Refactored HTTP URL construction in both client and server to leverage this optimization
  • Improved span context handling by storing trace ID, span ID, and parent span ID as separate attributes rather than a single concatenated string

These optimizations significantly improve performance in high load scenarios:

  • Up to 3.24% higher throughput in maximum load tests
  • Up to 57.75% lower latency at P99.9 under constant high rate (50k req/sec)
  • Substantial improvements across all high percentile latencies (P95-P99.999)
  • Most dramatic gains observed in tail latencies under sustained heavy load
=== Benchmark Summary ===

Benchmark Type: wrk2 (constant rate)
Old Version: ./nsolid_baseline
New Version: ./nsolid_fast_push_string
Iterations: 50
Connections: 10
Duration: 60s
Rate: 40k req/sec

+------------------+-------------+-------------+------------+-----+
| Metric           | Old Version | New Version | Difference | Sig |
+------------------+-------------+-------------+------------+-----+
| Avg Latency (ms) | 2.58        | 2.34        | N/A        |     |
| P50 Latency (ms) | 1.09        | 1.10        | +0.80%     | *** |
| P90 Latency (ms) | 2.22        | 2.20        | -0.63%     | *** |
| P99 Latency (ms) | 3.09        | 2.99        | -3.31%     | *** |
| Avg Req/Sec      | 39,992.50   | 39,992.53   | +0.00%     |     |
+------------------+-------------+-------------+------------+-----+

=== Benchmark Summary ===

Benchmark Type: wrk2 (constant rate)
Old Version: ./nsolid_baseline
New Version: ./nsolid_fast_push_string
Iterations: 50
Connections: 10
Duration: 60s
Rate: 50k req/sec

+------------------+-------------+-------------+------------+-----+
| Metric           | Old Version | New Version | Difference | Sig |
+------------------+-------------+-------------+------------+-----+
| Avg Latency (ms) | 9.92        | 5.20        | N/A        |     |
| P50 Latency (ms) | 0.95        | 0.94        | -0.54%     |     |
| P90 Latency (ms) | 2.54        | 2.35        | -7.47%     | *** |
| P99 Latency (ms) | 12.00       | 5.58        | -53.49%    | *** |
| Avg Req/Sec      | 49,990.78   | 49,990.72   | -0.00%     |     |
+------------------+-------------+-------------+------------+-----+

=== Benchmark Summary ===

Benchmark Type: wrk (maximum throughput)
Old Version: ./nsolid_baseline
New Version: ./nsolid_fast_push_string
Iterations: 50
Connections: 10
Duration: 60s
Threads: 2

+------------------+-------------+-------------+------------+-----+
| Metric           | Old Version | New Version | Difference | Sig |
+------------------+-------------+-------------+------------+-----+
| Avg Latency (ms) | 0.65        | 0.64        | N/A        |     |
| P50 Latency (ms) | 0.17        | 0.16        | -3.48%     | *** |
| P90 Latency (ms) | 0.33        | 0.32        | -2.30%     | *** |
| P99 Latency (ms) | 0.36        | 0.35        | -3.33%     | *** |
| Avg Req/Sec      | 52,398.20   | 54,094.55   | +3.24%     | *** |
+------------------+-------------+-------------+------------+-----+

Summary by CodeRabbit

  • Refactor

    • Improved how HTTP request URLs and tracing span data are recorded, now storing URL components separately for enhanced clarity in tracing.
    • Updated internal handling of trace and span IDs to use distinct fields rather than combined strings.
    • Streamlined and optimized span data processing for better performance and maintainability.
  • New Features

    • Added support for handling and recording multiple string properties for spans, including trace ID, span ID, parent span ID, and status messages.

@santigimeno santigimeno self-assigned this Apr 8, 2025
@santigimeno
Copy link
Member Author

@coderabbitai review

@coderabbitai
Copy link

coderabbitai bot commented Apr 16, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai
Copy link

coderabbitai bot commented Apr 16, 2025

Walkthrough

This update refactors the way tracing span data is handled and recorded within the codebase. Instead of concatenating multiple pieces of information (such as trace IDs or HTTP request URLs) into single strings, the new approach separates these values and processes them as distinct components. This change is reflected across the JavaScript and C++ layers, introducing new methods and type definitions for handling multiple string arguments, updating property types, and modifying the way span IDs are managed and recorded. The control flow and error handling remain unchanged, with the primary focus on improved data handling and clarity.

Changes

File(s) Change Summary
lib/_http_client.js, lib/_http_server.js Modified how HTTP request URLs are recorded in tracing spans: switched from concatenating protocol, host, and path into one string to passing them as separate arguments to a new method. Adjusted protocol formatting and removed construction of full URL strings.
lib/internal/otel/trace.js Refactored span ID handling: replaced single combined ID string with separate trace ID, span ID, and parent span ID fields. Updated Span constructor and methods to handle these separately. Introduced a new method for pushing three string values at once.
src/node_external_reference.h Added new function pointer types and external reference registrations for handling single and triple string arguments in span data operations.
src/nsolid/nsolid_api.cc Refactored and expanded span data string handling: separated slow and fast API paths, introduced support for three-string arguments, unified implementation logic, and registered new fast API calls.
src/nsolid/nsolid_bindings.h Added declarations for new slow and fast span data string methods (single and triple string variants), helper implementations, and static members for fast API entry points. Included necessary header for fast API calls.
src/nsolid/nsolid_trace.cc Updated span property assignment logic: replaced parsing of combined ID strings with direct assignment of individual trace ID, span ID, and parent span ID properties.
src/nsolid/nsolid_trace.h Modified macro for span property types: removed combined ID entry, added new entries for parent span ID, span ID, status message, and trace ID.

Sequence Diagram(s)

sequenceDiagram
    participant ClientRequest
    participant Span
    participant BindingData

    ClientRequest->>Span: _pushSpanDataString3(protocol, host, path)
    Span->>BindingData: pushSpanDataString3(trace_id, type, protocol, host, path)
    BindingData->>BindingData: PushSpanDataStringImpl3(...)
    Note right of BindingData: Concatenate and store span data
Loading
sequenceDiagram
    participant Tracer
    participant Span

    Tracer->>Span: new Span(internalId, parentSpanId, ...)
    Span->>Span: _pushSpanDataString(type, value)
    Span->>Span: _pushSpanDataString3(type, val1, val2, val3)
Loading

Poem

In fields of code where spans now run,
Each trace and ID shines bright as the sun.
No more strings tangled, all split apart,
URLs and IDs—each with their part.
With hops and leaps, the data flows free,
A rabbit’s delight in clear tracing, you see!
🐇✨

Tip

⚡💬 Agentic Chat (Pro Plan, General Availability)
  • We're introducing multi-step agentic chat in review comments and issue comments, within and outside of PR's. This feature enhances review and issue discussions with the CodeRabbit agentic chat by enabling advanced interactions, including the ability to create pull requests directly from comments and add commits to existing pull requests.
✨ Finishing Touches
  • 📝 Generate Docstrings

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
lib/internal/otel/trace.js (1)

222-225: Add _pushSpanDataString3 for handling three string arguments.
This enables more efficient pushing of multiple adjacent string values. As an optional improvement, consider basic argument validation or type checks to prevent future misuse, especially if external code references this method.

src/nsolid/nsolid_api.cc (1)

2361-2410: Implement triple-string push methods with an assertion to restrict usage.
The assertion on Span::kSpanHttpReqUrl ensures this method is domain-specific. This is fine if it’s strictly used for HTTP request URLs. If you expand triple-string usage beyond URLs, consider removing or generalizing that assertion. Also review whether "//" is always safe as a separator.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 973e241 and 8af94e2.

📒 Files selected for processing (8)
  • lib/_http_client.js (1 hunks)
  • lib/_http_server.js (2 hunks)
  • lib/internal/otel/trace.js (4 hunks)
  • src/node_external_reference.h (2 hunks)
  • src/nsolid/nsolid_api.cc (5 hunks)
  • src/nsolid/nsolid_bindings.h (3 hunks)
  • src/nsolid/nsolid_trace.cc (1 hunks)
  • src/nsolid/nsolid_trace.h (1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (2)
src/nsolid/nsolid_api.cc (3)
src/nsolid/nsolid_bindings.h (21)
  • args (25-26)
  • args (30-31)
  • args (35-36)
  • args (40-41)
  • args (51-52)
  • args (62-63)
  • args (72-73)
  • data (28-28)
  • data (33-33)
  • data (38-38)
  • data (46-49)
  • data (57-60)
  • data (68-71)
  • data (80-85)
  • receiver (27-27)
  • receiver (32-32)
  • receiver (37-37)
  • receiver (42-45)
  • receiver (53-56)
  • receiver (64-67)
  • receiver (74-79)
src/nsolid/nsolid_api.h (19)
  • isolate (250-250)
  • isolate (255-255)
  • isolate (308-308)
  • isolate (309-309)
  • isolate (315-318)
  • isolate (319-322)
  • isolate (556-558)
  • isolate (559-560)
  • isolate (617-621)
  • isolate (892-894)
  • isolate (892-892)
  • data (75-75)
  • data (76-76)
  • data (500-503)
  • data (505-508)
  • data (510-513)
  • data (514-517)
  • data (524-527)
  • context (251-251)
src/nsolid/nsolid_trace.h (7)
  • type (135-135)
  • val (149-149)
  • val (225-227)
  • val (225-225)
  • createSpanProp (127-127)
  • createSpanProp (217-220)
  • createSpanProp (217-218)
src/node_external_reference.h (1)
src/nsolid/nsolid_bindings.h (7)
  • receiver (27-27)
  • receiver (32-32)
  • receiver (37-37)
  • receiver (42-45)
  • receiver (53-56)
  • receiver (64-67)
  • receiver (74-79)
🔇 Additional comments (18)
src/nsolid/nsolid_trace.h (1)

33-36: Improved tracing granularity with separate span ID properties.

The changes to NSOLID_SPAN_PROP_TYPES_STRINGS macro replace the single kSpanOtelIds with separate properties for trace ID, span ID, parent span ID, and status message. This allows for more efficient handling of each component individually.

lib/_http_client.js (1)

486-486: Performance optimization: Using fast API calls for URL parts.

The implementation now uses _pushSpanDataString3 to pass individual URL components directly to C++ instead of concatenating them in JavaScript. This prevents the creation of non-flattened V8 strings, enabling the fast API path for better performance.

lib/_http_server.js (2)

1114-1114: Standardized protocol format with trailing colon.

The protocol string now includes a trailing colon ('https:' instead of 'https'), which is the standard format for URL protocols and ensures consistency across the codebase.


1128-1128: Performance optimization: Using fast API calls for URL components.

Similar to the client-side changes, this implementation now uses _pushSpanDataString3 to pass individual URL components directly to C++ instead of concatenating strings in JavaScript, enabling the fast API path for better performance.

src/nsolid/nsolid_trace.cc (1)

52-66: Refactored span ID handling for cleaner and more efficient implementation.

The implementation now handles trace ID, span ID, and parent span ID as separate properties rather than parsing them from a combined string. This simplifies the code by:

  1. Eliminating string parsing logic
  2. Providing direct assignment to respective storage fields
  3. Creating a more maintainable structure with explicit breaks after each case

These changes align with the JavaScript-side optimizations for better tracing performance.

src/node_external_reference.h (2)

91-102: Introduce new function pointer types for single/triple string pushes.
These additions cleanly match the fast API calls described in your native binding code. They appear correctly typed, taking FastOneByteString parameters and a V8 receiver.


133-134: Register new pointer types in ALLOWED_EXTERNAL_REFERENCE_TYPES.
Including CFunctionPushSpanDataString and CFunctionPushSpanDataString3 in the macro ensures they are properly recognized and managed as external references. This step looks correct and complete.

lib/internal/otel/trace.js (3)

58-62: Push separate ID properties instead of a combined string.
Splitting out traceId, spanId, and optionally parentSpanId clarifies the property values in each push call. This is easier to maintain and more explicit than a single combined ID string.


216-220: Invoke pushSpanDataString with minimal overhead.
Calling binding.pushSpanDataString directly looks straightforward. Since these lines only run when the span is sampled, there's no extra overhead in unsampled scenarios.


280-280: Pass parentContext?.spanId as an explicit parameter.
Referencing parentContext?.spanId directly here further clarifies parent-child relationships for child spans. This is a clear improvement from parsing or concatenating the parent ID.

src/nsolid/nsolid_bindings.h (3)

7-7: Include v8-fast-api-calls.h for fast string handling.
This new include is necessary for FastOneByteString, enabling more efficient string processing in the binding implementation.


62-86: Define new slow/fast push APIs for single and triple string arguments.
These function prototypes cleanly separate slow and fast variants for pushing string data into spans. Be mindful that PushSpanDataStringImpl3 might only apply to specific span property types, so consider clarifying or documenting its usage constraints more explicitly.


100-101: Add fast_push_span_data_string_ and fast_push_span_data_string3_.
Declaring these v8::CFunction fields here keeps the APIs consistent with the rest of the binding code. Make sure both are registered in the same scope and used only in contexts expecting fast calls.

src/nsolid/nsolid_api.cc (5)

42-42: Use FastOneByteString for improved string performance.
This using statement is essential to integrate V8’s fast API calls. Make sure fast strings remain valid for the duration of calls and that the isolate isn't accessed concurrently from multiple threads, preserving V8 usage constraints.


2322-2359: Implement Slow/Fast PushSpanDataString with consistent validation.
Your checks (DCHECK_EQ(args.Length(), 3) and type checks) prevent misuse in internal code. However, no fallback or user error handling is present if arguments are invalid. That’s generally acceptable for internal-only APIs, but keep it in mind for potential future expansions.


2963-2966: Register fast push methods for single/triple string data.
These lines correctly bind the static CFunctions to the fast-call entry points. This helps unify slow vs. fast code paths when bridging from JS to C++.


3006-3015: Add new fast binding methods in JS object exports.
Calling SetFastMethod for “pushSpanDataString” and “pushSpanDataString3” ensures both are properly exposed to userland code. The naming convention is consistent, matching the slow methods.


3148-3155: Register the new string push references in RegisterExternalReferences.
Ensuring both slow and fast push references are recognized closes the loop for tracing external references. This is essential for external reference registration in Node.

@santigimeno santigimeno force-pushed the santi/fast_push_string branch from 8af94e2 to f996ee1 Compare April 16, 2025 10:03
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (2)
src/nsolid/nsolid_api.cc (2)

2373-2381: Duplicate concern about validating 'type'.
This fast-path function uses the same signature and logic as the slow variant. The same comment regarding potential invalid type applies here.


2418-2431: Duplicate concern about validating triple-string 'type'.
This fast method repeats the same approach for the triple-string scenario. The same recommendation applies regarding checking or handling an invalid type at runtime.

🧹 Nitpick comments (3)
src/nsolid/nsolid_api.cc (3)

2357-2371: Consider validating 'type' before casting.
The method relies on debug-only checks (DCHECK) and ultimately casts type to Span::PropType, but there's no runtime validation. If invalid values are passed in release builds, it could lead to undefined behavior.

You could add a runtime check or gracefully handle out-of-range values, for example:

+  if (type > static_cast<uint32_t>(Span::kUnknownLast)) {
+    // Return or throw an error for invalid property type
+    return;
+  }

2396-2415: Tune the error handling for triple string arguments if 'type' is unexpected.
While PushSpanDataStringImpl3 eventually includes an assertion for kSpanHttpReqUrl, consider adding a runtime check here if you want to catch mismatches and produce an explicit error message rather than only relying on the debug assertion deeper in the call stack.


2432-2445: Assertion is good, but consider a runtime check in release builds.
ASSERT_EQ(prop_type, Span::kSpanHttpReqUrl) confirms expectations in debug mode. A release-mode safeguard would ensure stability in production.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8af94e2 and f996ee1.

📒 Files selected for processing (8)
  • lib/_http_client.js (1 hunks)
  • lib/_http_server.js (2 hunks)
  • lib/internal/otel/trace.js (4 hunks)
  • src/node_external_reference.h (2 hunks)
  • src/nsolid/nsolid_api.cc (5 hunks)
  • src/nsolid/nsolid_bindings.h (3 hunks)
  • src/nsolid/nsolid_trace.cc (1 hunks)
  • src/nsolid/nsolid_trace.h (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (7)
  • lib/_http_client.js
  • src/nsolid/nsolid_trace.cc
  • lib/_http_server.js
  • src/nsolid/nsolid_trace.h
  • lib/internal/otel/trace.js
  • src/node_external_reference.h
  • src/nsolid/nsolid_bindings.h
⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: lint-js-and-md
  • GitHub Check: coverage-windows
  • GitHub Check: build-docs
🔇 Additional comments (4)
src/nsolid/nsolid_api.cc (4)

2383-2394: Looks good.
This implementation correctly constructs and pushes the string-based property. No further issues found.


2998-3001: Definition of fast function objects is consistent.
No issues found.


3041-3051: Binding setup looks correct.
Registering both slow and fast methods under the correct names is consistent with the existing pattern.


3183-3190: Registration of slow/fast string-based span data methods is consistent.
No concerns here. This aligns perfectly with how other fast API calls are registered.

@santigimeno santigimeno force-pushed the santi/fast_push_string branch from f996ee1 to 4f9d18b Compare April 28, 2025 09:38
@santigimeno
Copy link
Member Author

@coderabbitai review

@coderabbitai
Copy link

coderabbitai bot commented Apr 29, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@santigimeno santigimeno force-pushed the santi/fast_push_string branch 2 times, most recently from 275a699 to 2bbf7f2 Compare May 12, 2025 14:14
@santigimeno santigimeno force-pushed the santi/fast_push_string branch from 2bbf7f2 to c6fb59e Compare June 5, 2025 14:01
This commit introduces performance optimizations for tracing span
attribute handling by adding two new fast API calls:

1. FastPushSpanDataString - Optimized version of the existing string push method
2. FastPushSpanDataString3 - New method to efficiently handle three string
   values in a single call

Key changes include:
- Avoided JavaScript string concatenation which produces non-flattened V8
  strings that cannot trigger fast API paths
- Implemented direct passing of individual string components to C++ where
  concatenation can happen after the fast API boundary
- Refactored HTTP URL construction in both client and server to leverage
  this optimization
- Improved span context handling by storing trace ID, span ID, and parent
  span ID as separate attributes rather than a single concatenated string

These optimizations significantly improve performance in high load scenarios:
- Up to 3.24% higher throughput in maximum load tests
- Up to 57.75% lower latency at P99.9 under constant high rate (50k req/sec)
- Substantial improvements across all high percentile latencies (P95-P99.999)
- Most dramatic gains observed in tail latencies under sustained heavy load

=== Benchmark Summary ===

Benchmark Type: wrk2 (constant rate)
Old Version: ./nsolid_baseline
New Version: ./nsolid_fast_push_string
Iterations: 50
Connections: 10
Duration: 60s
Rate: 40k req/sec

+------------------+-------------+-------------+------------+-----+
| Metric           | Old Version | New Version | Difference | Sig |
+------------------+-------------+-------------+------------+-----+
| Avg Latency (ms) | 2.58        | 2.34        | N/A        |     |
| P50 Latency (ms) | 1.09        | 1.10        | +0.80%     | *** |
| P90 Latency (ms) | 2.22        | 2.20        | -0.63%     | *** |
| P99 Latency (ms) | 3.09        | 2.99        | -3.31%     | *** |
| Avg Req/Sec      | 39,992.50   | 39,992.53   | +0.00%     |     |
+------------------+-------------+-------------+------------+-----+

=== Benchmark Summary ===

Benchmark Type: wrk2 (constant rate)
Old Version: ./nsolid_baseline
New Version: ./nsolid_fast_push_string
Iterations: 50
Connections: 10
Duration: 60s
Rate: 50k req/sec

+------------------+-------------+-------------+------------+-----+
| Metric           | Old Version | New Version | Difference | Sig |
+------------------+-------------+-------------+------------+-----+
| Avg Latency (ms) | 9.92        | 5.20        | N/A        |     |
| P50 Latency (ms) | 0.95        | 0.94        | -0.54%     |     |
| P90 Latency (ms) | 2.54        | 2.35        | -7.47%     | *** |
| P99 Latency (ms) | 12.00       | 5.58        | -53.49%    | *** |
| Avg Req/Sec      | 49,990.78   | 49,990.72   | -0.00%     |     |
+------------------+-------------+-------------+------------+-----+

=== Benchmark Summary ===

Benchmark Type: wrk (maximum throughput)
Old Version: ./nsolid_baseline
New Version: ./nsolid_fast_push_string
Iterations: 50
Connections: 10
Duration: 60s
Threads: 2

+------------------+-------------+-------------+------------+-----+
| Metric           | Old Version | New Version | Difference | Sig |
+------------------+-------------+-------------+------------+-----+
| Avg Latency (ms) | 0.65        | 0.64        | N/A        |     |
| P50 Latency (ms) | 0.17        | 0.16        | -3.48%     | *** |
| P90 Latency (ms) | 0.33        | 0.32        | -2.30%     | *** |
| P99 Latency (ms) | 0.36        | 0.35        | -3.33%     | *** |
| Avg Req/Sec      | 52,398.20   | 54,094.55   | +3.24%     | *** |
+------------------+-------------+-------------+------------+-----+
@santigimeno santigimeno force-pushed the santi/fast_push_string branch from c6fb59e to ee72779 Compare July 18, 2025 15:51
santigimeno added a commit that referenced this pull request Jul 18, 2025
This commit introduces performance optimizations for tracing span
attribute handling by adding two new fast API calls:

1. FastPushSpanDataString - Optimized version of the existing string push method
2. FastPushSpanDataString3 - New method to efficiently handle three string
   values in a single call

Key changes include:
- Avoided JavaScript string concatenation which produces non-flattened V8
  strings that cannot trigger fast API paths
- Implemented direct passing of individual string components to C++ where
  concatenation can happen after the fast API boundary
- Refactored HTTP URL construction in both client and server to leverage
  this optimization
- Improved span context handling by storing trace ID, span ID, and parent
  span ID as separate attributes rather than a single concatenated string

These optimizations significantly improve performance in high load scenarios:
- Up to 3.24% higher throughput in maximum load tests
- Up to 57.75% lower latency at P99.9 under constant high rate (50k req/sec)
- Substantial improvements across all high percentile latencies (P95-P99.999)
- Most dramatic gains observed in tail latencies under sustained heavy load

=== Benchmark Summary ===

Benchmark Type: wrk2 (constant rate)
Old Version: ./nsolid_baseline
New Version: ./nsolid_fast_push_string
Iterations: 50
Connections: 10
Duration: 60s
Rate: 40k req/sec

+------------------+-------------+-------------+------------+-----+
| Metric           | Old Version | New Version | Difference | Sig |
+------------------+-------------+-------------+------------+-----+
| Avg Latency (ms) | 2.58        | 2.34        | N/A        |     |
| P50 Latency (ms) | 1.09        | 1.10        | +0.80%     | *** |
| P90 Latency (ms) | 2.22        | 2.20        | -0.63%     | *** |
| P99 Latency (ms) | 3.09        | 2.99        | -3.31%     | *** |
| Avg Req/Sec      | 39,992.50   | 39,992.53   | +0.00%     |     |
+------------------+-------------+-------------+------------+-----+

=== Benchmark Summary ===

Benchmark Type: wrk2 (constant rate)
Old Version: ./nsolid_baseline
New Version: ./nsolid_fast_push_string
Iterations: 50
Connections: 10
Duration: 60s
Rate: 50k req/sec

+------------------+-------------+-------------+------------+-----+
| Metric           | Old Version | New Version | Difference | Sig |
+------------------+-------------+-------------+------------+-----+
| Avg Latency (ms) | 9.92        | 5.20        | N/A        |     |
| P50 Latency (ms) | 0.95        | 0.94        | -0.54%     |     |
| P90 Latency (ms) | 2.54        | 2.35        | -7.47%     | *** |
| P99 Latency (ms) | 12.00       | 5.58        | -53.49%    | *** |
| Avg Req/Sec      | 49,990.78   | 49,990.72   | -0.00%     |     |
+------------------+-------------+-------------+------------+-----+

=== Benchmark Summary ===

Benchmark Type: wrk (maximum throughput)
Old Version: ./nsolid_baseline
New Version: ./nsolid_fast_push_string
Iterations: 50
Connections: 10
Duration: 60s
Threads: 2

+------------------+-------------+-------------+------------+-----+
| Metric           | Old Version | New Version | Difference | Sig |
+------------------+-------------+-------------+------------+-----+
| Avg Latency (ms) | 0.65        | 0.64        | N/A        |     |
| P50 Latency (ms) | 0.17        | 0.16        | -3.48%     | *** |
| P90 Latency (ms) | 0.33        | 0.32        | -2.30%     | *** |
| P99 Latency (ms) | 0.36        | 0.35        | -3.33%     | *** |
| Avg Req/Sec      | 52,398.20   | 54,094.55   | +3.24%     | *** |
+------------------+-------------+-------------+------------+-----+

PR-URL: #289
Reviewed-By: Juan José Arboleda <[email protected]>
@santigimeno
Copy link
Member Author

Landed in 6ea71c8

@santigimeno santigimeno deleted the santi/fast_push_string branch July 18, 2025 16:15
santigimeno added a commit that referenced this pull request Jul 23, 2025
This commit introduces performance optimizations for tracing span
attribute handling by adding two new fast API calls:

1. FastPushSpanDataString - Optimized version of the existing string push method
2. FastPushSpanDataString3 - New method to efficiently handle three string
   values in a single call

Key changes include:
- Avoided JavaScript string concatenation which produces non-flattened V8
  strings that cannot trigger fast API paths
- Implemented direct passing of individual string components to C++ where
  concatenation can happen after the fast API boundary
- Refactored HTTP URL construction in both client and server to leverage
  this optimization
- Improved span context handling by storing trace ID, span ID, and parent
  span ID as separate attributes rather than a single concatenated string

These optimizations significantly improve performance in high load scenarios:
- Up to 3.24% higher throughput in maximum load tests
- Up to 57.75% lower latency at P99.9 under constant high rate (50k req/sec)
- Substantial improvements across all high percentile latencies (P95-P99.999)
- Most dramatic gains observed in tail latencies under sustained heavy load

=== Benchmark Summary ===

Benchmark Type: wrk2 (constant rate)
Old Version: ./nsolid_baseline
New Version: ./nsolid_fast_push_string
Iterations: 50
Connections: 10
Duration: 60s
Rate: 40k req/sec

+------------------+-------------+-------------+------------+-----+
| Metric           | Old Version | New Version | Difference | Sig |
+------------------+-------------+-------------+------------+-----+
| Avg Latency (ms) | 2.58        | 2.34        | N/A        |     |
| P50 Latency (ms) | 1.09        | 1.10        | +0.80%     | *** |
| P90 Latency (ms) | 2.22        | 2.20        | -0.63%     | *** |
| P99 Latency (ms) | 3.09        | 2.99        | -3.31%     | *** |
| Avg Req/Sec      | 39,992.50   | 39,992.53   | +0.00%     |     |
+------------------+-------------+-------------+------------+-----+

=== Benchmark Summary ===

Benchmark Type: wrk2 (constant rate)
Old Version: ./nsolid_baseline
New Version: ./nsolid_fast_push_string
Iterations: 50
Connections: 10
Duration: 60s
Rate: 50k req/sec

+------------------+-------------+-------------+------------+-----+
| Metric           | Old Version | New Version | Difference | Sig |
+------------------+-------------+-------------+------------+-----+
| Avg Latency (ms) | 9.92        | 5.20        | N/A        |     |
| P50 Latency (ms) | 0.95        | 0.94        | -0.54%     |     |
| P90 Latency (ms) | 2.54        | 2.35        | -7.47%     | *** |
| P99 Latency (ms) | 12.00       | 5.58        | -53.49%    | *** |
| Avg Req/Sec      | 49,990.78   | 49,990.72   | -0.00%     |     |
+------------------+-------------+-------------+------------+-----+

=== Benchmark Summary ===

Benchmark Type: wrk (maximum throughput)
Old Version: ./nsolid_baseline
New Version: ./nsolid_fast_push_string
Iterations: 50
Connections: 10
Duration: 60s
Threads: 2

+------------------+-------------+-------------+------------+-----+
| Metric           | Old Version | New Version | Difference | Sig |
+------------------+-------------+-------------+------------+-----+
| Avg Latency (ms) | 0.65        | 0.64        | N/A        |     |
| P50 Latency (ms) | 0.17        | 0.16        | -3.48%     | *** |
| P90 Latency (ms) | 0.33        | 0.32        | -2.30%     | *** |
| P99 Latency (ms) | 0.36        | 0.35        | -3.33%     | *** |
| Avg Req/Sec      | 52,398.20   | 54,094.55   | +3.24%     | *** |
+------------------+-------------+-------------+------------+-----+

PR-URL: #289
Reviewed-By: Juan José Arboleda <[email protected]>
santigimeno added a commit that referenced this pull request Aug 25, 2025
This commit introduces performance optimizations for tracing span
attribute handling by adding two new fast API calls:

1. FastPushSpanDataString - Optimized version of the existing string push method
2. FastPushSpanDataString3 - New method to efficiently handle three string
   values in a single call

Key changes include:
- Avoided JavaScript string concatenation which produces non-flattened V8
  strings that cannot trigger fast API paths
- Implemented direct passing of individual string components to C++ where
  concatenation can happen after the fast API boundary
- Refactored HTTP URL construction in both client and server to leverage
  this optimization
- Improved span context handling by storing trace ID, span ID, and parent
  span ID as separate attributes rather than a single concatenated string

These optimizations significantly improve performance in high load scenarios:
- Up to 3.24% higher throughput in maximum load tests
- Up to 57.75% lower latency at P99.9 under constant high rate (50k req/sec)
- Substantial improvements across all high percentile latencies (P95-P99.999)
- Most dramatic gains observed in tail latencies under sustained heavy load

=== Benchmark Summary ===

Benchmark Type: wrk2 (constant rate)
Old Version: ./nsolid_baseline
New Version: ./nsolid_fast_push_string
Iterations: 50
Connections: 10
Duration: 60s
Rate: 40k req/sec

+------------------+-------------+-------------+------------+-----+
| Metric           | Old Version | New Version | Difference | Sig |
+------------------+-------------+-------------+------------+-----+
| Avg Latency (ms) | 2.58        | 2.34        | N/A        |     |
| P50 Latency (ms) | 1.09        | 1.10        | +0.80%     | *** |
| P90 Latency (ms) | 2.22        | 2.20        | -0.63%     | *** |
| P99 Latency (ms) | 3.09        | 2.99        | -3.31%     | *** |
| Avg Req/Sec      | 39,992.50   | 39,992.53   | +0.00%     |     |
+------------------+-------------+-------------+------------+-----+

=== Benchmark Summary ===

Benchmark Type: wrk2 (constant rate)
Old Version: ./nsolid_baseline
New Version: ./nsolid_fast_push_string
Iterations: 50
Connections: 10
Duration: 60s
Rate: 50k req/sec

+------------------+-------------+-------------+------------+-----+
| Metric           | Old Version | New Version | Difference | Sig |
+------------------+-------------+-------------+------------+-----+
| Avg Latency (ms) | 9.92        | 5.20        | N/A        |     |
| P50 Latency (ms) | 0.95        | 0.94        | -0.54%     |     |
| P90 Latency (ms) | 2.54        | 2.35        | -7.47%     | *** |
| P99 Latency (ms) | 12.00       | 5.58        | -53.49%    | *** |
| Avg Req/Sec      | 49,990.78   | 49,990.72   | -0.00%     |     |
+------------------+-------------+-------------+------------+-----+

=== Benchmark Summary ===

Benchmark Type: wrk (maximum throughput)
Old Version: ./nsolid_baseline
New Version: ./nsolid_fast_push_string
Iterations: 50
Connections: 10
Duration: 60s
Threads: 2

+------------------+-------------+-------------+------------+-----+
| Metric           | Old Version | New Version | Difference | Sig |
+------------------+-------------+-------------+------------+-----+
| Avg Latency (ms) | 0.65        | 0.64        | N/A        |     |
| P50 Latency (ms) | 0.17        | 0.16        | -3.48%     | *** |
| P90 Latency (ms) | 0.33        | 0.32        | -2.30%     | *** |
| P99 Latency (ms) | 0.36        | 0.35        | -3.33%     | *** |
| Avg Req/Sec      | 52,398.20   | 54,094.55   | +3.24%     | *** |
+------------------+-------------+-------------+------------+-----+

PR-URL: #289
Reviewed-By: Juan José Arboleda <[email protected]>
santigimeno added a commit that referenced this pull request Aug 26, 2025
This commit introduces performance optimizations for tracing span
attribute handling by adding two new fast API calls:

1. FastPushSpanDataString - Optimized version of the existing string push method
2. FastPushSpanDataString3 - New method to efficiently handle three string
   values in a single call

Key changes include:
- Avoided JavaScript string concatenation which produces non-flattened V8
  strings that cannot trigger fast API paths
- Implemented direct passing of individual string components to C++ where
  concatenation can happen after the fast API boundary
- Refactored HTTP URL construction in both client and server to leverage
  this optimization
- Improved span context handling by storing trace ID, span ID, and parent
  span ID as separate attributes rather than a single concatenated string

These optimizations significantly improve performance in high load scenarios:
- Up to 3.24% higher throughput in maximum load tests
- Up to 57.75% lower latency at P99.9 under constant high rate (50k req/sec)
- Substantial improvements across all high percentile latencies (P95-P99.999)
- Most dramatic gains observed in tail latencies under sustained heavy load

=== Benchmark Summary ===

Benchmark Type: wrk2 (constant rate)
Old Version: ./nsolid_baseline
New Version: ./nsolid_fast_push_string
Iterations: 50
Connections: 10
Duration: 60s
Rate: 40k req/sec

+------------------+-------------+-------------+------------+-----+
| Metric           | Old Version | New Version | Difference | Sig |
+------------------+-------------+-------------+------------+-----+
| Avg Latency (ms) | 2.58        | 2.34        | N/A        |     |
| P50 Latency (ms) | 1.09        | 1.10        | +0.80%     | *** |
| P90 Latency (ms) | 2.22        | 2.20        | -0.63%     | *** |
| P99 Latency (ms) | 3.09        | 2.99        | -3.31%     | *** |
| Avg Req/Sec      | 39,992.50   | 39,992.53   | +0.00%     |     |
+------------------+-------------+-------------+------------+-----+

=== Benchmark Summary ===

Benchmark Type: wrk2 (constant rate)
Old Version: ./nsolid_baseline
New Version: ./nsolid_fast_push_string
Iterations: 50
Connections: 10
Duration: 60s
Rate: 50k req/sec

+------------------+-------------+-------------+------------+-----+
| Metric           | Old Version | New Version | Difference | Sig |
+------------------+-------------+-------------+------------+-----+
| Avg Latency (ms) | 9.92        | 5.20        | N/A        |     |
| P50 Latency (ms) | 0.95        | 0.94        | -0.54%     |     |
| P90 Latency (ms) | 2.54        | 2.35        | -7.47%     | *** |
| P99 Latency (ms) | 12.00       | 5.58        | -53.49%    | *** |
| Avg Req/Sec      | 49,990.78   | 49,990.72   | -0.00%     |     |
+------------------+-------------+-------------+------------+-----+

=== Benchmark Summary ===

Benchmark Type: wrk (maximum throughput)
Old Version: ./nsolid_baseline
New Version: ./nsolid_fast_push_string
Iterations: 50
Connections: 10
Duration: 60s
Threads: 2

+------------------+-------------+-------------+------------+-----+
| Metric           | Old Version | New Version | Difference | Sig |
+------------------+-------------+-------------+------------+-----+
| Avg Latency (ms) | 0.65        | 0.64        | N/A        |     |
| P50 Latency (ms) | 0.17        | 0.16        | -3.48%     | *** |
| P90 Latency (ms) | 0.33        | 0.32        | -2.30%     | *** |
| P99 Latency (ms) | 0.36        | 0.35        | -3.33%     | *** |
| Avg Req/Sec      | 52,398.20   | 54,094.55   | +3.24%     | *** |
+------------------+-------------+-------------+------------+-----+

PR-URL: #289
Reviewed-By: Juan José Arboleda <[email protected]>
PR-URL: #359
Reviewed-By: Rafael Gonzaga <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants