Skip to content

Conversation

misrasaurabh1
Copy link
Contributor

📄 37% (0.37x) speedup for extract_sentrytrace_data in sentry_sdk/tracing_utils.py

This one looked really important as well. I found a bunch more optimizations, but don't want to spam you guys. Would you like to meet over a 30 min call to discuss how we can better coordinate?

⏱️ Runtime : 3.30 milliseconds 2.41 milliseconds (best of 124 runs)

📝 Explanation and details

The optimization adds length checks before expensive string formatting operations. Specifically:

Key Changes:

  • Added len(trace_id) != 32 check before "{:032x}".format(int(trace_id, 16))
  • Added len(parent_span_id) != 16 check before "{:016x}".format(int(parent_span_id, 16))

Why It's Faster:
The original code always performed string-to-int conversion and formatting, even when the trace_id/span_id were already properly formatted. The optimization skips these expensive operations when the strings are already the correct length (32 hex chars for trace_id, 16 for span_id).

The int(trace_id, 16) and "{:032x}".format() operations are computationally expensive, involving:

  • Hexadecimal string parsing
  • Integer conversion
  • String formatting with zero-padding

Performance Impact:
Test results show the optimization is most effective when trace IDs and span IDs are already properly formatted (which is common in production). Cases like test_valid_full_header show 51.6% speedup, and test_missing_trace_id shows 65.9% speedup. The optimization has minimal overhead for cases where formatting is still needed, with only small gains (1-7%) for malformed inputs.

This is particularly valuable for high-throughput tracing scenarios where most headers contain well-formatted trace data.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 21 Passed
🌀 Generated Regression Tests 1960 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
⚙️ Existing Unit Tests and Runtime
Test File::Test Function Original ⏱️ Optimized ⏱️ Speedup
tracing/test_http_headers.py::test_sentrytrace_extraction 13.8μs 9.03μs 52.8%✅
🌀 Generated Regression Tests and Runtime
import re
from typing import Dict, Optional, Union

# imports
import pytest  # used for our unit tests
from sentry_sdk.tracing_utils import extract_sentrytrace_data

SENTRY_TRACE_REGEX = re.compile(
    "^[ \t]*"  # whitespace
    "([0-9a-f]{32})?"  # trace_id
    "-?([0-9a-f]{16})?"  # span_id
    "-?([01])?"  # sampled
    "[ \t]*$"  # whitespace
)
from sentry_sdk.tracing_utils import extract_sentrytrace_data

# unit tests

# ----------------------
# Basic Test Cases
# ----------------------

def test_valid_full_header():
    # Basic valid header: all fields present
    header = "0123456789abcdef0123456789abcdef-0123456789abcdef-1"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 7.43μs -> 4.90μs (51.6% faster)

def test_valid_full_header_sampled_false():
    # Basic valid header: sampled = 0
    header = "abcdefabcdefabcdefabcdefabcdefabcd-abcdefabcdefabcd-0"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 3.88μs -> 3.71μs (4.77% faster)

def test_valid_header_missing_sampled():
    # Valid header: missing sampled field
    header = "0123456789abcdef0123456789abcdef-0123456789abcdef"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 6.80μs -> 4.12μs (65.2% faster)

def test_valid_header_only_trace_id():
    # Valid header: only trace_id present
    header = "0123456789abcdef0123456789abcdef"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 5.17μs -> 3.62μs (42.7% faster)

def test_valid_header_with_whitespace():
    # Valid header with leading/trailing whitespace
    header = "  0123456789abcdef0123456789abcdef-0123456789abcdef-1  "
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 6.74μs -> 4.19μs (60.8% faster)

# ----------------------
# Edge Test Cases
# ----------------------

def test_empty_header():
    # Edge: empty string
    header = ""
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 459ns -> 492ns (6.71% slower)

def test_none_header():
    # Edge: None as input
    header = None
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 473ns -> 444ns (6.53% faster)

def test_invalid_trace_id_length():
    # Edge: trace_id too short
    header = "01234567-0123456789abcdef-1"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 3.00μs -> 2.83μs (6.00% faster)

def test_invalid_span_id_length():
    # Edge: span_id too short
    header = "0123456789abcdef0123456789abcdef-01234567-1"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 4.21μs -> 4.14μs (1.74% faster)

def test_invalid_sampled_value():
    # Edge: sampled value not 0 or 1
    header = "0123456789abcdef0123456789abcdef-0123456789abcdef-2"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 4.56μs -> 4.50μs (1.36% faster)

def test_invalid_characters_in_trace_id():
    # Edge: invalid characters in trace_id
    header = "zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz-0123456789abcdef-1"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 2.50μs -> 2.37μs (5.36% faster)

def test_invalid_characters_in_span_id():
    # Edge: invalid characters in span_id
    header = "0123456789abcdef0123456789abcdef-zzzzzzzzzzzzzzzz-1"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 4.03μs -> 3.82μs (5.69% faster)

def test_missing_trace_id():
    # Edge: missing trace_id, only span_id and sampled
    header = "-0123456789abcdef-1"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 6.44μs -> 3.88μs (65.9% faster)

def test_missing_span_id():
    # Edge: missing span_id, only trace_id and sampled
    header = "0123456789abcdef0123456789abcdef--1"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 5.88μs -> 4.00μs (47.0% faster)

def test_header_with_extra_fields():
    # Edge: extra fields after sampled
    header = "0123456789abcdef0123456789abcdef-0123456789abcdef-1-extra"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 4.96μs -> 4.80μs (3.16% faster)

def test_header_with_internal_spaces():
    # Edge: spaces inside the trace_id/span_id
    header = "0123456789abcd ef0123456789abcdef-0123456789abcdef-1"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 2.74μs -> 2.54μs (7.95% faster)

def test_header_with_leading_and_trailing_dash():
    # Edge: leading and trailing dash
    header = "-0123456789abcdef0123456789abcdef-0123456789abcdef-1-"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 3.44μs -> 3.26μs (5.51% faster)

def test_header_with_only_sampled():
    # Edge: only sampled value
    header = "-1"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 3.50μs -> 3.44μs (1.98% faster)

def test_header_with_only_span_id():
    # Edge: only span_id, no trace_id
    header = "-0123456789abcdef"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 5.94μs -> 3.64μs (63.3% faster)

def test_header_with_leading_and_trailing_tabs():
    # Edge: tabs instead of spaces
    header = "\t0123456789abcdef0123456789abcdef-0123456789abcdef-1\t"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 7.15μs -> 4.24μs (68.6% faster)

def test_header_with_tracing_format():
    # Edge: header with "00-" prefix and "-00" suffix
    header = "00-0123456789abcdef0123456789abcdef-0123456789abcdef-1-00"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 7.12μs -> 4.80μs (48.4% faster)

def test_header_with_tracing_format_and_whitespace():
    # Edge: header with "00-" prefix, "-00" suffix, and whitespace
    header = " 00-0123456789abcdef0123456789abcdef-0123456789abcdef-1-00 "
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 3.12μs -> 3.04μs (2.97% faster)

def test_header_with_uppercase_hex():
    # Edge: uppercase hex letters in trace_id and span_id
    header = "ABCDEFABCDEFABCDEFABCDEFABCDEFAB-ABCDEFABCDEFABCD-1"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 2.43μs -> 2.27μs (7.33% faster)

def test_header_with_mixedcase_hex():
    # Edge: mixed-case hex letters in trace_id and span_id
    header = "aBcDeFaBcDeFaBcDeFaBcDeFaBcDeFaB-aBcDeFaBcDeFaBcD-1"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 2.41μs -> 2.34μs (3.21% faster)

def test_header_with_leading_and_trailing_spaces_and_tabs():
    # Edge: both spaces and tabs
    header = " \t0123456789abcdef0123456789abcdef-0123456789abcdef-1\t "
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 7.61μs -> 4.67μs (62.9% faster)

# ----------------------
# Large Scale Test Cases
# ----------------------

def test_many_valid_headers():
    # Large scale: test many valid headers
    for i in range(100):
        trace_id = f"{i:032x}"
        span_id = f"{i:016x}"
        sampled = str(i % 2)
        header = f"{trace_id}-{span_id}-{sampled}"
        codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 224μs -> 132μs (69.3% faster)

def test_many_invalid_headers():
    # Large scale: test many invalid headers
    for i in range(100):
        # Too short trace_id
        header = f"{i:08x}-0123456789abcdef-1"
        codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 81.9μs -> 75.1μs (9.08% faster)

        # Too short span_id
        header = f"0123456789abcdef0123456789abcdef-{i:08x}-1"
        codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 146μs -> 138μs (6.19% faster)

        # Invalid sampled value
        header = f"0123456789abcdef0123456789abcdef-0123456789abcdef-{i+2}"
        codeflash_output = extract_sentrytrace_data(header); result = codeflash_output

def test_large_header_with_valid_data():
    # Large scale: test a header with max valid values
    trace_id = "f" * 32
    span_id = "e" * 16
    sampled = "1"
    header = f"{trace_id}-{span_id}-{sampled}"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 7.04μs -> 4.33μs (62.6% faster)

def test_large_header_with_tracing_format():
    # Large scale: header with tracing format and large values
    trace_id = "a" * 32
    span_id = "b" * 16
    sampled = "0"
    header = f"00-{trace_id}-{span_id}-{sampled}-00"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 6.91μs -> 4.47μs (54.6% faster)

def test_bulk_headers_performance():
    # Large scale: test performance with 500 valid and 500 invalid headers
    valid_headers = [
        f"{i:032x}-{i:016x}-{i%2}" for i in range(500)
    ]
    invalid_headers = [
        f"{i:08x}-badspanid-{i%2}" for i in range(500)
    ]
    # Check all valid headers parse correctly
    for i, header in enumerate(valid_headers):
        codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 1.08ms -> 651μs (65.3% faster)
    # Check all invalid headers return None
    for header in invalid_headers:
        codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 398μs -> 368μs (8.23% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
import re

# imports
import pytest  # used for our unit tests
from sentry_sdk.tracing_utils import extract_sentrytrace_data

SENTRY_TRACE_REGEX = re.compile(
    "^[ \t]*"  # whitespace
    "([0-9a-f]{32})?"  # trace_id
    "-?([0-9a-f]{16})?"  # span_id
    "-?([01])?"  # sampled
    "[ \t]*$"  # whitespace
)
from sentry_sdk.tracing_utils import extract_sentrytrace_data

# unit tests

# 1. Basic Test Cases

def test_basic_valid_full_header():
    # Test a header with all fields present and valid
    header = "4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-1"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 6.56μs -> 4.35μs (50.8% faster)

def test_basic_valid_unsampled():
    # Test a header with sampled bit set to 0
    header = "4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-0"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 6.58μs -> 4.20μs (56.8% faster)

def test_basic_valid_no_sampled():
    # Test a header with no sampled bit
    header = "4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 5.89μs -> 3.96μs (48.7% faster)

def test_basic_valid_only_trace_id():
    # Test a header with only trace_id
    header = "4bf92f3577b34da6a3ce929d0e0e4736"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 5.14μs -> 3.48μs (47.8% faster)

def test_basic_valid_trace_and_sampled():
    # Test a header with trace_id and sampled bit
    header = "4bf92f3577b34da6a3ce929d0e0e4736--1"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 5.47μs -> 3.81μs (43.7% faster)

def test_basic_valid_span_and_sampled():
    # Test a header with span_id and sampled bit, but no trace_id
    header = "-00f067aa0ba902b7-1"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 5.28μs -> 3.75μs (41.0% faster)

def test_basic_valid_only_span_id():
    # Test a header with only span_id
    header = "-00f067aa0ba902b7"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 5.24μs -> 3.53μs (48.6% faster)

def test_basic_valid_only_sampled():
    # Test a header with only sampled bit
    header = "--1"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 3.15μs -> 3.19μs (1.25% slower)

def test_basic_valid_only_sampled_zero():
    # Test a header with only sampled bit set to zero
    header = "--0"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 3.17μs -> 2.94μs (7.93% faster)

def test_basic_valid_whitespace():
    # Test a header with leading and trailing whitespace
    header = "   4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-1   "
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 6.64μs -> 4.36μs (52.4% faster)

# 2. Edge Test Cases

def test_edge_empty_string():
    # Test empty string input
    header = ""
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 410ns -> 443ns (7.45% slower)

def test_edge_none_input():
    # Test None input
    header = None
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 466ns -> 449ns (3.79% faster)

def test_edge_invalid_characters():
    # Test header with invalid characters
    header = "zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz-00f067aa0ba902b7-1"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 2.65μs -> 2.59μs (2.32% faster)

def test_edge_too_short_trace_id():
    # Test header with too short trace_id
    header = "4bf92f3577b34da6a3ce929d0e0e47-00f067aa0ba902b7-1"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 3.28μs -> 3.18μs (3.08% faster)

def test_edge_too_long_trace_id():
    # Test header with too long trace_id
    header = "4bf92f3577b34da6a3ce929d0e0e47361234-00f067aa0ba902b7-1"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 3.80μs -> 3.36μs (12.9% faster)

def test_edge_too_short_span_id():
    # Test header with too short span_id
    header = "4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b-1"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 4.25μs -> 3.81μs (11.4% faster)

def test_edge_too_long_span_id():
    # Test header with too long span_id
    header = "4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b71234-1"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 4.75μs -> 4.50μs (5.40% faster)

def test_edge_invalid_sampled_bit():
    # Test header with invalid sampled bit
    header = "4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-2"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 4.57μs -> 4.29μs (6.53% faster)

def test_edge_extra_fields():
    # Test header with extra fields
    header = "4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-1-extra"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 4.68μs -> 4.17μs (12.1% faster)

def test_edge_only_dashes():
    # Test header with only dashes
    header = "---"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 2.81μs -> 2.59μs (8.34% faster)

def test_edge_dash_at_start_and_end():
    # Test header with dash at start and end, but valid in the middle
    header = "-4bf92f3577b34da6a3ce929d0e0e4736-"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 3.24μs -> 3.17μs (2.05% faster)

def test_edge_trace_id_uppercase():
    # Test header with uppercase hex letters (should be valid)
    header = "4BF92F3577B34DA6A3CE929D0E0E4736-00F067AA0BA902B7-1"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 2.32μs -> 2.19μs (6.12% faster)

def test_edge_trace_id_leading_zeros():
    # Test header with trace_id having leading zeros
    header = "00000000000000000000000000000001-0000000000000001-1"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 7.51μs -> 4.51μs (66.5% faster)

def test_edge_trace_id_all_zeros():
    # Test header with trace_id all zeros
    header = "00000000000000000000000000000000-0000000000000000-0"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 6.10μs -> 4.02μs (51.7% faster)

def test_edge_trace_id_and_span_id_only():
    # Test header with trace_id and span_id, but sampled missing
    header = "4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 6.37μs -> 3.96μs (60.9% faster)

def test_edge_w3c_format():
    # Test header in W3C format with 00-...-00
    header = "00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-00"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 6.86μs -> 4.41μs (55.6% faster)

def test_edge_w3c_format_sampled():
    # Test header in W3C format with sampled bit set to 1
    header = "00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 2.83μs -> 2.79μs (1.65% faster)

def test_edge_leading_and_trailing_whitespace():
    # Test header with tabs and spaces
    header = "\t 4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-1 \t"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 7.16μs -> 4.45μs (60.8% faster)

def test_edge_invalid_dash_positions():
    # Test header with dashes in wrong positions
    header = "4bf92f3577b34da6a3ce929d0e0e4736--00f067aa0ba902b7-1"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 4.27μs -> 4.16μs (2.50% faster)

def test_edge_numeric_input():
    # Test header with numeric input instead of hex
    header = "12345678901234567890123456789012-1234567890123456-1"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 6.87μs -> 4.18μs (64.3% faster)

def test_edge_non_hex_span_id():
    # Test header with non-hex span_id
    header = "4bf92f3577b34da6a3ce929d0e0e4736-zzzzzzzzzzzzzzzz-1"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 4.03μs -> 3.84μs (4.84% faster)

def test_edge_non_hex_trace_id():
    # Test header with non-hex trace_id
    header = "zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz-00f067aa0ba902b7-1"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 2.39μs -> 2.21μs (8.06% faster)

def test_edge_none_fields():
    # Test header with only dashes and no fields
    header = "--"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 3.01μs -> 3.09μs (2.56% slower)

# 3. Large Scale Test Cases

def test_large_scale_valid_headers():
    # Test a large number of valid headers
    for i in range(100):
        trace_id = f"{i:032x}"
        span_id = f"{i:016x}"
        sampled = str(i % 2)
        header = f"{trace_id}-{span_id}-{sampled}"
        codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 221μs -> 132μs (67.1% faster)

def test_large_scale_invalid_headers():
    # Test a large number of invalid headers (wrong length)
    for i in range(100):
        trace_id = f"{i:030x}"  # 30 chars instead of 32
        span_id = f"{i:014x}"   # 14 chars instead of 16
        sampled = "2"           # invalid sampled
        header = f"{trace_id}-{span_id}-{sampled}"
        codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 109μs -> 98.9μs (10.6% faster)

def test_large_scale_mixed_headers():
    # Test a mix of valid and invalid headers
    for i in range(100):
        if i % 2 == 0:
            trace_id = f"{i:032x}"
            span_id = f"{i:016x}"
            sampled = str(i % 2)
            header = f"{trace_id}-{span_id}-{sampled}"
            codeflash_output = extract_sentrytrace_data(header); result = codeflash_output
        else:
            trace_id = f"{i:030x}"  # invalid length
            span_id = f"{i:014x}"   # invalid length
            sampled = "2"           # invalid sampled
            header = f"{trace_id}-{span_id}-{sampled}"
            codeflash_output = extract_sentrytrace_data(header); result = codeflash_output

def test_large_scale_whitespace_headers():
    # Test headers with lots of whitespace
    for i in range(100):
        trace_id = f"{i:032x}"
        span_id = f"{i:016x}"
        sampled = str(i % 2)
        header = f"   {trace_id}-{span_id}-{sampled}   "
        codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 223μs -> 136μs (64.3% faster)

def test_large_scale_w3c_format_headers():
    # Test W3C format headers with 00-...-00
    for i in range(100):
        trace_id = f"{i:032x}"
        span_id = f"{i:016x}"
        sampled = str(i % 2)
        header = f"00-{trace_id}-{span_id}-0{sampled}-00"
        codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 195μs -> 181μs (7.62% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-extract_sentrytrace_data-mg9m9ul7 and push.

Codeflash

The optimization adds length checks before expensive string formatting operations. Specifically:

**Key Changes:**
- Added `len(trace_id) != 32` check before `"{:032x}".format(int(trace_id, 16))`
- Added `len(parent_span_id) != 16` check before `"{:016x}".format(int(parent_span_id, 16))`

**Why It's Faster:**
The original code always performed string-to-int conversion and formatting, even when the trace_id/span_id were already properly formatted. The optimization skips these expensive operations when the strings are already the correct length (32 hex chars for trace_id, 16 for span_id).

The `int(trace_id, 16)` and `"{:032x}".format()` operations are computationally expensive, involving:
- Hexadecimal string parsing
- Integer conversion 
- String formatting with zero-padding

**Performance Impact:**
Test results show the optimization is most effective when trace IDs and span IDs are already properly formatted (which is common in production). Cases like `test_valid_full_header` show 51.6% speedup, and `test_missing_trace_id` shows 65.9% speedup. The optimization has minimal overhead for cases where formatting is still needed, with only small gains (1-7%) for malformed inputs.

This is particularly valuable for high-throughput tracing scenarios where most headers contain well-formatted trace data.
@misrasaurabh1 misrasaurabh1 requested a review from a team as a code owner October 16, 2025 06:23
Comment on lines +370 to 371
if trace_id and len(trace_id) != 32:
trace_id = "{:032x}".format(int(trace_id, 16))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the string formatting is really redundant after the regex match, the string formatting logic can be removed.

There's no point checking len(trace_id) != 32, because to reach this point the regex has matched, so the string must be 32 characters long.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed, can just remove it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants