⚡️ Speed up function `extract_sentrytrace_data` by 37% #4944

misrasaurabh1 · 2025-10-16T06:23:30Z

📄 37% (0.37x) speedup for `extract_sentrytrace_data` in `sentry_sdk/tracing_utils.py`

This one looked really important as well. I found a bunch more optimizations, but don't want to spam you guys. Would you like to meet over a 30 min call to discuss how we can better coordinate?

⏱️ Runtime : 3.30 milliseconds → 2.41 milliseconds (best of 124 runs)

📝 Explanation and details

The optimization adds length checks before expensive string formatting operations. Specifically:

Key Changes:

Added len(trace_id) != 32 check before "{:032x}".format(int(trace_id, 16))
Added len(parent_span_id) != 16 check before "{:016x}".format(int(parent_span_id, 16))

Why It's Faster:
The original code always performed string-to-int conversion and formatting, even when the trace_id/span_id were already properly formatted. The optimization skips these expensive operations when the strings are already the correct length (32 hex chars for trace_id, 16 for span_id).

The int(trace_id, 16) and "{:032x}".format() operations are computationally expensive, involving:

Hexadecimal string parsing
Integer conversion
String formatting with zero-padding

Performance Impact:
Test results show the optimization is most effective when trace IDs and span IDs are already properly formatted (which is common in production). Cases like test_valid_full_header show 51.6% speedup, and test_missing_trace_id shows 65.9% speedup. The optimization has minimal overhead for cases where formatting is still needed, with only small gains (1-7%) for malformed inputs.

This is particularly valuable for high-throughput tracing scenarios where most headers contain well-formatted trace data.

✅ Correctness verification report:

Test	Status
⚙️ Existing Unit Tests	✅ 21 Passed
🌀 Generated Regression Tests	✅ 1960 Passed
⏪ Replay Tests	🔘 None Found
🔎 Concolic Coverage Tests	🔘 None Found
📊 Tests Coverage	100.0%

⚙️ Existing Unit Tests and Runtime

Test File::Test Function	Original ⏱️	Optimized ⏱️	Speedup
`tracing/test_http_headers.py::test_sentrytrace_extraction`	13.8μs	9.03μs	52.8%✅

🌀 Generated Regression Tests and Runtime

import re
from typing import Dict, Optional, Union

# imports
import pytest  # used for our unit tests
from sentry_sdk.tracing_utils import extract_sentrytrace_data

SENTRY_TRACE_REGEX = re.compile(
    "^[ \t]*"  # whitespace
    "([0-9a-f]{32})?"  # trace_id
    "-?([0-9a-f]{16})?"  # span_id
    "-?([01])?"  # sampled
    "[ \t]*$"  # whitespace
)
from sentry_sdk.tracing_utils import extract_sentrytrace_data

# unit tests

# ----------------------
# Basic Test Cases
# ----------------------

def test_valid_full_header():
    # Basic valid header: all fields present
    header = "0123456789abcdef0123456789abcdef-0123456789abcdef-1"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 7.43μs -> 4.90μs (51.6% faster)

def test_valid_full_header_sampled_false():
    # Basic valid header: sampled = 0
    header = "abcdefabcdefabcdefabcdefabcdefabcd-abcdefabcdefabcd-0"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 3.88μs -> 3.71μs (4.77% faster)

def test_valid_header_missing_sampled():
    # Valid header: missing sampled field
    header = "0123456789abcdef0123456789abcdef-0123456789abcdef"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 6.80μs -> 4.12μs (65.2% faster)

def test_valid_header_only_trace_id():
    # Valid header: only trace_id present
    header = "0123456789abcdef0123456789abcdef"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 5.17μs -> 3.62μs (42.7% faster)

def test_valid_header_with_whitespace():
    # Valid header with leading/trailing whitespace
    header = "  0123456789abcdef0123456789abcdef-0123456789abcdef-1  "
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 6.74μs -> 4.19μs (60.8% faster)

# ----------------------
# Edge Test Cases
# ----------------------

def test_empty_header():
    # Edge: empty string
    header = ""
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 459ns -> 492ns (6.71% slower)

def test_none_header():
    # Edge: None as input
    header = None
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 473ns -> 444ns (6.53% faster)

def test_invalid_trace_id_length():
    # Edge: trace_id too short
    header = "01234567-0123456789abcdef-1"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 3.00μs -> 2.83μs (6.00% faster)

def test_invalid_span_id_length():
    # Edge: span_id too short
    header = "0123456789abcdef0123456789abcdef-01234567-1"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 4.21μs -> 4.14μs (1.74% faster)

def test_invalid_sampled_value():
    # Edge: sampled value not 0 or 1
    header = "0123456789abcdef0123456789abcdef-0123456789abcdef-2"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 4.56μs -> 4.50μs (1.36% faster)

def test_invalid_characters_in_trace_id():
    # Edge: invalid characters in trace_id
    header = "zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz-0123456789abcdef-1"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 2.50μs -> 2.37μs (5.36% faster)

def test_invalid_characters_in_span_id():
    # Edge: invalid characters in span_id
    header = "0123456789abcdef0123456789abcdef-zzzzzzzzzzzzzzzz-1"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 4.03μs -> 3.82μs (5.69% faster)

def test_missing_trace_id():
    # Edge: missing trace_id, only span_id and sampled
    header = "-0123456789abcdef-1"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 6.44μs -> 3.88μs (65.9% faster)

def test_missing_span_id():
    # Edge: missing span_id, only trace_id and sampled
    header = "0123456789abcdef0123456789abcdef--1"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 5.88μs -> 4.00μs (47.0% faster)

def test_header_with_extra_fields():
    # Edge: extra fields after sampled
    header = "0123456789abcdef0123456789abcdef-0123456789abcdef-1-extra"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 4.96μs -> 4.80μs (3.16% faster)

def test_header_with_internal_spaces():
    # Edge: spaces inside the trace_id/span_id
    header = "0123456789abcd ef0123456789abcdef-0123456789abcdef-1"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 2.74μs -> 2.54μs (7.95% faster)

def test_header_with_leading_and_trailing_dash():
    # Edge: leading and trailing dash
    header = "-0123456789abcdef0123456789abcdef-0123456789abcdef-1-"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 3.44μs -> 3.26μs (5.51% faster)

def test_header_with_only_sampled():
    # Edge: only sampled value
    header = "-1"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 3.50μs -> 3.44μs (1.98% faster)

def test_header_with_only_span_id():
    # Edge: only span_id, no trace_id
    header = "-0123456789abcdef"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 5.94μs -> 3.64μs (63.3% faster)

def test_header_with_leading_and_trailing_tabs():
    # Edge: tabs instead of spaces
    header = "\t0123456789abcdef0123456789abcdef-0123456789abcdef-1\t"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 7.15μs -> 4.24μs (68.6% faster)

def test_header_with_tracing_format():
    # Edge: header with "00-" prefix and "-00" suffix
    header = "00-0123456789abcdef0123456789abcdef-0123456789abcdef-1-00"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 7.12μs -> 4.80μs (48.4% faster)

def test_header_with_tracing_format_and_whitespace():
    # Edge: header with "00-" prefix, "-00" suffix, and whitespace
    header = " 00-0123456789abcdef0123456789abcdef-0123456789abcdef-1-00 "
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 3.12μs -> 3.04μs (2.97% faster)

def test_header_with_uppercase_hex():
    # Edge: uppercase hex letters in trace_id and span_id
    header = "ABCDEFABCDEFABCDEFABCDEFABCDEFAB-ABCDEFABCDEFABCD-1"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 2.43μs -> 2.27μs (7.33% faster)

def test_header_with_mixedcase_hex():
    # Edge: mixed-case hex letters in trace_id and span_id
    header = "aBcDeFaBcDeFaBcDeFaBcDeFaBcDeFaB-aBcDeFaBcDeFaBcD-1"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 2.41μs -> 2.34μs (3.21% faster)

def test_header_with_leading_and_trailing_spaces_and_tabs():
    # Edge: both spaces and tabs
    header = " \t0123456789abcdef0123456789abcdef-0123456789abcdef-1\t "
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 7.61μs -> 4.67μs (62.9% faster)

# ----------------------
# Large Scale Test Cases
# ----------------------

def test_many_valid_headers():
    # Large scale: test many valid headers
    for i in range(100):
        trace_id = f"{i:032x}"
        span_id = f"{i:016x}"
        sampled = str(i % 2)
        header = f"{trace_id}-{span_id}-{sampled}"
        codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 224μs -> 132μs (69.3% faster)

def test_many_invalid_headers():
    # Large scale: test many invalid headers
    for i in range(100):
        # Too short trace_id
        header = f"{i:08x}-0123456789abcdef-1"
        codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 81.9μs -> 75.1μs (9.08% faster)

        # Too short span_id
        header = f"0123456789abcdef0123456789abcdef-{i:08x}-1"
        codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 146μs -> 138μs (6.19% faster)

        # Invalid sampled value
        header = f"0123456789abcdef0123456789abcdef-0123456789abcdef-{i+2}"
        codeflash_output = extract_sentrytrace_data(header); result = codeflash_output

def test_large_header_with_valid_data():
    # Large scale: test a header with max valid values
    trace_id = "f" * 32
    span_id = "e" * 16
    sampled = "1"
    header = f"{trace_id}-{span_id}-{sampled}"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 7.04μs -> 4.33μs (62.6% faster)

def test_large_header_with_tracing_format():
    # Large scale: header with tracing format and large values
    trace_id = "a" * 32
    span_id = "b" * 16
    sampled = "0"
    header = f"00-{trace_id}-{span_id}-{sampled}-00"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 6.91μs -> 4.47μs (54.6% faster)

def test_bulk_headers_performance():
    # Large scale: test performance with 500 valid and 500 invalid headers
    valid_headers = [
        f"{i:032x}-{i:016x}-{i%2}" for i in range(500)
    ]
    invalid_headers = [
        f"{i:08x}-badspanid-{i%2}" for i in range(500)
    ]
    # Check all valid headers parse correctly
    for i, header in enumerate(valid_headers):
        codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 1.08ms -> 651μs (65.3% faster)
    # Check all invalid headers return None
    for header in invalid_headers:
        codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 398μs -> 368μs (8.23% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
import re

# imports
import pytest  # used for our unit tests
from sentry_sdk.tracing_utils import extract_sentrytrace_data

SENTRY_TRACE_REGEX = re.compile(
    "^[ \t]*"  # whitespace
    "([0-9a-f]{32})?"  # trace_id
    "-?([0-9a-f]{16})?"  # span_id
    "-?([01])?"  # sampled
    "[ \t]*$"  # whitespace
)
from sentry_sdk.tracing_utils import extract_sentrytrace_data

# unit tests

# 1. Basic Test Cases

def test_basic_valid_full_header():
    # Test a header with all fields present and valid
    header = "4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-1"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 6.56μs -> 4.35μs (50.8% faster)

def test_basic_valid_unsampled():
    # Test a header with sampled bit set to 0
    header = "4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-0"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 6.58μs -> 4.20μs (56.8% faster)

def test_basic_valid_no_sampled():
    # Test a header with no sampled bit
    header = "4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 5.89μs -> 3.96μs (48.7% faster)

def test_basic_valid_only_trace_id():
    # Test a header with only trace_id
    header = "4bf92f3577b34da6a3ce929d0e0e4736"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 5.14μs -> 3.48μs (47.8% faster)

def test_basic_valid_trace_and_sampled():
    # Test a header with trace_id and sampled bit
    header = "4bf92f3577b34da6a3ce929d0e0e4736--1"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 5.47μs -> 3.81μs (43.7% faster)

def test_basic_valid_span_and_sampled():
    # Test a header with span_id and sampled bit, but no trace_id
    header = "-00f067aa0ba902b7-1"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 5.28μs -> 3.75μs (41.0% faster)

def test_basic_valid_only_span_id():
    # Test a header with only span_id
    header = "-00f067aa0ba902b7"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 5.24μs -> 3.53μs (48.6% faster)

def test_basic_valid_only_sampled():
    # Test a header with only sampled bit
    header = "--1"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 3.15μs -> 3.19μs (1.25% slower)

def test_basic_valid_only_sampled_zero():
    # Test a header with only sampled bit set to zero
    header = "--0"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 3.17μs -> 2.94μs (7.93% faster)

def test_basic_valid_whitespace():
    # Test a header with leading and trailing whitespace
    header = "   4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-1   "
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 6.64μs -> 4.36μs (52.4% faster)

# 2. Edge Test Cases

def test_edge_empty_string():
    # Test empty string input
    header = ""
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 410ns -> 443ns (7.45% slower)

def test_edge_none_input():
    # Test None input
    header = None
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 466ns -> 449ns (3.79% faster)

def test_edge_invalid_characters():
    # Test header with invalid characters
    header = "zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz-00f067aa0ba902b7-1"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 2.65μs -> 2.59μs (2.32% faster)

def test_edge_too_short_trace_id():
    # Test header with too short trace_id
    header = "4bf92f3577b34da6a3ce929d0e0e47-00f067aa0ba902b7-1"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 3.28μs -> 3.18μs (3.08% faster)

def test_edge_too_long_trace_id():
    # Test header with too long trace_id
    header = "4bf92f3577b34da6a3ce929d0e0e47361234-00f067aa0ba902b7-1"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 3.80μs -> 3.36μs (12.9% faster)

def test_edge_too_short_span_id():
    # Test header with too short span_id
    header = "4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b-1"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 4.25μs -> 3.81μs (11.4% faster)

def test_edge_too_long_span_id():
    # Test header with too long span_id
    header = "4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b71234-1"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 4.75μs -> 4.50μs (5.40% faster)

def test_edge_invalid_sampled_bit():
    # Test header with invalid sampled bit
    header = "4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-2"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 4.57μs -> 4.29μs (6.53% faster)

def test_edge_extra_fields():
    # Test header with extra fields
    header = "4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-1-extra"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 4.68μs -> 4.17μs (12.1% faster)

def test_edge_only_dashes():
    # Test header with only dashes
    header = "---"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 2.81μs -> 2.59μs (8.34% faster)

def test_edge_dash_at_start_and_end():
    # Test header with dash at start and end, but valid in the middle
    header = "-4bf92f3577b34da6a3ce929d0e0e4736-"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 3.24μs -> 3.17μs (2.05% faster)

def test_edge_trace_id_uppercase():
    # Test header with uppercase hex letters (should be valid)
    header = "4BF92F3577B34DA6A3CE929D0E0E4736-00F067AA0BA902B7-1"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 2.32μs -> 2.19μs (6.12% faster)

def test_edge_trace_id_leading_zeros():
    # Test header with trace_id having leading zeros
    header = "00000000000000000000000000000001-0000000000000001-1"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 7.51μs -> 4.51μs (66.5% faster)

def test_edge_trace_id_all_zeros():
    # Test header with trace_id all zeros
    header = "00000000000000000000000000000000-0000000000000000-0"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 6.10μs -> 4.02μs (51.7% faster)

def test_edge_trace_id_and_span_id_only():
    # Test header with trace_id and span_id, but sampled missing
    header = "4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 6.37μs -> 3.96μs (60.9% faster)

def test_edge_w3c_format():
    # Test header in W3C format with 00-...-00
    header = "00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-00"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 6.86μs -> 4.41μs (55.6% faster)

def test_edge_w3c_format_sampled():
    # Test header in W3C format with sampled bit set to 1
    header = "00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 2.83μs -> 2.79μs (1.65% faster)

def test_edge_leading_and_trailing_whitespace():
    # Test header with tabs and spaces
    header = "\t 4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-1 \t"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 7.16μs -> 4.45μs (60.8% faster)

def test_edge_invalid_dash_positions():
    # Test header with dashes in wrong positions
    header = "4bf92f3577b34da6a3ce929d0e0e4736--00f067aa0ba902b7-1"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 4.27μs -> 4.16μs (2.50% faster)

def test_edge_numeric_input():
    # Test header with numeric input instead of hex
    header = "12345678901234567890123456789012-1234567890123456-1"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 6.87μs -> 4.18μs (64.3% faster)

def test_edge_non_hex_span_id():
    # Test header with non-hex span_id
    header = "4bf92f3577b34da6a3ce929d0e0e4736-zzzzzzzzzzzzzzzz-1"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 4.03μs -> 3.84μs (4.84% faster)

def test_edge_non_hex_trace_id():
    # Test header with non-hex trace_id
    header = "zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz-00f067aa0ba902b7-1"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 2.39μs -> 2.21μs (8.06% faster)

def test_edge_none_fields():
    # Test header with only dashes and no fields
    header = "--"
    codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 3.01μs -> 3.09μs (2.56% slower)

# 3. Large Scale Test Cases

def test_large_scale_valid_headers():
    # Test a large number of valid headers
    for i in range(100):
        trace_id = f"{i:032x}"
        span_id = f"{i:016x}"
        sampled = str(i % 2)
        header = f"{trace_id}-{span_id}-{sampled}"
        codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 221μs -> 132μs (67.1% faster)

def test_large_scale_invalid_headers():
    # Test a large number of invalid headers (wrong length)
    for i in range(100):
        trace_id = f"{i:030x}"  # 30 chars instead of 32
        span_id = f"{i:014x}"   # 14 chars instead of 16
        sampled = "2"           # invalid sampled
        header = f"{trace_id}-{span_id}-{sampled}"
        codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 109μs -> 98.9μs (10.6% faster)

def test_large_scale_mixed_headers():
    # Test a mix of valid and invalid headers
    for i in range(100):
        if i % 2 == 0:
            trace_id = f"{i:032x}"
            span_id = f"{i:016x}"
            sampled = str(i % 2)
            header = f"{trace_id}-{span_id}-{sampled}"
            codeflash_output = extract_sentrytrace_data(header); result = codeflash_output
        else:
            trace_id = f"{i:030x}"  # invalid length
            span_id = f"{i:014x}"   # invalid length
            sampled = "2"           # invalid sampled
            header = f"{trace_id}-{span_id}-{sampled}"
            codeflash_output = extract_sentrytrace_data(header); result = codeflash_output

def test_large_scale_whitespace_headers():
    # Test headers with lots of whitespace
    for i in range(100):
        trace_id = f"{i:032x}"
        span_id = f"{i:016x}"
        sampled = str(i % 2)
        header = f"   {trace_id}-{span_id}-{sampled}   "
        codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 223μs -> 136μs (64.3% faster)

def test_large_scale_w3c_format_headers():
    # Test W3C format headers with 00-...-00
    for i in range(100):
        trace_id = f"{i:032x}"
        span_id = f"{i:016x}"
        sampled = str(i % 2)
        header = f"00-{trace_id}-{span_id}-0{sampled}-00"
        codeflash_output = extract_sentrytrace_data(header); result = codeflash_output # 195μs -> 181μs (7.62% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-extract_sentrytrace_data-mg9m9ul7 and push.

The optimization adds length checks before expensive string formatting operations. Specifically: **Key Changes:** - Added `len(trace_id) != 32` check before `"{:032x}".format(int(trace_id, 16))` - Added `len(parent_span_id) != 16` check before `"{:016x}".format(int(parent_span_id, 16))` **Why It's Faster:** The original code always performed string-to-int conversion and formatting, even when the trace_id/span_id were already properly formatted. The optimization skips these expensive operations when the strings are already the correct length (32 hex chars for trace_id, 16 for span_id). The `int(trace_id, 16)` and `"{:032x}".format()` operations are computationally expensive, involving: - Hexadecimal string parsing - Integer conversion - String formatting with zero-padding **Performance Impact:** Test results show the optimization is most effective when trace IDs and span IDs are already properly formatted (which is common in production). Cases like `test_valid_full_header` show 51.6% speedup, and `test_missing_trace_id` shows 65.9% speedup. The optimization has minimal overhead for cases where formatting is still needed, with only small gains (1-7%) for malformed inputs. This is particularly valuable for high-throughput tracing scenarios where most headers contain well-formatted trace data.

…a-mg9m9ul7

sentry_sdk/tracing_utils.py

alexander-alderman-webb · 2025-10-16T13:56:27Z

sentry_sdk/tracing_utils.py

+    if trace_id and len(trace_id) != 32:
        trace_id = "{:032x}".format(int(trace_id, 16))


If the string formatting is really redundant after the regex match, the string formatting logic can be removed.

There's no point checking len(trace_id) != 32, because to reach this point the regex has matched, so the string must be 32 characters long.

agreed, can just remove it

sentry_sdk/tracing_utils.py

misrasaurabh1 requested a review from a team as a code owner October 16, 2025 06:23

Merge branch 'master' into codeflash/optimize-extract_sentrytrace_dat…

f681575

…a-mg9m9ul7

misrasaurabh1 commented Oct 16, 2025

View reviewed changes

sentry_sdk/tracing_utils.py Outdated Show resolved Hide resolved

Apply suggestion from @misrasaurabh1

c1c6bc5

alexander-alderman-webb reviewed Oct 16, 2025

View reviewed changes

misrasaurabh1 commented Oct 16, 2025

View reviewed changes

sentry_sdk/tracing_utils.py Show resolved Hide resolved

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

⚡️ Speed up function `extract_sentrytrace_data` by 37% #4944

⚡️ Speed up function `extract_sentrytrace_data` by 37% #4944

misrasaurabh1 commented Oct 16, 2025

Uh oh!

Uh oh!

alexander-alderman-webb Oct 16, 2025

Uh oh!

sl0thentr0py Oct 16, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		if trace_id and len(trace_id) != 32:
		trace_id = "{:032x}".format(int(trace_id, 16))

⚡️ Speed up function extract_sentrytrace_data by 37% #4944

Are you sure you want to change the base?

⚡️ Speed up function extract_sentrytrace_data by 37% #4944

Conversation

misrasaurabh1 commented Oct 16, 2025

📄 37% (0.37x) speedup for extract_sentrytrace_data in sentry_sdk/tracing_utils.py

📝 Explanation and details

Uh oh!

Uh oh!

alexander-alderman-webb Oct 16, 2025

Choose a reason for hiding this comment

Uh oh!

sl0thentr0py Oct 16, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

⚡️ Speed up function `extract_sentrytrace_data` by 37% #4944

⚡️ Speed up function `extract_sentrytrace_data` by 37% #4944

📄 37% (0.37x) speedup for `extract_sentrytrace_data` in `sentry_sdk/tracing_utils.py`