Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

⚡️ Speed up function get_string_prefix by 8% in src/black/strings.py #63

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

codeflash-ai[bot]
Copy link

@codeflash-ai codeflash-ai bot commented Dec 16, 2024

📄 get_string_prefix in src/black/strings.py

✨ Performance Summary:

  • Speed Increase: 📈 8% (0.08x faster)
  • Runtime Reduction: ⏱️ From 89.9 microseconds down to 83.2 microseconds (best of 306 runs)

📝 Explanation and details

Here's an optimized version of the provided Python code. The objective is to reduce unnecessary operations and improve the performance, especially within the assert_is_leaf_string function and the get_string_prefix function.

Explanation of Optimizations.

  1. Reduced Redundant Check: The initial code performs string.find('"') and string.find("'") and then processes these indexes multiple times. I replaced this with a single pass through the string to find the first quote and validate in one go.
  2. Unified Error Checking: Instead of isolating the assertions, the code now performs a combined check for appropriate length and starting/ending quote characters upfront.
  3. Optimized Prefix Extraction: Instead of appending characters to a prefix string iteratively, I used a loop that constructs a list of characters within STRING_PREFIX_CHARS, which is more efficient. Joining list elements at the end is faster for larger inputs than constantly concatenating strings.

These changes ensure that the code is more efficient while maintaining the original logic and behavior.


Correctness verification

The new optimized code was tested for correctness. The results are listed below:

Test Status Details
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 55 Passed See below
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 1 Passed
📊 Coverage 100.0%

🌀 Generated Regression Tests Details

Click to view details
from typing import Final

# imports
import pytest  # used for our unit tests
from black.strings import get_string_prefix

STRING_PREFIX_CHARS: Final = "furbFURB"  # All possible string prefix characters.
from black.strings import get_string_prefix

# unit tests

# Basic Test Cases
def test_valid_simple_prefixes():
    codeflash_output = get_string_prefix("f'hello'")
    codeflash_output = get_string_prefix("r'world'")
    codeflash_output = get_string_prefix("rf'hello'")
    codeflash_output = get_string_prefix("fr'world'")

def test_valid_mixed_case_prefixes():
    codeflash_output = get_string_prefix("Fr'hello'")
    codeflash_output = get_string_prefix("rF'world'")

def test_no_prefix():
    codeflash_output = get_string_prefix("'hello'")
    codeflash_output = get_string_prefix('"world"')

# Edge Test Cases
def test_invalid_prefixes():
    with pytest.raises(AssertionError):
        get_string_prefix("x'hello'")
    with pytest.raises(AssertionError):
        get_string_prefix("1'world'")

def test_missing_quotes():
    with pytest.raises(AssertionError):
        get_string_prefix("fhello'")
    with pytest.raises(AssertionError):
        get_string_prefix("f'hello")
    with pytest.raises(AssertionError):
        get_string_prefix("hello'")
    with pytest.raises(AssertionError):
        get_string_prefix("'hello")

def test_empty_string():
    with pytest.raises(AssertionError):
        get_string_prefix("")

def test_single_character_strings():
    with pytest.raises(AssertionError):
        get_string_prefix("'")
    with pytest.raises(AssertionError):
        get_string_prefix('"')

def test_strings_with_escaped_quotes():
    with pytest.raises(AssertionError):
        get_string_prefix(r"\"hello\"")

# Large Scale Test Cases

def test_large_scale_no_prefix():
    codeflash_output = get_string_prefix("'" + "a" * 10000 + "'end'")
    codeflash_output = get_string_prefix('"' + "b" * 10000 + '"end"')

# Additional Edge Cases
def test_only_prefix_characters():
    codeflash_output = get_string_prefix("fffff'hello'")
    codeflash_output = get_string_prefix("rfrfrf'world'")

def test_mixed_valid_and_invalid_prefix_characters():
    with pytest.raises(AssertionError):
        get_string_prefix("frx'hello'")
    with pytest.raises(AssertionError):
        get_string_prefix("xrf'world'")
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
from typing import Final

# imports
import pytest  # used for our unit tests
from black.strings import get_string_prefix

STRING_PREFIX_CHARS: Final = "furbFURB"  # All possible string prefix characters.
from black.strings import get_string_prefix

# unit tests

# Test valid inputs with no prefix
def test_valid_no_prefix():
    codeflash_output = get_string_prefix('"hello"')  # No prefix, double quotes
    codeflash_output = get_string_prefix("'world'")  # No prefix, single quotes

# Test valid inputs with single character prefix
def test_valid_single_char_prefix():
    codeflash_output = get_string_prefix('r"hello"')  # Single character prefix 'r'
    codeflash_output = get_string_prefix("f'world'")  # Single character prefix 'f'

# Test valid inputs with multiple character prefix
def test_valid_multiple_char_prefix():
    codeflash_output = get_string_prefix('rf"hello"')  # Multiple character prefix 'rf'
    codeflash_output = get_string_prefix("fr'world'")  # Multiple character prefix 'fr'

# Test valid inputs with mixed case prefix
def test_valid_mixed_case_prefix():
    codeflash_output = get_string_prefix('Rf"hello"')  # Mixed case prefix 'Rf'
    codeflash_output = get_string_prefix("Fr'world'")  # Mixed case prefix 'Fr'

# Test invalid inputs with missing starting quote
def test_invalid_missing_starting_quote():
    with pytest.raises(AssertionError):
        get_string_prefix("hello'")  # Missing starting quote
    with pytest.raises(AssertionError):
        get_string_prefix('rhello"')  # Missing starting quote

# Test invalid inputs with missing ending quote
def test_invalid_missing_ending_quote():
    with pytest.raises(AssertionError):
        get_string_prefix("'hello")  # Missing ending quote
    with pytest.raises(AssertionError):
        get_string_prefix('r"hello')  # Missing ending quote

# Test invalid inputs with invalid prefix characters
def test_invalid_prefix_characters():
    with pytest.raises(AssertionError):
        get_string_prefix("x'hello'")  # Invalid prefix 'x'
    with pytest.raises(AssertionError):
        get_string_prefix('rx"world"')  # Invalid prefix 'rx'

# Test invalid inputs with no quotes at all
def test_no_quotes_at_all():
    with pytest.raises(AssertionError):
        get_string_prefix("hello")  # No quotes at all
    with pytest.raises(AssertionError):
        get_string_prefix("rworld")  # No quotes at all

# Test edge case with empty string

def test_edge_only_quotes():
    codeflash_output = get_string_prefix('""')  # Only double quotes
    codeflash_output = get_string_prefix("''")  # Only single quotes

# Test edge case with only prefix characters
def test_edge_only_prefix_characters():
    with pytest.raises(AssertionError):
        get_string_prefix("f")  # Only prefix character 'f'
    with pytest.raises(AssertionError):
        get_string_prefix("r")  # Only prefix character 'r'

# Test edge case with prefix but no content
def test_edge_prefix_no_content():
    codeflash_output = get_string_prefix('f""')  # Prefix 'f' with no content
    codeflash_output = get_string_prefix("r''")  # Prefix 'r' with no content

# Test large scale inputs with valid prefix
def test_large_scale_valid_prefix():
    codeflash_output = get_string_prefix('f"{}"'.format("a" * 10000))  # Long string with prefix 'f'
    codeflash_output = get_string_prefix("r'{}'".format("b" * 10000))  # Long string with prefix 'r'

# Test large scale inputs with no prefix
def test_large_scale_no_prefix():
    codeflash_output = get_string_prefix('"{}"'.format("a" * 10000))  # Long string with no prefix
    codeflash_output = get_string_prefix("'{}'".format("b" * 10000))  # Long string with no prefix

# Test mixed valid and invalid prefix characters
def test_mixed_valid_invalid_prefix():
    with pytest.raises(AssertionError):
        get_string_prefix("fx'hello'")  # Mixed valid and invalid prefix 'fx'
    with pytest.raises(AssertionError):
        get_string_prefix('rx"world"')  # Mixed valid and invalid prefix 'rx'

# Test strings with embedded quotes
def test_embedded_quotes():
    codeflash_output = get_string_prefix('f"he said, \'hello\'"')  # Embedded single quotes inside double quotes
    codeflash_output = get_string_prefix('r\'she replied, "world"\'')  # Embedded double quotes inside single quotes
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

📣 **Feedback**

If you have any feedback or need assistance, feel free to join our Discord community:

Discord

Here's an optimized version of the provided Python code. The objective is to reduce unnecessary operations and improve the performance, especially within the `assert_is_leaf_string` function and the `get_string_prefix` function.



### Explanation of Optimizations.
1. **Reduced Redundant Check**: The initial code performs `string.find('"')` and `string.find("'")` and then processes these indexes multiple times. I replaced this with a single pass through the string to find the first quote and validate in one go.
2. **Unified Error Checking**: Instead of isolating the assertions, the code now performs a combined check for appropriate length and starting/ending quote characters upfront.
3. **Optimized Prefix Extraction**: Instead of appending characters to a prefix string iteratively, I used a loop that constructs a list of characters within `STRING_PREFIX_CHARS`, which is more efficient. Joining list elements at the end is faster for larger inputs than constantly concatenating strings.

These changes ensure that the code is more efficient while maintaining the original logic and behavior.
@codeflash-ai codeflash-ai bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label Dec 16, 2024
@codeflash-ai codeflash-ai bot requested a review from misrasaurabh1 December 16, 2024 06:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
⚡️ codeflash Optimization PR opened by Codeflash AI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

0 participants