Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix #58 #64

Merged
merged 5 commits into from
Aug 31, 2023
Merged

Fix #58 #64

merged 5 commits into from
Aug 31, 2023

Conversation

younesaassila
Copy link
Contributor

@younesaassila younesaassila commented Aug 31, 2023

Fix #58 thanks to ChatGPT

Prompt

You are a Python expert. Your task is to detect the catastrophic backtracking error in a regex for a C formatter program.
If you cannot find the catastrophic backtracking error, it's okay to say so.
Here are our regex helpers:

# regex for a type
REGEX_TYPE = r"(?!return|goto)([a-z]+\s+)*[a-zA-Z_]\w*"
# regex for a c variable/function name
REGEX_NAME = r"\**[a-zA-Z_*()]\w*"
# regex for a name in a declaration context (with array and function ptr)
REGEX_DECL_NAME = r"\(?{name}(\[.*\])*(\)\(.*\))?".format(name=REGEX_NAME)

And here is the regex with the catastrophic backtracking error:

r"^(static\s+)?{type}\s+{name}\((.|\s)*?\);"

The regex you've provided indeed contains a potential catastrophic backtracking issue. The problematic part is the (.\|\s)*? section within the regex. This part can lead to catastrophic backtracking when trying to match certain inputs, especially if the input contains a long sequence of characters followed by a semicolon.

Here's why it's problematic:

  • (.\|\s)*? attempts to match any character or whitespace zero or more times in a non-greedy manner. The problem is that when the regex engine encounters a character that doesn't match, it will backtrack and try different ways of matching characters and whitespace until it either succeeds or exhausts all possibilities. This can lead to an exponential number of possible paths that the engine explores, causing significant slowdown or even freezing on certain inputs.

To avoid this issue, you can use a more specific pattern to match the content within the parentheses in a safer way. Here's a modified version of the regex that should help avoid catastrophic backtracking:

r"^(static\s+)?{type}\s+{name}\([^)]*?\);"

In this modified regex, [^)]*? is used to match any sequence of characters except a closing parenthesis. This pattern ensures that the regex engine won't get stuck in catastrophic backtracking scenarios.

@younesaassila
Copy link
Contributor Author

younesaassila commented Aug 31, 2023

Do note that this only fixes the hanging. c_formatter_42 will still insert void in the comments when it sees something like bzero() (in the issue author input)

@cacharle
Copy link
Collaborator

Can you add a test with pytest-timeout?

@pytest.mark.timeout(60)
def test_foo():
    pass

@younesaassila
Copy link
Contributor Author

younesaassila commented Aug 31, 2023

@cacharle For the whole run_all procedure or just for line_break?

@cacharle
Copy link
Collaborator

Just for the example we know was infinite before your fix for now (so only line_break)

But I see you can set a global timeout in the setup.cfg file like so:

[pytest]
timeout = 20

https://pypi.org/project/pytest-timeout/

@younesaassila
Copy link
Contributor Author

younesaassila commented Aug 31, 2023

@cacharle Been trying to make it work for the last 30 minutes but it seems that pytest-timeout suffers from an issue with (at least) Python 3.10: pytest-dev/pytest-timeout#113
The timeout function just doesn't work as it is supposed to

@cacharle
Copy link
Collaborator

cacharle commented Aug 31, 2023

Works fine for me with python3.10

c_formatter_42 ❯ pytest tests/test_run.py -k foo
========================================================== test session starts ==========================================================
platform darwin -- Python 3.10.8, pytest-7.4.0, pluggy-1.3.0
rootdir: ~/git/c_formatter_42
configfile: setup.cfg
plugins: timeout-2.1.0, cov-4.1.0, clarity-1.0.1
collected 5 items / 4 deselected / 1 selected

tests/test_run.py F                                                                                                               [100%]

=============================================================== FAILURES ================================================================
_______________________________________________________________ test_foo ________________________________________________________________

    @pytest.mark.timeout(2)
    def test_foo():
>       time.sleep(10000)
E       Failed: Timeout >2.0s

tests/test_run.py:23: Failed

---------- coverage: platform darwin, python 3.10.8-final-0 ----------
Name                                                  Stmts   Miss  Cover   Missing
-----------------------------------------------------------------------------------
c_formatter_42/__init__.py                                0      0   100%
c_formatter_42/__main__.py                               26     26     0%   15-63
c_formatter_42/data/__init__.py                           0      0   100%
c_formatter_42/formatters/__init__.py                     0      0   100%
c_formatter_42/formatters/align.py                       51     41    20%   32-96, 102, 107-109
c_formatter_42/formatters/clang_format.py                28     18    36%   31-45, 65-75
c_formatter_42/formatters/helper.py                      14      7    50%   27-34
c_formatter_42/formatters/hoist.py                       20     16    20%   45-80
c_formatter_42/formatters/line_breaker.py                49     42    14%   7-9, 13-36, 59-72, 76-77, 82-90
c_formatter_42/formatters/misc.py                         9      4    56%   17, 26, 34, 42
c_formatter_42/formatters/preprocessor_directive.py      23     21     9%   17-45
c_formatter_42/formatters/return_type_single_tab.py      12      9    25%   20-35
c_formatter_42/run.py                                    19     11    42%   29-39
-----------------------------------------------------------------------------------
TOTAL                                                   251    195    22%

======================================================== short test summary info ========================================================
FAILED tests/test_run.py::test_foo - Failed: Timeout >2.0s
==================================================== 1 failed, 4 deselected in 2.25s ====================================================
$ python --version
Python 3.10.8

$ pip list | grep time
pytest-timeout        2.1.0

with the following code:

import time


@pytest.mark.timeout(2)
def test_foo():
    time.sleep(10000)

@younesaassila
Copy link
Contributor Author

@cacharle Alright well perhaps it's a Windows problem then. I'll push the test

@cacharle
Copy link
Collaborator

WSL 👀👀👀👀👀

@younesaassila
Copy link
Contributor Author

WSL 👀👀👀👀👀

Yeah I use WSL for 42 projects, I'm just too lazy to setup all my Python env on there lol

@cacharle cacharle merged commit a073173 into dawnbeen:master Aug 31, 2023
10 checks passed
@younesaassila younesaassila deleted the fix/58 branch August 31, 2023 12:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Infinite loop if function call in a comment
2 participants