Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression in route match regex in 1.18.0 #550

Closed
moretension opened this issue Jul 20, 2021 · 7 comments
Closed

Regression in route match regex in 1.18.0 #550

moretension opened this issue Jul 20, 2021 · 7 comments
Assignees
Labels
bug Something isn't working

Comments

@moretension
Copy link

The 1.18.0 release resolves the path regex greedy match problem in #520, but introduces a regression in the new regex pattern. To resolve the greedy match issue, #520 matches on the \w character set and on the non-word boundary (\b) to ensure trailing path components following the match aren't captured. Unfortunately, this change also prevents match of path values containing non-word characters. Email addresses, URL-encoded values, and RFC 4122 UUIDs no long match route rules, because @, %, and - are not included in the \w character match set.

Expected Behavior

APIGatewayResolver route rules should match request path parameters containing @, -, and % in addition to word characters. This was the matching behavior in the 1.17.x release.

Current Behavior

No routes are matched if a path parameter contains a non-word character.

Possible Solution

The _NAMED_GROUP_BOUNDARY_PATTERN needs to be modified to include common non-word ASCII characters used in path components. Here's one possible pattern that seems to work while avoiding greedy capture:

r"(?P\1[-%@.:\\w]+)"

As shown in the REPL:

>>> import re
>>> from uuid import uuid4
>>> m = re.match("/api/v1/(?P<test>[-%@\\w]+)/items", f"/api/v1/{uuid4()}/items")
>>> m.groupdict()
{'test': '1f21c0c1-cd19-4775-8033-9e6244d54d79'}
>>> m = re.match("/api/v1/(?P<test>[-%@.:\\w]+)/items", f"/api/v1/[email protected]/items")
>>> m.groupdict()
{'test': '[email protected]'}

Steps to Reproduce (for bugs)

The following should call the get_user_items() function and print a Response JSON object containing the items object in the body. However, because the path parameter contains a non-word character, no route matches request path, and a 404 Response JSON object is printed instead.

from aws_lambda_powertools.event_handler.api_gateway import ApiGatewayResolver

api = ApiGatewayResolver()

@api.get("/api/v1/<user>/items")
def get_user_items(user):
    return {"items": ["1", "2", "3"]}

def handler(event, context):
    return api.resolve(event, context)

if __name__ == "__main__":
    event = {
        "httpMethod": "GET",
        "path": "/api/v1/[email protected]/items"
    }
    print(handler(event, {}))

This works (albeit with the greedy match problem) in 1.17.x.

Environment

  • Powertools version used: 1.18.0
  • Packaging format (Layers, PyPi): PyPI
  • AWS Lambda function runtime: 3.8
@moretension moretension added bug Something isn't working triage Pending triage from maintainers labels Jul 20, 2021
@boring-cyborg
Copy link

boring-cyborg bot commented Jul 20, 2021

Thanks for opening your first issue here! We'll come back to you as soon as we can.

@moretension
Copy link
Author

I'm hoping someone will be able to take a look at this soon. We need the greedy pattern fix, but can't use 1.18.0 because of the regression preventing use of path parameters containing non-word characters.

@heitorlessa
Copy link
Contributor

Looking into this now, missed it somehow

@heitorlessa heitorlessa removed the triage Pending triage from maintainers label Jul 23, 2021
@heitorlessa heitorlessa self-assigned this Jul 23, 2021
@heitorlessa
Copy link
Contributor

PR created - re-reading the RFC for other chars we might be missing and will improve tests with it to be triple sure on it.

Thanks again for providing a quick fix @moretension!

@heitorlessa
Copy link
Contributor

Found other edge cases too, waiting for checks to complete then I'll release a patch version

@heitorlessa
Copy link
Contributor

This is now available in PyPi and on Serverless Application Repository - 1.18.1 --- Could you double check this now works as expected for you @moretension?

Thank you

@heitorlessa heitorlessa added the need-customer-feedback Requires more customers feedback before making or revisiting a decision label Jul 23, 2021
@moretension
Copy link
Author

Fix confirmed in our unit tests, thanks! And thanks for going the extra mile to make sure you covered all the characters properly.

@heitorlessa heitorlessa removed the need-customer-feedback Requires more customers feedback before making or revisiting a decision label Jul 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants