Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix raw string regex patterns #743

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ewlsh
Copy link

@ewlsh ewlsh commented Mar 6, 2025

Fix raw string regex patterns and update converting.py encoding

This PR addresses #693 which is generating a fair amount of logging spam when I'm using Pony.

Context

Python 3.12 changed the way escape sequences in strings work with regular expressions. From the Python 3.12 changelog:

A SyntaxWarning is now emitted for invalid escape sequences in regular expressions passed as strings. The warning will be enabled by default in Python 3.14. Use raw strings (r"...") for regular expressions containing backslashes.

Changes

Added 'r' prefix to regex patterns in:

  • pony/orm/dbproviders/oracle.py: json_item_re
  • pony/orm/dbapiprovider.py: version_re
  • pony/thirdparty/decorator.py: DEF
  • pony/converting.py: date_re_list, time_re, datetime_re_list

Note: email_re and rfc2822_email_re in converting.py already had the 'r' prefix.

Additionally, converting.py was converted from cp1251 to UTF-8 encoding using:

iconv -f cp1251 -t utf-8 pony/converting.py > pony/converting.py.utf8
mv pony/converting.py.utf8 pony/converting.py

This encoding change aligns with a similar change made in #669.

Why

Using raw strings for regex patterns is a Python best practice because:

  1. It avoids double escaping of backslashes
  2. Makes patterns more readable and less error-prone
  3. Prevents potential issues with string escape sequences interfering with regex syntax
  4. Eliminates SyntaxWarnings in Python 3.12+ for invalid escape sequences

This PR was generated with https://www.all-hands.dev/ (an open source agent) but I have reviewed the code changes prior to submitting, noted potential areas for review, and adjusted this description to include additional context.

- Add 'r' prefix to regex patterns in:
  - pony/orm/dbproviders/oracle.py
  - pony/orm/dbapiprovider.py
  - pony/thirdparty/decorator.py
  - pony/converting.py

Fixes ponyorm#693
@ewlsh ewlsh force-pushed the fix-raw-string-regex branch from df6e4d0 to d0c6581 Compare March 6, 2025 20:47
except:
if value == '': return None
raise ValidationError(err_msg or 'Incorrect data')
from __future__ import absolute_import, print_function
Copy link
Author

@ewlsh ewlsh Mar 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hiding whitespace while reviewing this is very helpful, github treats file encoding changes as whitespace.


month_lists = [
"jan feb mar apr may jun jul aug sep oct nov dec".split(),
u"янв фев мар апр май июн июл авг сен окт ноя дек".split(), # Russian
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be reviewed by someone knowledgeable in the Russian equivalents, this is AI-generated.

cc @kozlovsky

Copy link
Author

@ewlsh ewlsh Mar 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've run iconv locally with the same commands the agent used and verified this is the output but I am not familiar with Russian so if there is an encoding issue it would be non-obvious to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant