Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-121445: C implementation for fnmatch #121446

Open
wants to merge 60 commits into
base: main
Choose a base branch
from

Conversation

picnixz
Copy link
Contributor

@picnixz picnixz commented Jul 6, 2024

  • Update configure script.
  • C implementation
    • use of lru_cache
    • use of os.path.normcase
    • fnmatch.fnmatch
    • fnmatch.fnmathcase
    • fnmatch.translate
  • Make it work on Windows CI/CD.
  • Add documentation
    • C documentation
    • Python documentation
    • NEWS entry
  • Increase test coverage

Modules/Setup Outdated Show resolved Hide resolved
@picnixz
Copy link
Contributor Author

picnixz commented Jul 6, 2024

I forgot that MSVC didn't have an implementation of fnmatch... I'll add a guard for Windows so that it uses the pure Python implementation.

Modules/_fnmatchmodule.c Outdated Show resolved Hide resolved
@picnixz

This comment was marked as outdated.

@picnixz

This comment was marked as outdated.

@picnixz
Copy link
Contributor Author

picnixz commented Jul 11, 2024

So, everything should be ready. Documentation and NEWS entry need to be written and I also need to know how I can make it work on Windows.

@zooba I know you are well-versed on Windows platforms so could you help me there please if you have time? How can I specify the files to be compiled correctly...?

@zooba
Copy link
Member

zooba commented Jul 11, 2024

Easiest way is to look at any other similar module, find all the references to it, and set yours up the same way. Looks like you'll have a bit of rearranging to do here (e.g. header goes somewhere under Include).

Filename matching should also be a subset of regex, so it may be easier and more efficient to write our own comparison function. That would also then be fully portable, as well as being easier to handle compatibility.

@picnixz
Copy link
Contributor Author

picnixz commented Jul 12, 2024

Easiest way is to look at any other similar module, find all the references to it, and set yours up the same way. Looks like you'll have a bit of rearranging to do here (e.g. header goes somewhere under Include).

Thank you for that (actually, that's how I managed to do it for the linux build...). I actually never used MSVC so I don't know how to compile nor what I should actually specify but I'll do it the old way (trial and error).

Filename matching should also be a subset of regex, so it may be easier and more efficient to write our own comparison function. That would also then be fully portable, as well as being easier to handle compatibility.

For now, I've just replicated the Python implementation which translates UNIX shell patterns to RE patterns. Strictly speaking, I don't think the translation is needed when comparing filenames (since we would anyway use a subset of regex indeed), but I didn't want to introduce a new algorithm (which I more or less gave up to describe after 10 minutes...)

@picnixz
Copy link
Contributor Author

picnixz commented Jul 12, 2024

Mmmh... so the improvements are.... not improvements =/ Well I'll just forget about using interned strings actually.

EDIT 2: mmh, actually it didn't change at all. So what happened between yesterday and today for 'fnmatch.translate' to be a bit slower... (maybe the merge?)

@picnixz picnixz changed the title gh-121445: C implementation for fnmatch [WIP] gh-121445: C implementation for fnmatch Jul 12, 2024
@picnixz
Copy link
Contributor Author

picnixz commented Jul 12, 2024

I think I'll leave the implementation like that for now and open it for review. At least, we have 1.7x performance gain for fnmatch.translate (but let's say we only have 1.5x to be very conservative).

@picnixz picnixz marked this pull request as ready for review July 12, 2024 12:58
@picnixz picnixz requested review from a team, erlend-aasland and corona10 as code owners July 12, 2024 12:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants