Version 2.0 Goals #70

cdgriffith · 2024-05-12T21:52:24Z

Now that puremagic is picking up some outside traction, and used in places like MongoDB, want to lay out clear future plans.

Stay backwards compatible. Anything changed or added has to be behind a feature flag.
Speed Improvements #71 Faster. (Is a json file best way to store data? switch to tree lookup instead of loop iteration?)
Higher accuracy. Some ideas in Some common filetypes are not detected #12
Even better test coverage. All platforms, all current python versions, both success and failure cases. (Started in GitHub Actions: Test on Python 3.13 beta #67)
Documentation improvements
Better sub variation names Variant field in magic.json? #69

Please keep comments on this page limited to overall goals, any specific conversations about any goal should be their own issue and will be updated here.

The text was updated successfully, but these errors were encountered:

NebularNerd · 2024-05-12T22:37:53Z

Could #69 be a new feature for 2.0? Compatibility wise the new field would/should not break anything (that I'm aware of).

CatKasha · 2024-05-13T21:48:34Z

Hi, found out your project via "Explore repositories" on github.com homepage feed
I have kinda similar project https://github.com/CatKasha/yet-another-filetype-checker
Idk if it will be helpful (my project is very simple) but hope it will give you some ideas for improvements

chapmanjacobd · 2024-05-26T04:53:56Z

I just found this: https://mark0.net/soft-trid-e.html

Not sure how well it is known but it contains "over 17k file types". The file signatures does not have an explicit data license attached to it, but at the very least it might be useful to compare against

maybe related:

NebularNerd · 2024-05-26T08:44:26Z

TrID is one of the oldest filetype sites/software out there. That site has looked near enough the same for decades.

Their database is pretty solid and very extensive. But they cannot generate a confidence or process more complicated searches. For example .SBK Creative Soundfont is only handled as an extension where as we can handle looking at the file in two places to generate a match.

cdgriffith · 2024-09-28T23:32:46Z

NebularNerd · 2024-09-29T08:04:47Z

Just had a quick skim through the code and this is awesome stuff. The zip method is way better coded that I can manage but I can see it works as I sort of thought it would in my head. If I want to help fill in some of the .zip what's the best way? I'm guessing I need to fork the dev branch?

Looking at the two examples I can see the rough ideas of how to improve some of the more complex formats I've mentioned in my PR's. For example, we could heavily reduce the size of the .json by shoving all the .mp3 related stuff I added into a dedicated scanner. That in itself would likely be smaller than the .json entries data size as we would not need to repeat everything so heavily.

cdgriffith added this to the Version 2.0 milestone May 12, 2024

cdgriffith pinned this issue May 12, 2024

cdgriffith mentioned this issue May 12, 2024

2024 05 04 Experimental regex support (No rush to merge, proof of concept/feasibility discussion) #65

Closed

NebularNerd mentioned this issue May 20, 2024

2024-05-11 imghdr parity updates #75

Merged

NebularNerd mentioned this issue Jun 24, 2024

For Python 3.13: A drop-in replacement for sndhdr.what() and sndhdr.whathdr() #85

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Version 2.0 Goals #70

Version 2.0 Goals #70

cdgriffith commented May 12, 2024 •

edited

Loading

NebularNerd commented May 12, 2024

CatKasha commented May 13, 2024

chapmanjacobd commented May 26, 2024 •

edited

Loading

NebularNerd commented May 26, 2024

cdgriffith commented Sep 28, 2024 •

edited

Loading

NebularNerd commented Sep 29, 2024

Version 2.0 Goals #70

Version 2.0 Goals #70

Comments

cdgriffith commented May 12, 2024 • edited Loading

NebularNerd commented May 12, 2024

CatKasha commented May 13, 2024

chapmanjacobd commented May 26, 2024 • edited Loading

NebularNerd commented May 26, 2024

cdgriffith commented Sep 28, 2024 • edited Loading

NebularNerd commented Sep 29, 2024

cdgriffith commented May 12, 2024 •

edited

Loading

chapmanjacobd commented May 26, 2024 •

edited

Loading

cdgriffith commented Sep 28, 2024 •

edited

Loading