-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
list_all_resources does not handle subdirectories well #750
Comments
On the other hand, it will be very difficult to discern the two cases, especially during Perhaps we should entirely rule out Tesseract-style subdirectory resources by definition (and prevent such installations during |
But even then, the current implementation of I think we should not dictate the file status a priori, only its existence. Then, for directories, one could list their contents. |
Hey team! Please add your planning poker estimate with ZenHub @kba @lena-hinrichsen |
Would we really need full recursion to support tesseract-like subdirectories? From what I have seen, there isn't any deeper nesting than 1, e.g. |
One way to reliably solve this and not output folders for processors that only accept files and vice versa would be to add a |
What should we show when calling |
Yes, indeed, good idea. But then, why not just use
Well, I don't know what exactly the use-case is for |
|
I am not so sure myself anymore. I was vaguely thinking about verifying that a certain resource/model a processor uses is indeed at the exact same version/checksum one expects. Or to share binary data of a resource withouot using the file system. But these are at best potential use cases. Should we drop the feature? Or replace it with an arguably more sensible |
Sorry, I should have been more specific: I do mean |
I do think these use cases are valid (if mere potential at this point). But mind that due to #752 they don't work so far.
No, why? The latter already outputs the full path of the resource. |
Oh, right, yes, that is possible and makes sense. I'll adapt OCR-D/spec#189. This does make it slightly more difficult to implement and could possibly introduce ambiguity (a processor might define one file parameter as But since gitter is down and the
Mostly because it at least would fullfil its stated purpose, giving you the full path to a resource by name and would not require different behaviors for files vs. directories. |
Oh, I haven't thought about that yet. Perhaps resource manager could just guess the type from the path name (perhaps restricted by the ocrd-tool parameter mimetypes) or by peeking into the fd? |
But |
No, that remains useful as-is and needs no change. |
But then I could just match the suffixes of |
@bertsky can be closed? |
Currently for ocrd-tesserocr-recognize the subdirectories ( |
With |
I do see them though:
Notice the lines:
Yes. (Perhaps we even want to offer them as additional parameter |
Are you sure you have reinstalled or installed with
(and we really need to sort this output) |
Yes, I am sure. Clone of core is at 836eb05, I used
I can still see the directory entries.
speaking of which: in #792 I have added sorting via d108dd0, but that only gives sorted output of |
In
ocrd_utils.list_all_resources
(used by a processor's--list-resources
and resmgr's--list-installed
), entries that are themselves directories do not get traversed recursively.That's okay if the processor takes directory resources (e.g. ocrd-calamari-recognize's h5/json checkpoint dirs), but not if its resources are files in subdirectories (e.g. ocrd-tesserocr-recognize's
script/*.traineddata
orconfigs/*
).So IMO a better behaviour would be to list all such directories recursively.
The text was updated successfully, but these errors were encountered: