Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recognize file type base on mime type #396

Open
nkh opened this issue Jul 9, 2023 · 8 comments
Open

Recognize file type base on mime type #396

nkh opened this issue Jul 9, 2023 · 8 comments
Labels
bug Something isn't working

Comments

@nkh
Copy link

nkh commented Jul 9, 2023

I have a bunch of bash files that are not counted in because they don't have a shebang

using the file type or the mime type would make them part of the scc

@boyter
Copy link
Owner

boyter commented Jul 9, 2023

Could you provide an example of a file (2 would be better) that show how you would expect this to work. I want to see the file itself to determine how this should work.

@nkh
Copy link
Author

nkh commented Jul 9, 2023

Here's a file, let me know if you need more, full of bash code.
https://github.com/nkh/ftl/blob/main/config/ftl/etc/core/ftl

As it is, it's not recognized.

If the extension is changed to .sh, it is recognized as a shell script

if "#!/bin/bash/ is at the beginning of the file, it is recognized as bash code

And now that I runt the test again, I realized that I was wrong.

file ftl -> ftl: Unicode text, UTF-8 text
mimetype ftl -> ftl: text/plain

but it's right with a shebang:
file ftl_shebang -> ftl_shebang: Bourne-Again shell script, Unicode text, UTF-8 text executable
mimetype ftl_shebang -> ftl_shebang: application/x-shellscript

I must have mixed files earlier, sorry.

But let's not lose a good opportunity, I know what those files are, I can cheat and have a list of files and create temporary files, with extension or shebang to give to scc. Or scc could accept a list, and in the best of worlds also generate a list of the files it checked and what types it thought they were.

I'd understand if you feel that the input file with file types (and the list of files/types) is not something you want to implement., I can write a workaround.

@boyter
Copy link
Owner

boyter commented Jul 10, 2023

Ah ok.

So the way scc works internally is to check the extension. If its a singular known file type it treats it as that. Where there are multiple it will inspect the first few thousand bytes counting keywords trying to identify the most likely type then count on that.

Where the filename itself matches, such as makefile the above applies.

Where nothing matches the file is then checked for the presence of a #! operator.

So what I get from the above is you want to do a remap? This currently exists perhaps.

Have a look at the following options, which might do what you are expecting.

--remap-all
--remap-unknown

I suspect either of those should work if you are prepared to add a small comment on the top of your files. Although I understand this might not be ideal.

I don't know if any other option is a good idea in this case, at least without it being an opt-in to ensure that performance is not tanked.

@nkh
Copy link
Author

nkh commented Jul 10, 2023

Thank for pointing at the remapping in this specific case I could add a vim tag to the bash files. I also ran a test with --remap-all that worked well (I ned to check the results a bit more).

I have symlinks in the directory structure, that completely broke scc I think as it never finished working.

@boyter
Copy link
Owner

boyter commented Jul 10, 2023

The symlinks is one I want to know more about. I thought I took care of this. By default it should detect and ignore those, and you have to explicitly enable them using --include-symlinks.

Possible to get a case that replicates it? Id be curious to know if either of these projects are affected too since they have the new file walking logic I want to move scc to

https://github.com/boyter/cs
https://github.com/boyter/dcd

@nkh
Copy link
Author

nkh commented Jul 11, 2023 via email

@boyter
Copy link
Owner

boyter commented Jul 11, 2023

Ideally id like a test case to replicate the issue. But I might be able to create one based on what you have mentioned above.

What OS are you on?

@nkh
Copy link
Author

nkh commented Jul 12, 2023

linux

here's how my directory structure looks lile

config/
└── ftl
    ├── bindings
    ├── commands -> etc/commands/ * link
    ├── etags -> etc/etags/ *link
    ├── etc
    │   ├── bin
    │   │   └── third_party
    │   ├── bindings
    │   │   └── lib
    │   ├── commands
    │   │   └── ftlrc_dir
    │   ├── core
    │   │   └── lib
    │   │       ├── lock_preview
    │   │       └── merge
    │   ├── etags
    │   ├── filters
    │   ├── generators
    │   └── viewers
    ├── filters -> etc/filters/ *link
    ├── generators -> etc/generators/ *link
    ├── man
    └── viewers -> etc/viewers/ *link

@boyter boyter added the bug Something isn't working label Jul 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants