Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: search within JPG metadata #221

Open
Pfeil opened this issue May 5, 2024 · 2 comments
Open

Feature: search within JPG metadata #221

Pfeil opened this issue May 5, 2024 · 2 comments

Comments

@Pfeil
Copy link

Pfeil commented May 5, 2024

Is your feature request related to a problem? Please describe.

I use the Aves Android App to manage and sometimes tag my files. But in order to find images by tag with rga, I need to use the --text parameter to find it, as it does not seem to take a look into the metadata. This works, as the XML seems to be in there in plain text, but produces ugly output (you know, binary stuff producing weird characters in the terminal).

Describe the solution you'd like

Search within metadata (timestamps and others) should be done by default, as jpg is a common format type.

Describe alternatives you've considered

  • cleaner binary output
  • a --metadata parameter

Additional context

This is not specific to Aves only. For example, digiKam also stores XMP within image files. This includes information about recognized faces in images and possibly nondestructive-editing information.

I think there is a lot of potential for looking into metadata. Here is a screenshot of Aves, which shows the different metadata layers of an image taken with ma phone and added a single tag (so, very minimal). I can only view one layer at once, but you get the idea. The opened layer shows how Aves uses XMP / dublin core to store tags within a file.

Aves Screenshot showing image metadata as layers

@phiresky
Copy link
Owner

phiresky commented May 5, 2024

I did actually have plans to add an EXIF adapter originally, but I guess I never did? Probably because usually there's not really a lot of interesting searchable data in there.

In any case, this should be easy to do with a Custom Adapter calling exiftool.

exiftool has lots of output options so you might have to look at the docs. Try starting with exiftool -g -u - as the command and jpg,jpeg,png as the extensions.

If you do find a good config json, do post it in the Wiki: https://github.com/phiresky/ripgrep-all/discussions/categories/show-your-adapter

I'll probably add the top voted adapters from the wiki to core at some point.

@Pfeil
Copy link
Author

Pfeil commented May 5, 2024

Thank you for the hint. I looked a bit into exiftool and was impressed how many file formats it supports according to the manpage. So for now, this prototype works for me:

{
      "name": "exiftool",
      "version": 1,
      "description": "Uses exiftool to extract all plain text metadata from supported files.",

      "extensions": ["jpg", "jpeg"],
      "mimetypes": ["image/jpeg"],

      "binary": "exiftool",
      "args": ["-g", "-u", "-"],
      "disabled_by_default": false,
      "match_only_by_mime": false
}

Example Output:

$ rga "Notiz" .
./IMG_20240113_183022.jpg
Subject                         : Notiz

As for the wiki version, to which kind of files would you apply exiftool for (e.g. to avoid double extractions)? This list in the manpage is pretty long. I wonder what it gets out of pptx files, for example. I do not have any at hand right now. Probably the office username and similar.

 File Types
 ------------+-------------+-------------+-------------+------------
 360   r/w   | DOCX  r     | ITC   r     | O     r     | RSRC  r
 3FR   r     | DPX   r     | J2C   r     | ODP   r     | RTF   r
 3G2   r/w   | DR4   r/w/c | JNG   r/w   | ODS   r     | RW2   r/w
 3GP   r/w   | DSS   r     | JP2   r/w   | ODT   r     | RWL   r/w
 7Z    r     | DV    r     | JPEG  r/w   | OFR   r     | RWZ   r
 A     r     | DVB   r/w   | JSON  r     | OGG   r     | RM    r
 AA    r     | DVR-MS r    | JXL   r     | OGV   r     | SEQ   r
 AAC   r     | DYLIB r     | K25   r     | ONP   r     | SKETCH r
 AAE   r     | EIP   r     | KDC   r     | OPUS  r     | SO    r
 AAX   r/w   | EPS   r/w   | KEY   r     | ORF   r/w   | SR2   r/w
 ACR   r     | EPUB  r     | LA    r     | ORI   r/w   | SRF   r
 AFM   r     | ERF   r/w   | LFP   r     | OTF   r     | SRW   r/w
 AI    r/w   | EXE   r     | LIF   r     | PAC   r     | SVG   r
 AIFF  r     | EXIF  r/w/c | LNK   r     | PAGES r     | SWF   r
 APE   r     | EXR   r     | LRV   r/w   | PBM   r/w   | THM   r/w
 ARQ   r/w   | EXV   r/w/c | M2TS  r     | PCD   r     | TIFF  r/w
 ARW   r/w   | F4A/V r/w   | M4A/V r/w   | PCX   r     | TORRENT r
 ASF   r     | FFF   r/w   | MACOS r     | PDB   r     | TTC   r
 AVI   r     | FITS  r     | MAX   r     | PDF   r/w   | TTF   r
 AVIF  r/w   | FLA   r     | MEF   r/w   | PEF   r/w   | TXT   r
 AZW   r     | FLAC  r     | MIE   r/w/c | PFA   r     | VCF   r
 BMP   r     | FLIF  r/w   | MIFF  r     | PFB   r     | VNT   r
 BPG   r     | FLV   r     | MKA   r     | PFM   r     | VRD   r/w/c
 BTF   r     | FPF   r     | MKS   r     | PGF   r     | VSD   r
 C2PA  r     | FPX   r     | MKV   r     | PGM   r/w   | WAV   r
 CHM   r     | GIF   r/w   | MNG   r/w   | PLIST r     | WDP   r/w
 COS   r     | GLV   r/w   | MOBI  r     | PICT  r     | WEBP  r/w
 CR2   r/w   | GPR   r/w   | MODD  r     | PMP   r     | WEBM  r
 CR3   r/w   | GZ    r     | MOI   r     | PNG   r/w   | WMA   r
 CRM   r/w   | HDP   r/w   | MOS   r/w   | PPM   r/w   | WMV   r
 CRW   r/w   | HDR   r     | MOV   r/w   | PPT   r     | WPG   r
 CS1   r/w   | HEIC  r/w   | MP3   r     | PPTX  r     | WTV   r
 CSV   r     | HEIF  r/w   | MP4   r/w   | PS    r/w   | WV    r
 CUR   r     | HTML  r     | MPC   r     | PSB   r/w   | X3F   r/w
 CZI   r     | ICC   r/w/c | MPG   r     | PSD   r/w   | XCF   r
 DCM   r     | ICO   r     | MPO   r/w   | PSP   r     | XISF  r
 DCP   r/w   | ICS   r     | MQV   r/w   | QTIF  r/w   | XLS   r
 DCR   r     | IDML  r     | MRC   r     | R3D   r     | XLSX  r
 DFONT r     | IIQ   r/w   | MRW   r/w   | RA    r     | XMP   r/w/c
 DIVX  r     | IND   r/w   | MXF   r     | RAF   r/w   | ZIP   r
 DJVU  r     | INSP  r/w   | NEF   r/w   | RAM   r     |
 DLL   r     | INSV  r     | NKSC  r/w   | RAR   r     |
 DNG   r/w   | INX   r     | NRW   r/w   | RAW   r/w   |
 DOC   r     | ISO   r     | NUMBERS r   | RIFF  r     |  

And it supports among others:

 Meta Information
----------------------+----------------------+---------------------
EXIF           r/w/c  |  CIFF           r/w  |  Ricoh RMETA    r
GPS            r/w/c  |  AFCP           r/w  |  Picture Info   r
IPTC           r/w/c  |  Kodak Meta     r/w  |  Adobe APP14    r
XMP            r/w/c  |  FotoStation    r/w  |  MPF            r
MakerNotes     r/w/c  |  PhotoMechanic  r/w  |  Stim           r
Photoshop IRB  r/w/c  |  JPEG 2000      r    |  DPX            r
ICC Profile    r/w/c  |  DICOM          r    |  APE            r
MIE            r/w/c  |  Flash          r    |  Vorbis         r
JFIF           r/w/c  |  FlashPix       r    |  SPIFF          r
Ducky APP12    r/w/c  |  QuickTime      r    |  DjVu           r
PDF            r/w/c  |  Matroska       r    |  M2TS           r
PNG            r/w/c  |  MXF            r    |  PE/COFF        r
Canon VRD      r/w/c  |  PrintIM        r    |  AVCHD          r
Nikon Capture  r/w/c  |  FLAC           r    |  ZIP            r
GeoTIFF        r/w/c  |  ID3            r    |  (and more)

PS: For me, the issue is solved with this. I'll tinker around with exiftools options and post into the wiki later on. So for me it is fine if you like to close the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants