Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Album Name regex ? #106

Open
Bierchermuesli opened this issue Jan 16, 2025 · 13 comments
Open

Album Name regex ? #106

Bierchermuesli opened this issue Jan 16, 2025 · 13 comments

Comments

@Bierchermuesli
Copy link

Hey very nice script, kudos for you!

I have a YYYY\YYYYMMDD Some Album Name '95 folder structure.
Not sure if I overlooked the documentation but I actually only want Some Album Name '95 as album name.

I think I could implement a simple regex flag for search replace in create_album_name function - or what do you suggest?

Thanks

@Salvoxia
Copy link
Owner

So you have one folder per day and would like to only create an album based on the title following the date?

I agree a regex_replace function might be the most straight-forward solution. I think it makes sense to apply that at the end of create_album_name after the album chunks have been joined on the separator.
Did you already put some thought into about how to pass the regular expression and the replacement to the script? Two arguments? One argument with a delimiter?

@Bierchermuesli
Copy link
Author

Bierchermuesli commented Jan 17, 2025

ok, glad my need is not so terribly wrong :) yes my classic folder structure carries the date as a prefix (as most other tools can nod sort by its (exif-) content)
another cheap workaround would be by just to create a .albumprops for each folder. 🤔 but maybe there are other search/replace use cases.

I made a try attempt #110 with two arguments to see if its handy to use it. one argument in sed style would be also nice, however not as easy to implement in python.
In my case I just ommit the second one (replace with nothing):

    environment:
...
      TZ: Europe/Berlin
      ALBUM_LEVELS: 2
      READ_ALBUM_PROPERTIES: 1
      ALBUM_SEPARATOR: '-'
      UNATTENDED: 1 
      ALBUM_NAME_REGEX: '[\d\-\s]+\s'

I would expect escaping issues and questions from other users (?)

Let me know what you think and feel free to tweak as it matches your code style.

Out of curiosity, how does immich/your script knows that an album already exist?

@Salvoxia
Copy link
Owner

Nice, looks pretty much as I expected it to.
Did you test it with your structure? I also was a bit worried about escaping, especially when passing through the docker script, but if your example above is working, that's great!
I also kinda like ALBUM_NAME_REPLACE being optional and simply replacing with nothing when not supplied.
I reviewed the PR with a couple of suggestions.

Out of curiosity, how does immich/your script knows that an album already exist?

After building the list of albums from the assets, the script fetches the complete list of existing albums from the API here:

albums = fetch_albums()

@Bierchermuesli
Copy link
Author

yes, it worked in my situation. I will do some testing with/without/replace and do the documentation part.

yes. ENV, Args and so on smells of escaping issue sooner or later :) - maybe config file/json/yaml would be better? (as a long term idea..)

@Salvoxia
Copy link
Owner

maybe config file/json/yaml would be better? (as a long term idea..)

Yes, I've thought of that a couple of time in the past. It would make it much easier for people trying to serve multiple users (and thus need multiple API keys) with Docker. Instead of spinning up multiple containers, a single one with a config file for different API keys and different settings would suffice.

@ratti
Copy link

ratti commented Jan 18, 2025

Hello, I came here to check for nearly exactly that feature, except I would need multiple RegEx chained.

My folder structure and naming convention is nearly the same as yours — except I am using underscores in names instead spaces, to make shell scripting easier.

So, my structure is:

2024/2024_12_24__Our_nice_Xmas_day

…and I would, in a first step, replace ^.*?__ by “nothing“ to get rid of 2024/2024_12_24__ , so I have left Our_nice_Xmas_day , then /_/ /g to replace underscores with spaces, getting Our nice Xmas day

That would be great!

@Bierchermuesli
Copy link
Author

@ratti i think this can be done with one regex as intended. Please make some test caseses in your python console e.g.

re.sub(r'\d{4}/\d{4}_\d{2}_\d{2}__|_', ' ', filename).strip()

@ratti
Copy link

ratti commented Jan 19, 2025

I don't speak Python, but I'd guess that puts a space at the beginning of every album name.

However, that's just one specific case. I just think having an option for chaining RegEx's will increase the power of that feature a lot with minor additional effort.

Chaining matching expression with | enforces having the same replacement string for every match, since AFAIK there is no RegEx-Syntax for replacing “list of matches“ with “list of replacements“, only “list“ with “one string“

EDIT: Ah, I didn't notice the "strip" at the end, however, if I get that feature right, I could only hand over a RegEx, not running fully chained python methods, right? So this wouldn't help?

@Salvoxia
Copy link
Owner

Salvoxia commented Jan 19, 2025

I do understand where @ratti is coming from. His example could work with a single regexp if the script would implicitly apply strip() (which makes sense for it to do), but if the naming scheme was a little bit different it might no longer be possible with a single regex.
I also agree that the additional effort of chaining regexps is little when looking at the python script alone. I was thinking of passing --album-name-postproc-regex and --album-name-postproc-replace multiple times, chaining would be done in order of passing (and --album-name-postproc-replace would be mandatory every time to be able to match pairs by index).
However, for Docker that's not so easy, since all regex and replacements would have to be passed in a single environment variable, which means some kind of delimiter for regular expressions would be necessary. Might be a little bit tricky to find something suitable (maybe "<regex1>":"<regex2>" could work?).
What do you thinkg @Bierchermuesli ?

@Bierchermuesli
Copy link
Author

Bierchermuesli commented Jan 21, 2025

sorry was absent for some work.

well.. then... something like:

python3 immich_auto_album.py /mnt/library http://localhost:2283/api <key>  -l DEBUG -a 2 -s '' --album-name-post-regex '[\d]+_|\d+\s\w{3}'  --album-name-post-regex '_' ' '
time=2025-01-21T22:19:37.578+01:00 level=DEBUG msg=Identified root_path for asset /mnt/library/2024/01 Dez Das Hasen_Fest/PXL_20241201_093840023.jpg = /mnt/library/
time=2025-01-21T22:19:37.578+01:00 level=DEBUG msg=path chunks = ['2024', '01 Dez Das Hasen_Fest']
time=2025-01-21T22:19:37.578+01:00 level=DEBUG msg=album_name_chunks = ['2024', '01 Dez Das Hasen_Fest']
time=2025-01-21T22:19:37.578+01:00 level=DEBUG msg=Album Name 202401 Dez Das Hasen_Fest
time=2025-01-21T22:19:37.578+01:00 level=DEBUG msg=Album Post Regex s/[\d]+_|\d+\s\w{3}//g -->  Das Hasen_Fest
time=2025-01-21T22:19:37.578+01:00 level=DEBUG msg=Album Post Regex s/_/ /g -->  Das Hasen Fest

....
time=2025-01-21T22:19:37.596+01:00 level=DEBUG msg=Identified root_path for asset /mnt/library/2024/20240907_Stevns_Klint_Sheltern/PXL_20240907_110835302.jpg = /mnt/library/
time=2025-01-21T22:19:37.596+01:00 level=DEBUG msg=path chunks = ['2024', '20240907_Stevns_Klint_Sheltern']
time=2025-01-21T22:19:37.596+01:00 level=DEBUG msg=album_name_chunks = ['2024', '20240907_Stevns_Klint_Sheltern']
time=2025-01-21T22:19:37.596+01:00 level=DEBUG msg=Album Name 202420240907_Stevns_Klint_Sheltern
time=2025-01-21T22:19:37.596+01:00 level=DEBUG msg=Album Post Regex s/[\d]+_|\d+\s\w{3}//g --> Stevns_Klint_Sheltern
time=2025-01-21T22:19:37.596+01:00 level=DEBUG msg=Album Post Regex s/_/ /g --> Stevns Klint Sheltern
time=2025-01-21T22:19:37.596+01:00 level=INFO msg=2 albums identified
time=2025-01-21T22:19:37.596+01:00 level=INFO msg=Album list: ['Das Hasen Fest', 'Stevns Klint Sheltern']

it is repetitive as much you want. I got rid of the replace flag which is now the 2nd value which can be optional.

just don't ask me how this can be integrated into ENV :-)

@Salvoxia
Copy link
Owner

That's pretty cool! Thanks for implementing!
I played around with it for a bit hoping to get some kind of idea how to integrate it with docker properly. Surely we're pushing the limits of what should be done with environment variables here, but I think I've found a solution (that I actually like):
Instead of passing all regex's and replacements in a single environment variable, one indexed variable per pattern is used.
For the example above, the direct Docker call would look like this:

docker run \
  -e ROOT_PATH="/mnt/library" \n
  -e API_URL=" http://localhost:2283/api" \n
  -e API_KEY="<key>" \n
  -e LOG_LEVEL="DEBUG" \n
  -e ALBUM_LEVELS="2" \n
  -e ALBUM_SEPARATOR="" \n
  -e ALBUM_NAME_POST_REGEX1="[\d]+_|\d+\s\w{3}" \n
  -e ALBUM_NAME_POST_REGEX2="'_' ' '"

The YAML example for docker-compose looks like this:

---
services:
  immich-folder-album-creator:
    container_name: immich_folder_album_creator
    image: salvoxia/immich-folder-album-creator:latest
    restart: unless-stopped
    environment:
      API_URL: http://immich_server:2283/api
      API_KEY: <key>
      ROOT_PATH: /external_libs/photos
      ALBUM_LEVELS: 2
      ALBUM_SEPARATOR: ""
      # backslashes must be escaped in YAML
      ALBUM_NAME_POST_REGEX1: "[\\d]+_|\\d+\\s\\w{3}"
      ALBUM_NAME_POST_REGEX2: "'_' ' '"
      LOG_LEVEL: DEBUG
      CRON_EXPRESSION: "0 * * * *"
      TZ: Europe/Berlin

The shell script that translates env variables to CLI arguments is set up to dynamically parse environment variables ALBUM_NAME_POST_REGEX1...ALBUM_NAME_POST_REGEX10, that should be enough for everyone.

If you guys @ratti @Bierchermuesli think that's a feasible solution, I'll create a PR to @Bierchermuesli 's branch the PR is based on with the changes to immich_auto_album.sh.

@Bierchermuesli
Copy link
Author

sounds good. do bash loop over ALBUM_NAME_POST_REGEX* ENVs?

@Salvoxia
Copy link
Owner

That's what the script effectively does, but it's not that straight forwards. It lists all ENV variables, loops over them and performs regex matching with grep to find the ones it wants.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants