Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Srt files from Bazarr #70

Closed
bonswouar opened this issue Sep 22, 2024 · 21 comments
Closed

Srt files from Bazarr #70

bonswouar opened this issue Sep 22, 2024 · 21 comments

Comments

@bonswouar
Copy link

bonswouar commented Sep 22, 2024

Not sure if this is a bug / misconfiguration on my end, or a feature request, but I would have expected the Leaving Soon folders to contain all extra files from the original ones: mainly subtitles, but could also be nfo, theme.mp3, trickplay...

For files generated by Jellyfin I guess it's not a huge issue, as configuring the Leaving Soon directory to generate nfo/theme/trickplay should be good enough (although not ideal to generate them twice, specially if you have changed things - like metadata language - manually)
But for subtitles it's kind of a big deal, in my case they're created by an external tool (bazarr) that only manage subtitles for Radarr's directories.

@Schaka
Copy link
Owner

Schaka commented Sep 22, 2024

Essentially duplicate of #48

It's intended. Janitorr doesn't know about files that aren't managed by the arrs and cleanup is a lot harder (and worse, will depend on Jellyfin setup) if I copy over all files. Metadata is up to you. You should be using TRaSH Guides naming so that Jellyfin/Emby will automatically pick everything up.

Ideally, you should NOT be putting your metadata in the same folder as your media.
However, if you insist, you can configure your leaving soon libraries to do in Jellyfin/Emby. Janitorr only creates them if they don't exist and will not alter them after other than to add/remove paths if necessary.

I understand the subtitle problem and I'm willing to give it some thought, but you should really be grabbing media that has enough of them already embedded.

The more files I have to carry over that aren't managed by the arrs, the more orphaned data I have to clean up somehow.
Since most of Janitorr's component are optional and it's supposed to work without file access, it's a lot more complicated than it would seem at first.

@bonswouar
Copy link
Author

bonswouar commented Sep 22, 2024

Thanks for your answer!

You should be using TRaSH Guides naming so that Jellyfin/Emby will automatically pick everything up.

I already have no issue regarding identification, it's more about customization (ie changing metadata's language for some specific movies)

Ideally, you should NOT be putting your metadata in the same folder as your media.

I've basically configured that with the hope to make media info more "portable" (but I have no actual use case except for Janitorr I think).
But as I was saying that's not a big deal, I understand that would mean a pretty tight integration with Jellyfin.

I understand the subtitle problem and I'm willing to give it some thought, but you should really be grabbing media that has enough of them already embedded.

Glad to see that part of the request did make sense! :)

And well, you often don't have so much choice regarding embedded subtitles, plus I even sometimes have to translate some (for movies that don't have any official release in that language), or generate some from whisper, etc. And Bazarr is pretty useful for that!
Also one feature that is working so well I totally forgot: stripping HI parts from subtitles, as HI subtitles are more frequent than regular ones (specially for embedded ones!), it helps a lot to have good English srt

The more files I have to carry over that aren't managed by the arrs, the more orphaned data I have to clean up somehow. Since most of Janitorr's component are optional and it's supposed to work without file access, it's a lot more complicated than it would seem at first.

Oh well I didn't really know it's supposed to work without file access, you're right it sounds quite more complicated than it would seem then 😅
The problem is that without it managed by Janitorr I see no solution except doing my own dirty script somehow

Although that just reminded me, when you say

Janitorr doesn't know about files that aren't managed by the arrs

I guess they aren't really "managed" by it, but those extra files are still shown in radarr : I've just checked at a random movie, and paths for my srt, nfo, bif, and json files are listed. Couldn't this be managed without orphaned data issues then?

(Sorry I should have checked the code before maybe :))

@Schaka
Copy link
Owner

Schaka commented Sep 23, 2024

I guess they aren't really "managed" by it, but those extra files are still shown in radarr : I've just checked at a random movie, and paths for my srt, nfo, bif, and json files are listed. Couldn't this be managed without orphaned data issues then?

I need to look into what the arrs supply in terms of files. Now that you mention it, I remember seeing it in Radarr too.
I primarily use the history API since that's how you get the original filename (for seeding checks) and I obviously need the grab dates for what Janitorr does.

I may be able to enrich it with extra data through a user option to move all files over.
If I can get deletion to be managed by the arrs so I don't have to start manging those files in Janitorr, I'll implement it.

I don't want to start persisting data for Janitorr as of right now. So keeping track of what files are in the leaving-soon folders is not desirable. Recursively checking all folders and assuming they all belong to Janitorr also doesn't necessarily work because you can turn options like from-scratch on and off.

I'll keep it open and see if deletion via arrs is possible. I'm sure it is for Radarr, but deleting seasons is going to be different.
In general, anything that has to do with Radarr has always been piss easy to handle. It's always TV shows that require extra work.

@bonswouar
Copy link
Author

bonswouar commented Sep 23, 2024

I see!
Well if it's "piss easy" to implement with Radarr, but not with Sonarr, I wouldn't mind if this option was just for Radarr extra files first. Not great for consistency between arrs, but better than nothing I'd say! (and I can still do my dirty script for tv shows extra files)

Also if you think it makes more sense to focus on subtitles only and not all other extra files, another way could maybe to use Bazarr API directly? So it could work for Movies & TV shows directly. I've just had a look, you can easily get srt path from the API using Sonarr's SerieID and EpisodeID for example

@bonswouar bonswouar changed the title Leaving soon doesn't contain extra files Srt files from Bazarr Sep 30, 2024
@bonswouar
Copy link
Author

bonswouar commented Sep 30, 2024

So I had a better look at Bazarr API, here are the endpoints I've found with their structured responses:

  • You can use /api/movies?start=0&length=-1&radarrid%5B%5D={RadarrMovieID} to get the path of subtitles for a movie
{
  "data": [
    {
      [...]
      "subtitles": [
        {
          [...]
          "path": "/data/media/Movies/Movie/MovieName.en.srt",
        },
        [...]
      ],
    [...]
    }
  ]
}
  • And /api/episodes?seriesid%5B%5D={SonarrSeriesID}&episodeid%5B%5D={SonarrEpisodeId} for episodes or full shows (only one param is required)
{
  "data": [
    {
      "episode": 14,
      [...]
      "subtitles": [
        {
          "path": "/data/media/Series/SerieName/Season/EpisodeName.en.srt",
          [...]
        },
      [...]
      ]
    },
  [...]
  ]
}

Do you think it could work?

Thanks!

@Schaka
Copy link
Owner

Schaka commented Oct 2, 2024

This is going to happen. I just don't know when. Adding another optional service to be triggered by the service responsible for Jellyfin is going to require some refactoring unless I'm turning the existing code even dirtier than it already is.

@bonswouar
Copy link
Author

bonswouar commented Oct 3, 2024

@Schaka I can look into it if you want (I just have to see how to build locally first), but if a big refactoring is needed that might be better that you do it?
I can still propose a draft if that can help though

@Schaka
Copy link
Owner

Schaka commented Oct 4, 2024

I started development here: #80

@Schaka
Copy link
Owner

Schaka commented Oct 5, 2024

You can try the images in the feature branch. They're tagged 80-merge, native-80-merge, native-amd64-80-merge, etc.

@bonswouar
Copy link
Author

Cool!

I think there's an issue with episodeIds:

feign.FeignException$NotFound: [404 NOT FOUND] during [GET] to [http://bazarr_host:6767/api/episodes?series=325&episodeid] [BazarrClient#getTvSubtitles(int,List)]: ["Series or Episode ID not provided"

@Schaka
Copy link
Owner

Schaka commented Oct 6, 2024

That was stupid. I changed the call later on and didn't test it again:
I've pushed some changes now

@bonswouar
Copy link
Author

Almost there :)

feign.FeignException$NotFound: [404 NOT FOUND] during [GET] to [http://bazarr_host:6767/api/episodes?series=325] [BazarrClient#getTvSubtitles(int)]: ["Series or Episode ID not provided"

I believe the url param shouldn't be series but seriesid

@Schaka
Copy link
Owner

Schaka commented Oct 6, 2024

Changes are coming. I must've been very tired. I just went by their swagger generated docs.

@Schaka
Copy link
Owner

Schaka commented Oct 12, 2024

Please try again. I've been running it on my server for a bit but can only do limited testing. It took me a while to debug and make the (minimal) changes I had to because I've been strapped for time.

@bonswouar
Copy link
Author

bonswouar commented Oct 13, 2024

Thanks for the update!
I see no error now, although still no srt file from bazarr in leaving-soon :/

@Schaka
Copy link
Owner

Schaka commented Oct 13, 2024

Trace log should show, if files are meant to be copied over:

// Response from bazarr
log.trace("Adding extra files to $type for *arr id ${item.id}: $extraFiles")

Debug logging should show actual files attempted to be copied over:

log.debug("Copying extra files from {} to {}", filePath, target)

Are you seeing any of that?

@Schaka
Copy link
Owner

Schaka commented Oct 13, 2024

I believe I've found all issues now. I was testing components in too much of an isolated state.
I need to figure out how to run a complete arr stack in testcontainers when I can find the time.

Please let me know if it works for you.

@bonswouar
Copy link
Author

bonswouar commented Oct 14, 2024

I've checked at the logs, not sure it's normal:

Adding extra files to MOVIES for *arr id 11367 (season null): [/data/media/Movies/ ..., , /data/media/Movies/ ...

with a huge list of movies included in this array

It seems some subtitles are copied, but not the right ones/to the right folders. I see for example 3 srt than aren't supposed to be in leaving-soon are copied in some movie folders.
The logs corresponding seem to be :

Copying extra files from /data/media/Movies/Movie1/Movie1.en.srt to /data/media/leaving-soon/movies/media/Movie2/Movie2.mkv

with Movie1 not supposed to be used

Also, probably not related but, there is an error "Malformed input or input contains unmappable characters" because one of the path contains ³ (I think)

@Schaka
Copy link
Owner

Schaka commented Oct 14, 2024

That was a stupid bug.
If you pass radarrid as a parameter instead of radarrid[], it will just respond with a list of all movies.

The second bug I've been trying to chase down. It's related to encoding. I believe it's a bug in GraalVM and I'm looking into (temporary) workarounds.

@bonswouar
Copy link
Author

@Schaka Looks good for subtitles, thanks!

Still the issue with "Malformed input or input contains unmappable characters" though, this time with a file containing É and ô

@Schaka
Copy link
Owner

Schaka commented Oct 17, 2024

Yeah that's a bug in Oracle's GraalVM, or more specifically how it's using in the Paketo's native image builder. Since they officially advertise those buildpacks, I hope they're going to fix it soon . I've reported the issue to both.
This has always been the case, I just made the exception more visibile when using debug logging.

Closing this ticket. I've already created follow ups for the accented/umlaut character issue.

@Schaka Schaka closed this as completed Oct 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants