-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Distribute chunks over slaves #53
base: master
Are you sure you want to change the base?
Conversation
prt.py
Outdated
# If no load is returned, then it is likely that the host | ||
# is offline or unreachable | ||
log.debug("Couldn't get load for host '%s'" % hostname) | ||
del servers[hostname] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure that you're able to remove elements from a dictionary whilst iterating over it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will work in python 2.7, but not python 3.x. In python 3.x dict.items()
returns an iterator, whereas in python 2.7 it does not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok thanks i'm not a python developper at all, i will fix it asap
prt.py
Outdated
ss+=segment_time*SEGMENTS_PER_NODE | ||
q.put((proc, hostname)) | ||
|
||
if init is True: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if init
prt.py
Outdated
log.debug("Available servers : %s\n" % servers) | ||
|
||
return servers | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PEP8: 2 blank lines in between functions please
So first up I'd like to say kudos for original thinking around your approach, it's good to see people showing initiative, too many are content to reuse what they find via google! Genuine praise... Now comes the other side I'm afraid, personally I think we shouldn't be bringing this sort of function into PRT without at the very least a total from head to foot scrutiny under a microscope...and that's if at all as it a very complex type of function at the core of it and it's a minefield of problems lurking under the surface. I'll admit that i haven't dived into the code yet as I just managed to start catching up with all my git notifications and this stood out as something that I had to flag when I read the brief about it. That being said, I'm coming at this from the perspective of someone who has been a technical lead on designs/builds/support of enterprise cluster deployments. They are some of the trickiest miracles to perform! Even using very expensive clustering software from the industry leaders, they all have issues...and that's using enterprise class hardware and attempting to minimize your issues with your solution by things like : having direct access to data via SAN arrays rather than network shares, maximizing uptime on your nodes with resilient systems, etc, etc. So introducing a function that takes an input from and external source (PMS) that we have no control over and is continuously changing with each version and then trying to distribute/load balance individual jobs across multiple nodes which could be a random mixture OS's/hardware while dealing with a having to read/write from NFS to work reliably in producing a successful output every time (or even almost every time) really is a large piece of muddy land filled with death and a pot of gold on the horizon... |
@liviynz Thanks for your feedback I completely understand your point of view and i agree. But, we aren't discussing about an entreprise software project, this is most a tweak for those who want optimize their media server at a small cost (e.g like me using some cheap hardware to transcode high quality movies). The installation method, the lack of automated testing and the manual configuration shows that it's clearly a POC that the creator shares with us (thanks to him). IMO Plex have to work on this subject (as we prove here that it's not a so big challenge), but as they don't we can continue to improve this project and maybe provide some ideas for the futur version by a working POC. I've many ideas to improve this project and i submitted the only "essential" part here (the distributed transcoding) because it has no real BC breaks as it just improve the background algorythm and finally the user experience. My work on this PR is not finished as i've to make it more "fault tolerent" (by retrying the transcoding if segments failed), but all i saw using it on my configuration were quite positive (same quality as the Plex default transcoder but really faster). FYI the previously mentioned improvement ideas:
It's up to the maintainers to choose whether or not they want to merge it, it doesn't change anything for me as i'm using my fork. Thanks |
First off, thank you @JJK801 for your work on this! I haven't had time to look through the code yet, but I agree with @liviynz that if we incorporate this feature, we will have to do so carefully (perhaps as an optional feature). As to your other suggested improvements, I'm intrigued by the idea of automatic slave discovery. We have some other features on the todo list which we could use some help with as well :) |
This is exactly what I've wanted from PRT. Definitely want to keep an eye on this PR. |
The two main challenges as I see it that we have with PRT as it stands are :
Hopefully that gives a bit of a better background for things. I'm totally up for reviewing the code and talking it through with the boss man but due to the reasons I mentioned before I'm very cautious. I know we're not dealing with enterprise but my point was that I see quite a few issues arising when using the best hardware and running clustering software which costs 100's of thousands of dollars so scaling that to open sourced code that could be running on virtually any Linux distro on any hardware there's a lot of grey area there for problems purely from the sheer variations in there. I need to catch up on my coding anyway so definitely keen to catch up internally to talk about this and some other items...my biggest issue is I'm not native with Python and if it was built around one I am native with I'd be hammering out new quality functions quite quickly. E.g. Auto detection of slaves, I have code sitting here right now that will do just that! I wrote it for another project and it works perfectly, it even allows for quick communication via ssh, etc but it's not in Python unfortunately. Anyway, better get back to my paid job I suppose haha |
@JJK801 Did the video, audio, or both get transcoded (Did PMS say it was direct play, direct stream or transcode)? Is your transcode directory on the Banana Pi, the Raspberry Pi, or somewhere else? Where is the media coming from? Do you know what the bitrate of that 1080p video was? I understand that this will increase the rate in which you can pump out transcoded material, but I'm curious if this could help with the IO issues the Raspberry Pi faces when using NFS to read from the library and then again to write to the transcode directory. Particularly with high bitrate media (~10000kbps) |
Hi,
This PR aims to distribute transcoding over multiple slaves at time.
Some input arguments are tweaked (-ss, -segment_start_number) and other added (-t) in order to delegate a set of segments (defined by SEGMENTS_PER_NODE - default 5) to each host and iterate over the timeline as the slaves finish their sets (allowing to have various CPU capicities on the same network).
Tested on my configuration and works like a charm:
Transcoding is very fast regarding the CPUs used (Flawlessly plays 1080p movies over the network).
Because of the NFS shares, it needs a powerful local network between server and slaves for high resolution movies.
Be also careful to put the biggest CPUs first to reduce initial load time.
TODO: