-
Notifications
You must be signed in to change notification settings - Fork 120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Code restructuring/improvements #202
Comments
Thank you, @tazend for raising this issue. This is something I have been trying to implement for a few years now and just couldn't get it over the finish line. I run into some wrapping issue and then I get stuck and time just flies :( My last attempt at this was with the 19.05 release: https://github.com/PySlurm/pyslurm/blob/19.05.0-v2/README.rst. Have a look at the README and tell me if it is aligned with what you are thinking. My idea was to break out the .pxd and .pyx files into smaller, more manageable chunks: https://github.com/PySlurm/pyslurm/tree/19.05.0-v2/pyslurm I definitely agree overall that the project can benefit from this restructuring. This may break how we use autopxd today, however. Thoughts? |
Hi @giovtorres , very interesting and thanks for the links! Now, my idea is the following: The slurm.pxd file should stay as it is right now, having all the slurm definitions in one place. Reasons for that:
Additionally splitting the slurm definitions up into seperate .pxd file could probably become very tedious and wouldn't be really benefitial I guess. The user doesn't interact with it directly anway, it's just for internal usage and I'd keep that as it is right now - plain and simple. EDIT: Thinking about this again, maybe it might be good to sort of split the However, what is more important is to have the actual logic from For example, I'm currently working on the logic for partitions here (work in progress and may not compile right now since I quickly pushed it) The logic for partitions can happily reside in it's own module, where I'd write the functions that do the actual stuff in a way where you could basically just put the Using classes definitely makes sense for most things. You can see my basic idea of the partition class structure in the file I sent above though, not much more would be needed in that sense from my point of view. I'm not a big fan of defining an Trying to keep it as simple and straightforward as it gets here is the key I think. However, everything is also possible to do without classes, by simply having straightforward module level functions and where the information is stored and operated on in plain dictionaries, e.g:
The usage would be almost identical to using a class, calling the functions like Either way would work and whatever is the simplest should be used. In the end it doesn't really matter what approach is used of those 2, as long as the functions that implement all the necessary logic, e.g in the case of |
Great points all around!
/cc @gingergeeks @bikerdanny Any feedback or thoughts on the above? |
You are right that overusing the dict type for returns is not the best, and I agree to your points, since so many things like Of course, not everything must be turned in to a class, only where it makes sense, e.g. the slurm (main) config information can simply be stored and handed out in a dictionary, since there isn't much you can do with the data anyway other than simply retrieving it. I would definitely help getting these things going! |
I am very new to pyslurm but am quite familiar with the Python API from HTCondor. The individual classes are based on which of the daemons one is interacting with: schedulers, negotiator, collector, master, etc. Also am a fan of the ClassAd syntax for describing jobs and information from daemons. But Slurm is not HTCondor. Just figured one could learn from the designs others have created. Also it would make going from batch system to another a little easier if the Python API syntax was at least similar at some level. I love that you folks are trying to make interacting with Slurm far more friendly and I would be interested in helping out. In part, because I really hate manually writing job submission scripts. |
Thanks for leaning in here, @mtwest2718 ! We definitely welcome contributions from the HPC community. I left the HPC world a few years ago, and I try my best to keep the project updated with just the Slurm container that I put together for testing, as I no longer have access to a real cluster. Thanks for sharing those links. I'll have to peek at the design of ClassAd in HTCondor. I agree with you, where we should copy good practices and patterns from other projects that have solved for these designs. Perhaps, we could put together an RFC or some sort of decision record for the more modular and object redesign. I think wrapping the job submission code will be the trickiest, but also the most beneficial to users. |
Hi, I put some more work into the I tried different approaches, also played a lot with Sticking with simple (public) In the end, the base layout for every class can almost be idential, as everything can be described with 4 base funtions: What I definitely like is this compiler directive that can be put on top of a file (docs here):
This basically means that cython will automatically handle conversion of
There is then no need to do explicit Going forward, maybe there could be some sort of Milestone created for the refactoring, with a different issue per class that should be (re)implemented? Could be good to have that overview of what must be done and helps tracking progress. Just a few thoughts. Also I agree, having a sort of baseline defined for the redesign could be good. |
I'm glad you brought up the automatic encoding and decoding. We couldn't take advantage of this in the past because it complicated simultaneous support of Python2 and Python3. Now that we've dropped support for Python2, we can certainly implement this, and also look for other opportunities for enhancements and taking advantage of some of Cython's features that the project has been missing out on. The consensus is that we will ship the C code with the project, as it will allow more flexibility in making it available for more users. I created a Project in GitHub: https://github.com/PySlurm/pyslurm/projects/1 We can create issues related to the refactor and keep track of them. I'm not sure if community members have permissions to add to the Project. Let me know if you can't and if you would like access. Thanks! |
We can also use the Discussions feature for discussing the redesign before creating issues per class. |
@giovtorres Sounds good! Yeah it doesn't seem like I have permissions to add things there. Would be nice if you can grant me access if possible. |
Try now. |
Unfortunately still read-only for me there. |
I am not sure how system(atic) y'all want to be with this refactor, particularly regarding the UX.
|
It took me a little bit of searching but here is a translation layer between different LRMS. Just more libraries doing similar tasks. ALSO, some of the folks at the Open Science Grid are interested in this endeavor to make their lives easier, since many machines on the US grid do run Slurm. |
Try now? 🙏🏼 |
Unfortunately still can't add there. |
I can now add issues to this project, but still no open to add cards. But that is OK. Does anyone mind if I write up a few enhancement issues focusing on broad goals? |
Hi,
right now I believe the codebase is a bit unpleasant to maintain and add new stuff to, since everything is put into one big file that consists of thousands of lines. From my perspective, in order to have a cleaner interface, it would probably be best to put some effort into splitting up that big file into seperate modules where possible. Seperation of concerns (e.g slurmdbd stuff is mostly completely unrelated to other things like partition info).
Also the way classes are used right now is probably not the best way. The classes don't really have a purpose and merely act as a namespace for related functions, returning the information in dictionaries instead of utilizing instance attributes.
For example, all the functions from the partition class could be put into a seperate partition.pyx module, and the user can then do calls like
partition.get()
,partition.create()
...and so on, no classes needed in that case, especially when everything is done via dictionaries.However, it could also be benefitial to turn the current "classes" into actual classes, by defining instance attributes and going away from the approach with handing out dictionaries, letting the instances itself hold all the relevant information , e.g for Partition:
Then one can actually do stuff like:
It would probably be the best to improve the classes like this, and the code would be much more maintainable when also split into different files. Also, there is plenty of code that could need an update, not all things are safe and guaranteed to work when pushing a new major version. Besides that, doing
from .pyslurm import *
is also pretty evil in such a big file 😂I would definitely if desired and whenever I have the time, put some effort into the things described above (and already doing so for the Partition case described above). Though it would for sure take a while for everything, but it could be upgraded gradually. Now of course doing things like this might be API breaking, but the current pyslurm.pyx file doesn't have to be touched though when simply branching out into seperate modules.
These are just some thoughts I had on the current code.
The text was updated successfully, but these errors were encountered: