Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

periodically hangs entire parasol hub when listing jobs #28

Open
joelarmstrong opened this issue Mar 16, 2015 · 2 comments
Open

periodically hangs entire parasol hub when listing jobs #28

joelarmstrong opened this issue Mar 16, 2015 · 2 comments
Assignees

Comments

@joelarmstrong
Copy link
Collaborator

This has been a problem for a while, but I'm just putting an issue up so I remember to fix this somehow.

When parasol has more than a million or so jobs queued, like now, the periodic "parasol -extended list jobs" command that jobTree runs hangs the entire parasol hub process for a couple minutes while it gets a listing of every job. This sucks, since it means that the cluster nodes start to go idle waiting for work, since the hub can't issue new jobs while it it's busy sending the list of queued jobs to jobTree. This gets even worse when there are a few jobTrees running; the cluster sometimes sits completely idle for several minutes.

We (read: I) should try to find some way around listing every job, maybe by looking to see if there's a way we can get the same information, but limited to just the jobTree batch rather than all batches. If there isn't a way currently, maybe modify parasol to include that functionality.

@joelarmstrong joelarmstrong self-assigned this Mar 16, 2015
@diekhans
Copy link
Collaborator

This is another reason to have each job tree job do multiple cactus alignments. If parasol can't handle this no other scheduler can

... Sent from my computer phone

-----Original Message-----
From: Joel Armstrong [email protected]
To: benedictpaten/jobTree [email protected]
Sent: Mon, 16 Mar 2015 7:19 PM
Subject: [jobTree] periodically hangs entire parasol hub when listing jobs (#28)

This has been a problem for a while, but I'm just putting an issue up so I remember to fix this somehow.

When parasol has more than a million or so jobs queued, like now, the periodic "parasol -extended list jobs" command that jobTree runs hangs the entire parasol hub process for a couple minutes while it gets a listing of every job. This sucks, since it means that the cluster nodes start to go idle waiting for work, since the hub can't issue new jobs while it it's busy sending the list of queued jobs to jobTree. This gets even worse when there are a few jobTrees running; the cluster sometimes sits completely idle for several minutes.

We (read: I) should try to find some way around listing every job, maybe by looking to see if there's a way we can get the same information, but limited to just the jobTree batch rather than all batches. If there isn't a way currently, maybe modify parasol to include that functionality.


Reply to this email directly or view it on GitHub:
#28

@benedictpaten
Copy link
Owner

The problem is parasol does not provide means to only list the jobs of a
given user. Adding Galt.

On Mon, Mar 16, 2015 at 5:09 PM, Mark Diekhans [email protected]
wrote:

This is another reason to have each job tree job do multiple cactus
alignments. If parasol can't handle this no other scheduler can

... Sent from my computer phone

-----Original Message-----
From: Joel Armstrong [email protected]
To: benedictpaten/jobTree [email protected]
Sent: Mon, 16 Mar 2015 7:19 PM
Subject: [jobTree] periodically hangs entire parasol hub when listing jobs
(#28)

This has been a problem for a while, but I'm just putting an issue up so I
remember to fix this somehow.

When parasol has more than a million or so jobs queued, like now, the
periodic "parasol -extended list jobs" command that jobTree runs hangs the
entire parasol hub process for a couple minutes while it gets a listing of
every job. This sucks, since it means that the cluster nodes start to go
idle waiting for work, since the hub can't issue new jobs while it it's
busy sending the list of queued jobs to jobTree. This gets even worse when
there are a few jobTrees running; the cluster sometimes sits completely
idle for several minutes.

We (read: I) should try to find some way around listing every job, maybe
by looking to see if there's a way we can get the same information, but
limited to just the jobTree batch rather than all batches. If there isn't a
way currently, maybe modify parasol to include that functionality.


Reply to this email directly or view it on GitHub:
#28


Reply to this email directly or view it on GitHub
#28 (comment)
.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants