Display job reexecutions and job id #225

dgmora · 2024-12-07T11:32:23Z

Background / problem

There are some things that are hard to distinguish at first glance regarding job instances, executions and reexecutions. I think a factor is that I am not used to the terminology / internals, since I mostly used sidekiq.

An example:

Here you can't see to what instance these jobs belong. Three may be related to a759850e-3b87-44be-ba63-3d98ab92771b and one to c24aa402-b669-437c-80ea-e3b603e87dc2. Knowing the instance can help understand that the same job (instance) has been reexecuted multiple times because there was an exception.

Sometimes when you click one of these "finished jobs", you'll land in a job with the finished label, while in others you'll go to a scheduled job (because it's still being reexecuted due to an exception). It's also a bit surprising to me that these are called "jobs", but we see multiple rows for the same job instance.

Then in scheduled jobs, you can't see if a job is part of a reexecution or not

Changes

Added the first part of the job id to the list. This helps understanding a bit reexecutions

Change:

Added a reexecution label in the jobs in finished/scheduled:

Change:

and in the scheduled view:

Change:

Questions/Asks

I'm not sure if it fits the terminology, but wouldn't Finished job executions fit better than Finished jobs? There are multiple rows for the same job instance.

I don't mind changes to this PR, so if it can serve as a starting point, feel free to modify it (styles or anything) or create a new one 🙏.

I think it'd be nice to show as well in the Finished jobs list if a job is 'done done' or if it's still being reexecuted / failed / blocked. But with just the information job has, I didn't see a way.

Only display if it's a reexecution and before delayed

rosa · 2024-12-07T17:50:23Z

Hey @dgmora, thank you! I'll look into this next week. To answer this question:

There are multiple rows for the same job instance.

The truth is that, because of the way this is implemented, these are different jobs, they just happen to have the same Active Job ID. The way Active Job's handles automated retries is to re-enqueue a job with the same arguments, updating the internal metadata on executions and exceptions. Then the queue adapter proceeds as it wishes with that. In the case of Resque (and I think Sidekiq is the same) finished jobs aren't kept, jobs are discarded when they're run, so it seems as if there was only one job, but the truth is that there are multiple jobs, just that the ones that ran are no longer in the system. Mission Control supports Resque too, and in that case, this won't be as relevant. Solid Queue can also behave in that way if you set preserve_finished_jobs to false.

Now, Solid Queue could do something different here, perhaps, and it's that instead of creating a completely new job when Active Job calls enqueue, it could update the existing record for the same Active Job ID. I went with creating a new job because I thought it was safer and more consistent with Active Job's calls to enqueue, and also useful to see the retries that actually ran.

Now I understand what you meant by scheduled, because, in this case,e you have a delay for the retries, so you're seeing both finished and retried jobs and jobs that are scheduled because they were enqueued with the delay given by the retry_on options.

rosa · 2024-12-07T17:52:00Z

Hey @dgmora, thank you! I'll look into this next week. To answer this question:

There are multiple rows for the same job instance.

The truth is that, because of the way this is implemented, these are different jobs; they are related to each other but they're different jobs. The way Active Job's handles automated retries is to re-enqueue a job with the same arguments, updating the internal metadata on executions and exceptions. Then, the queue adapter proceeds as it wishes with that. In the case of Resque (and I think Sidekiq is the same), finished jobs aren't kept, jobs are discarded when they're run, so it seems as if there was only one job, but the truth is that there are multiple jobs, just that the ones that ran are no longer in the system. Mission Control supports Resque too, and in that case, this won't be as relevant. Solid Queue can also behave in that way if you set preserve_finished_jobs to false.

Now, Solid Queue could do something different here, perhaps, and it's that instead of creating a completely new job when Active Job calls enqueue, it could update the existing record for the same Active Job ID. I went with creating a new job because I thought it was safer and more consistent with Active Job's calls to enqueue, and also useful to see the retries that actually ran.

Now I understand what you meant by scheduled, because, in this case, you have a delay for the retries, so you're seeing both finished and retried jobs and jobs that are scheduled because they were enqueued with the delay given by the retry_on options.

I'll look into this more closely next week, together with #204, as that's related 😊

dgmora added 3 commits December 7, 2024 11:00

Add reexecution number to job list

e9fe231

Only display if it's a reexecution and before delayed

Add job reexecution to finished and delayed job page

55ce711

Add short job id to the job list

cc9d3d6

dgmora mentioned this pull request Dec 7, 2024

Behavior of retried jobs #181

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Display job reexecutions and job id #225

Display job reexecutions and job id #225

dgmora commented Dec 7, 2024 •

edited

Loading

rosa commented Dec 7, 2024

rosa commented Dec 7, 2024

Display job reexecutions and job id #225

Are you sure you want to change the base?

Display job reexecutions and job id #225

Conversation

dgmora commented Dec 7, 2024 • edited Loading

Background / problem

Changes

Questions/Asks

rosa commented Dec 7, 2024

rosa commented Dec 7, 2024

dgmora commented Dec 7, 2024 •

edited

Loading