Skip to content

Stagger more reporting jobs#11116

Merged
h-m-m merged 1 commit intomainfrom
hmm/more-job-staggering
Aug 23, 2024
Merged

Stagger more reporting jobs#11116
h-m-m merged 1 commit intomainfrom
hmm/more-job-staggering

Conversation

@h-m-m
Copy link
Contributor

@h-m-m h-m-m commented Aug 20, 2024

🎫 Ticket

See #11115 and #11030

🛠 Summary of changes

PR #11115 is likely to get at the cause/source of our intermittent reporting problems. That said, this PR along with #11030 gives us some extra insurance that the reports will run regardless of other efforts. Since this is a fairly small and safe change, I'm suggesting that we do this while also focusing on the more important line of improvement that is work like #11115

@mitchellhenke
Copy link
Contributor

I think it might be preferable to pursue approaches like #11115 and better understand why we're hitting limits. There are a limited number of jobs that can running in parallel, so the delay may not necessarily achieve what we're after depending on how the queuing and run length works out at a given time.

Copy link
Contributor Author

@h-m-m h-m-m Aug 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One minute is not enough of a delay, based on https://gsa-tts.slack.com/archives/CMW9H0RFX/p1724077904549099?thread_ts=1722271455.385999&cid=CMW9H0RFX

Two bits is 25, so delaying for a bit must be ~12 minutes, eh 🤨?

@h-m-m
Copy link
Contributor Author

h-m-m commented Aug 20, 2024

I think it might be preferable to pursue approaches like #11115 and better understand why we're hitting limits.

On the one hand, I absolutely agree. I wrote "the more important line of improvement that is work like #11115" because I believe you're right

On the other hand, customers are complaining that they aren't getting their reports and I'd like to not have another week where that happens if there's a chance we can avoid it easily. I think we can do both. Let me know if you think I'm missing something that means we can't

@h-m-m h-m-m force-pushed the hmm/more-job-staggering branch from 8e338bb to 8b73e17 Compare August 20, 2024 14:21
changelog: Internal, Reporting, further stagger the delay of reporting jobs so we don't overwhelm other systems
@h-m-m h-m-m force-pushed the hmm/more-job-staggering branch from 8b73e17 to f710b16 Compare August 20, 2024 14:48
Copy link
Contributor

@mitchellhenke mitchellhenke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the one hand, I absolutely agree. I wrote "the more important line of improvement that is work like #11115" because I believe you're right

On the other hand, customers are complaining that they aren't getting their reports and I'd like to not have another week where that happens if there's a chance we can avoid it easily. I think we can do both. Let me know if you think I'm missing something that means we can't

No, I don't think anything is missing and I think that a shorter-term fix makes sense. I do think we should be looking into the rate and volume of queries and address any issues there as a follow-up though.

@h-m-m h-m-m marked this pull request as ready for review August 23, 2024 13:58
@h-m-m h-m-m merged commit a844517 into main Aug 23, 2024
@h-m-m h-m-m deleted the hmm/more-job-staggering branch August 23, 2024 13:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants