Background Jobs: Rewrite RecurringHostedServiceBase with SemaphoreSlim and add signalling support#22331
Conversation
…and triggering immediate executions
…nner Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
I've reviewed and will merge #22330 @ronaldbarendse, but this one has a few conflicts and comments to consider. Please can you look at resolving/addressing those, and then you can re-target this to |
# Conflicts: # src/Umbraco.Infrastructure/BackgroundJobs/IRecurringBackgroundJob.cs # src/Umbraco.Infrastructure/BackgroundJobs/RecurringBackgroundJobHostedService.cs # src/Umbraco.Infrastructure/BackgroundJobs/RecurringBackgroundJobHostedServiceRunner.cs # src/Umbraco.Infrastructure/Extensions/ServiceCollectionExtensions.cs # src/Umbraco.Infrastructure/HostedServices/RecurringHostedServiceBase.cs # tests/Umbraco.Tests.UnitTests/Umbraco.Infrastructure/HostedServices/RecurringHostedServiceBaseTests.cs
…roundJobCanceledNotification
AndyButland
left a comment
There was a problem hiding this comment.
Thanks @ronaldbarendse - it looks like a solid and robust rewrite to me. Can you just fill me in a bit on the background of what triggered you to make these updates? Is it just something you saw that could be improved? Or are you running into real-world issues that this update should solve, or have features in mind that depends on this being updated in core?
Then there are various internal jobs that aren't updated to use the new API - TempFileCleanupJob, ReportSiteJob, TouchServerJob, and InstructionProcessJob still implement only RunJobAsync() (without the CancellationToken). I think if we are going change this, we should make sure internal code is now calling the non-obsolete overloads.
…ring the initial delay
|
@AndyButland All CMS jobs are now using the new I've also added support for setting either/both This makes all the following use-cases possible:
|
|
Great @ronaldbarendse, thanks. I'll take a final look over this tomorrow (travelling today) and look to get this in. As it stands it will be 17.6 and 18.1, but if you think there is good reason to accelerate that, please let me know. We should think about documentation for this too I would suggest. I can have a stab at that unless you are minded to, but given there's quite a bit of flexibility here now for custom jobs, it's worth having it documented, so I've added the "needs docs" label. |
Having this available in 18.0 would allow the products to start taking advantage if this straight away. Otherwise, we'd push these improvements to v19, since we don't want to force users to upgrade the CMS before the product (especially for Deploy: we want to keep the minimum CMS version at x.0.0). Also note that these changes are all additive and backwards compatible!
This does indeed require updating the docs page, please do have a stab as this (there's more than enough information in this PR to have a decent restructure kicked-off by Claude) 👍🏻 |
… loop, restoring previous timer behaviour.
RecurringHostedServiceBase with SemaphoreSlim and add signalling support
|
I've taken a further pass through this and pushed a few changes on top of your work @ronaldbarendse. Please see summary below, plus details on a runtime issue I uncovered while testing and the fix for it. If you spot any concerns, please shout. Or if looks OK to you of course, please confirm. ChangesThree minor bits of clean-up/nit-picking, and one bug fix. 1.
|
| Job | Purpose |
|---|---|
HeartbeatJob |
Plain RecurringBackgroundJobBase, 15-s period, 5-s delay — confirms the basic semaphore loop |
TriggerableHeartbeatJob |
ITriggerableRecurringBackgroundJob, 2-min period — long enough that manual triggers via IRecurringBackgroundJobTrigger<TJob> are clearly visible |
ManualOnlyJob |
Period and Delay both Timeout.InfiniteTimeSpan — runs only when triggered |
CancellationAwareJob |
30-s loop awaiting Task.Delay(ct) inside RunJobAsync(CancellationToken) |
IgnoredDelayDemoJob |
ServerRoles = [Subscriber], Period = TimeSpan.Zero, IgnoredDelay = 20s — always ignored on a single-server dev setup; the back-off prevents the tight loop |
Plus a BackgroundJobsTestController exposing endpoints to fire each TriggerExecution overload (None, Reset, Replace, custom delay) against TriggerableHeartbeatJob, and to fire ManualOnlyJob.
Verified:
- All five jobs register correctly at boot.
ManualOnlyJoblogsdelay -00:00:00.0010000 every -00:00:00.0010000—Timeout.InfiniteTimeSpanrendered via the existing framework logging (cosmetic). IgnoredDelayDemoJobis correctly ignored on every iteration with theRecurringBackgroundJobIgnoredNotificationpublished at a clean ~20-second cadence — theIgnoredDelayback-off is working.HeartbeatJobandCancellationAwareJobfire on their declared schedules once the server role is resolved.CancellationAwareJobcancels cleanly on app shutdown —OperationCanceledExceptionpropagates throughRunJobAsync(ct), theRecurringBackgroundJobCanceledNotificationis published, the loop exits.TriggerableHeartbeatJobreacts to all four trigger overloads as documented:TriggerExecution()(None) — immediate run, original schedule resumesTriggerExecution(NextExecutionStrategy.Reset)— immediate run, next tick a full period laterTriggerExecution(NextExecutionStrategy.Replace)— immediate run, the originally-scheduled tick is skippedTriggerExecution(TimeSpan)— immediate run, next tick after the custom delay
ManualOnlyJobonly fires when its trigger endpoint is hit.IRecurringBackgroundJobTrigger<TJob>.TriggerExecution(...)returnsfalsebeforeStartAsyncandtrueafterwards, as expected from the runner lookup.- DistributedJobService scope exception described above no longer reproduces after the
SuppressFlowfix.
…RecurringBackgroundJob
… for the remaining application lifecycle
…Slim` and add signalling support (#22331) * Compute next delay to compensate for time drift * Use SemaphoreSlim to properly handle exceptions, cancellation tokens and triggering immediate executions * Add RecurringBackgroundJobBase to contain default values and hide obsoleted method * Add NextExecutionStrategy parameter to adjust the schedule after triggered executions * Add TriggerExecution methods to RecurringBackgroundJobHostedServiceRunner Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Handle cancellation (application shutdown) and publish RecurringBackgroundJobCanceledNotification * Match hosted services by Type instead of type name string * Extract shared helper for TriggerExecution tests * Clear trigger state when initial delay is interrupted * Clear _nextExecutionSkipOnOvershoot unconditionally * Combine ComputeNextDelay tests * Consolidate trigger state into an immutable record for thread safety * Use ConcurrentDictionary for thread-safe hosted service lookup * Remove hosted services from dictionary on stop * Fix API compatibility errors * Removed unneeded using. * Register RecurringBackgroundJobHostedServiceRunner as resolvable singleton * Remove failed hosted service from dictionary when StartAsync throws * Use semaphore signaling instead of Task.Delay in trigger tests Use semaphore signaling instead of Task.Delay in trigger tests 2 * Inject TimeProvider into RecurringHostedServiceBase for deterministic testing Fix timeprovider * Use DelayCalculator.GetDelay instead of RecurringHostedServiceBase.GetDelay * Fix Exception_In_PerformExecuteAsync_Does_Not_Kill_Loop test * Avoid disposing period-change CTS while wait loop may still reference it * Configure IEventMessagesFactory mock to return real EventMessages * Clarify TriggerExecution(TimeSpan) docs and add ChangePeriod test * Validate period is positive and use GetOrAdd to avoid creating unused hosted services * Set up Period and Delay on mock job to satisfy constructor validation * Ensure PeriodChanged event is unsubscribed again * Fix trigger state race, simplify ReleaseSignal, and add canceled notification test Fix trigger state * Use Interlocked for _period reads/writes and implement thread-safe dispose pattern * Remove hosted service from dictionary before stopping to prevent triggering during shutdown * Replace Task.Yield with semaphore timeouts in negative assertions * Tidy RecurringBackgroundJobBase docs and runner error handling * Wait IgnoredDelay after ignored execution to prevent tight looping when Period is short or zero * Add IRecurringBackgroundJobTrigger<TJob> for opt-in job triggering * Register IRecurringBackgroundJobTrigger as open generic and drop AddTriggerableRecurringBackgroundJob * Fix and add parameter validation * Allow Timeout.InfiniteTimeSpan as Period for manual-trigger-only recurring jobs * Migrate built-in jobs to RecurringBackgroundJobBase and require ITriggerableRecurringBackgroundJob in runner trigger overloads * Support infinite Delay and honor TriggerExecution(TimeSpan) issued during the initial delay * Handle edge case of backoff via InfiniteTimeSpan. * Refactored large method. * Added clarifying documentation. * Suppress ExecutionContext flow when starting the recurring background loop, restoring previous timer behaviour. * Relocate Suppress ExecutionContext flow to avoid package validation error. * Align IRecurringBackgroundJobTrigger generic type constraint with AddRecurringBackgroundJob * Rename ApplyTriggerState to ComputeNextDelayFromTriggerState * Allow Timeout.InfiniteTimeSpan as IgnoredDelay to fully disable a job for the remaining application lifecycle * Fix generic type constraint --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: Andy Butland <abutland73@gmail.com>
…Slim` and add signalling support (#22331) * Compute next delay to compensate for time drift * Use SemaphoreSlim to properly handle exceptions, cancellation tokens and triggering immediate executions * Add RecurringBackgroundJobBase to contain default values and hide obsoleted method * Add NextExecutionStrategy parameter to adjust the schedule after triggered executions * Add TriggerExecution methods to RecurringBackgroundJobHostedServiceRunner Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Handle cancellation (application shutdown) and publish RecurringBackgroundJobCanceledNotification * Match hosted services by Type instead of type name string * Extract shared helper for TriggerExecution tests * Clear trigger state when initial delay is interrupted * Clear _nextExecutionSkipOnOvershoot unconditionally * Combine ComputeNextDelay tests * Consolidate trigger state into an immutable record for thread safety * Use ConcurrentDictionary for thread-safe hosted service lookup * Remove hosted services from dictionary on stop * Fix API compatibility errors * Removed unneeded using. * Register RecurringBackgroundJobHostedServiceRunner as resolvable singleton * Remove failed hosted service from dictionary when StartAsync throws * Use semaphore signaling instead of Task.Delay in trigger tests Use semaphore signaling instead of Task.Delay in trigger tests 2 * Inject TimeProvider into RecurringHostedServiceBase for deterministic testing Fix timeprovider * Use DelayCalculator.GetDelay instead of RecurringHostedServiceBase.GetDelay * Fix Exception_In_PerformExecuteAsync_Does_Not_Kill_Loop test * Avoid disposing period-change CTS while wait loop may still reference it * Configure IEventMessagesFactory mock to return real EventMessages * Clarify TriggerExecution(TimeSpan) docs and add ChangePeriod test * Validate period is positive and use GetOrAdd to avoid creating unused hosted services * Set up Period and Delay on mock job to satisfy constructor validation * Ensure PeriodChanged event is unsubscribed again * Fix trigger state race, simplify ReleaseSignal, and add canceled notification test Fix trigger state * Use Interlocked for _period reads/writes and implement thread-safe dispose pattern * Remove hosted service from dictionary before stopping to prevent triggering during shutdown * Replace Task.Yield with semaphore timeouts in negative assertions * Tidy RecurringBackgroundJobBase docs and runner error handling * Wait IgnoredDelay after ignored execution to prevent tight looping when Period is short or zero * Add IRecurringBackgroundJobTrigger<TJob> for opt-in job triggering * Register IRecurringBackgroundJobTrigger as open generic and drop AddTriggerableRecurringBackgroundJob * Fix and add parameter validation * Allow Timeout.InfiniteTimeSpan as Period for manual-trigger-only recurring jobs * Migrate built-in jobs to RecurringBackgroundJobBase and require ITriggerableRecurringBackgroundJob in runner trigger overloads * Support infinite Delay and honor TriggerExecution(TimeSpan) issued during the initial delay * Handle edge case of backoff via InfiniteTimeSpan. * Refactored large method. * Added clarifying documentation. * Suppress ExecutionContext flow when starting the recurring background loop, restoring previous timer behaviour. * Relocate Suppress ExecutionContext flow to avoid package validation error. * Align IRecurringBackgroundJobTrigger generic type constraint with AddRecurringBackgroundJob * Rename ApplyTriggerState to ComputeNextDelayFromTriggerState * Allow Timeout.InfiniteTimeSpan as IgnoredDelay to fully disable a job for the remaining application lifecycle * Fix generic type constraint --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: Andy Butland <abutland73@gmail.com>
|
Cherry picked to |
Note
This PR is based on #22330 (period drift fix). Review that PR first — this one contains only the additional commits.
Summary
Replace
TimerwithSemaphoreSlim-based loop onBackgroundServiceThe previous implementation used
System.Threading.Timer, which executes callbacks on the ThreadPool. This had several issues:PerformExecuteAsync(object? state)had no way to observe host shutdown. The newPerformExecuteAsync(CancellationToken)overload enables cooperative cancellation.The rewrite inherits from
BackgroundServiceand uses aSemaphoreSlim(0, 1)as the wait primitive insideExecuteAsync.SemaphoreSlim.WaitAsync(TimeSpan, CancellationToken)was chosen overPeriodicTimerbecause it solves multiple problems simultaneously:CancellationTokenAdd
TriggerExecution()for on-demand signallingThe
SemaphoreSlimwait can be interrupted by releasing the semaphore, causing the loop to execute immediately. Fourprotected internaloverloads onRecurringHostedServiceBasecontrol what happens after the triggered execution:TriggerExecution()TriggerExecution(NextExecutionStrategy.Reset)TriggerExecution(NextExecutionStrategy.Replace)TriggerExecution(TimeSpan nextDelay)Triggering an
IRecurringBackgroundJob(opt-in)The four
TriggerExecution(...)instance methods above areprotected internal, so subclasses ofRecurringHostedServiceBase(custom hosted services that don't go throughIRecurringBackgroundJob) can trigger themselves directly. ForIRecurringBackgroundJobimplementations, triggering is now an explicit opt-in to keep the public API surface small and intentional:Mark the job with
ITriggerableRecurringBackgroundJob(an empty marker interface extendingIRecurringBackgroundJob):Register the job the usual way:
Inject
IRecurringBackgroundJobTrigger<MyJob>where you want to trigger it:IRecurringBackgroundJobTrigger<>is registered once as an open generic byAddBackgroundJobs, so the typed trigger is resolvable for any job that opts in via theITriggerableRecurringBackgroundJobmarker — no per-job DI registration is required. The generic constraint on the trigger interface enforces opt-in at compile time: requestingIRecurringBackgroundJobTrigger<NotMarked>is a compile error.The typed trigger exposes the same overloads as the base class (
TriggerExecution(),TriggerExecution(NextExecutionStrategy),TriggerExecution(TimeSpan)) and returnsfalseif no hosted service is currently running for the job (e.g. beforeStartAsync, or if the job is not registered). Internally it delegates toRecurringBackgroundJobHostedServiceRunner, whoseTriggerExecution<TJob>overloads areinternaland not part of the public API.Add
RecurringBackgroundJobBaseabstract classDefault values for
Delay,ServerRoles, and thePeriodChangedevent previously lived as default interface implementations onIRecurringBackgroundJob. A base class is the more natural place for these defaults.RecurringBackgroundJobBasealso hides the now-obsoleted parameterlessRunJobAsync()by routing it toRunJobAsync(CancellationToken). Implementors only need to providePeriodandRunJobAsync(CancellationToken).The
DefaultDelayandDefaultServerRolesconstants are declared asprotected internal static readonlyon the base class —protectedfor subclasses,internalso the default interface implementations (kept for backward compatibility with directIRecurringBackgroundJobimplementors) can still reference them.Add
IgnoredDelayto prevent tight looping when execution is skippedThe hosted service short-circuits
PerformExecuteAsyncand publishes aRecurringBackgroundJobIgnoredNotificationwhen the runtime is not ready, the current server role is not allowed, or this is not the MainDom. Previously this returned immediately, so a job with a very short or zeroPeriod(e.g. one that throttles itself via a semaphore insideRunJobAsync) would spin at 100% CPU and flood logs and notifications whenever the CMS chose to ignore it — see Umbraco.Engage.Issues#65 and #22859.A new
IRecurringBackgroundJob.IgnoredDelayproperty (default 1 minute, overridable per job) is now awaited after each ignored execution before the loop continues. The wait uses the injectedTimeProviderfor deterministic testing, is cancellable via thestoppingToken, and is skipped entirely whenIgnoredDelay <= TimeSpan.Zero.A regular execution path (job actually runs) is unaffected — back-off is only applied when the CMS prevents the job from running.
Add
CancellationTokentoRunJobAsyncThe new
RunJobAsync(CancellationToken)overload onIRecurringBackgroundJobenables cooperative cancellation during host shutdown. The parameterlessRunJobAsync()is obsoleted with a default interface implementation that delegates to the new overload (scheduled for removal in Umbraco 19).Test plan
TriggerExecutionstrategies (None,Reset,Replace, custom delay)Nonestrategy overshoot-skip behaviorChangePeriodtaking effect on next cycletrue/falsebased on job registration, verifies immediate executionIRecurringBackgroundJobTrigger<TJob>typed trigger tests (delegation to runner)IgnoredDelayback-off tests: waits the configured delay, honors per-job override, skipped when zero, cancellable on shutdownRecurringBackgroundJobHostedServiceTestsupdated to use new signatures