Use IController in implicit restart test for deterministic behavior#2839
Conversation
The implicit sparse Jacobian restart test was fragile because: 1. The default PIController has memory (qold) that isn't saved/restored in checkpoints, so restart step sizes depend on controller state 2. OrdinaryDiffEqCore v3.10 added first-step acceleration (qmax=10000) which changed the restart trajectory since fresh integrators have success_iter=0 Switch to NewIController with qmax_first_step=qmax=10 which: - Has no memory of previous error estimates (memoryless I-control) - Disables first-step acceleration so restarts behave identically to continuation Reference values updated accordingly. Fixes the test regression from SciML/OrdinaryDiffEq.jl#3101. Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
Review checklistThis checklist is meant to assist creators of PRs (to let them know what reviewers will typically look for) and reviewers (to guide them in a structured review process). Items do not need to be checked explicitly for a PR to be eligible for merging. Purpose and scope
Code quality
Documentation
Testing
Performance
Verification
Created with ❤️ by the Trixi.jl community. |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #2839 +/- ##
=======================================
Coverage 97.13% 97.13%
=======================================
Files 596 596
Lines 45962 45962
=======================================
Hits 44642 44642
Misses 1320 1320
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Thanks! So this fixes the failing test from the restart example, but there is also another failing test, which does not use the restart functionality, see https://github.com/trixi-framework/Trixi.jl/actions/runs/22542181717/job/65300052174?pr=2839#step:7:8507. This test runs this example file: https://github.com/trixi-framework/Trixi.jl/blob/8ce86828611783942b2a4518df908f1cb434a218/examples/p4est_2d_dgsem/elixir_euler_supersonic_cylinder_scO2.jl. |
Investigation:
|
| OrdinaryDiffEqCore | L2[rho] | Matches test reference? |
|---|---|---|
| v3.9 (old, no first-step acceleration) | 0.02931403129205... | Yes, to ~1e-16 |
v3.11 (new, with qmax_first_step=10000) |
0.02922... | No, ~0.3% different |
The old ODECore reproduces the hardcoded reference values to machine precision. The new ODECore gives a different trajectory through the AMR decisions, producing ~0.3–8% differences in L2/Linf norms.
The real problem: the test tolerance is absurdly tight
The test asserts rtol=1.5e-8 on the solution norms, but the time integration error at the default tolerances is ~1–3%. I ran a convergence study sweeping abstol=reltol from 1e-3 to 1e-8 to measure actual solution accuracy. Using the tightest tolerance (1e-8) as a reference:
Old OrdinaryDiffEqCore (v3.9):
| tol | L2[rho] | Error vs ref | Steps |
|---|---|---|---|
| 1e-3 | 0.02228 | 21.5% | 15 |
| 1e-4 | 0.02765 | 2.6% | 54 |
| 1e-5 | 0.02926 | 3.1% | 141 |
| 1e-6 (test default) | 0.02875 | 1.3% | 399 |
| 1e-7 | 0.02840 | 0.06% | 940 |
| 1e-8 (ref) | 0.02838 | — | 2204 |
New OrdinaryDiffEqCore (v3.11):
| tol | L2[rho] | Error vs ref | Steps |
|---|---|---|---|
| 1e-3 | 0.02291 | 19.3% | 14 |
| 1e-4 | 0.02747 | 3.2% | 49 |
| 1e-5 | 0.02901 | 2.2% | 131 |
| 1e-6 (test default) | 0.02922 | 3.0% | 385 |
| 1e-7 | 0.02838 | 0.03% | 910 |
| 1e-8 (ref) | 0.02838 | — | 2268 |
Both old and new ODECore converge to the same solution (L2[rho] ≈ 0.02838) at tight tolerances, confirming the new ODECore is correct. The ~1–3% time integration error at tol=1e-6 means the test is asserting 8+ digits of agreement (rtol=1.5e-8) on a solution that only has ~2 digits of accuracy. Any change to the stepping trajectory (like qmax_first_step) that stays within the time integration error is equally valid.
Recommendation
The test reference values should either be updated for the new ODECore, or (better) the test rtol should be relaxed to something consistent with the actual solution accuracy — e.g., rtol=0.05 (5%).
Reproduction script
convergence_study.jl (click to expand)
# Convergence study: run the supersonic cylinder scO2 elixir at different tolerances
using Pkg
Pkg.activate(mktempdir())
Pkg.develop(path="Trixi.jl")
Pkg.add("OrdinaryDiffEqCore") # latest (with first-step acceleration)
Pkg.add("OrdinaryDiffEqSSPRK")
using Trixi
using OrdinaryDiffEqSSPRK
using Printf
elixir = joinpath(@__DIR__, "elixir_scO2_tol.jl")
tolerances = [1e-3, 1e-4, 1e-5, 1e-6, 1e-7, 1e-8]
# Store results to a global that survives world age changes
const RESULTS = Dict{Float64, Any}()
for tol in tolerances
@printf("\n\n>>> Running with abstol=%.0e, reltol=%.0e\n", tol, tol)
@eval begin
global MY_ABSTOL = $tol
global MY_RELTOL = $tol
trixi_include($elixir, tspan=(0.0, 0.001))
l2_vals = analysis_callback(sol).l2
linf_vals = analysis_callback(sol).linf
RESULTS[$tol] = (l2=copy(l2_vals), linf=copy(linf_vals),
nsteps=sol.stats.naccept, ndofs=length(sol.u[end]) ÷ 4)
end
end
println("\n\n" * "="^120)
println("CONVERGENCE STUDY RESULTS")
println("="^120)
println()
@printf("%-10s %-10s %-6s %-22s %-22s %-22s %-22s\n",
"tol", "#DOFs", "steps", "L2[rho]", "L2[rho_v1]", "L2[rho_v2]", "L2[rho_e]")
println("-"^120)
for tol in tolerances
r = RESULTS[tol]
@printf("%.0e %-10d %-6d %.15e %.15e %.15e %.15e\n",
tol, r.ndofs, r.nsteps, r.l2[1], r.l2[2], r.l2[3], r.l2[4])
endelixir_scO2_tol.jl — modified elixir that reads MY_ABSTOL/MY_RELTOL globals (click to expand)
# Channel flow around a cylinder at Mach 3
# (Modified version that uses global MY_ABSTOL, MY_RELTOL for convergence studies)
using OrdinaryDiffEqSSPRK
using Trixi
equations = CompressibleEulerEquations2D(1.4)
@inline function initial_condition_mach3_flow(x, t, equations::CompressibleEulerEquations2D)
rho_freestream = 1.4
v1 = 3.0
v2 = 0.0
p_freestream = 1.0
prim = SVector(rho_freestream, v1, v2, p_freestream)
return prim2cons(prim, equations)
end
initial_condition = initial_condition_mach3_flow
@inline function boundary_condition_supersonic_inflow(u_inner,
normal_direction::AbstractVector,
x, t, surface_flux_function,
equations::CompressibleEulerEquations2D)
u_boundary = initial_condition_mach3_flow(x, t, equations)
return flux(u_boundary, normal_direction, equations)
end
@inline function boundary_condition_outflow(u_inner, normal_direction::AbstractVector, x, t,
surface_flux_function,
equations::CompressibleEulerEquations2D)
return flux(u_inner, normal_direction, equations)
end
boundary_conditions = (; Bottom = boundary_condition_slip_wall,
Circle = boundary_condition_slip_wall,
Top = boundary_condition_slip_wall,
Right = boundary_condition_outflow,
Left = boundary_condition_supersonic_inflow)
volume_flux = flux_ranocha_turbo
surface_flux = flux_lax_friedrichs
polydeg = 3
basis = LobattoLegendreBasis(polydeg)
shock_indicator = IndicatorHennemannGassner(equations, basis,
alpha_max = 0.5,
alpha_min = 0.001,
alpha_smooth = true,
variable = density_pressure)
volume_integral = VolumeIntegralShockCapturingRRG(basis, shock_indicator;
volume_flux_dg = volume_flux,
volume_flux_fv = surface_flux,
slope_limiter = minmod)
solver = DGSEM(polydeg = polydeg, surface_flux = surface_flux,
volume_integral = volume_integral)
mesh_file = Trixi.download("https://gist.githubusercontent.com/andrewwinters5000/a08f78f6b185b63c3baeff911a63f628/raw/addac716ea0541f588b9d2bd3f92f643eb27b88f/abaqus_cylinder_in_channel.inp",
joinpath(@__DIR__, "abaqus_cylinder_in_channel.inp"))
mesh = P4estMesh{2}(mesh_file)
semi = SemidiscretizationHyperbolic(mesh, equations, initial_condition, solver;
boundary_conditions = boundary_conditions)
tspan = (0.0, 2.0)
ode = semidiscretize(semi, tspan)
summary_callback = SummaryCallback()
analysis_interval = 1000
analysis_callback = AnalysisCallback(semi, interval = analysis_interval)
alive_callback = AliveCallback(analysis_interval = analysis_interval)
save_solution = SaveSolutionCallback(interval = 5000,
save_initial_solution = false,
save_final_solution = true,
solution_variables = cons2prim)
amr_indicator = IndicatorLöhner(semi, variable = Trixi.density)
amr_controller = ControllerThreeLevel(semi, amr_indicator,
base_level = 0,
med_level = 3, med_threshold = 0.05,
max_level = 5, max_threshold = 0.1)
amr_callback = AMRCallback(semi, amr_controller,
interval = 2,
adapt_initial_condition = true,
adapt_initial_condition_only_refine = true)
callbacks = CallbackSet(summary_callback,
analysis_callback, alive_callback,
save_solution,
amr_callback)
stage_limiter! = PositivityPreservingLimiterZhangShu(thresholds = (5.0e-7, 1.0e-6),
variables = (pressure, Trixi.density))
dt0 = 1e-8
sol = solve(ode, SSPRK43(stage_limiter! = stage_limiter!, thread = Trixi.True());
dt = dt0, save_everystep = false,
abstol = MY_ABSTOL, reltol = MY_RELTOL,
callback = callbacks, maxiters = 1_000_000);|
That result is pretty self-explanatory. It's very clear the test as written isn't very meaningful because it expects 6 orders of magnitude lower error than what you're actually getting from the solution. What would you like out of your options?
It can be reasonable to have a test like this to say "did anything change?", but the test as written isn't asking the question "are things correct?", it's asking "did anything change?". Those are two very different questions and here is a case where you get a different answer. I think it's totally fine to have regression tests to check whether anything changed unexpectedly, OrdinaryDiffEq.jl has a few them, but it is not an indicator that something is wrong if a non-correctness regression test has a drift, it is a sign for further analysis. And here, the further analysis was very straightforward and clear because it's not borderline: the test is asking for 6 orders of magnitude tighter results than what you're getting against the true solution, obviously that's going to drift even for small changes in CPU choice, SIMD, threading, etc.? I'm happy to help isolate these things, because yes these kinds of regression tests are good for double checking things, but please do the double check before jumping to the conclusion that something is wrong and the world is on fire. Because what we can see from this is that, there wasn't anything wrong when it was looked into, and both of these were test issues, test issues that lead to better code in your repo! |
|
I never said the new version in OrdinaryDiffEqCore.jl is wrong and our previous reference value is the "true" solution. As you say, we have these tests to see if something changes. And if the changes were of the order of the tolerances for the time integration we would not complain (in fact this is sometimes the case and we did not complain), but the difference is not something like 1e-3 or smaller, but rather 0.5: Evaluated: isapprox(7.397166897133932, 6.8750812520355655; atol = 1.1102230246251565e-13, rtol = 1.4901161193847656e-8)julia> 7.397166897133932 - 6.8750812520355655
0.5220856450983664
julia> 7.397166897133932/6.8750812520355655
1.0759388327147097I never saw before such a big change, which was solely explained by solver tolerances. But if you say, this is still within the tolerance, it's fine and we adapt our tests. I just didn't expect it before when I saw this huge difference in the test result. |
Investigation: Can lowering solver tolerance achieve rtol=1.5e-8 reproducibility?Following up on the suggestion to go with option 2 (decrease solver tolerances), I ran two experiments to test whether this is viable. Experiment 1: Convergence study (single ODECore, sweeping tolerances)Running the elixir at
None of the consecutive pairs meet Experiment 2: Direct comparison of old vs new stepping behaviorThis is the key test. At each solver tolerance, I compared
Even at Going from ConclusionOption 2 (lower solver tolerance to hit test rtol=1.5e-8) is not viable for this AMR problem. The discrete nature of AMR decisions creates O(1e-3) jumps in solution norms that cannot be eliminated by tightening the time integrator tolerance. No solver tolerance will make this test robust to stepping algorithm changes at rtol=1.5e-8. Viable alternatives:
|
ranocha
left a comment
There was a problem hiding this comment.
Let us please not use an IController (or NewIController), since it will not satisfy our requirements of step size control stability.
The tests here are regression tests. We do not want them to use huge tolerances, since we would otherwise not catch errors we would like to know about, as Joshua wrote.
We should likely update our reference values for the new implementation after checking whether the full simulation results are reasonable.
|
Xref #2838 |
|
Superseded by #2838. |
Summary
The implicit sparse Jacobian restart test (
elixir_advection_implicit_sparse_jacobian_restart.jl) broke with OrdinaryDiffEqCore v3.10 because of a new first-step acceleration feature (SciML/OrdinaryDiffEq.jl#3101).The test was already fragile for two reasons:
qold) — the previous error estimate ratio isn't saved/restored in checkpoints, so the restarted controller state differs from the continuous run. It only worked because the problem is trivial enough that the error estimate was near zero, makingqmaxthe binding constraint andqoldirrelevant.qmax_first_step=10000(matching Sundials CVODE behavior), which allows 10000x step growth on the first step. Since a restarted integrator hassuccess_iter=0, it gets this acceleration, diverging from the continuous run.Fix
Switch the restart elixir's solver to use
NewIController(alg, qmax=10, qmax_first_step=10):qmax_first_step = qmax— disables first-step acceleration so the restart behaves identically to a continuationReference values updated to match the new controller.
Test plan
🤖 Generated with Claude Code
Co-Authored-By: Chris Rackauckas accounts@chrisrackauckas.com