Explicit 2D Diffusion Correlation Operator by travissluka · Pull Request #966 · JCSDA-internal/soca

travissluka · 2023-10-13T21:25:28Z

Description

closes Create a SaberCentralBlock for 2D Diffusion correlation operator #952
(well that took longer than expected!, and not at all because I accidentally deleted half the code last week)

This PR creates a SaberCentralBlock for a soca-specific 2D explicit diffusion operator. (Following Weaver 2001, and pulling inspiration from the ROMS code). It is meant to handle the short correlation lengths. For longer lengths, it's better to use BUMP

Much like BUMP, using this block is a two-step process

calibration: for the given grid and length scales, the diffusion constants and normalization weights are calculated and saved to a file (see ctest configuration test/testinput/parameters_diffusion.yml). Two ways of doing normalization are possible.
- brute force calculates the normalization exactly, but as you might guess from the name, you don't want to do this for a grid bigger than the 5 deg ctest grid. It's just there to check the code.
- randomization random vectors are generated, and the TL of the diffusion operator is applied, to estimate the normalization values. (~10,000 iterations is good for a real case)
when using afterward (e.g. see dirac yaml test/testinput/dirac_diffusion.yml) the previous parameter file is read in.

other changes that were made to get this to work

the logic used in the setcorscale application (how the rossby radius based length were specified) was modified to mirror exactly what is used in soca_covariance_mod. Required a small change in the setcorscale.yaml (min -> min value)
gridgen will now also save dx and dy so that they can be used in this filter.
existing soca_gridspec.nc files will need to be updated after the PR

Testing

The following tests were done on my own machine

ctest

2 ctests are added, one for calibration, one for a dirac test.

if you change the normalization method to brute force, you'll see in the dirac test that everything is perfectly normalized. The value at the dirac location stays 1.0 🎉
if you enable the saber adjoint test, and comment line 1573 of src/soca/Fields/soca_fields_mod.F90 you'll see that the self-adjoint test does pass 🎉 . The adjoint test can't be left on because it fails if the halos are modified. (soca's toFieldset() does a halo exchange, and other tests break if I remove that halo exchange). To be fixed another day.

realistic scales on a 1 deg grid

length scale = 1.5 x rossby radius, min size of 2 grid box
the resulting max value with 20,000 iterations of randomization is 1.01 (good enough!)

absurdly large scales on a 1 deg grid

length scale = 10 x rossby radius
as expected this is slow (diffusion requires 1388 iterations during multiply() !) so don't do this in the real world. But it works, and normalization is still good.

extra sanity check

create a dirac with a length scale of 4 gridboxes, and 4 grid boxes away the value is ~0.6 as expected for 1 sigma of a gaussian

1/4 deg grid regional 3dvar

@travissluka: Test with regional domain 3dvar
providing comparison with BUMP

Followup steps

likely future follow-ups to this PRs

the same scales are currently applied to all variables. Generalize so that different variables can use different scales
Add the vertical diffusion to create a 3D (2D+1D) diffusion.
This will have the normalization done separately so that 2D normalization can be computed offline (it's slower), and the vertical normalization can be computed each cycle to respond to varying mixed layer depths and such
use as a smoother within parts of the soca code (e.g. smoothing background error, T/S/SSH balance jacobians, etc)
generalize into non-soca-specific code and move into the saber repository. This will be easier once the atlas mesh connectivity is added to the model interfaces
deprecated and remove HorizFiltSOCA

tagging other people who were interested: @hga007 @shlyaeva @ncrossette

hga007 · 2023-10-13T21:46:31Z

@travissluka, Awesome, great job. This algorithm is not trivial. Congratulations! The good thing is that this pseudo-diffusion operator is self-adjoint and passes the self_adjoint tests. I am glad that ROMS algorithms were useful. Implementing the methodologies we see in papers into discrete models is usually complicated and takes time. Yes, the randomization approach is the way to go. I recall it took me about two days to compute those coefficients in a large grid using the exact method.

By the way, I have checked and used the coding in SOCA that you and others have developed many times as a prototype in the ROMS-JEDI interface.

guillaumevernieres · 2023-10-13T22:02:00Z

src/soca/ExplicitDiffusion/soca_diffusion.F90

+end subroutine
+
+! ------------------------------------------------------------------------------
+! Calibration for the explicit diffuion operator.


ppppffffff .... difffuzzion not diffuion 🤦‍♂️

guillaumevernieres

🎉
Thanks for doing this @travissluka ! Looks good as always. If there are bugs we'll find them by putting that chunk of code through its paces.
Preemptively approving before the CI finishes.

PS: Did you check within one of the variational application too @travissluka ?

kbhargava · 2023-10-16T17:32:51Z

Looks good. Ctests pass on orion. I compared the diracs of the ctest setting the normalization method to brute force vs randomization with 100,1000 and 10000 iterations. Plot below. Testing with the regional domain now. Will update when done.

travissluka · 2023-10-19T18:37:47Z

eh, we celebrated too early. With HAT10 there are weird artifacts in the normalization depending on how many PEs are used. (6 PEs on left, 40 PEs on right) clearly something wrong going on with the halo with 40 PEs

kbhargava · 2023-10-19T19:07:31Z

eh, we celebrated too early. With HAT10 there are weird artifacts in the normalization depending on how many PEs are used. (6 PEs on left, 40 PEs on right) clearly something wrong going on with the halo with 40 PEs

uhh Ohh!

travissluka · 2023-10-19T19:47:23Z

on the plus side though (ignoring that halo bug),

an increment with BUMP (left) and EXPLICIT_DIFFUSION (right) with the following scales:

rossby mult: 1.0
min grid mult: 1.0
min value: 150e3

They look very similar (except in one spot in the northern extension of the Gulfstream right where there is the weird blob in normalization weight).

Runtimes for 84 iterations in 3dvar

BUMP: 117s
EXPLICIT_DIFFUSION: 79s (faster 🎉)

note that the min value is there to keep bump from being overly upset. With diffusion we can get rid of that and allow for smaller scales (runs faster because it converges faster):

hga007 · 2023-10-19T20:10:26Z

@travissluka, looking for parallel bugs is time-consuming; good luck. Hopefully, it will be easy to find them by putting some print statements about the i- and j-ranges. Otherwise, it is looking good. Knowing that the explicit diffusion operator is faster than BUMP is useful.

In ROMS, we can have either 2 or 3 halo points. The diagrams for NLM/TLM and ADM for 2 halo points are:

travissluka · 2023-10-20T15:17:29Z

phew, it turned out NOT to be a halo issue. The problem was a weird artifact of having the same random numbers being generated on each individual PE when randomization iterations were performed. It's fixed now to generate different random numbers across the PEs

normalization constants look good now (right image) and the adjoint tests pass with the regional domain

I think this is ready to merge once the static gridspec file is updated for Skylab 🎉

hga007 · 2023-10-20T15:50:34Z

@travissluka, Great. Oh yeah, I remember generating random numbers in parallel is very tricky. I struggled a lot with getting an identical solution with different parallel partitions. We have a very convoluted way of creating random numbers for our algorithms.

travissluka added 25 commits October 3, 2023 12:51

empty sabercentralblock

6aa1508

add dx / dy to gridgen

6c46f16

empty fortran class

8b220cb

WIP

d83f58c

starting to diffuse!

dab404f

landmask now used

fc8a634

use correct metrics when calculating flux

dda35b3

bruteforce normalization

77cbc12

normalization from randomization

0f5525b

cleanup

8de4022

fix bad grid parameters in halo

9bc3e3a

close, ad halo exchange not working right

9199ed3

adjoint test passing!!!!

7eae15c

dt stability

114dfbb

file IO WIP

a9524af

fix bug in parameter read for halos

6ed0a65

filename in conf

eb72297

update setparams to look like other rossby radius configs

5953fd6

move parameters to separate file

3a82574

length scales from file

d6d549b

update max iterations

bb0ebcc

coding norms

33c2587

Merge remote-tracking branch 'origin/develop' into feature/diffusion

f2f9cca

tests and references

f6f4b10

cleanup

72ff8f3

travissluka requested review from guillaumevernieres and kbhargava October 13, 2023 21:25

guillaumevernieres reviewed Oct 13, 2023

View reviewed changes

guillaumevernieres approved these changes Oct 13, 2023

View reviewed changes

Merge branch 'develop' into feature/diffusion

3dc3062

fix randomization's random numbers to make more random

3645ad9

travissluka and others added 3 commits October 25, 2023 16:05

fix references

80df658

Merge branch 'develop' into feature/diffusion

034cce7

Merge branch 'develop' into feature/diffusion

366ec21

travissluka merged commit 7220445 into develop Nov 10, 2023

travissluka deleted the feature/diffusion branch November 10, 2023 17:55

Dooruk mentioned this pull request Dec 22, 2023

Add vertical component to explicit diffusion central block #988

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Explicit 2D Diffusion Correlation Operator#966

Explicit 2D Diffusion Correlation Operator#966
travissluka merged 30 commits intodevelopfrom
feature/diffusion

travissluka commented Oct 13, 2023 •

edited

Loading

Uh oh!

hga007 commented Oct 13, 2023

Uh oh!

guillaumevernieres Oct 13, 2023

Uh oh!

guillaumevernieres left a comment

Uh oh!

kbhargava commented Oct 16, 2023

Uh oh!

travissluka commented Oct 19, 2023

Uh oh!

kbhargava commented Oct 19, 2023

Uh oh!

travissluka commented Oct 19, 2023

Uh oh!

hga007 commented Oct 19, 2023

Uh oh!

travissluka commented Oct 20, 2023

Uh oh!

hga007 commented Oct 20, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

travissluka commented Oct 13, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

other changes that were made to get this to work

Testing

ctest

realistic scales on a 1 deg grid

absurdly large scales on a 1 deg grid

extra sanity check

1/4 deg grid regional 3dvar

Followup steps

Uh oh!

hga007 commented Oct 13, 2023

Uh oh!

guillaumevernieres Oct 13, 2023

Choose a reason for hiding this comment

Uh oh!

guillaumevernieres left a comment

Choose a reason for hiding this comment

Uh oh!

kbhargava commented Oct 16, 2023

Uh oh!

travissluka commented Oct 19, 2023

Uh oh!

kbhargava commented Oct 19, 2023

Uh oh!

travissluka commented Oct 19, 2023

Uh oh!

hga007 commented Oct 19, 2023

Uh oh!

travissluka commented Oct 20, 2023

Uh oh!

hga007 commented Oct 20, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

travissluka commented Oct 13, 2023 •

edited

Loading