Skip to content

Add support for traditional threading#2533

Closed
DusanJovic-NOAA wants to merge 4 commits into
ufs-community:developfrom
DusanJovic-NOAA:rt_tt
Closed

Add support for traditional threading#2533
DusanJovic-NOAA wants to merge 4 commits into
ufs-community:developfrom
DusanJovic-NOAA:rt_tt

Conversation

@DusanJovic-NOAA
Copy link
Copy Markdown
Collaborator

Commit Queue Requirements:

  • Fill out all sections of this template.
  • All sub component pull requests have been reviewed by their code managers.
  • Run the full Intel+GNU RT suite (compared to current baselines) on either Hera/Derecho/Hercules
  • [] Commit 'test_changes.list' from previous step

Description:

This PR adds support for traditional (non esmf managed) threading. Currently, all '2threads' tests have been converted to traditional threading. ESMF managed threading is still the default.

To use tradition threading, set ESMF_THREADING=false in test's configuration file and set THRD to number of threads, for example:

$ cat tests/cpld_2threads_p8
. . . 
ESMF_THREADING=false
THRD=$THRD_cpl_thrd

Many tests had two versions of ufs.configure templates (_esmf.IN and the version without _esmf) with the only difference being the value of globalResourceControl:, true for esmf and false for non-esmf threading. Those are now unified in a single ufs.configure template.

Commit Message:

* UFSWM - This PR adds support for traditional (non esmf managed) threading. ESMF managed threading is still default.

Priority:

  • Normal

Git Tracking

UFSWM:

Sub component Pull Requests:

  • None

UFSWM Blocking Dependencies:

  • None

Changes

Regression Test Changes (Please commit test_changes.list):

  • No Baseline Changes.

Input data Changes:

  • None.

Library Changes/Upgrades:

  • No Updates

Testing Log:

  • RDHPCS
    • Hera
    • Orion
    • Hercules
    • Jet
    • Gaea
    • Derecho
  • WCOSS2
    • Dogwood/Cactus
    • Acorn
  • CI
  • opnReqTest (complete task if unnecessary)

Comment thread tests/parm/ufs.configure.s2s.IN Outdated
Use this variable to turn on writing histaux files every MED_history_n.
By default it is .false.
Currently only cpld_control_nowave_noaero_p8 test has it turned on.
@NickSzapiro-NOAA
Copy link
Copy Markdown
Collaborator

Does traditional threading require one value of #threads per node?

If yes, I'm not sure these lines in rt.sh and run_test.sh work when components under-fill nodes

TPN=$(( TPN / THRD ))
NODES=$(( TASKS / TPN ))
if (( NODES * TPN < TASKS )); then
  NODES=$(( NODES + 1 ))
fi
PPN=$(( TASKS / NODES ))
if (( TASKS - ( PPN * NODES ) > 0 )); then
  PPN=$((PPN + 1))
fi

and maybe compute_petbounds_and_task functions could export NODES too

@DusanJovic-NOAA
Copy link
Copy Markdown
Collaborator Author

Does traditional threading require one value of #threads per node?

If yes, I'm not sure these lines in rt.sh and run_test.sh work when components under-fill nodes

TPN=$(( TPN / THRD ))
NODES=$(( TASKS / TPN ))
if (( NODES * TPN < TASKS )); then
  NODES=$(( NODES + 1 ))
fi
PPN=$(( TASKS / NODES ))
if (( TASKS - ( PPN * NODES ) > 0 )); then
  PPN=$((PPN + 1))
fi

and maybe compute_petbounds_and_task functions could export NODES too

I am not sure what you mean by '#threads per node'. Are you referring to TPN variable? TPN is a number of (MPI) tasks per node, which is initially set to be equal to the number of physical cores on a given platform, 80 on Hercules, 40 on Hera, 128 on WCOSS2 etc. Then it is divided by number of threads (THRD) to avoid having multiple threads running on the same physical core. Once we have tasks per node (TPN) and total number of tasks (TASKS), we compute the number of nodes (NODES). That is done in this block:

TPN=$(( TPN / THRD ))
NODES=$(( TASKS / TPN ))
if (( NODES * TPN < TASKS )); then
  NODES=$(( NODES + 1 ))
fi

the following block:

PPN=$(( TASKS / NODES ))
if (( TASKS - ( PPN * NODES ) > 0 )); then
  PPN=$((PPN + 1))
fi

has been added for 'derecho support', although I never fully understood why that was needed and what is special about Derecho to require it.

NickSzapiro-NOAA added a commit to NickSzapiro-NOAA/ufs-weather-model that referenced this pull request Dec 19, 2024
jkbk2004 pushed a commit that referenced this pull request Dec 23, 2024
…eading (#2533) (#2538)

* UFSWM - This PR adds support for traditional (non esmf managed) threading. ESMF managed threading is still default.
* UFSWM - Sync with ESCOMP/CDEPS (2024-12-16)
  * CDEPS - Sync with ESCOMP/CDEPS (2024-12-16)
---------

Co-authored-by: Dusan Jovic <dusan.jovic@noaa.gov>
@jkbk2004
Copy link
Copy Markdown
Collaborator

merged with #2538

@jkbk2004 jkbk2004 closed this Dec 23, 2024
WalterKolczynski-NOAA added a commit to NOAA-EMC/global-workflow that referenced this pull request Jan 17, 2025
This PR adds the following: 
* converting from inp -> nml (@sbanihash) 
* turning on PIO for waves for restarts (@sbanihash) 
* enabling cycling for WW3 which required some updates to wave prep jobs
+ changing what restarts are being saved/etc
* changed the way CEMPS, MOM6, CICE and WW3 write restarts to be in sync
with FV3 for IAU, which required moving the ufs-weather-model forward
one hash to use the new flexible restart feature. (UFS PR
ufs-community/ufs-weather-model#2419)
* adds uglo_15km - the targeted new wave grid. 
* Update to use new esmf_threading ufs.configure files which changes the
toggle between how you use esmf threading versus traditional threading
(UFS PR ufs-community/ufs-weather-model#2538)

Notes on ufs-weather-model updates: 
| Commit date | Commit hash/ PR | Notes for g-w changes | Baseline
Changes |
| :------------- | :------------- | :------------- | :------------- |
| Dec 11, 2024 |
ufs-community/ufs-weather-model@409bc85
ufs-community/ufs-weather-model#2419 | Enables
flexible restart writes - changes included in g-w PR | none|
| Dec 16, 2024 |
ufs-community/ufs-weather-model@6ec6b45
ufs-community/ufs-weather-model#2528
ufs-community/ufs-weather-model#2469 | n/a |
HAFs test changes, no global changes |
| Dec 18, 2024
|ufs-community/ufs-weather-model@e119370
ufs-community/ufs-weather-model#2448 | Adds Gaea
C6 support (changes in other g-w PRs, not here) | none |
|Dec 23, 2024 |
ufs-community/ufs-weather-model@2950089
ufs-community/ufs-weather-model#2533
ufs-community/ufs-weather-model#2538 | changes
for ESMF vs traditional threading | none |
|Dec 30, 2024 |
ufs-community/ufs-weather-model@241dd8e
ufs-community/ufs-weather-model#2485 | n/a |
changes in conus13km, no global changes|
|Jan 3, 2025 |
ufs-community/ufs-weather-model@76471dc
ufs-community/ufs-weather-model#2530 | n/a |
changes in regional tests, no global changes |

Note this PR requires the following: 
* update to fix files to add uglo_15km 
* staging ICs for high resolution test case for uglo_15km 

Co-author: @sbanihash 

Related Issues: 
- Fixes #1457 
- Fixes #3154
- Fixes #1795 
- related to #1776 

---------

Co-authored-by: Rahul Mahajan <aerorahul@users.noreply.github.com>
Co-authored-by: Saeideh Banihashemi <saeideh.banihashemi@noaa.gov>
Co-authored-by: David Huber <69919478+DavidHuber-NOAA@users.noreply.github.com>
Co-authored-by: Walter Kolczynski - NOAA <Walter.Kolczynski@noaa.gov>
@DusanJovic-NOAA DusanJovic-NOAA deleted the rt_tt branch March 20, 2025 14:14
EricSinsky-NOAA pushed a commit to EricSinsky-NOAA/global-workflow that referenced this pull request May 16, 2025
This PR adds the following:
* converting from inp -> nml (@sbanihash)
* turning on PIO for waves for restarts (@sbanihash)
* enabling cycling for WW3 which required some updates to wave prep jobs
+ changing what restarts are being saved/etc
* changed the way CEMPS, MOM6, CICE and WW3 write restarts to be in sync
with FV3 for IAU, which required moving the ufs-weather-model forward
one hash to use the new flexible restart feature. (UFS PR
ufs-community/ufs-weather-model#2419)
* adds uglo_15km - the targeted new wave grid.
* Update to use new esmf_threading ufs.configure files which changes the
toggle between how you use esmf threading versus traditional threading
(UFS PR ufs-community/ufs-weather-model#2538)

Notes on ufs-weather-model updates:
| Commit date | Commit hash/ PR | Notes for g-w changes | Baseline
Changes |
| :------------- | :------------- | :------------- | :------------- |
| Dec 11, 2024 |
ufs-community/ufs-weather-model@409bc85
ufs-community/ufs-weather-model#2419 | Enables
flexible restart writes - changes included in g-w PR | none|
| Dec 16, 2024 |
ufs-community/ufs-weather-model@6ec6b45
ufs-community/ufs-weather-model#2528
ufs-community/ufs-weather-model#2469 | n/a |
HAFs test changes, no global changes |
| Dec 18, 2024
|ufs-community/ufs-weather-model@e119370
ufs-community/ufs-weather-model#2448 | Adds Gaea
C6 support (changes in other g-w PRs, not here) | none |
|Dec 23, 2024 |
ufs-community/ufs-weather-model@2950089
ufs-community/ufs-weather-model#2533
ufs-community/ufs-weather-model#2538 | changes
for ESMF vs traditional threading | none |
|Dec 30, 2024 |
ufs-community/ufs-weather-model@241dd8e
ufs-community/ufs-weather-model#2485 | n/a |
changes in conus13km, no global changes|
|Jan 3, 2025 |
ufs-community/ufs-weather-model@76471dc
ufs-community/ufs-weather-model#2530 | n/a |
changes in regional tests, no global changes |

Note this PR requires the following:
* update to fix files to add uglo_15km
* staging ICs for high resolution test case for uglo_15km

Co-author: @sbanihash

Related Issues:
- Fixes NOAA-EMC#1457
- Fixes NOAA-EMC#3154
- Fixes NOAA-EMC#1795
- related to NOAA-EMC#1776

---------

Co-authored-by: Rahul Mahajan <aerorahul@users.noreply.github.com>
Co-authored-by: Saeideh Banihashemi <saeideh.banihashemi@noaa.gov>
Co-authored-by: David Huber <69919478+DavidHuber-NOAA@users.noreply.github.com>
Co-authored-by: Walter Kolczynski - NOAA <Walter.Kolczynski@noaa.gov>
EricSinsky-NOAA pushed a commit to EricSinsky-NOAA/global-workflow that referenced this pull request May 19, 2025
This PR adds the following:
* converting from inp -> nml (@sbanihash)
* turning on PIO for waves for restarts (@sbanihash)
* enabling cycling for WW3 which required some updates to wave prep jobs
+ changing what restarts are being saved/etc
* changed the way CEMPS, MOM6, CICE and WW3 write restarts to be in sync
with FV3 for IAU, which required moving the ufs-weather-model forward
one hash to use the new flexible restart feature. (UFS PR
ufs-community/ufs-weather-model#2419)
* adds uglo_15km - the targeted new wave grid.
* Update to use new esmf_threading ufs.configure files which changes the
toggle between how you use esmf threading versus traditional threading
(UFS PR ufs-community/ufs-weather-model#2538)

Notes on ufs-weather-model updates:
| Commit date | Commit hash/ PR | Notes for g-w changes | Baseline
Changes |
| :------------- | :------------- | :------------- | :------------- |
| Dec 11, 2024 |
ufs-community/ufs-weather-model@409bc85
ufs-community/ufs-weather-model#2419 | Enables
flexible restart writes - changes included in g-w PR | none|
| Dec 16, 2024 |
ufs-community/ufs-weather-model@6ec6b45
ufs-community/ufs-weather-model#2528
ufs-community/ufs-weather-model#2469 | n/a |
HAFs test changes, no global changes |
| Dec 18, 2024
|ufs-community/ufs-weather-model@e119370
ufs-community/ufs-weather-model#2448 | Adds Gaea
C6 support (changes in other g-w PRs, not here) | none |
|Dec 23, 2024 |
ufs-community/ufs-weather-model@2950089
ufs-community/ufs-weather-model#2533
ufs-community/ufs-weather-model#2538 | changes
for ESMF vs traditional threading | none |
|Dec 30, 2024 |
ufs-community/ufs-weather-model@241dd8e
ufs-community/ufs-weather-model#2485 | n/a |
changes in conus13km, no global changes|
|Jan 3, 2025 |
ufs-community/ufs-weather-model@76471dc
ufs-community/ufs-weather-model#2530 | n/a |
changes in regional tests, no global changes |

Note this PR requires the following:
* update to fix files to add uglo_15km
* staging ICs for high resolution test case for uglo_15km

Co-author: @sbanihash

Related Issues:
- Fixes NOAA-EMC#1457
- Fixes NOAA-EMC#3154
- Fixes NOAA-EMC#1795
- related to NOAA-EMC#1776

---------

Co-authored-by: Rahul Mahajan <aerorahul@users.noreply.github.com>
Co-authored-by: Saeideh Banihashemi <saeideh.banihashemi@noaa.gov>
Co-authored-by: David Huber <69919478+DavidHuber-NOAA@users.noreply.github.com>
Co-authored-by: Walter Kolczynski - NOAA <Walter.Kolczynski@noaa.gov>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Set traditional threading as default

3 participants