Skip to content

test mod to run/README.namelist#1

Closed
davegill wants to merge 2 commits intoCODEOWNERSfrom
CHANGE_RUN_DIR
Closed

test mod to run/README.namelist#1
davegill wants to merge 2 commits intoCODEOWNERSfrom
CHANGE_RUN_DIR

Conversation

@davegill
Copy link
Owner

@davegill davegill commented Oct 8, 2018

Does Kelly need to approve this

@davegill davegill requested a review from kkeene44 as a code owner October 8, 2018 18:20
Copy link
Collaborator

@kkeene44 kkeene44 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I Approve.

@davegill davegill closed this Nov 7, 2018
@davegill davegill deleted the CHANGE_RUN_DIR branch November 7, 2018 00:00
davegill pushed a commit that referenced this pull request Jan 12, 2019
TYPE: bug fix

KEYWORDS: obs nudging, max number of tasks

SOURCE: internal

DESCRIPTION OF CHANGES:
Problem:
The max number of processors, 1024, is hard coded in module_dm.F for observation nudging.
If a user requests more MPI tasks than this max number, this leads to segmentation fault.

Solution:
In the routine where the dimension of the variables is defined as the maximum number of MPI
tasks, those two variables are now declared as ALLOCATABLE, and then they are allocated based on
the total number of MPI ranks.

LIST OF MODIFIED FILES:
M external/RSL_LITE/module_dm.F

TESTS CONDUCTED:

Applied new code to a user's case, which shows the code works as expected.
No bit-wise diffs with smaller test case, before vs after mods: I built the code with ./configure -d option, and run a small test case with 1 processor and 36 processors, respectively. OBS nudging is turned on. Both runs cover a 3-hour period. Results are identical.
Test case with > 1024 MPI tasks: A large case (derived from a user's case) is also tested. In this case, the code is built with ./configure -D option. Without the change, the case crashed immediately. The error message is:
OBS NUDGING is requested on a total of  2 domain(s).
++++++CALL ERROB AT KTAU =     0 AND INEST =  1:  NSTA =     0 ++++++
At line 5741 of file module_dm.f90
Fortran runtime error: Index '1025' of dimension 1 of array 'idisplacement' above upper bound of 1024
Error termination. Backtrace:
#0  0x782093 in __module_dm_MOD_get_full_obs_vector
	at /glade/scratch/chenming/WRFHELP/WRFV3.9.1.1_intel_dmpar_large-file/frame/module_dm.f90:5741
#1  0xffffffffffffffff in ???
With the code change, the case can run successfully for 6 hours.

RELEASE NOTE: After removing a hard-coded limit for an assumed maximum number of MPI tasks, the observation nudging code for WRF now supports more than 1024 MPI tasks. If users previously ran the obs nudging code with 1024 or fewer MPI tasks, the original code is OK. However, if users tried to run obs nudging with > 1024 MPI tasks, likely the code died from a segmentation fault, while trying to access an address for an array index that was not available.
davegill added a commit that referenced this pull request Feb 15, 2019
davegill added a commit that referenced this pull request Feb 15, 2019
TYPE: text only

KEYWORDS: version_decl, v4.1-alpha

SOURCE: internal

DESCRIPTION OF CHANGES: 
Update the character string inside the WRF system from 4.0.3 to 4.1-alpha.

LIST OF MODIFIED FILES: 
M inc/version_decl

TESTS CONDUCTED: 
 - [x] Code runs and v4.1-alpha is the version printed from the WRF system programs.
```
> ncdump -h wrfinput_d01 | grep TITLE
		:TITLE = " OUTPUT FROM REAL_EM V4.1-alpha PREPROCESSOR" ;
> ncdump -h wrfinput_initialized_d01  | grep TITLE
		:TITLE = " OUTPUT FROM WRF V4.1-alpha MODEL" ;
> ncdump -h met_em.d01.2019-02-15_12:00:00.nc  | grep TITLE
		:TITLE = "OUTPUT FROM METGRID V4.1" ;
> ncdump -h wrfout_d01_2019-02-16_12:00:00  | grep TITLE
		:TITLE = " OUTPUT FROM WRF V4.1-alpha MODEL" ;
```
davegill added a commit that referenced this pull request May 30, 2019
… data (wrf-model#875)

TYPE: bug fix

KEYWORDS: LBC, valid time

SOURCE: identified by Michael Duda (NCAR/MMM), fixed internally

DESCRIPTION OF CHANGES:
Problem:
1. If a user tried to start a simulation _after_ the last LBC valid period, the
WRF model would get into a nearly infinite loop and print out repeated statements:
```
 THIS TIME 2000-01-24_18:00:00, NEXT TIME 2000-01-25_00:00:00
d01 2000-01-25_06:00:00  Input data is acceptable to use: wrfbdy_d01
           2  input_wrf: wrf_get_next_time current_date: 2000-01-24_18:00:00 Status =           -4
d01 2000-01-25_06:00:00  ---- ERROR: Ran out of valid boundary conditions in file wrfbdy_d01
```
2. If a user tries to extend the model simulation beyond that valid times of the LBC, the code
behavior is not controlled (nearly infinite loops on some machines, or runtime errors with a backtrace
on other machines).

Solution:
In another routine, the lateral boundary condition is read to get to the
correct time. Once inside of share/input_wrf.F, we should be at the
correct time. There is no need to try to get to the next time. In this
particular case, the effort to get to the next time fails, but we try
again (and again and again). This solution fixes both problems identified
above.

ISSUE:
Fixes wrf-model#769 "WRF doesn't halt when beginning LBC time is not in wrfbdy_d01 file"

LIST OF MODIFIED FILES:
M share/input_wrf.F

TESTS CONDUCTED:
1. Without fix, start the model after the last valid time of the LBC file => lots of repeated messages
```
 THIS TIME 2000-01-24_18:00:00, NEXT TIME 2000-01-25_00:00:00
d01 2000-01-25_06:00:00  Input data is acceptable to use: wrfbdy_d01
           2  input_wrf: wrf_get_next_time current_date: 2000-01-24_18:00:00 Status =           -4
d01 2000-01-25_06:00:00  ---- ERROR: Ran out of valid boundary conditions in file wrfbdy_d01
```
2. With this fix, when LBC stops at 2000 01 25 00, and WRF starts at 2000 01 25 06
```
d01 2000-01-25_06:00:00  Input data is acceptable to use: wrfbdy_d01
 THIS TIME 2000-01-24_12:00:00, NEXT TIME 2000-01-24_18:00:00
d01 2000-01-25_06:00:00  Input data is acceptable to use: wrfbdy_d01
 THIS TIME 2000-01-24_18:00:00, NEXT TIME 2000-01-25_00:00:00
d01 2000-01-25_06:00:00  Input data is acceptable to use: wrfbdy_d01
           2  input_wrf: wrf_get_next_time current_date: 2000-01-24_18:00:00 Status =           -4
-------------- FATAL CALLED ---------------
FATAL CALLED FROM FILE:  <stdin>  LINE:    1134
 ---- ERROR: Ran out of valid boundary conditions in file wrfbdy_d01
-------------------------------------------
```
3. Without this fix, if we try to extend the module simulation beyond the valid lateral boundary times
```
Timing for main: time 2000-01-24_23:54:00 on domain   1:    0.53782 elapsed seconds
Timing for main: time 2000-01-24_23:57:00 on domain   1:    0.51111 elapsed seconds
Timing for main: time 2000-01-25_00:00:00 on domain   1:    0.54507 elapsed seconds
Timing for Writing wrfout_d01_2000-01-25_00:00:00 for domain        1:    0.03793 elapsed seconds
d01 2000-01-25_00:00:00  Input data is acceptable to use: wrfbdy_d01
           2  input_wrf: wrf_get_next_time current_date: 2000-01-25_00:00:00 Status =           -4
d01 2000-01-25_00:00:00  ---- ERROR: Ran out of valid boundary conditions in file wrfbdy_d01
At line 777 of file module_date_time.f90
Fortran runtime error: Bad value during integer read

Error termination. Backtrace:
#0  0x10e67c36c
#1  0x10e67d075
#2  0x10e67d7e9
```
4. With this fix, if we try to extend the module simulation beyond the valid lateral boundary times
```
Timing for main: time 2000-01-24_23:54:00 on domain   1:    0.60755 elapsed seconds
Timing for main: time 2000-01-24_23:57:00 on domain   1:    0.57641 elapsed seconds
Timing for main: time 2000-01-25_00:00:00 on domain   1:    0.60817 elapsed seconds
Timing for Writing wrfout_d01_2000-01-25_00:00:00 for domain        1:    0.04499 elapsed seconds
d01 2000-01-25_00:00:00  Input data is acceptable to use: wrfbdy_d01
           2  input_wrf: wrf_get_next_time current_date: 2000-01-25_00:00:00 Status =           -4
-------------- FATAL CALLED ---------------
FATAL CALLED FROM FILE:  <stdin>  LINE:    1134
 ---- ERROR: Ran out of valid boundary conditions in file wrfbdy_d01
-------------------------------------------
```

MMM Classroom regtest; em_real, nmm, em_chem; GNU only
davegill added a commit that referenced this pull request Feb 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants