Parallel netcdf-4 IO option#1552
Conversation
|
@MicroTed
|
|
@davegill
|
The NetCDF version 4.7.4 is the earliest that permits the usage of the parallel and compressed options together. modified: configure
modified: configure
modified: Makefile modified: external/io_netcdfpar/Makefile modified: external/io_netcdfpar/wrf_io.F90
…d/WRFV4 into release-v4.3-parallelnc4
new file: external/io_netcdfpar/diffwrf.F90
|
@MicroTed Your mods do not break the standard IO output options. I can get data output with io_form = 2. But, I am having troubles getting this code to work with the new capability. Here is the specific error message:
Any ideas or suggestions? |
|
@davegill |
|
@MicroTed |
|
The netcdf 4.8.0 release has a possibly useful new test: P.S. I compiled 4.8.0 (on mac) and that also worked with the PR code, so it doesn't seem to be a problem with 4.8.0 itself. |
modified: share/mediation_integrate.F
|
@davegill |
|
@MicroTed @weiwangncar @kkeene44 Wei and Kelly, |
|
@davegill Have you retested code with this latest change? |
Wei, |
TYPE: bug fix KEYWORDS: netcdfpar, Error SOURCE: internal DESCRIPTION OF CHANGES: Problem: With PR wrf-model#1552 "Parallel netcdf-4 IO option" (SHA1 3cd4713), when then code was built without the new parallel NetCDF4 compression, the build log had an `Error`. The problem was related to constructing the object files in the io_netcdfpar directory. When the option is not selected at compile time, then we do not care about errors in the directory that will never be used. Solution: If the NETCDFPAR option is not selected at compile time, then do SKIP going into the io_netcdfpar directory all together. LIST OF MODIFIED FILES: m Makefile m arch/Config.pl m arch/configure.defaults TESTS CONDUCTED: 1. Without the NETCDFPAR parameter set, the build for the io_netcdfpar directory is bypassed: ``` cd ../io_netcdfpar ; \ echo SKIPPING make -i -r NETCDFPARPATH="/glade/u/apps/ch/opt/netcdf/4.7.3/gnu/9.1.0" \ cd ../io_netcdfpar ; \ echo SKIPPING make -i -r NETCDFPARPATH="/glade/u/apps/ch/opt/netcdf/4.7.3/gnu/9.1.0" \ ``` 2. When the NETCDFPAR env variable is set, the build includes the io_netcdfpar directory: cd ../io_netcdfpar ; \ make -i -r NETCDFPARPATH="/glade/u/apps/ch/opt/netcdf/4.8.0/gnu/9.1.0" \ cd ../io_netcdfpar ; \ make -i -r NETCDFPARPATH="/glade/u/apps/ch/opt/netcdf/4.8.0/gnu/9.1.0" \ ``` 3. Jenkins tests are all PASS. RELEASE NOTE: Include a stand-alone message suitable for the inclusion in the minor and annual releases. A publication citation is appropriate.
TYPE: bug fix KEYWORDS: netcdfpar, Error SOURCE: internal DESCRIPTION OF CHANGES: IMPORTANT: Without these mods, every commit since the parallel netcdf4 IO mods will fail the DA build test in the regression test. For example, at least these commits: ``` fed10f4 Adding the WRF-Solar EPS model (#1547) 0bda5e0 Fix 4dvar build failure after commit 8b5bfe5 (#1652) 8b5bfe5 Thompson AA enhancements: BC aerosol, biomass burning emissions, and … (#1616) 9dc68ca After testing with UFS/GFS/FV-3, some tuning knob changes to Thompson-MP and icloud3 (cloud fraction) scheme (#1626) 96fd889 Update HONO, TERP, and CO2 emissions (#1644) 64fb190 SFCLAY=1, add shallow water roughness calculation (#1543) 609c2fc New module firebrand_spotting for WRF-Fire (#1540) 75bfe6d MYNN PBL clouds in photolysis option 4 (TUV) (#1622) f8c4b13 Fix runtime error when using sf_surface_mosaic = 1 with use_wudapt_lcz = 0 (#1638) b511c70 Run-time option for climate GHG for radiation (#1625) 8194c66 Bug fix for configuration option INTEL:HSW/BDW (#1645) 16c9287 bug fixes for radar_rf_opt=2 (#1642) a82ce24 Sync with NoahMP Github version with all NoahMP updates since v4.3 (#1641) 7b642cc Bug fix for TAMDAR T VarBC (#1632) 92fd706 fix WRFDA build for Parallel netcdf-4 IO (#1634) ``` Problem: With PR #1552 "Parallel netcdf-4 IO option" (SHA1 3cd4713), when then code was built without the new parallel NetCDF4 compression, the build log had an `Error`. ``` > grep Error compile.log Fatal Error: Cannot open module file ‘wrf_data_ncpar.mod’ for reading at (1): No such file or directory make[2]: [diffwrf] Error 1 (ignored) make[2]: [diffwrf] Error 1 (ignored) wrf_io.f:117: Error: Can't open included file 'mpif.h' make[2]: [wrf_io.o] Error 1 (ignored) Fatal Error: Cannot open module file ‘wrf_data_ncpar.mod’ for reading at (1): No such file or directory make[2]: [field_routines.o] Error 1 (ignored) make[2]: [libwrfio_nfpar.a] Error 127 (ignored) make[2]: [libwrfio_nfpar.a] Error 1 (ignored) ``` The problem was related to constructing the object files in the io_netcdfpar directory. When the option is not selected at compile time, then we do not care about errors in the directory that will never be used. Solution: If the NETCDFPAR option is not selected at compile time, then SKIP going into the io_netcdfpar directory all together. LIST OF MODIFIED FILES: m Makefile m arch/Config.pl m arch/configure.defaults m configure TESTS CONDUCTED: 1. Without the NETCDFPAR parameter set, the build for the io_netcdfpar directory is bypassed: ``` cd ../io_netcdfpar ; \ echo SKIPPING make -i -r NETCDFPARPATH="/glade/u/apps/ch/opt/netcdf/4.7.3/gnu/9.1.0" \ cd ../io_netcdfpar ; \ echo SKIPPING make -i -r NETCDFPARPATH="/glade/u/apps/ch/opt/netcdf/4.7.3/gnu/9.1.0" \ ``` 2. When the NETCDFPAR env variable is set, the build includes the io_netcdfpar directory: cd ../io_netcdfpar ; \ make -i -r NETCDFPARPATH="/glade/u/apps/ch/opt/netcdf/4.8.0/gnu/9.1.0" \ cd ../io_netcdfpar ; \ make -i -r NETCDFPARPATH="/glade/u/apps/ch/opt/netcdf/4.8.0/gnu/9.1.0" \ ``` 3. Jenkins tests are all PASS.
TYPE: bug fix KEYWORDS: netcdfpar, Error SOURCE: internal DESCRIPTION OF CHANGES: Problem: With PR wrf-model#1552 "Parallel netcdf-4 IO option" (SHA1 3cd4713), when then code was built without the new parallel NetCDF4 compression, the build log had an `Error`. The problem was related to constructing the object files in the io_netcdfpar directory. When the option is not selected at compile time, then we do not care about errors in the directory that will never be used. Solution: If the NETCDFPAR option is not selected at compile time, then do SKIP going into the io_netcdfpar directory all together. LIST OF MODIFIED FILES: m Makefile m arch/Config.pl m arch/configure.defaults TESTS CONDUCTED: 1. Without the NETCDFPAR parameter set, the build for the io_netcdfpar directory is bypassed: ``` cd ../io_netcdfpar ; \ echo SKIPPING make -i -r NETCDFPARPATH="/glade/u/apps/ch/opt/netcdf/4.7.3/gnu/9.1.0" \ cd ../io_netcdfpar ; \ echo SKIPPING make -i -r NETCDFPARPATH="/glade/u/apps/ch/opt/netcdf/4.7.3/gnu/9.1.0" \ ``` 2. When the NETCDFPAR env variable is set, the build includes the io_netcdfpar directory: cd ../io_netcdfpar ; \ make -i -r NETCDFPARPATH="/glade/u/apps/ch/opt/netcdf/4.8.0/gnu/9.1.0" \ cd ../io_netcdfpar ; \ make -i -r NETCDFPARPATH="/glade/u/apps/ch/opt/netcdf/4.8.0/gnu/9.1.0" \ ``` 3. Jenkins tests are all PASS. RELEASE NOTE: Include a stand-alone message suitable for the inclusion in the minor and annual releases. A publication citation is appropriate.
TYPE: new feature KEYWORDS: parallel netcdf4 SOURCE: Ted Mansell (NOAA/National Severe Storms Lab) DESCRIPTION OF CHANGES: This PR adds an I/O option to use the new netcdf capability in 4.7.4 (and later versions) to write both in parallel and with variable compression for the same file. Versions of pnetcdf permitted parallel I/O, but did not permit standard HDF5 compression. A new I/O library directory, `external/io_netcdfpar`, has been added that is modeled on the similar `external/io_netcdf` directory. Users should see the file doc/README.netcdf4par for details on configuring and running. In a nutshell, steps to use this feature: 1. This capability to use HDF5 compression and to have parallel I/O (similar to pnetcdf) requires NetCDF v4.7.4 or later. 2. The NetCDF library must be built with MPI. 3. Once the NetCDF library / module is chosen and the NETCDF env variable is correctly pointing to the right location, then prior to the `configure` step, the user sets a new environment variable. For example, in csh or bash `setenv NETCDFPAR $NETCDF` or `export NETCDFPAR=$NETCDF`, respectively. 4. Set the io_form to 13 to use the parallel compressed netcdf option (usually just for the model output or restarts). 5. This I/O option requires activating the NOCOLONS switch (which is why only model output or restarts are recommended, since metgrid files have colons imbedded in the file names). LIST OF MODIFIED FILES: M Makefile M Registry/Registry.EM_COMMON M arch/Config.pl M arch/md_calls.inc M arch/postamble M arch/preamble M configure A doc/README.netcdf4par M external/Makefile A external/io_netcdfpar/Makefile A external/io_netcdfpar/diffwrf.F90 A external/io_netcdfpar/ext_ncdpar_get_dom_ti.code A external/io_netcdfpar/ext_ncdpar_get_var_td.code A external/io_netcdfpar/ext_ncdpar_get_var_ti.code A external/io_netcdfpar/ext_ncdpar_put_dom_ti.code A external/io_netcdfpar/ext_ncdpar_put_var_td.code A external/io_netcdfpar/ext_ncdpar_put_var_ti.code A external/io_netcdfpar/field_routines.F90 A external/io_netcdfpar/module_wrfsi_static.F90 A external/io_netcdfpar/transpose.code A external/io_netcdfpar/wrf_io.F90 M frame/md_calls.m4 M frame/module_io.F M share/mediation_integrate.F M share/module_io_domain.F M share/output_wrf.F M share/wrf_ext_write_field.F TESTS CONDUCTED: 1. Tested on lustre file system (cray) with domain sizes up to about 700x700. Will leave the existing chunking as as the initial implementation. 2. Parallel I/O with compression successfully runs on the NCAR cheyenne system, a GPFS file system. The following were set up for modules during the build and within the job script: ``` module purge module load intel module load ncarcompilers module load mpt module load netcdf-mpi module load ncarenv ``` 3. The tests on cheyenne worked with both GNU/10.1.0 and Intel/19.1.1 4. Jenkins testing is OK. 5. Additional information is now output at the end of the `configure` step regarding I/O options: ``` NetCDF version: 4.8.1 Enabled NetCDF-4/HDF-5: yes NetCDF built with PnetCDF: no Enabled NetCDF parallel: yes Using parallel NetCDF via NETCDFPAR option ``` RELEASE NOTE: Added the ability to write compressed NetCDF4 files in parallel via NetCDF 4.7.4 (and later). The performance is slower than pnetcdf, but can be notably faster than regular NetCDF4 on parallel file systems. As expected, the compression provides files significantly smaller than pnetcdf generates.
TYPE: bug fix KEYWORDS: netcdfpar, Error SOURCE: internal DESCRIPTION OF CHANGES: IMPORTANT: Without these mods, every commit since the parallel netcdf4 IO mods will fail the DA build test in the regression test. For example, at least these commits: ``` fed10f4 Adding the WRF-Solar EPS model (wrf-model#1547) 0bda5e0 Fix 4dvar build failure after commit 8b5bfe5 (wrf-model#1652) 8b5bfe5 Thompson AA enhancements: BC aerosol, biomass burning emissions, and … (wrf-model#1616) 9dc68ca After testing with UFS/GFS/FV-3, some tuning knob changes to Thompson-MP and icloud3 (cloud fraction) scheme (wrf-model#1626) 96fd889 Update HONO, TERP, and CO2 emissions (wrf-model#1644) 64fb190 SFCLAY=1, add shallow water roughness calculation (wrf-model#1543) 609c2fc New module firebrand_spotting for WRF-Fire (wrf-model#1540) 75bfe6d MYNN PBL clouds in photolysis option 4 (TUV) (wrf-model#1622) f8c4b13 Fix runtime error when using sf_surface_mosaic = 1 with use_wudapt_lcz = 0 (wrf-model#1638) b511c70 Run-time option for climate GHG for radiation (wrf-model#1625) 8194c66 Bug fix for configuration option INTEL:HSW/BDW (wrf-model#1645) 16c9287 bug fixes for radar_rf_opt=2 (wrf-model#1642) a82ce24 Sync with NoahMP Github version with all NoahMP updates since v4.3 (wrf-model#1641) 7b642cc Bug fix for TAMDAR T VarBC (wrf-model#1632) 92fd706 fix WRFDA build for Parallel netcdf-4 IO (wrf-model#1634) ``` Problem: With PR wrf-model#1552 "Parallel netcdf-4 IO option" (SHA1 3cd4713), when then code was built without the new parallel NetCDF4 compression, the build log had an `Error`. ``` > grep Error compile.log Fatal Error: Cannot open module file ‘wrf_data_ncpar.mod’ for reading at (1): No such file or directory make[2]: [diffwrf] Error 1 (ignored) make[2]: [diffwrf] Error 1 (ignored) wrf_io.f:117: Error: Can't open included file 'mpif.h' make[2]: [wrf_io.o] Error 1 (ignored) Fatal Error: Cannot open module file ‘wrf_data_ncpar.mod’ for reading at (1): No such file or directory make[2]: [field_routines.o] Error 1 (ignored) make[2]: [libwrfio_nfpar.a] Error 127 (ignored) make[2]: [libwrfio_nfpar.a] Error 1 (ignored) ``` The problem was related to constructing the object files in the io_netcdfpar directory. When the option is not selected at compile time, then we do not care about errors in the directory that will never be used. Solution: If the NETCDFPAR option is not selected at compile time, then SKIP going into the io_netcdfpar directory all together. LIST OF MODIFIED FILES: m Makefile m arch/Config.pl m arch/configure.defaults m configure TESTS CONDUCTED: 1. Without the NETCDFPAR parameter set, the build for the io_netcdfpar directory is bypassed: ``` cd ../io_netcdfpar ; \ echo SKIPPING make -i -r NETCDFPARPATH="/glade/u/apps/ch/opt/netcdf/4.7.3/gnu/9.1.0" \ cd ../io_netcdfpar ; \ echo SKIPPING make -i -r NETCDFPARPATH="/glade/u/apps/ch/opt/netcdf/4.7.3/gnu/9.1.0" \ ``` 2. When the NETCDFPAR env variable is set, the build includes the io_netcdfpar directory: cd ../io_netcdfpar ; \ make -i -r NETCDFPARPATH="/glade/u/apps/ch/opt/netcdf/4.8.0/gnu/9.1.0" \ cd ../io_netcdfpar ; \ make -i -r NETCDFPARPATH="/glade/u/apps/ch/opt/netcdf/4.8.0/gnu/9.1.0" \ ``` 3. Jenkins tests are all PASS.
TYPE: new feature
KEYWORDS: parallel netcdf4
SOURCE: Ted Mansell (NOAA/National Severe Storms Lab)
DESCRIPTION OF CHANGES:
This PR adds an I/O option to use the new netcdf capability in 4.7.4 (and later versions) to write both in parallel
and with variable compression for the same file. Versions of pnetcdf permitted parallel I/O, but did not permit
standard HDF5 compression. A new I/O library directory,
external/io_netcdfpar, has been added that ismodeled on the similar
external/io_netcdfdirectory. Users should see the file doc/README.netcdf4par fordetails on configuring and running.
In a nutshell, steps to use this feature:
or later.
location, then prior to the
configurestep, the user sets a new environment variable. For example, in csh orbash
setenv NETCDFPAR $NETCDForexport NETCDFPAR=$NETCDF, respectively.recommended, since metgrid files have colons imbedded in the file names).
LIST OF MODIFIED FILES:
M Makefile
M Registry/Registry.EM_COMMON
M arch/Config.pl
M arch/md_calls.inc
M arch/postamble
M arch/preamble
M configure
A doc/README.netcdf4par
M external/Makefile
A external/io_netcdfpar/Makefile
A external/io_netcdfpar/diffwrf.F90
A external/io_netcdfpar/ext_ncdpar_get_dom_ti.code
A external/io_netcdfpar/ext_ncdpar_get_var_td.code
A external/io_netcdfpar/ext_ncdpar_get_var_ti.code
A external/io_netcdfpar/ext_ncdpar_put_dom_ti.code
A external/io_netcdfpar/ext_ncdpar_put_var_td.code
A external/io_netcdfpar/ext_ncdpar_put_var_ti.code
A external/io_netcdfpar/field_routines.F90
A external/io_netcdfpar/module_wrfsi_static.F90
A external/io_netcdfpar/transpose.code
A external/io_netcdfpar/wrf_io.F90
M frame/md_calls.m4
M frame/module_io.F
M share/mediation_integrate.F
M share/module_io_domain.F
M share/output_wrf.F
M share/wrf_ext_write_field.F
TESTS CONDUCTED:
as the initial implementation.
following were set up for modules during the build and within the job script:
configurestep regarding I/O options:RELEASE NOTE: Added the ability to write compressed netcdf4 files in parallel via NetCDF 4.7.4 (and later). The performance is slower than pnetcdf, but can be notably faster than regular netcdf4 on parallel file systems. As expected, the compression provides files significantly smaller than pnetcdf generates.