Skip to content

Careful cygwin support#1003

Merged
davegill merged 13 commits intowrf-model:developfrom
DWesl:careful_cygwin_support
Feb 7, 2020
Merged

Careful cygwin support#1003
davegill merged 13 commits intowrf-model:developfrom
DWesl:careful_cygwin_support

Conversation

@DWesl
Copy link
Contributor

@DWesl DWesl commented Oct 27, 2019

TYPE: enhancement

KEYWORDS: compilation, cygwin, new platform, build system

SOURCE: Daniel Wesloh (The Pennsylvania State University Department of Meteorology and Atmospheric Science)

DESCRIPTION OF CHANGES:

The WRF compilation script did not support compiling on Cygwin.
#911 added support for that compilation, and generated feedback to
ensure those changes did not break compilation on other platforms.
This PR incorporates those suggestions and the changes required by them.

I moved the LIB_BUNDLED and LIB_EXTERNAL default definitions from the configure.wrf postamble to the preamble so I could override them in configure.defaults on non-Windows platforms. This required copying the processing of the postamble in arch/Config.pl to where it processed the preamble so that the variables used in the definitions of LIB_BUNDLED and LIB_EXTERNAL would still be substituded.

One additional change was necessary: the system version of libjasper on cygwin is recent enough that struct jas_image_t has no inmem_ member, so that the reference to inmem_ must be commented out for Grib2 support to work. I did not include that change here as that is related to the version of the Jasper library on all platforms, not just cygwin.

Dependency versions for current test:

$ cygcheck -c cygwin gcc-core gcc-fortran libnetcdf-devel libnetcdf-fortran-devel openmpi libjasper-devel perl tcsh libhdf5-devel libopenmpi-devel openmpi $(cygcheck -cd | awk $'/libnetcdf[0-9]/ {print $1;}\n/libhdf5_[0-9]/ {print $1;}\n/libopenmpiusef08_[0-9]/ {print $1;}\n/libgomp[0-9]/ {print $1;}\n/libgfortran[0-9]/ {print $1;}\n/libjasper[0-9]/ {print $1;}\n/libnetcdf-fortran_[0-9]/ {print $1;}')
Cygwin Package Information
Package                 Version        Status
cygwin                  3.1.2-1        OK
gcc-core                9.2.0-1        OK
gcc-fortran             9.2.0-1        OK
libgfortran3            6.4.0-5        OK
libgfortran4            7.4.0-1        OK
libgfortran5            9.2.0-1        OK
libgomp1                9.2.0-1        OK
libhdf5-devel           1.10.6-1       OK
libhdf5_10              1.8.20-1       OK
libhdf5_101             1.10.2-1       OK
libhdf5_103             1.10.6-1       OK
libjasper-devel         2.0.14-1       OK
libjasper1              1.900.22-1     OK
libjasper4              2.0.14-1       OK
libnetcdf-devel         4.7.3-1        OK
libnetcdf-fortran-devel 4.5.2-1.1      OK
libnetcdf-fortran_6     4.4.4-3        OK
libnetcdf-fortran_7     4.5.2-1.1      OK
libnetcdf11             4.4.1.1-1      OK
libnetcdf13             4.6.1-2        OK
libnetcdf15             4.7.3-1        OK
libopenmpi-devel        3.1.5-1        OK
libopenmpiusef08_11     1.10.7-1       OK
libopenmpiusef08_40     3.1.5-1        OK
openmpi                 3.1.5-1        OK
perl                    5.26.3-2       OK
tcsh                    6.22.01-1      OK

ISSUE:
Replaces #911

LIST OF MODIFIED FILES:
M README
M arch/Config.pl
M arch/configure.defaults
M arch/postamble
M arch/preamble
M configure
M doc/README.cygwin.md
M external/io_netcdf/makefile
M tools/Makefile

TESTS CONDUCTED:

I used the script at
https://github.com/davegill/wrf-coop/blob/master/build.csh
to build and run WRF-ARW, WRF-NMM, and WRF-Chem on Windows Subsystem for Linux (debian) and on Cygwin.
The build script does not consistently work, but running the indicated commands manually usually works fine.

A regression test is successful, but an intermediate branch was required to circumvent
a broken develop branch (from which this is based).

WRF NUMBER OF TILES =   1
 No land surface physics option is used: sf_surface_physics =            0
Timing for main: time 0001-01-01_00:00:04 on domain   2:    1.12292 elapsed seconds
Timing for main: time 0001-01-01_00:00:08 on domain   2:    0.91971 elapsed seconds
Timing for main: time 0001-01-01_00:00:12 on domain   2:    0.91965 elapsed seconds
Timing for main: time 0001-01-01_00:00:12 on domain   1:    6.95076 elapsed seconds
Timing for main: time 0001-01-01_00:00:16 on domain   2:    0.92157 elapsed seconds
Timing for main: time 0001-01-01_00:00:20 on domain   2:    0.92541 elapsed seconds
Timing for main: time 0001-01-01_00:00:24 on domain   2:    0.92848 elapsed seconds
Timing for main: time 0001-01-01_00:00:24 on domain   1:    3.92829 elapsed seconds
Timing for main: time 0001-01-01_00:00:28 on domain   2:    0.93069 elapsed seconds
Timing for main: time 0001-01-01_00:00:32 on domain   2:    0.93314 elapsed seconds
Timing for main: time 0001-01-01_00:00:36 on domain   2:    0.93447 elapsed seconds
Timing for main: time 0001-01-01_00:00:36 on domain   1:    3.95087 elapsed seconds
Timing for main: time 0001-01-01_00:00:40 on domain   2:    0.94396 elapsed seconds
Timing for main: time 0001-01-01_00:00:44 on domain   2:    0.93751 elapsed seconds
Timing for main: time 0001-01-01_00:00:48 on domain   2:    0.93741 elapsed seconds
Timing for main: time 0001-01-01_00:00:48 on domain   1:    3.97565 elapsed seconds
Timing for main: time 0001-01-01_00:00:52 on domain   2:    0.93471 elapsed seconds
Timing for main: time 0001-01-01_00:00:56 on domain   2:    0.93481 elapsed seconds
Timing for main: time 0001-01-01_00:01:00 on domain   2:    0.93528 elapsed seconds
Timing for Writing wrfout_d02_0001-01-01_00:01:00 for domain        2:    0.03285 elapsed seconds
Timing for main: time 0001-01-01_00:01:00 on domain   1:    3.99505 elapsed seconds
Timing for Writing wrfout_d01_0001-01-01_00:01:00 for domain        1:    0.03183 elapsed seconds
d01 0001-01-01_00:01:00 wrf: SUCCESS COMPLETE WRF

RELEASE NOTE: Introduce a Windows-based WRF build and run via the free Unix-like CYGWIN environment.

DWesl added 6 commits October 1, 2019 08:11
Changed treatment of preamble parsing to make it more similar to
postamble parsing.  It would probably be a good idea to unify the
treatment of the three sections.  The other option is to change the
condition on the #NOWIN section in the postamble, but I'm not sure
that works as intended.

This way the default is specified before overrides, which is nice.
Mostly names of link libraries, one note about name for whereis/which.
`/usr/bin/peflags.exe --stack-reserve num` works similarly to
`ulimit -s num`, except per-executable rather than per-session.
Combine two lines of perl to one, and allow the
LIB_EXTERNAL=/LIB_BUNDLED= line to end with a backslash and still be
processed sensibly.
These should have been in earlier commits, but I didn't notice them
then.
@DWesl DWesl requested a review from a team as a code owner October 27, 2019 00:03
@DWesl
Copy link
Contributor Author

DWesl commented Nov 14, 2019

How would I exercise the HDF5-fortran dependency? I'd prefer not to build that unless I have some idea how to check that linking to it works, and the tests I was given on the last PR passed fine without it.

@davegill
Copy link
Contributor

davegill commented Dec 5, 2019

@DWesl

I am trying to understand "comments" in these two statements:

was used to build and run WRF em_real, em_chem, and nmm_real after each of the first five comments on WSL-Debian.

The same script was adapted to compile and run WRF em_real on Cygwin, requiring the changes in the sixth comment.

Can you give me an example of what this means?

@davegill
Copy link
Contributor

davegill commented Dec 5, 2019

@DWesl

How would I exercise the HDF5-fortran dependency? I'd prefer not to build that unless I have some idea how to check that linking to it works, and the tests I was given on the last PR passed fine without it.

The only place that we really use HDF5 is when we build the WRF model with netcdf4, and then we can use the HDF5 compression.

Since this PR is specifically addressing a Windows implementation, we could probably make the assumption that MASSIVE data sets are not really the focus, and therefore the compression from netcdf4 (via HDF5) would not be all that important.

Does that seem reasonable?

@DWesl
Copy link
Contributor Author

DWesl commented Dec 5, 2019

That seems reasonable.

Out of curiosity, does using HDF5 compression from netCDF4-fortran require the HDF5-Fortran libraries? I would think that netCDF4-Fortran calls netCDF4-C, which in turn calls HDF5-C, requiring -lnetcdff -lnetcdf -lhdf5 -lz, but not -lhdf5-fortran. I could be mistaken. I will try to experiment when I find time.

@DWesl

I am trying to understand "comments" in these two statements:

was used to build and run WRF em_real, em_chem, and nmm_real after each of the first five comments on WSL-Debian.

I couldn't figure out how to install Docker, so I installed Windows Subsystem for Linux, which allows running Linux binaries from within Windows, and installed a Debian distribution. I then found the scripts the Docker images were running and called those manually.

The same script was adapted to compile and run WRF em_real on Cygwin, requiring the changes in the sixth comment.

I called the same scripts from Cygwin, found that these scripts hard-code build options in the thirties, and changed these to reflect Cygwin having only one compiler. I found that the changes I had made to support Linux builds broke the Cygwin build, and added the sixth commit to get the Cygwin build working again.

Can you give me an example of what this means?

The Docker images refer to those mentioned in this comment:
#911 (comment)

@DWesl
Copy link
Contributor Author

DWesl commented Dec 14, 2019

I can successfully compile and run a Fortran program creating a compressed variable in a NetCDF file without -lhdf_fortran. Trying to inspect the file in ncdump results in an HDF5 error, though h5dump has no problems.

@davegill
Copy link
Contributor

@DWesl
Just for completeness, please verify that the following are true statement after this PR:

  1. The WRF model is now able to be built on a Windows machine using the cygwin shell emulator.
  2. The ARW, NMM, and Chemistry components were successfully built and run.

@DWesl
Copy link
Contributor Author

DWesl commented Jan 21, 2020 via email

@davegill
Copy link
Contributor

@DWesl
Provide a full name and affiliation in the commit message so that we may give you some "props" when we do the release.

@DWesl
Copy link
Contributor Author

DWesl commented Jan 21, 2020

I'm not entirely sure what you mean by "commit message". The only way I know to modify comment messages is to use git format-patch, do the edits, then use git am, which is somewhat messy.

In case you meant the SOURCE line in the first comment/PR template, I included name and department there.

Was it one of these you meant, or something else entirely?

@davegill
Copy link
Contributor

@DWesl
Daniel ... riiiiight
Thanks for pointing out the obvious

@DWesl
Copy link
Contributor Author

DWesl commented Jan 24, 2020

So is the edit on the first comment enough, or should I add that information to the individual commit messages as well?

@davegill
Copy link
Contributor

davegill commented Jan 24, 2020

@DWesl
Daniel,
We want you to edit the web-based PR commit message. If you go to [https://github.com//pull/1003], you will see this.

Screen Shot 2020-01-24 at 9 26 53 AM

Click on those three dots on the right, and select "Edit". Change this message (and save it). When we do a squash and merge, the other git-based commit messages will be lost.

@DWesl
Copy link
Contributor Author

DWesl commented Jan 24, 2020

What information do you need beyond name and department?

@davegill
Copy link
Contributor

@DWesl

What information do you need beyond name and department?

Nothing - that was my oops.

However, please modify / expand the PR message:

  1. I would remove the part about the specific github site for building WRF. This allows you to get rid of the confusing "first five comments" and "sixth comment" stuff.
  2. Explicitly state, as in there now, what you built and ran specifically: ARW, NMM, Chem.
  3. State the serial, openmp, MPI options your tested
  4. What compiler and did you use on windows (did you have to build those yourself)
  5. Would you explain the large replicated part in Config.pl - is it mandatory, cleaner, etc?
  6. Adding a README.cygwin in the doc directory for a cookbook set of instructions would be a great addition to this PR

@davegill
Copy link
Contributor

davegill commented Feb 1, 2020

@DWesl
Daniel,
I had the local sys admins install cygwin on a windows machine for me (with GNU 7.4 and a sufficient software stack to try to compile WRF - such as netcdf and the csh shell). I was unable to get the WRF code to successfully build. Many of the radiation schemes ended up with internal compiler errors.

Did you encounter such a problem? Any solution?

@DWesl
Copy link
Contributor Author

DWesl commented Feb 3, 2020

Odd. I noticed no internal compiler errors when I tried it a year ago. I will attempt to rebuild WRF and check what components are necessary, paying special attention to the radiation schemes. These would be phys/module_ra_*.F?

@davegill
Copy link
Contributor

davegill commented Feb 3, 2020

@DWesl
Daniel,
What version of the GNU gcc and gfortran did you choose? I am on 7.4. Maybe that is important.

@DWesl
Copy link
Contributor Author

DWesl commented Feb 3, 2020

I probably used whatever was current when I submitted. I am checking with gcc/gfortran 9.2.0, NetCDF-C 4.7.3, NetCDF-Fortran 4.5.2, openmpi 3.1.5, and HDF5 1.10.6.

@DWesl
Copy link
Contributor Author

DWesl commented Feb 3, 2020

I see no internal compiler errors in the mpif90 module_ra_*.f90 ... steps. The only problems I see at present are related to a gfortran version mismatch with OpenMPI, since that was built before the most recent update.

@DWesl
Copy link
Contributor Author

DWesl commented Feb 3, 2020

Radiation modules seem to be done compiling. The only errors I saw were derived from module_ra_cam_support.f90 having use mpi and gfortran complaining because mpi.mod was build with a different version of gfortran.

arch/Config.pl Outdated

$_ =~ s/CONFIGURE_DEP_LIB_PATH/$sw_dep_lib_path/g ;

$_ =~ s/CONFIGURE_PERL_PATH/$sw_perl_path/g ;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DWesl
Daniel,
Why do you replicate these 100+ lines of perl? This code already exist from about line 440 through 560.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Processing preamble. I didn't know what LIB_BUNDLED and LIB_EXTERNAL assumed to be defined, so I just moved everything there.
I'll try paring this down sometime tomorrow. It will likely be next week before I can test dmpar configurations, but I might be able to get to serial and smpar builds before then. I don't remember what I tested the first time through.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cleaned to just the variables referenced in LIB_EXTERNAL and LIB_BUNDLED

@DWesl
Copy link
Contributor Author

DWesl commented Feb 4, 2020 via email

@davegill
Copy link
Contributor

davegill commented Feb 5, 2020

@DWesl
Daniel,
I was able to get most of the code to compile once I switched to GNU 9.20 (from 7.4.0).

@DWesl
Copy link
Contributor Author

DWesl commented Feb 5, 2020

I just remembered there was a problem with equivalence statements on Cygwin GCC 7.4, which I filed with GCC here:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89079
They then pointed me at
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=47030
which caused this bug while trying to fix something else.

What isn't compiling still? So far I haven't run into anything big.

@davegill
Copy link
Contributor

davegill commented Feb 5, 2020

@DWesl

What isn't compiling still?

Daniel,
Now the issues are expected, as my netcdf was built with a different compiler. The gnu 9.2.0 fixed my internal compiler troubles.

Copy link
Contributor

@davegill davegill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved

@DWesl DWesl requested review from a team as code owners February 5, 2020 23:47
@davegill
Copy link
Contributor

davegill commented Feb 6, 2020

@DWesl
Daniel,
I would like to merge this on Friday. Please let me know when you are ready. We can add some small tweaks, etc in the couple weeks afterwards.

@DWesl
Copy link
Contributor Author

DWesl commented Feb 6, 2020

Okay. I will pare down the changes to arch/Config.pl tomorrow. The rest should work, and can be left to a later PR if it doesn't.

I moved the #NOWIN sections from the postamble to the preamble, and
copied the whole postamble processing section to match it.  This
removes the sections not needed to process the #NOWIN section of the
preamble.

The #NOWWIN sections set LIB_EXTERNAL and LIB_BUNDLED.
@davegill davegill merged commit 4365315 into wrf-model:develop Feb 7, 2020
@DWesl DWesl deleted the careful_cygwin_support branch February 8, 2020 21:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants