Skip to content

UFS-dev PR#126#121

Merged
grantfirl merged 7 commits into
NCAR:mainfrom
grantfirl:ufs-dev-PR126
Mar 8, 2024
Merged

UFS-dev PR#126#121
grantfirl merged 7 commits into
NCAR:mainfrom
grantfirl:ufs-dev-PR126

Conversation

@grantfirl
Copy link
Copy Markdown
Collaborator

@grantfirl grantfirl commented Feb 6, 2024

Identical to ufs-community#1993 (no BL change)

Also contains:
ufs-community#1967 (BL changes and input changes)
ufs-community#1990 (BL changes)

@grantfirl
Copy link
Copy Markdown
Collaborator Author

Expected RT failures due to ufs-community#1967:

cpld_control_gfsv17
cpld_control_gfsv17_iau
cpld_restart_gfsv17
cpld_restart_gfsv17
cpld_mpi_gfsv17
cpld_debug_gfsv17
cpld_control_pdlib_p8
cpld_restart_pdlib_p8
cpld_mpi_pdlib_p8
cpld_debug_pdlib_p8

@grantfirl
Copy link
Copy Markdown
Collaborator Author

Expected RT failures for ufs-community#1990:

control_wrtGauss_netcdf_parallel
control_wrtGauss_netcdf_parallel_debug

@grantfirl grantfirl marked this pull request as ready for review February 23, 2024 19:45
@grantfirl grantfirl requested a review from mkavulich February 23, 2024 19:45
@grantfirl
Copy link
Copy Markdown
Collaborator Author

@mkavulich Ready to test

@mkavulich mkavulich added hera-RT Run regression test on Hera machine. TESTING ONLY, NOT FOR GENERAL PRS YET and removed hera-RT Run regression test on Hera machine. TESTING ONLY, NOT FOR GENERAL PRS YET labels Feb 23, 2024
@mkavulich
Copy link
Copy Markdown
Collaborator

Machine: hera
Job: RT
[RT] Repo location: /scratch1/BMC/gmtb/CCPP_regression_testing/NCAR_ufs-weather-model/beta/run//1714160685/20240223210515/ufs-weather-model
Regression test successful on hera!

@mkavulich
Copy link
Copy Markdown
Collaborator

Well that's not correct. @grantfirl from the logs it looks like the baseline directory isn't correct: one of the weird hard-coded things I haven't been able to resolve yet is that the full path to the baseline is stored in the repository, so you'll need to move /scratch1/BMC/gmtb/CCPP_regression_testing/NCAR_ufs-weather-model/beta/baselines/main-20240222 to /scratch1/BMC/gmtb/CCPP_regression_testing/NCAR_ufs-weather-model/baselines/main-20240222, since it was created without group permissions I can't move it.

@grantfirl
Copy link
Copy Markdown
Collaborator Author

grantfirl commented Feb 23, 2024

@mkavulich OK, I've moved that directory. Do we need to also link it back to beta/baselines?

@mkavulich
Copy link
Copy Markdown
Collaborator

Nope, the new location should be seen now. Starting the new test.

@mkavulich mkavulich added hera-RT Run regression test on Hera machine. TESTING ONLY, NOT FOR GENERAL PRS YET and removed hera-RT Run regression test on Hera machine. TESTING ONLY, NOT FOR GENERAL PRS YET labels Feb 23, 2024
@mkavulich
Copy link
Copy Markdown
Collaborator

Automated RT Failure Notification
Machine: hera
Job: RT
[RT] Repo location: /scratch1/BMC/gmtb/CCPP_regression_testing/NCAR_ufs-weather-model/beta/run//1714160685/20240223223511/ufs-weather-model
[RT] Error: Test 034 control_wrtGauss_netcdf_parallel_intel FAIL Tries: 2
[RT] Error: Test 085 control_wrtGauss_netcdf_parallel_debug_intel FAIL Tries: 2
[RT] Log file shows failures.
[RT] Please obtain logs from /scratch1/BMC/gmtb/CCPP_regression_testing/NCAR_ufs-weather-model/beta/run//1714160685/20240223223511/ufs-weather-model

@grantfirl grantfirl added the hera-RT Run regression test on Hera machine. TESTING ONLY, NOT FOR GENERAL PRS YET label Feb 26, 2024
@grantfirl
Copy link
Copy Markdown
Collaborator Author

@mkavulich Not all tests were failing as expected. I think that I forgot to stage new input data that should cause the cpld* tests to fail. I restarted the tests.

@mkavulich mkavulich removed the hera-RT Run regression test on Hera machine. TESTING ONLY, NOT FOR GENERAL PRS YET label Feb 26, 2024
@mkavulich
Copy link
Copy Markdown
Collaborator

I was literally just logging in to check on that. Good to know there's an explanation; I'll look for the next round of results and if they look as expected I can start the baseline.

@mkavulich
Copy link
Copy Markdown
Collaborator

Automated RT Failure Notification
Machine: hera
Job: RT
[RT] Repo location: /scratch1/BMC/gmtb/CCPP_regression_testing/NCAR_ufs-weather-model/beta/run//1714160685/20240226214017/ufs-weather-model
[RT] Error: Test 034 control_wrtGauss_netcdf_parallel_intel FAIL Tries: 2
[RT] Error: Test 085 control_wrtGauss_netcdf_parallel_debug_intel FAIL Tries: 2
[RT] Log file shows failures.
[RT] Please obtain logs from /scratch1/BMC/gmtb/CCPP_regression_testing/NCAR_ufs-weather-model/beta/run//1714160685/20240226214017/ufs-weather-model

@mkavulich
Copy link
Copy Markdown
Collaborator

@grantfirl Those other tests seem to still be passing. I am actually a bit confused by the PR message in ufs-community#1967

This PR does not change any baselines, but new baselines are required for the platforms for where the following intel tests are enabled:

This seems to imply that they don't think there will be baseline changes? But then why would new baselines be needed? It's unclear to me.

@grantfirl
Copy link
Copy Markdown
Collaborator Author

@grantfirl Those other tests seem to still be passing. I am actually a bit confused by the PR message in ufs-community#1967

This PR does not change any baselines, but new baselines are required for the platforms for where the following intel tests are enabled:

This seems to imply that they don't think there will be baseline changes? But then why would new baselines be needed? It's unclear to me.

OK, ya, the PR discussions are a bit hard to follow whether we should expect failures on Hera or not. It looks like maybe the failures only show up on different machines? I think that we should just go ahead and start new baselines.

@grantfirl grantfirl added the hera-BL Create new baselines on Hera machine. TESTING ONLY, NOT FOR GENERAL PRS YET label Feb 28, 2024
@mkavulich mkavulich removed the hera-BL Create new baselines on Hera machine. TESTING ONLY, NOT FOR GENERAL PRS YET label Feb 28, 2024
@grantfirl
Copy link
Copy Markdown
Collaborator Author

grantfirl commented Feb 29, 2024

@mkavulich Throughput for us is REALLY slow. I'm guessing it's because we've pushed a lot of RTs through, although, we haven't exhausted our resources yet.

Report User Report for: Grant.Firl
Report Run: Thu 29 Feb 2024 04:51:34 PM UTC
Report Period Beginning: Thu 01 Feb 2024 12:00:00 AM UTC
Report Period Ending: Fri 01 Mar 2024 12:00:00 AM UTC
Percentage of Period Elapsed: 99.0%
Percentage of Period Remaining: 1.0%

Project NormShares FairShare Rank Allocation Cr-HrsUsed Windfall TotalUsed %Used Jobs


gmtb 0.003499 0.335813 81/90 158,394 148,607 0 148,607 93.82% 7,661

@mkavulich
Copy link
Copy Markdown
Collaborator

Machine: hera
Job: BL
[BL] Repo location: /scratch1/BMC/gmtb/CCPP_regression_testing/NCAR_ufs-weather-model/beta/run//1714160685/20240228213517/ufs-weather-model
Baseline creation successful on hera
[RT] Repo location: /scratch1/BMC/gmtb/CCPP_regression_testing/NCAR_ufs-weather-model/beta/run//1714160685/20240229095554/ufs-weather-model
Regression test successful on hera!

@mkavulich
Copy link
Copy Markdown
Collaborator

Looks good once submodule is updated

@grantfirl grantfirl merged commit 448ee8e into NCAR:main Mar 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants