Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
7712481
Update data_locations.yaml w/ RAP bufr section. Add RAP file location…
Jan 9, 2023
be74621
Merge branch 'ufs-community:develop' into feature/add_rrfs_obs
ulmononian Jan 9, 2023
863c913
Add bufr as file_type option in retrieve_data.py.
Jan 9, 2023
9e0553c
Merge branch 'feature/add_rrfs_obs' of https://github.com/ulmononian/…
Jan 9, 2023
970a896
Add all RAP bufr file types to data_locations.yaml until wildcard/sin…
Jan 10, 2023
ac8f034
Update aws and local (Jet) RAP obs. file names in data_locations.yaml.
Jan 11, 2023
f35ac65
Add Observations section to jet.yaml (RAP obs. to start). Local (on-d…
Jan 11, 2023
b8bac25
Add FV3GFS prepbufr file information to jet machine file and data_loc…
Jan 11, 2023
e204055
Add new date/time string for GFS bufr/prepbufr file naming convention.
Jan 11, 2023
7acfc89
Updates to data_location.yaml to include obs. types and machine.
Jan 11, 2023
1dd9809
Remove RAP/FV3 GFS bufr file info from Jet machine file. Clean up dat…
Jan 12, 2023
4bad463
Merge branch 'ufs-community:develop' into feature/add_rrfs_obs
ulmononian Jan 12, 2023
9d59a76
Migrate machine-specific RAP/GFS file info to machine YAMLs.
Jan 12, 2023
be8424e
Merge branch 'feature/add_rrfs_obs' of https://github.com/ulmononian/…
Jan 12, 2023
6457a4d
Remove generic RAP JJOB/exregional script for now. Not part of this PR.
Jan 12, 2023
6bd5876
Adjust where obs are located in data_locations.yaml and introduce new…
ulmononian Jan 17, 2023
9b7bc02
Update retrieve_data.py
ulmononian Jan 18, 2023
9328425
Address comments to add_rrfs_obs PR: re-instate required args, rename…
Jan 25, 2023
fc5a2f7
Merge branch 'ufs-community:develop' into feature/add_rrfs_obs
ulmononian Jan 25, 2023
ce61a90
Some minor fixes missed on previous commit to data_locations.yaml.
Jan 25, 2023
b6ac165
Merge branch 'feature/add_rrfs_obs' of https://github.com/ulmononian/…
Jan 25, 2023
988ba43
Update data_locations.yml
ulmononian Jan 25, 2023
340b33e
Remove yyjjjhh entry from retrieval datetime formats.
Jan 25, 2023
73c9697
tMerge branch 'feature/add_rrfs_obs' of https://github.com/ulmononian…
Jan 25, 2023
f4a202c
Update data_locations.yml
ulmononian Jan 25, 2023
052d139
Fixes for machine and data_locations YAMLs. Revert file template line…
Jan 31, 2023
7d9e3a5
Add leading forward slash to GSI fix path in noaacloud.yaml.
Jan 31, 2023
620920b
More reversions and revisions to retrieve_data.py
Feb 6, 2023
5bd6810
Merge branch 'ufs-community:develop' into feature/add_rrfs_obs
ulmononian Feb 10, 2023
89dc7f4
Update test_retrieve_data.py
ulmononian Feb 10, 2023
a9d4de5
Update test_retrieve_data.py (again)
ulmononian Feb 10, 2023
578777d
Update test_retrieve_data.py
ulmononian Feb 10, 2023
4fbcbba
Update test_retrieve_data.py
ulmononian Feb 10, 2023
818d62e
Change --anl_or_fcst instances to --file_set in exregional_get_extrn_…
Feb 10, 2023
57a135c
Update exregional_get_extrn_mdl_files.sh
ulmononian Feb 10, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
117 changes: 117 additions & 0 deletions parm/data_locations.yml
Original file line number Diff line number Diff line change
Expand Up @@ -271,3 +271,120 @@ NAM:
fcst:
- nam.t{hh}z.awphys{fcst_hr:02d}.tm00.grib2

##########################
##########################
### Observation Data ###
##########################
##########################

GFS_obs:
hpss:
protocol: htar
archive_format: zip
archive_path:
- /BMC/fdr/Permanent/{yyyy}/{mm}/{dd}/data/grids/gfs/prepbufr
archive_file_names:
prepbufr:
obs:
- "{yyyymmdd}0000.zip"
tcvitals:
obs:
- "{yyyymmdd}0000.zip"
file_names:
prepbufr:
obs:
- "{yy}{jjj}{hh}00.gfs.t{hh}z.prepbufr.nr"
tcvitals:
obs:
- "{yy}{jjj}{hh}00.gfs.t{hh}z.syndata.tcvitals.tm00"

RAP_obs:
hpss:
protocol: htar
archive_format: zip
archive_path:
- /BMC/fdr/Permanent/{yyyy}/{mm}/{dd}/data/grids/rap/obs
archive_internal_dir:
- ./
archive_file_names:
- "{yyyymmddhh}00.zip"
file_names:
obs:
- "{yyyymmddhh}.rap.t{hh}z.prepbufr.tm00"
- "{yyyymmddhh}.rap.t{hh}z.1bamua.tm00.bufr_d"
- "{yyyymmddhh}.rap.t{hh}z.1bhrs4.tm00.bufr_d"
- "{yyyymmddhh}.rap.t{hh}z.1bmhs.tm00.bufr_d"
- "{yyyymmddhh}.rap.t{hh}z.amsr2.tm00.bufr_d"
- "{yyyymmddhh}.rap.t{hh}z.ascatt.tm00.bufr_d"
- "{yyyymmddhh}.rap.t{hh}z.ascatw.tm00.bufr_d"
- "{yyyymmddhh}.rap.t{hh}z.atms.tm00.bufr_d"
- "{yyyymmddhh}.rap.t{hh}z.atmsdb.tm00.bufr_d"
- "{yyyymmddhh}.rap.t{hh}z.crisf4.tm00.bufr_d"
- "{yyyymmddhh}.rap.t{hh}z.crsfdb.tm00.bufr_d"
- "{yyyymmddhh}.rap.t{hh}z.esamua.tm00.bufr_d"
- "{yyyymmddhh}.rap.t{hh}z.esatms.tm00.bufr_d"
- "{yyyymmddhh}.rap.t{hh}z.eshrs3.tm00.bufr_d"
- "{yyyymmddhh}.rap.t{hh}z.esiasi.tm00.bufr_d"
- "{yyyymmddhh}.rap.t{hh}z.esmhs.tm00.bufr_d"
- "{yyyymmddhh}.rap.t{hh}z.gpsipw.tm00.bufr_d"
- "{yyyymmddhh}.rap.t{hh}z.gpsipw.tm00.bufr_d.nr"
- "{yyyymmddhh}.rap.t{hh}z.gsrasr.tm00.bufr_d"
- "{yyyymmddhh}.rap.t{hh}z.gsrcsr.tm00.bufr_d"
- "{yyyymmddhh}.rap.t{hh}z.iasidb.tm00.bufr_d"
- "{yyyymmddhh}.rap.t{hh}z.lghtng.tm00.bufr_d"
- "{yyyymmddhh}.rap.t{hh}z.lghtng.tm00.bufr_d.nr"
- "{yyyymmddhh}.rap.t{hh}z.lgycld.tm00.bufr_d"
- "{yyyymmddhh}.rap.t{hh}z.mtiasi.tm00.bufr_d"
- "{yyyymmddhh}.rap.t{hh}z.nexrad.tm00.bufr_d"
- "{yyyymmddhh}.rap.t{hh}z.rassda.tm00.bufr_d"
- "{yyyymmddhh}.rap.t{hh}z.satwnd.tm00.bufr_d"
- "{yyyymmddhh}.rap.t{hh}z.sevasr.tm00.bufr_d"
- "{yyyymmddhh}.rap.t{hh}z.ssmisu.tm00.bufr_d"
aws:
protocol: download
url: https://noaa-rap-pds.s3.amazonaws.com/rap.{yyyymmdd}
file_names:
obs:
- rap.t{hh}z.prepbufr.tm00.nr
- rap.t{hh}z.1bamua.tm00.bufr_d
- rap.t{hh}z.1bhrs4.tm00.bufr_d
- rap.t{hh}z.1bmhs.tm00.bufr_d
- rap.t{hh}z.amsr2.tm00.bufr_d
- rap.t{hh}z.ascatt.tm00.bufr_d
- rap.t{hh}z.ascatw.tm00.bufr_d
- rap.t{hh}z.atms.tm00.bufr_d
- rap.t{hh}z.atmsdb.tm00.bufr_d
- rap.t{hh}z.crisf4.tm00.bufr_d
- rap.t{hh}z.crsfdb.tm00.bufr_d
- rap.t{hh}z.esamua.tm00.bufr_d
- rap.t{hh}z.esatms.tm00.bufr_d
- rap.t{hh}z.eshrs3.tm00.bufr_d
- rap.t{hh}z.esiasi.tm00.bufr_d
- rap.t{hh}z.esmhs.tm00.bufr_d
- rap.t{hh}z.gpsipw.tm00.bufr_d
- rap.t{hh}z.gpsipw.tm00.bufr_d.nr
- rap.t{hh}z.gsrasr.tm00.bufr_d
- rap.t{hh}z.gsrcsr.tm00.bufr_d
- rap.t{hh}z.iasidb.tm00.bufr_d
- rap.t{hh}z.lghtng.tm00.bufr_d
- rap.t{hh}z.lghtng.tm00.bufr_d.nr
- rap.t{hh}z.lgycld.tm00.bufr_d
- rap.t{hh}z.mtiasi.tm00.bufr_d
- rap.t{hh}z.nexrad.tm00.bufr_d
- rap.t{hh}z.rassda.tm00.bufr_d
- rap.t{hh}z.satwnd.tm00.bufr_d
- rap.t{hh}z.sevasr.tm00.bufr_d
- rap.t{hh}z.ssmisu.tm00.bufr_d

###########################
###########################
####### Fix Files #########
###########################
###########################

GSI-FIX:
remote:
protocol: download
url: https://epic-sandbox-srw.s3.amazonaws.com
file_names:
- gsi-fix.22.07.27.tar.gz
8 changes: 4 additions & 4 deletions scripts/exregional_get_extrn_mdl_files.sh
Original file line number Diff line number Diff line change
Expand Up @@ -57,9 +57,9 @@ or lateral boundary conditions for the FV3.
set -x
if [ "${ICS_OR_LBCS}" = "ICS" ]; then
if [ ${TIME_OFFSET_HRS} -eq 0 ] ; then
anl_or_fcst="anl"
file_set="anl"
else
anl_or_fcst="fcst"
file_set="fcst"
fi
fcst_hrs=${TIME_OFFSET_HRS}
file_names=${EXTRN_MDL_FILES_ICS[@]}
Expand All @@ -69,7 +69,7 @@ if [ "${ICS_OR_LBCS}" = "ICS" ]; then
input_file_path=${EXTRN_MDL_SOURCE_BASEDIR_ICS:-$EXTRN_MDL_SYSBASEDIR_ICS}

elif [ "${ICS_OR_LBCS}" = "LBCS" ]; then
anl_or_fcst="fcst"
file_set="fcst"
first_time=$((TIME_OFFSET_HRS + LBC_SPEC_INTVL_HRS))
last_time=$((TIME_OFFSET_HRS + FCST_LEN_HRS))
fcst_hrs="${first_time} ${last_time} ${LBC_SPEC_INTVL_HRS}"
Expand Down Expand Up @@ -151,7 +151,7 @@ fi
cmd="
python3 -u ${USHdir}/retrieve_data.py \
--debug \
--anl_or_fcst ${anl_or_fcst} \
--file_set ${file_set} \
--config ${PARMdir}/data_locations.yml \
--cycle_date ${EXTRN_MDL_CDATE} \
--data_stores ${data_stores} \
Expand Down
3 changes: 3 additions & 0 deletions ush/machine/hera.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -32,3 +32,6 @@ platform:
FIXsfc: /scratch2/BMC/det/UFS_SRW_App/develop/fix/fix_sfc_climo
FIXshp: /scratch2/BMC/det/UFS_SRW_App/develop/NaturalEarth
EXTRN_MDL_DATA_STORES: hpss aws nomads
data:
obs:
RAP_obs: /scratch2/BMC/public/data/grids/rap/obs
5 changes: 5 additions & 0 deletions ush/machine/jet.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -40,3 +40,8 @@ data:
netcdf: /public/data/grids/gfs/anl/netcdf
RAP: /public/data/grids/rap/full/wrfprs/grib2
HRRR: /public/data/grids/hrrr/conus/wrfprs/grib2
obs:
RAP_obs: /public/data/grids/rap/obs
GFS_obs:
prepbufr: /public/data/grids/gfs/prepbufr
tcvitals: /public/data/grids/gfs/bufr
1 change: 1 addition & 0 deletions ush/machine/noaacloud.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ platform:
FIXorg: /contrib/EPIC/UFS_SRW_App/develop/fix/fix_orog
FIXsfc: /contrib/EPIC/UFS_SRW_App/develop/fix/fix_sfc_climo
FIXshp: /contrib/EPIC/UFS_SRW_App/develop/NaturalEarth
FIXgsi: /contrib/EPIC/UFS_SRW_App/develop/fix/fix_gsi
EXTRN_MDL_DATA_STORES: aws nomads
data:
ics_lbcs:
Expand Down
47 changes: 30 additions & 17 deletions ush/retrieve_data.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@
import shutil
import subprocess
import sys
import glob
from textwrap import dedent
import time
from copy import deepcopy
Expand Down Expand Up @@ -81,7 +82,7 @@ def copy_file(source, destination, copy_cmd):
"""

if not os.path.exists(source):
logging.info(f"File does not exist on disk \n {source}")
logging.info(f"File does not exist on disk \n {source} \n try using: --input_file_path <your_path>")
return False

# Using subprocess here because system copy is much faster than
Expand Down Expand Up @@ -224,7 +225,7 @@ def fill_template(template_str, cycle_date, templates_only=False, **kwargs):
if templates_only:
return f'{",".join((format_values.keys()))}'
return template_str.format(**format_values)


def create_target_path(target_path):

Expand Down Expand Up @@ -306,7 +307,7 @@ def get_file_templates(cla, known_data_info, data_store, use_cla_tmpl=False):
if isinstance(file_templates, dict):
if cla.file_type is not None:
file_templates = file_templates[cla.file_type]
file_templates = file_templates[cla.anl_or_fcst]
file_templates = file_templates[cla.file_set]
if not file_templates:
msg = "No file naming convention found. They must be provided \
either on the command line or on in a config file."
Expand Down Expand Up @@ -478,7 +479,7 @@ def hpss_requested_files(cla, file_names, store_specs, members=-1, ens_group=-1)
archive_file_names = archive_file_names[cla.file_type]

if isinstance(archive_file_names, dict):
archive_file_names = archive_file_names[cla.anl_or_fcst]
archive_file_names = archive_file_names[cla.file_set]

unavailable = {}
existing_archives = {}
Expand All @@ -505,7 +506,7 @@ def hpss_requested_files(cla, file_names, store_specs, members=-1, ens_group=-1)

archive_internal_dirs = store_specs.get("archive_internal_dir", [""])
if isinstance(archive_internal_dirs, dict):
archive_internal_dirs = archive_internal_dirs.get(cla.anl_or_fcst, [""])
archive_internal_dirs = archive_internal_dirs.get(cla.file_set, [""])

# which_archive matters for choosing the correct file names within,
# but we can safely just try all options for the
Expand Down Expand Up @@ -683,6 +684,7 @@ def setup_logging(debug=False):
user-defined level for logging in the script."""

level = logging.WARNING
level = logging.INFO
if debug:
level = logging.DEBUG

Expand Down Expand Up @@ -743,7 +745,7 @@ def main(argv):
cla.members = arg_list_to_range(cla.members)

setup_logging(cla.debug)
print("Running script retrieve_data.py with args:\n", f"{('-' * 80)}\n{('-' * 80)}")
print("Running script retrieve_data.py with args:", f"\n{('-' * 80)}\n{('-' * 80)}")
for name, val in cla.__dict__.items():
if name not in ["config"]:
print(f"{name:>15s}: {val}")
Expand Down Expand Up @@ -896,30 +898,33 @@ def parse_args(argv):

# Required
parser.add_argument(
"--anl_or_fcst",
choices=("anl", "fcst"),
help="Flag for whether analysis or forecast \
files should be gathered",
"--file_set",
choices=("anl", "fcst", "obs", "fix"),
help="Flag for whether analysis, forecast, \
fix, or observation files should be gathered",
required=True,
)
parser.add_argument(
"--config",
help="Full path to a configuration file containing paths and \
naming conventions for known data streams. The default included \
in this repository is in parm/data_locations.yml",
required=True,
type=config_exists,

)
parser.add_argument(
"--cycle_date",
help="Cycle date of the data to be retrieved in YYYYMMDDHH \
format.",
required=True,
required=False, # relaxed this arg option, and set a benign value when not used
default="1999123100",
type=to_datetime,
)
parser.add_argument(
"--data_stores",
help="List of priority data_stores. Tries first list item \
first. Choices: hpss, nomads, aws, disk",
first. Choices: hpss, nomads, aws, disk, remote.",
nargs="*",
required=True,
type=to_lower,
Expand All @@ -928,14 +933,17 @@ def parse_args(argv):
"--external_model",
choices=(
"FV3GFS",
"GFS_obs",
"GDAS",
"GEFS",
"GSMGFS",
"HRRR",
"NAM",
"RAP",
"RAPx",
"RAP_obs",
"HRRRx",
"GSI-FIX",
),
help="External model label. This input is case-sensitive",
required=True,
Expand All @@ -946,25 +954,30 @@ def parse_args(argv):
one fhr will be processed. If 2 or 3 arguments, a sequence \
of forecast hours [start, stop, [increment]] will be \
processed. If more than 3 arguments, the list is processed \
as-is.",
as-is. default=[0]",
nargs="+",
required=True,
required=False, # relaxed this arg option, and set a default value when not used
default=[0],
type=int,
)
parser.add_argument(
"--output_path",
help="Path to a location on disk. Path is expected to exist.",
required=True,
required=True,
type=os.path.abspath,
)
parser.add_argument(
"--ics_or_lbcs",
choices=("ICS", "LBCS"),
help="Flag for whether ICS or LBCS.",
required=True,
required=True
)

# Optional
parser.add_argument(
"--version", # for file patterns that dont conform to cycle_date [TBD]
help="Version number of package to download, e.g. x.yy.zz",
)
parser.add_argument(
"--symlink",
action="store_true",
Expand All @@ -984,7 +997,7 @@ def parse_args(argv):
)
parser.add_argument(
"--file_type",
choices=("grib2", "nemsio", "netcdf"),
choices=("grib2", "nemsio", "netcdf", "prepbufr", "tcvitals"),
help="External model file format",
)
parser.add_argument(
Expand Down
Loading