-
1_all_raw_data.yml
: Pulls down the raw data files from google drive. Depends on therawfile_to_id_crosswalk.csv
-
1_all_wqp
: Pulls down data for each NHD lake, based on an existing crosswalk of NHD sites to WQP sites. Depends on themaster_lake.csv
file for the NHD lakes to check. -
1_data_s3_assimilate
: Munges all the files from1_all_raw_data.yml
into a standard format, with four columns: DateTime, depth in meters, temp in Celsius, and a column named corresponding to a state ID system (i.e. DOW). The last column will be used to regroup the data into seperate NHD files. Each file requires it's own parsing function (named parse_), since there is no standard format. The parsing function should change units if necessary, and add in ID if none is provided in the original file. Depends onrawfile_to_id_crosswalk.csv
for the list of files to parse, and also needs to rebuild if1_all_raw_data changes
. -
2_get_model_files
: Sets of nml and driver files for GLM for each NHD lake. Only depends onmaster_lake.csv
. The cleaning function in3_regroup_data
depends on this step, since it filters observations based on the maximum depth from thenml
file for each lake. -
3_regroup_data
: Takes the munged files from1_data_s3_assimilate
and regroups the data into files by NHD ID. It also does some 'universal' cleaning, like checking for duplicates and filtering out data points beyond the maximum lake depth, and beyond the start, stop and max temp values set in the base model config file. Depends on thelake_master.csv
to relate state IDs to NHDs, and needs to rebuild if1_data_s3_assimilate
changes.