Skip to content

First guess test

Cristian Lussana edited this page Dec 16, 2023 · 11 revisions

fgt (FGT, First Guess Test) implements a streamlined version of the Spatial Consistency Test (SCT), which is less computationally intensive compared to OI-based (Optimal Interpolation) SCTs because it does not require complex operations like matrix inversions. This makes fgt a faster alternative. While the algorithms of fgt are akin to those used in sct_resistant, the key difference lies in FGT's comparison of observations against background values, bypassing the additional OI step of spatial analysis required in sct_resistant. Consequently, FGT could serve as an efficient pre-processing step in the quality control chain, quickly weeding out likely erroneous observations before more detailed analysis by sct or sct_resistant.

Input and output parameters for fgt are similar to sct_resistant, and for detailed information, users are referred to the corresponding wiki page Spatial-consistency-test-resistant. However, fgt is distinct in its allowance for the background_uncertainties p-vector. This optional argument becomes relevant when 'background_elab_type' is set to "External," allowing users to quantify the uncertainty of each background value input. By default, this value is set to 1, implying no significant impact on the chi calculation in the core fgt algorithm unless specified otherwise.

For practical application, consider a scenario where background values are sourced from an ensemble of gridded fields produced by a numerical weather prediction model. In this case, background_values might be the ensemble means, with background_uncertainties set to the corresponding standard deviations.

Furthermore, FGT can facilitate a buddy check. By deriving background_values and background_uncertainties from observations surrounding the ones under scrutiny, FGT effectively conducts a buddy check. It's important to note that all background types in SCT_Resistant, except "External", inherently perform a buddy check, as outlined in the Spatial-consistency-test-resistant documentation.

Input parameters

Parameter Type Unit Description
points Points Point object with station position
values vec ou Observations
obs_to_check Observations that will be checked (since can pass in observations that will not be checked). 1=check the corresponding observation
background_values external background value (not used if background_elab_type!=external)
background_uncertainties uncertainty of the external background value (not used if background_elab_type!=external, optional when background_elab_type=external)
background_elab_type enum one of: vertical_profile, vertical_profile_Theil_Sen, mean_outer_circle, external
num_min_outer int Minimum number of observations inside the outer circle to compute FGT
num_max_outer int Maximum number of observations inside the outer circle used
num_min_prof int Minimum number of observations to compute vertical profile
inner_radius float m Radius for flagging
outer_radius float m Radius for computing OI and background
num_iterations int Number of FGT iterations
min_elev_diff float m Minimum elevation difference to compute vertical profile
value_mina vec ou Minimum admissible value
value_maxa vec ou Maximum admissible value
value_minv vec ou Minimum valid value
value_maxv vec ou Maximum valid value
tpos vec FGT-score threshold. Positive deviation allowed
tneg vec FGT-score threshold. Negative deviation allowed
debug Verbose output

ou = Unit of the observation

Returned parameters

Parameter Type Unit Description
flags ivec Quality control flag (0=OK, 1=bad)
scores vec FGT-score. The higher the score, the more likely is the presence of a gross measurement error

Algorithm

The main algorithm is the same as for sct_resistant, except for the "core" part, where the SCT-core is replaced by FGT-core. This algorithm is a robust method for detecting outliers, considering both the absolute deviation of observations from the background and the variability of these deviations. FGT-core algorithm is:

  1. Collecting Statistics
  • Calculate chi for each observation, where chi = abs(observation - background) / background_uncertainty.
  • Only include observations where the background is within a predefined range of admissible values.
  1. Determining the Test Version
  • If the user chooses the basic version (basic = true), then z = chi.
  • Otherwise, normalize chi to get z, where z = (chi - median(chi)) / interquartile_range(chi).
  1. Testing Observations
  • Check if any of the p_test observations to be tested have a z value outside the specified thresholds (tpos and tneg).
  • If yes, flag the observation with the worst z value as bad. This score is returned by fgt.
  • If no, flag all observations as good ones.

Example (R-code)

N = length(lats)
obs_to_check = rep(1, N)
background_values = rep(0, N)
background_uncertainties = rep(1, N)
background_elab_type = "MedianOuterCircle"
N = length(lats)
num_min_outer = 3
num_max_outer = 10
inner_radius = 20000
outer_radius = 50000
num_iterations = 10
num_min_prof = 0
min_elev_diff = 100
min_horizontal_scale = 250 
max_horizontal_scale = 100000
kth_closest_obs_horizontal_scale = 2
tpos = rep(1,N) * 5
tneg = rep(1,N) * 5
values_mina = temp_obs - 20
values_maxa = temp_obs + 20
values_minv = temp_obs - 1
values_maxv = temp_obs + 1
debug = T
basic = T
points = Points(lats, lons, elevs)
res<-fgt( points, temp_obs, obs_to_check, background_values, background_uncertainties, background_elab_type, num_min_outer, num_max_outer, inner_radius, outer_radius, num_iterations, num_min_prof, min_elev_diff, values_mina, values_maxa, values_minv, values_maxv, tpos, tneg, debug, basic)
# flags (0=good; 1=bad)
res[[1]]
# z-score (only when flag==1)
res[[2]]