Skip to content

Buddy check

Thomas Nipen edited this page May 11, 2022 · 32 revisions

The buddy check compares an observation against its neighbours (i.e. buddies) and flags outliers.

The check looks for buddies in a neighbourhood specified by radius [m], which is the radius of a circle around the observation to be checked. A minimum number of observations (num_min) is required to be available inside the circle and the range of elevations in the circle must not exceed max_elev_diff meters . The number of iterations is set by num_iterations.

The buddy check flags observations if the (absolute value of the) difference between the observations and the average of the neighbours normalized by the standard deviation in the circle is greater than a predefined threshold. If the standard deviation of values in the neighbourhood is less than min_std, then a value of min_std is used instead. min_std should be roughly equal to the standard deviation of the error of a typical observation. If it is too low, then too many observations will be flaged in areas where the variability is low.

In the case of temperature, elevation differences should be taken into account because all observations are reported to the elevation of the centroid observation before averaging. A linear vertical rate of change of temperature can be set by elev_gradient. A recommended value is elev_gradient=-0.0065 °C/m (as defined in the ICAO international standard atmosphere). If max_elev_diff is negative then don't check elevation difference and do not correct the observed values.

It is possible to specify an optional vector obs_to_check to specify whether an observation should be checked. The length of obs_to_check must be the same as the vector with the values to check. The buddy check is performed only for values where the corresponding obs_to_check element is set to 1, while all values are always used as buddies for checking the data quality.

Input parameters

Parameter Type Unit Description
points Points Point object with station position
values vec ou Observations
radius vec m Search radius
num_min int The minimum number of buddies a station can have
threshold float σ the variance threshold for flagging a station
max_elev_diff float m the maximum difference in elevation for a buddy (if negative will not check for heigh difference)
elev_gradient float ou/m linear elevation gradient with height
min_std float If the standard deviation of values in a neighborhood are less than min_std, min_std will be used instead
num_iterations int The number of iterations to perform
obs_to_check* ivec Observations that will be checked (since can pass in observations that will not be checked). 1=check the corresponding observation

* optional, ou = Unit of the observation, σ = Standard deviations

Returned parameters

Parameter Type Unit Description
flags ivec Quality control flag (0=OK, 1=bad)

Python examples

radius = np.full(points.size(), 5000)
num_min = np.full(points.size(), 5)
threshold = 2
max_elev_diff = 200
elev_gradient = -0.0065
min_std = 1
num_iterations = 5

flags = titanlib.buddy_check(
    points,
    temp_obs,
    radius,
    num_min,
    threshold,
    max_elev_diff,
    elev_gradient,
    min_std,
    num_iterations,
)

R examples

# R code
radius <- rep( 500000, npoints)
num_min <- rep( 5, npoints)
threshold <- 2
max_elev_diff <- 200
elev_gradient <- -0.0065
min_std <- 1
num_iterations <- 5
points <- Points(lats, lons, elevs)
flags <- buddy_check( points, temp_obs, radius, num_min, threshold, 
                      max_elev_diff, elev_gradient, min_std,
                      num_iterations)