De-spiking algorithm

From Atomix

An algorithm that seems to effectively remove spikes from shear-probe data uses the following steps.

  1. Data are high-pass filtered with a first-order Butterworth filter with a cutoff frequency of [math]\displaystyle{ 0.1\, \mathrm{Hz} }[/math] to remove any offset and very-low frequency signals.
  2. The shear data are rectified by taking their absolute value.
  3. A copy of the rectified shear-probe data is smoothed with a first-order low-pass filter with a cutoff frequency, that is usually in the range of [math]\displaystyle{ 0.25 }[/math] to [math]\displaystyle{ 2\ \mathrm{Hz} }[/math].
  4. Those samples for which the ratio of the absolute to the smoothed absolute shear exceeds a threshold (8 is a typical choice), are identified as spikes.
  5. A number [math]\displaystyle{ N }[/math] of samples after a spike and [math]\displaystyle{ N/2 }[/math] samples before a spike are replace by a constant value equal to the mean shear of an interval of one-half second before and after this area of replacement.

The purpose of the low-pass filter is to establish the level of shear in a neighbourhood of duration that is roughly equal to the inverse of the low-pass filter cutoff frequency. A shear sample is anomalous if its magnitude exceeds the typical magnitude of its neighbourhood by more than a factor of the threshold. Thus, if the variance of shear is small, a small anomaly is detected, while the same anomaly remains undetected if the variance of shear is large. That is, only anomalies that have the potential to bias the variance are removed.

What is a suitable neighbourhood and low-pass cutoff frequency? Turbulent patches in the ocean seem to seldom be thinner than about [math]\displaystyle{ 0.5\ \mathrm{m} }[/math] in the vertical direction. This can serve as a lower limit to the neighbourhood and an upper limit to the cutoff frequency. Thus, if a vertical profiler is moving at a speed of [math]\displaystyle{ 0.5\ \mathrm{m\, s^{-1}} }[/math], then the cutoff frequency should be no higher than [math]\displaystyle{ 1\ \mathrm{Hz} }[/math]. Gliders that move more slowly and at an angle of [math]\displaystyle{ 30^{\circ} }[/math] with respect to the horizontal should use a lower cutoff frequency to establish a neighbourhood for comparison of the ratio of signals, because they will take longer to pass through a patch of turbulence.

The question of how many points around a spike should be removed is determined by the typical relaxation time of the shear-probe to a collision with zooplankton. Such anomalies seem to last about [math]\displaystyle{ 0.04\ \mathrm{s} }[/math], and so a good choice is [math]\displaystyle{ N=0.04\, f_s }[/math] where [math]\displaystyle{ f_s }[/math] is the sampling rate.

This algorithm is often applied iteratively until no more anomalies are detected.




return to Flow chart for shear probes