Next: Basic Statistical Functions, Previous: Descriptive Statistics, Up: Statistics [Contents][Index]
It is often useful to calculate descriptive statistics over a subsection (i.e., window) of a full dataset. Octave provides the function movfun
which
will call an arbitrary function handle with windows of data and accumulate
the results. Many of the most commonly desired functions, such as the moving
average over a window of data (movmean
), are already provided.
Apply function fcn to a moving window of length wlen on data x.
If wlen is a scalar, the function fcn is applied to a moving
window of length wlen. When wlen is an odd number the window is
symmetric and includes (wlen - 1) / 2
elements on either
side of the central element. For example, when calculating the output at
index 5 with a window length of 3, movfun
uses data elements
[4, 5, 6]
. If wlen is an even number, the window is
asymmetric and has wlen/2
elements to the left of the
central element
and wlen/2 - 1
elements to the right of the central element.
For example, when calculating the output at index 5 with a window length of
4, movfun
uses data elements [3, 4, 5, 6]
.
If wlen is an array with two elements [nb, na]
,
the function is applied to a moving window -nb:na
. This
window includes nb number of elements before the current
element and na number of elements after the current element.
The current element is always included. For example, given
wlen = [3, 0]
, the data used to calculate index 5 is
[2, 3, 4, 5]
.
During calculations the data input x is reshaped into a 2-dimensional wlen-by-N matrix and fcn is called on this new matrix. Therefore, fcn must accept an array input argument and apply the computation along dimension 1, i.e., down the columns of the array.
When applied to an array (possibly multi-dimensional) with n columns,
fcn may return a result in either of two formats: Format 1)
an array of size 1-by-n-by-dim3-by-…-by-dimN. This
is the typical output format from Octave core functions. Type
demo ("movfun", 5)
for an example of this use case.
Format 2) a row vector of length
n * numel_higher_dims
where numel_higher_dims is
prod (size (x)(3:end))
. The output of fcn for the
i-th input column must be found in the output at indices
i:n:(n*numel_higher_dims)
.
This format is useful when concatenating functions into arrays, or when
using nthargout
. Type demo ("movfun", 6)
for an example of
this case.
The calculation can be controlled by specifying property/value pairs. Valid properties are
"dim"
Operate along the dimension specified, rather than the default of the first non-singleton dimension.
"Endpoints"
This property controls how results are calculated at the boundaries (endpoints) of the window. Possible values are:
"shrink"
(default)The window is truncated at the beginning and end of the array to exclude
elements for which there is no source data. For example, with a window of
length 3, y(1) = fcn (x(1:2))
, and
y(end) = fcn (x(end-1:end))
.
"discard"
Any y values that use a window extending beyond the original
data array are deleted. For example, with a 10-element data vector and a
window of length 3, the output will contain only 8 elements. The first
element would require calculating the function over indices
[0, 1, 2]
and is therefore discarded. The last element would
require calculating the function over indices [9, 10, 11]
and is
therefore discarded.
"fill"
Any window elements outside the data array are replaced by NaN
. For
example, with a window of length 3,
y(1) = fcn ([NaN, x(1:2)])
, and
y(end) = fcn ([x(end-1:end), NaN])
.
This option usually results in y having NaN
values at the
boundaries, although it is influenced by how fcn handles NaN
,
and also by the property "nancond"
.
Any window elements outside the data array are replaced by the specified
value user_value which must be a numeric scalar. For example, with a
window of length 3,
y(1) = fcn ([user_value, x(1:2)])
, and
y(end) = fcn ([x(end-1:end), user_value])
.
A common choice for user_value is 0.
"same"
Any window elements outside the data array are replaced by the value of
x at the boundary. For example, with a window of length 3,
y(1) = fcn ([x(1), x(1:2)])
, and
y(end) = fcn ([x(end-1:end), x(end)])
.
"periodic"
The window is wrapped so that any missing data elements are taken from
the other side of the data. For example, with a window of length 3,
y(1) = fcn ([x(end), x(1:2)])
, and
y(end) = fcn ([x(end-1:end), x(1)])
.
Note that for some of these choices, the window size at the boundaries is not the same as for the central part, and fcn must work in these cases.
"nancond"
Controls whether NaN
or NA
values should be included (value:
"includenan"
), or excluded (value: "omitnan"
), from the data
passed to fcn. The default is "includenan"
. Caution:
The "omitnan"
option is not yet implemented.
"outdim"
A row vector that selects which dimensions of the calculation will appear in the output y. This is only useful when fcn returns an N-dimensional array in Format 1. The default is to return all output dimensions.
Programming Note: The property "outdim"
can be used to save memory
when the output of fcn has many dimensions, or when a wrapper to the
base function that selects the desired outputs is too costly. When memory
is not an issue, the easiest way to select output dimensions is to first
calculate the complete result with movfun
and then filter that result
with indexing. If code complexity is not an issue then a wrapper can be
created using anonymous functions. For example, if basefcn
is a function returning a K-dimensional row output, and only
dimension D is desired, then the following wrapper could be used.
fcn = @(x) basefcn (x)(:,size(x,2) * (D-1) + (1:size(x,2))); y = movfun (@fcn, …);
Generate indices to slice a vector of length N in to windows of length wlen.
FIXME: Document inputs N, wlen
FIXME: Document outputs slcidx, C, Cpre, Cpost, win.
See also: movfun.
Minimum of x over a sliding window of length wlen.
FIXME: Need explanation of all options. Write once and then replicate.
See also: movfun.
Next: Basic Statistical Functions, Previous: Descriptive Statistics, Up: Statistics [Contents][Index]