Theory of ARX model estimation

The ARX (AutoregRessive with eXogeneous input) model is defined as follows:

\[ y_t = \theta' \psi_t + \rho e_t \]

where $y_t$ is the system output, $[\theta,\rho]$ is vector of unknown parameters, $\psi_t$ is an vector of data-dependent regressors, and noise $e_t$ is assumed to be Normal distributed $\mathcal{N}(0,1)$.

Special cases include:

Off-line estimation:

This particular model belongs to the exponential family, hence it has conjugate distribution (i.e. both prior and posterior) of the Gauss-inverse-Wishart form. See [ref]

Estimation of this family can be achieved by accumulation of sufficient statistics. The sufficient statistics Gauss-inverse-Wishart density is composed of:

Information matrix
which is a sum of outer products

\[ V_t = \sum_{i=0}^{n} \left[\begin{array}{c}y_{t}\\ \psi_{t}\end{array}\right] \begin{array}{c} [y_{t}',\,\psi_{t}']\\ \\\end{array} \]

"Degree of freedom"
which is an accumulator of number of data records

\[ \nu_t = \sum_{i=0}^{n} 1 \]

On-line estimation

For online estimation with stationary parameters can be easily achieved by collecting the sufficient statistics described above recursively.

Extension to non-stationaly parameters, $ \theta_t , r_t $ can be achieved by operation called forgetting. This is an approximation of Bayesian filtering see [Kulhavy]. The resulting algorithm is defined by manipulation of sufficient statistics:

Information matrix
which is a sum of outer products

\[ V_t = \phi V_{t-1} + \left[\begin{array}{c}y_{t}\\ \psi_{t}\end{array}\right] \begin{array}{c} [y_{t}',\,\psi_{t}']\\ \\\end{array} +(1-\phi) V_0 \]

"Degree of freedom"
which is an accumulator of number of data records

\[ \nu_t = \phi \nu_{t-1} + 1 + (1-\phi) \nu_0 \]

where $ \phi $ is the forgetting factor, typically $ \phi \in [0,1]$ roughly corresponding to the effective length of the exponential window by relation:

\[ \mathrm{win_length} = \frac{1}{1-\phi}\]

Hence, $ \phi=0.9 $ corresponds to estimation on exponential window of effective length 10 samples.

Statistics $ V_0 , \nu_0 $ are called alternative statistics, their role is to stabilize estimation. It is easy to show that for zero data, the statistics $ V_t , \nu_t $ converge to the alternative statistics.

Structure estimation

For this model, structure estimation is a form of model selection procedure. Specifically, we compare hypotheses that the data were generated by the full model with hypotheses that some regressors in vector $\psi$ are redundant. The number of possible hypotheses is then the number of all possible combinations of all regressors.

However, due to property known as nesting in exponential family, these hypotheses can be tested using only the posterior statistics. (This property does no hold for forgetting $ \phi<1 $). Hence, for low dimensional problems, this can be done by a tree search (method bdm::ARX::structure_est()). Or more sophisticated algorithm [ref Ludvik]

Software Image

Estimation of the ARX model is implemented in class bdm::ARX.

How to try

The best way to experiment with this object is to run matlab script arx_test.m located in directory ./library/tutorial. See Running experiment estimator with ARX data fields for detailed description.


Generated on Thu Apr 23 21:06:43 2009 for mixpp by  doxygen 1.5.8