[272] | 1 | /*! |
---|
| 2 | \page tut_arx Theory of ARX model estimation |
---|
| 3 | \addindex Theory of ARX estimation |
---|
| 4 | |
---|
| 5 | The \c ARX (AutoregRessive with eXogeneous input) model is defined as follows: |
---|
| 6 | \f[ |
---|
| 7 | y_t = \theta' \psi_t + \rho e_t |
---|
| 8 | \f] |
---|
| 9 | where \f$y_t\f$ is the system output, \f$[\theta,\rho]\f$ is vector of unknown parameters, \f$\psi_t\f$ is an |
---|
| 10 | vector of data-dependent regressors, and noise \f$e_t\f$ is assumed to be Normal distributed \f$\mathcal{N}(0,1)\f$. |
---|
| 11 | |
---|
| 12 | Special cases include: |
---|
| 13 | \li estimation of unknown mean and variance of a Gaussian density from independent samples. |
---|
| 14 | |
---|
| 15 | \section off Off-line estimation: |
---|
| 16 | This particular model belongs to the exponential family, hence it has conjugate distribution (i.e. both prior and posterior) of the Gauss-inverse-Wishart form. See [ref] |
---|
| 17 | |
---|
| 18 | Estimation of this family can be achieved by accumulation of sufficient statistics. The sufficient statistics Gauss-inverse-Wishart density is composed of: |
---|
| 19 | <dl> |
---|
| 20 | <dt>Information matrix</dt> <dd>which is a sum of outer products \f[ |
---|
| 21 | V_t = \sum_{i=0}^{n} \left[\begin{array}{c}y_{t}\\ \psi_{t}\end{array}\right] |
---|
| 22 | \begin{array}{c} [y_{t}',\,\psi_{t}']\\ \\\end{array} |
---|
| 23 | \f]</dd> |
---|
| 24 | <dt>"Degree of freedom"</dt> <dd>which is an accumulator of number of data records \f[ |
---|
| 25 | \nu_t = \sum_{i=0}^{n} 1 |
---|
| 26 | \f]</dd> |
---|
| 27 | </dl> |
---|
| 28 | |
---|
| 29 | \section on On-line estimation |
---|
| 30 | For online estimation with stationary parameters can be easily achieved by collecting the sufficient statistics described above recursively. |
---|
| 31 | |
---|
[290] | 32 | Extension to non-stationaly parameters, \f$ \theta_t , r_t \f$ can be achieved by operation called forgetting. This is an approximation of Bayesian filtering see [Kulhavy]. The resulting algorithm is defined by manipulation of sufficient statistics: |
---|
[272] | 33 | <dl> |
---|
| 34 | <dt>Information matrix</dt> <dd>which is a sum of outer products \f[ |
---|
[290] | 35 | V_t = \phi V_{t-1} + \left[\begin{array}{c}y_{t}\\ \psi_{t}\end{array}\right] |
---|
[272] | 36 | \begin{array}{c} [y_{t}',\,\psi_{t}']\\ \\\end{array} |
---|
| 37 | +(1-\phi) V_0 |
---|
| 38 | \f]</dd> |
---|
| 39 | <dt>"Degree of freedom"</dd> <dd>which is an accumulator of number of data records \f[ |
---|
[290] | 40 | \nu_t = \phi \nu_{t-1} + 1 + (1-\phi) \nu_0 |
---|
[272] | 41 | \f]</dd> |
---|
| 42 | </dl> |
---|
| 43 | where \f$ \phi \f$ is the forgetting factor, typically \f$ \phi \in [0,1]\f$ roughly corresponding to the effective length of the exponential window by relation:\f[ |
---|
| 44 | \mathrm{win_length} = \frac{1}{1-\phi}\f] |
---|
| 45 | Hence, \f$ \phi=0.9 \f$ corresponds to estimation on exponential window of effective length 10 samples. |
---|
| 46 | |
---|
| 47 | Statistics \f$ V_0 , \nu_0 \f$ are called alternative statistics, their role is to stabilize estimation. It is easy to show that for zero data, the statistics \f$ V_t , \nu_t \f$ converge to the alternative statistics. |
---|
| 48 | |
---|
| 49 | \section str Structure estimation |
---|
| 50 | For this model, structure estimation is a form of model selection procedure. |
---|
| 51 | Specifically, we compare hypotheses that the data were generated by the full model with hypotheses that some regressors in vector \f$\psi\f$ are redundant. The number of possible hypotheses is then the number of all possible combinations of all regressors. |
---|
| 52 | |
---|
| 53 | However, due to property known as nesting in exponential family, these hypotheses can be tested using only the posterior statistics. (This property does no hold for forgetting \f$ \phi<1 \f$). Hence, for low dimensional problems, this can be done by a tree search (method bdm::ARX::structure_est()). Or more sophisticated algorithm [ref Ludvik] |
---|
| 54 | |
---|
| 55 | \section soft Software Image |
---|
| 56 | Estimation of the ARX model is implemented in class bdm::ARX. |
---|
| 57 | \li models from exponential family share some properties, these are encoded in class bdm::BMEF which is the parent of ARX |
---|
| 58 | \li one of the parameters of bdm::BMEF is the forgetting factor which is stored in attribute \c frg, |
---|
| 59 | \li posterior density is stored inside the estimator in the form of bdm::egiw |
---|
| 60 | \li references to statistics of the internal \c egiw class, i.e. attributes \c V and \c nu are established for convenience. |
---|
| 61 | |
---|
| 62 | \section try How to try |
---|
| 63 | The best way to experiment with this object is to run matlab script \c arx_test.m located in directory \c ./library/tutorial. See \ref arx_ui for detailed description. |
---|
| 64 | |
---|
| 65 | \li In default setup, the parameters converge to the true values as expected. |
---|
| 66 | \li Try changing the forgetting factor, field \c estimator.frg, to values <1. You should see increased lower and upper bounds on the estimates. |
---|
| 67 | \li Try different set of parameters, filed \c system.theta, you should note that poles close to zero are harder to identify. |
---|
| 68 | |
---|
| 69 | |
---|
| 70 | */ |
---|