Context Navigation

03userguide_estim.dox @ 1417

Revision 1054, 10.7 kB (checked in by smidl, 14 years ago)
doc

Line
1	/*!
2	\page userguide_estim BDM Use - Estimation and Bayes Rule
3
4	Bayesian theory is predominantly used in system identification, or estimation problems.
5	This section is concerned with recursive estimation, as implemented in prepared scenario \c estimator.
6
7	Table of contents:
8	\ref ug2_theory
9	\ref ug2_arx_basic
10	\ref ug2_model_sel
11	\ref ug2_bm_composition
12	\ref ug_est_ext
13
14	The function of the \c estimator is graphically illustrated:
15	\dot
16	digraph estimation{
17	node [shape=box];
18	{rank="same"; "Data Source"; "Bayesian Model"}
19	"Data Source" -> "Bayesian Model" [label="data"];
20	"Bayesian Model" -> "Result Logger" [label="estimation\n result"];
21	"Data Source" -> "Result Logger" [label="Simulated\n data"];
22	}
23	\enddot
24
25	Here,
26	\li Data Source is an object (class DS) providing sequential data, \f$ [d_1, d_2, \ldots d_t] \f$.
27	\li Bayesian Model is an object (class BM) performing Bayesian filtering,
28	\li Result Logger is an object (class logger) dedicated to storing important data from the experiment.
29
30	Since objects datasource and the logger has already been introduced in section \ref userguide_sim, it remains to introduce
31	object \c Bayesian \c Model (bdm::BM).
32
33	\section ug2_theory Bayes rule and estimation
34	The object bdm::BM is basic software image of the Bayes rule:
35	\f[ f(x_t\|d_1\ldots d_t) \propto f(d_t\|x_t,d_1\ldots d_{t-1}) f(x_t\| d_1\ldots d_{t-1}) \f]
36
37	Since this operation can not be defined universally, the object is defined as abstract class with methods for:
38	- <b> Bayes rule </b> as defined above, operation bdm::BM::bayes() which expects to get the current data record \c dt, \f$ d_t \f$
39	- <b> evidence </b> i.e. numerical value of \f$ f(d_t\|d_1\ldots d_{t-1})\f$ as a typical side-product, since it is required in denominator of the above formula.
40	For some models, computation of this value may require extra effort, and can be switched off.
41	- <b> prediction </b> the object has enough information to create the one-step ahead predictor, i.e. \f[ f(d_{t+1}\| d_1 \ldots d_{t}), \f]
42
43	Implementation of these operations is heavily dependent on the specific class of prior pdf, or its approximations. We can identify only a few principal approaches to this problem. For example, analytical estimation which is possible within sufficient the Exponential Family, or estimation when both prior and posterior are approximated by empirical densities.
44	These approaches are first level of descendants of class \c BM, classes bdm::BMEF and bdm::PF, respectively.
45
46	List of all available <a href="annotated_bdm_BM.html"> Bayesian Models </a>.
47
48	\section ug2_arx_basic Estimation of ARX models
49
50	Autoregressive models has already been introduced in \ref ug_arx_sim where their simulator has been presented.
51	We will use results of simulation of the ARX datasource defined there to provide data for estimation using MemDS.
52
53	The following code is from bdmtoolbox/tutorial/userguide/arx_basic_example.m
54	\code
55	A1.class = 'ARX';
56	A1.rv = y;
57	A1.rgr = RVtimes([y,y],[-3,-1]) ;
58	A1.log_level = 'logbounds,logevidence';
59	\endcode
60	This is the minimal configuration of an ARX estimator.
61
62	The first three fields are self explanatory, they identify which data are predicted (field \c rv) and which are in regresor (field \c rgr).
63	The field \c log_level is a string of options passed to the object. In particular, class \c BM understand only options related to storing results:
64	- logbounds - store also lower and upper bounds on estimates (obtained by calling BM::posterior().qbounds()),
65	- logevidence - store also evidence of each step of the Bayes rule.
66	These values are stored in given logger (\ref ug_store). By default, only mean values of the estimate are stored.
67
68	Storing of the evidence is useful, e.g. in model selection task when two models are compared.
69
70	The bounds are useful e.g. for visualization of the results. Run of the example should provide result like the following:
71	\image latex arx_basic_example.png "Typical run of tutorial/userguide/arx_basic_example.m" width=\linewidth
72
73	\section ug2_model_sel Model selection
74
75	In Bayesian framework, model selection is done via comparison of evidence (marginal likelihood) of the recorded data. See [some theory].
76
77	A trivial example how this can be done is presented in file bdmtoolbox/tutorial/userguide/arx_selection_example.m. The code extends the basic A1 object as follows:
78	\code
79	A2=A1;
80	A2.constant = 0;
81
82	A3=A2;
83	A3.frg = 0.95;
84	\endcode
85	That is, two other ARX estimators are created,
86	- A2 which is the same as A1 except it does not model constant term in the linear regression. Note that if the constant was set to zero, then this is the correct model.
87	- A3 which is the same as A2, but assumes time-variant parameters with forgetting factor 0.95.
88
89	Since all estimator were configured to store values of marginal log-likelihood, we can easily compare them by computing total log-likelihood for each of them and converting them to probabilities. Typically, the results should look like:
90	\code
91	Model_probabilities =
92
93	0.0002 0.7318 0.2680
94	\endcode
95	Hence, the true model A2 was correctly identified as the most likely to produce this data.
96
97	For this task, additional technical adjustments were needed:
98	\code
99	A1.name='A1';
100	A2.name='A2';
101	A2.rv_param = RV({'a2th', 'r'},[2,1],[0,0]);
102	A3.name='A3';
103	A3.rv_param = RV({'a3th', 'r'},[2,1],[0,0]);
104	\endcode
105	First, in order to distinguish the estimators from each other, the estimators were given names. Hence, the results will be logged with prefix given by the name, such as M.A1_evidence.
106
107	Second, if the parameters of a ARX model are not specified, they are automatically named \c theta and \c r. However, in this case, \c A1 and \c A2 differ in size, hence their random variables differ and can not use the same name. Therefore, we have explicitly used another names (RVs) of the parameters.
108
109	\section ug2_bm_composition Composition of estimators
110
111	Similarly to pdfs which could be composed via \c mprod, the Bayesian models can be composed together. However, justification of this step is less clear than in the case of epdfs.
112
113	One possible theoretical base of composition is the Marginalized particle filter, which splits the prior and the posterior in two parts:
114	\f[ f(x_t\|d_1\ldots d_t)=f(x_{1,t}\|x_{2,t},d_1\ldots d_t)f(x_{2,t}\|d_1\ldots d_t) \f]
115	each of these parts is estimated using different approach. The first part is assumed to be analytically tractable, while the second is approximated using empirical approximation.
116
117	The whole algorithm runs by parallel evaluation of many \c BMs for estimation of \f$ x_{1,t}\f$, each of them conditioned on value of a sample of \f$ x_{2,t}\f$.
118
119	For example, the forgetting factor, \f$ \phi \f$ of an ARX model can be considered to be unknown. Then, the whole parameter space is \f$ [\theta_t, r_t, \phi_t]\f$ decomposed as follows:
120	\f[ f(\theta_t, r_t, \phi_t) = f(\theta_t, r_t\| \phi_t) f(\phi_t) \f]
121	Note that for known trajectory of \f$ \phi_t \f$ the standard ARX estimator can be used if we find a way how to feed the changing \f$ \phi_t \f$ into it.
122	This is achieved by a trivial extension using inheritance method bdm::BM::condition().
123
124	Extension of standard ARX estimator to conditional estimator is implemented as class bdm::ARXfrg. The only difference from standard ARX is that this object will obtain its forgetting factor externally as a conditioning variable.
125	Informally, the name 'ARXfrg' means: "if anybody calls your condition(0.9), it tells you new value of forgetting factor".
126
127	The MPF estimator for this case is specified as follows:
128	\code
129	%%%%%% ARX estimator conditioned on frg
130
131	A1.class = 'ARXfrg';
132	A1.rv = y;
133	A1.rgr = RVtimes([y,u],[-3,-1]) ;
134	A1.log_level ='logbounds,logevidence';
135	A1.frg = 0.9;
136	A1.name = 'A1';
137
138	%%%%%% Random walk on frg - Dirichlet
139	phi_pdf.class = 'mDirich'; % random walk on coefficient phi
140	phi_pdf.rv = RV('phi',2); % 2D random walk - frg is the first element
141	phi_pdf.k = 0.01; % width of the random walk
142	phi_pdf.betac = [0.01 0.01]; % stabilizing elememnt of random walk
143
144	%%%%%% Particle
145	p.class = 'MarginalizedParticle';
146	p.parameter_pdf = phi_pdf; % Random walk is the parameter evolution model
147	p.bm = A1;
148
149	% prior on ARX
150	%%%%%% Combining estimators in Marginalized particle filter
151	E.class = 'PF';
152	E.particle = p; % ARX is the analytical part
153	E.res_threshold = 1.0; % resampling parameter
154	E.n = 100; % number of particles
155	E.prior.class = 'eDirich'; % prior on non-linear part
156	E.prior.beta = [2 1]; %
157	E.log_level = 'logbounds';
158	E.name = 'MPF';
159
160	M=estimator(DS,{E});
161
162	\endcode
163
164	Here, the configuration structure \c A1 is a description of an ARX model, as used in previous examples, the only difference is in its name 'ARXfrg'.
165
166	The configuration structure \c phi_pdf defines random walk on the forgetting factor. It was chosen as Dirichlet, hence it will produce 2-dimensional vector of \f$[\phi, 1-\phi]\f$. The class \c ARXfrg was designed to read only the first element of its condition.
167	The random walk of type mDirich is:
168	\f[ f(\phi_t\|\phi_{t-1}) = Di (\phi_{t-1}/k + \beta_c) \f]
169	where \f$ k \f$ influences the spread of the walk and \f$ \beta_c \f$ has the role of stabilizing, to avoid traps of corner cases such as [0,1] and [1,0].
170	Its influence on the results is quite important.
171
172	This example is implemented as bdmtoolbox/tutorial/userguide/frg_example.m
173	Its typical run should look like the following:
174	\image html frg_example_small.png
175	\image latex frg_example.png "Typical run of tutorial/userguide/frg_example.m" width=\linewidth
176
177	Note: error bars in this case are not directly comparable with those of previous examples. The MPF class implements the qbounds function as minimum and maximum of bounds in the considered set (even if its weight is extremely small). Hence, the bounds of the MPF are probably larger than it should be. Nevertheless, they provide great help when designing and tuning algorithms.
178
179	\section ug_est_ext Matlab extensions of the Bayesian estimators
180
181	Similarly to the extension of pdf, the estimators (or filters) can be extended via prepared class \c mexBM in directory bdmtoolbox/mex/mex_classes.
182
183	An example of such class is mexLaplaceBM in \<toolbox_dir\>/tutorial/userguide/laplace_example.m
184
185	Note that matlab-extended classes of mexEpdf, specifically, mexDirac and mexLaplace are used as outputs of methods posterior and epredictor, respectively.
186
187	In order to create a new extension of an estimator, copy file with class mexLaplaceBM.m and redefine the methods therein. If needed create new classes for pdfs by inheriting from mexEpdf, it the same way as in the mexLaplace.m example class.
188
189	For list of all Matlab estimators, see <a href="annotated.html"> list </a>.
190
191	*/

Note: See TracBrowser for help on using the browser.

Download in other formats:

Original Format