Context Navigation

01userguide.dox @ 659

Revision 659, 16.3 kB (checked in by mido, 16 years ago)
synchronization of documentation pages names

Line
1	/*!
2	\page userguide BDM Use - System, Data, Simulation
3
4	This section serves as introdustion to the scenario of data simulation. Since it is the simpliest of all scenarios defined in \ref 005userguide0 it also serves as introduction to configuration of an experiment (see \ref ui) and basic decision making objects (bdm::RV and bdm::DS).
5
6	All experiments are demonstarted on scenario simulator which can be either standalone application or mex file (simulator.mex**).
7
8
9	\section ug_config Configuration of an experiment
10
11	Configuration file (or config structure) is organized as a tree of information. High levels represent complex structures, leafs of the tree are basic data elements such as strings, numbers or vectors.
12
13	Specific treatment was developed for objects. Since BDM is designed as object oriented library, the configuration was designed to honor the rule of inheritance. That is, offspring of a class can be used in place of its predecessor. Hence, objects (instances of classes) are configured by a structure with compulsory field \c class. This is a string variable corresponding to the name of the class to be used.
14
15	The configuration has two possible options:
16	- configuration file using syntax of libconfig (see \ref ui),
17	- matlab structure.
18	For the purpose of tutorial, we will use the matlab notation.
19	These two options can be mutually converted from one to another using prepared mex files: config2mxstruct.mex and mxstruct2config.mex. Naturally, these scripts require matlab to run. If it is not available, manual conversion is relatively trivial, the major difference is in using different types of brackets (\ref ui)
20
21	\subsection ug_first First experiment
22
23	The first experiment that can be performed is:
24	\code
25	DS.class='MemDS';
26	DS.Data =[1 2 3 4 5 6];
27	\endcode
28	which can be found in file bdmtoolbox/tutorials/userguide/memds_example.m.
29
30	The code above is the minimum necessary information to run scenario \c simulator in matlab.
31	To actually do so, make sure that matlab can find the simulator.mex file, e.g. by running:
32	\code
33	>> addpath _path_to_/bmtoolbox/mex/
34	\endcode
35
36	The expected result for Matlab is:
37	\code
38	>> M=simulator(DS)
39
40	M =
41
42	ch0: [6x1 double]
43	\endcode
44
45	If you see this result, you have configured BDM correctly and you have sucessfully run you first experiment. In other cases, please check your installation, \ref install.
46	All that the simulator did was actually copying \c DS.Data to \c M.ch0. Explanation of the experiment and the logic used there follows.
47
48	\section ug_sim Systems and DataSources
49
50	In standard system theory, the system is typically illustrated graphically as:
51	\dot
52	digraph sys{
53	node [shape=box];
54	{"System"}
55	node [shape=plaintext]
56	{rank="same"; "u"; "System"; "y"}
57	"u" -> "System" -> "y" [nodesep=2];
58	}
59	\enddot
60	Where \c u typically denotes input and \c y denotes output of the system. A causal dependence between input and output is typically presumed.
61
62	We are predominantly concerned with discrete-time systems, hence, we will add indeces \f$ _t \f$ to both input and output, \f$ u_t \f$ and \f$ y_t \f$. We presume that the causal dependence is \f$ u_t \f$ comes before \f$ y_t \f$.
63
64	One of the definition of a system is that system is a "set of variables observed on a part of the world". Under this definition system is understood as generator of data. This definition may be a considered too simplistic, but it serves well as a description of what software object \c DataSource is.
65
66	DataSource is an object that is essentially:
67	-# able to return data observed at time \f$ t \f$, (bdm::DS::getdata()),
68	-# able to perform one a time step, (bdm::DS::step()).
69	-# able to describe what these data are, (bdm::DS::_drv()),
70
71	No fruther specification, e.g. if the data are pre-recorded or computed on-the-fly, are given.
72	Specific behaviour of various DataSources is implemented as specialization of the root class bdm::DS.
73
74
75	\section ug_memds DataSource of pre-recorded data -- MemDS
76
77	The first experiment run in \ref first was actually an instance of DataSource of pre-recorded data that were stored in memory, i.e. the bdm::MemDS class.
78
79	Operation of such object is trivial, the data are stored as a matrix and the general operations defined above are specialized as follows:
80	-# data observed at time \f$ t \f$ are columns of the matrix, getdata() ruturns current column,
81	-# time step itself is performed by increasing the column index,
82	-# each row is named as "ch0","ch1",...
83
84	This is the default bahavior. It can be customized using the UI mechanism.
85	When the object of class MemDS is created it calls method bdm::MemDS::from_setting() and the input structure is parsed for settings. All available settings are documented in the method, see bdm::MemDS::from_setting(). The options are:
86	\code
87	DS.class = 'MemDS';
88	DS.Data = (...); // Data matrix or data vector
89	--- optional ---
90	DS.drv = RV({"ch0",...} ); // Identification how rows of the matrix Data will be known to others
91	DS.time = 0; // Index of the first column to user_info,
92	DS.rowid = [1,2,3...]; // ids of rows to be used
93	\endcode
94	The compulsory fields are listed at the beginning; the optional fields are separated by string "--- optional ---".
95
96	Fields \c time and \c rowid are self-explanatory. Field \c drv is a the one that specifies identification of the data elements, (point 3. of the general requirements of a DataSource).
97
98	All optionals fields will be filled by default values, it this case:
99	\code
100	DS.drv = RV({'ch0'},1,0);
101	DS.time = 0;
102	DS.rowid = [1];
103	\endcode
104	Where the first line specifies a universal identification structure: random variable (bdm::RV).
105
106	\section ug_rvs What is RV and how to use it
107
108	RV stands for \c random \c variable which is a description of random variable or its realization. This object playes role of identifier of elements of vectors of data (in datasources), expected inputs to functions (in pdfs), or required results (operations conditioning).
109
110	Mathematical interpretation of RV is straightforward. Consider pdf \f$ f(a)\f$, then \f$ a \f$ is the part represented by RV. Explicit naming of random variables may seem unnecessary for many operations with pdf, e.g. for generation of a uniform sample from <0,1> it is not necessary to specify any random variable. For this reason, RV are often optional information to specify. However, the considered scenanrio \c simulator is build in a way that requires RV to be given.
111
112	In software, \c RV has three compulsory properties:
113	- <b>name</b>, unique identifier, two RV with the same name are considered to be identical
114	- <b>size</b>, size of the random variable, if not given it is assumed to be 1,
115	- <b>time</b>, more exactly time shift from \f$ t \f$, defaults to 0.
116	For example, scalar \f$ x_{t-2} \f$ is encoded as (name='x',sizes=1,time=-2).
117	Each RV stores array of these elements, hence RV with:
118	\code
119	names={'a', 'b'};
120	sizes=[ 2 , 3];
121	times=[-1, 1];
122	\endcode
123	denotes 5-dimensional vector \f$ [a_{t-1}, b_{t+1}] \f$.
124
125	\subsection ug_rv_alg Algebra on RVs
126	Algebra on RVs (adding, searching in, subtraction, intersection, etc.) is implemented, see bdm::RV.
127
128	For convenience in Matlab, the following operations are defined:
129	- RV(names,sizes,times) creates configuration structure for RV,
130	- RVjoin(rvs) joins configuration structures for array of RVs rvs=[rv1,rv2,...],
131	- RVtimes(rvs,times) assign times to corresponding rvs.
132
133	See examples in bdmtoolbox/tutorial/userguide
134
135	\subsection ug_rv_connect
136
137	The \c simulator scenario connects the DataSource to second basic class of BDM, bdm:logger. The logger is a class that take care of storing results -- in this case, results of simulation.
138	The connection between these blocks is done automatically. The logger stores results of simulations under the names specified in drv.
139	Readers familiar with Simulink environment may look at the RV as being unique identifiers of inputs and outputs of simulation blocks. The inputs are connected automatically with the outputs with matching RV. This view is however, very incomplete, RV have more roles than this.
140
141
142	\section loggers Loggers for flexible handling of results
143	Loggers are universal objects for storing and manipulating the results of an experiment. Similar to DataSource, every logger has to provide basic functionality:
144	-# initialize its storage (bdm::logger.init()),
145	-# assign a connection point to each interested object (bdm::logger.logadd()),
146	-# accept data to be logged to given connection (bdm::logger.logit()),
147	-# finalize the storage when experiment is finished.
148
149	These abstarct operations can be specialized in many ways. For example, storing all results in memory and writing them to disc when finished (bdm::memlog), storing data in a matlab structure (bdm::mexlog), writing them out in ascii (bdm::stdlog) or more sophisticated buffered output to harddrive (bdm::dirfilelog).
150
151	Since all experiments are performed in matlab, the default mexlog class will be used. However, the way how the results are to be stored can be configured using configuration structure filled by fields from \c from_setting of the chosen logger, and passing it as third argument to \c simulator.
152
153	\section ug_datasource Class inheritance and DataSources
154
155	As mentioned above, the scenario \c simulator is written to accept any datasource (i.e. any offspring of bdm::DS). For full list of offsprings, click see Classes > Class Hierarchy.
156
157	At the time of writing this tutorial, available datasources are
158	bdm::DS
159	- bdm::EpdfDS
160	- bdm::MemDS
161	- bdm::FileDS
162	- bdm::CsvFileDS
163	- bdm::ITppFileDS
164	- bdm::MpdfDS
165	- bdm::stateDS
166
167	The MemDS has already been introduced in the example in \ref memds.
168	However, any of the classes listed above can be used to replace it in the example.
169	This will be demonstrated on the \c EpdfDS class.
170
171	Brief decription of the class states that EpdfDS "Simulate data from a static pdf (epdf)". The static pdf means unconditional pdf in the sense that the random variable is conditioned by numerical values only. In mathematical notation it could be both \f$ f(a) \f$ and \f$ f(x_t \|d_1 \ldots d_t)\f$. The latter case is true only when all \f$ d \f$ denotes observed values.
172
173	For example, we wish to simulate realizations of a Uniform pdf on interval <-1,1>.
174	This is achieved by plugging an object representing uniform pdf into general simulator of independent random samples, EpdfDS. Uniform density is implemented as class bdm::euni.
175	An instance of \c euni can be again created method \c from_setting, in this case bdm::euni.from_setting(). Using documentation we define it with the following code:
176	\code
177	U.class='euni';
178	U.rv = RV({'a'});
179	U.high = 1.0;
180	U.low = -1.0;
181	\endcode
182	which encodes information:\f[
183	f(a) = \mathcal{U}(-1,1)
184	\f]
185
186	The datasource itself, i.e. the instanc of \c EpdfDS can be then configured via:
187	\code
188	DS.class = 'EpdfDS';
189	DS.epdf = U;
190	\endcode
191	where \c U is the structure defined above.
192
193	Contrary to the previous example, we need to tell to algorithm \c simulator how many samples from the data source we need. This is configured by variable \c experiment.ndat. The configuration has to be finalized by:
194	\code
195	experiment.ndat = 10;
196	M=simulator(DS,experiment);
197	\endcode
198
199	The result is as expected in field \c M.a the name of which corresponds to name of \c U.rv .
200
201	If the task was only to generate random realizations, this would indeed be a very clumsy way of doing it. However, the power of the proposed approach will be revelead in more demanding examples, one of which follows next.
202
203	\section ug_arx_sim Simulating autoregressive model
204
205	Consider the following autoregressive model:
206	\f[
207	f(y_t\|y_{t-3},u_{t-1}) = \mathcal{N}( a y_{t-3} + b u_{t-1}, r)
208	\f]
209	where \f$ a,b \f$ are known constants, and \f$ r \f$ is known variance.
210
211	Direct application of \c EpdfDS is not possible, since the pdf above is conditioned on values of \f$ y_{t-3}\f$ and \f$ u_{t-1}\f$.
212	We need to handle two issues:
213	-# extra unsimulated variable \f$ u \f$,
214	-# time delayes of the values.
215
216	The first issue can be handled in two ways. First, \f$ u \f$ can be considered as input and as such it could be externally given to the datasource. This solution is used in scenario \c closedloop.
217	However, for the \c simulator scenario we will apply the second option, that is we complement \f$ f(y_{t}\|y_{t-3},u_{t-1})\f$ by extra pdf:\f[
218	f(u_t) = \mathcal{N}(0, r_u)
219	\f]
220	where \f$ r_u \f$ is another known constant.
221	Thus, the joint density is now:\f[
222	f(y_{t},u_{t}\|y_{t-3},u_{t-1}) = f(y_{t}\|y_{t-3},u_{t-1})f(u_{t})
223	\f]
224	and we have no need for input since the datasource have all necessary information inside. All that is required is to store them and copy their values to appropriate places.
225
226	That is done in automatic way using dedicated class bdm::datalink_buffered. The only issue a user may need to take care about is the missing initial conditions for simulation.
227	By default these are set to zeros. Using the default values, the full configuration of this system is:
228	\code
229	y = RV({'y'});
230	u = RV({'u'});
231
232	fy.class = 'mlnorm<ldmat>';
233	fy.rv = y;
234	fy.rvc = RV({'y','u'}, [1 1], [-3, -1]);
235	fy.A = [0.5, -0.9];
236	fy.const = 0;
237	fy.R = 0.1;
238
239
240	fu.class = 'enorm<ldmat>';
241	fu.rv = u;
242	fu.mu = 0;
243	fu.R = 0.2;
244
245	DS.class = 'MpdfDS';
246	DS.mpdf.class = 'mprod';
247	DS.mpdf.mpdfs = {fy, epdf2mpdf(fu)};
248	\endcode
249
250	Explanation of this example will require few remarks:
251	- class of the \c fy object is 'mlnorm<ldmat>' which is Normal pdf with mean value given by linear function, and covariance matrix stored in LD decomposition, see bdm::mlnorm for details.
252	- naming convention 'mlnorm<ldmat>' relates to the concept of templates in C++. For those unfamiliar with this concept, it is basicaly a way how to share code for different flavours of the same object. Note that mlnorm exist in three versions: mlnorm<ldmat>, mlnorm<chmat>, mlnorm<fsqmat>. Those classes act identically the only difference is that the internal data are stored either in LD decomposition, choleski decomposition or full matrices, respectively.
253	- the same concept is used for enorm, where enorm<chmat> and enorm<fsqmat> are also possible. In this particular use, these objects are equivalent. In specific situation, e.g. Kalman filter implemented on Choleski decomposition (bdm::KalmanCh), only enorm<chmat> is approprate.
254	- class 'mprod' represents the chain rule of probability. Attribute \c mpdfs of its configuration structure is a list of conditional densities. Conditional density \f$ f(a\|b)\f$ is represented by class \c mpdf and its offsprings. Class \c RV is used to describe both variables before conditioning (field \c rv ) and after conditioning sign (field \c rvc).
255	- due to simplicity of implementation, mprod accept only conditional densities in the field \c mpdfs. Hence, the pdf \f$ f(u_t)\f$ must be converted to conditional density with empty conditioning, \f$ f(u_t\| \{\})\f$. This is achieved by calling function epdf2mpdf which is only a trivial wrapper creating class bdm::mepdf.
256
257
258	The code above can be immediatelly run, usin the same execution sequence of \c estimator as above.
259
260	\subsection ug_ini Initializing simulation
261
262	When zeros are not appropriate initial conditions, the correct conditions can be set using additional commands (see bdm::MpdfDS.from_setting() ):
263	\code
264	DS.init_rv = RV({'y','y','y'}, [1,1,1], [-1,-2,-3]);
265	DS.init_values = [0.1, 0.2, 0.3];
266	\endcode
267
268	The values of \c init_values will be copied to places in history identified by corresponding values of \c init_rv.
269	Initial data is not checked for completeness, i.e. values of random variables missing from \c init_rv (in this case all occurences of \f$ u \f$) are still initialized to 0.
270
271
272	\section ug_store Storing results of simulation
273
274	If the simulated data are to be analyzed off-line it may be advantageous to store them and use later on.
275	This operation is straightforward if the class of logger used in the \c simulator is compatible with some datasource class.
276
277	For example, the output of \c MemDS can be stored as an .it file (filename is specified in configuration structure) which can be later read by bdm::ITppFileDS.
278
279	In matlab, the output of mexlog is a structure of vectors or matrices. The results can be saved in a matlab file using:
280	\code
281	Data=[M.y; M.u];
282	drv = RVjoin({y,u});
283	save mpdfds_results Data drv
284	\endcode
285	Such data can be later provided e.g. by MemDS
286	\code
287	mxDS.class = 'MemDS';
288	mxDS.Data = 'Data';
289	mxDS.drv = drv;
290	\endcode
291
292	*/

Note: See TracBrowser for help on using the browser.

Download in other formats:

Original Format