Changeset 617 for library/doc/tutorial

Show
Ignore:
Timestamp:
09/15/09 23:47:47 (15 years ago)
Author:
smidl
Message:

simulator + doc

Files:
1 modified

Legend:

Unmodified
Added
Removed
  • library/doc/tutorial/01userguide.dox

    r616 r617  
    11/*! 
    2 \page user_guide Howto Use BDM - Introduction 
    3 \addindex Howto Use BDM - Introduction 
    4  
    5 BDM is a library of basic components for Bayesian decision making, hence its direct use is not possible. In order to use BDM the components must be pulled together in order to achieve desired functionality. We expect two kinds of users: 
    6  
    7  - <b> Basic users </b> who run prepared scripts with different parameterizations and analyze their results, 
    8  - <b> Advanced users </b> who are able to understand the logic of BDM and extend its functionality to new applications. 
    9  
    10 The primary design aim of BDM was to ease development of complex algorithms, hence the target user is the advanced one.  
    11 However, running experiments is the first task to learn for both types of users. 
    12  
    13 \section param Experiment is fully parameterized before execution 
    14  
    15 Experiments in BDM can be performed using either standalone applications or function bindings in high-level environment. A typical example of the latter being mex file in Matlab environment. 
    16  
    17 The main logic behind the experiment is that all necessary information about it are gathered in advance in a configuration file (for standalone applications) or in configuration structure (Matlab). 
    18 This approach was designed especially for time consuming  experiments and Monte-Carlo studies for which it suits the most.  
    19  
    20 For smaller decision making tasks, interactive use of the experiment can be achieved by showing the full configuration structure (or its selected parts), running the experiment on demand and showing the results. 
    21  
    22 Semi-interactive experiments can be designed by sequential run of different algorithms. This topic will be covered in advanced documentation. 
     2\page user_guide Howto Use BDM - System, Data, Simulation 
     3 
     4This section serves as introdustion to the scenario of data simulation. Since it is the simpliest of all scenarios defined in \ref user_guide0 it also serves as introduction to configuration of an experiment (see \ref ui) and basic decision making objects (bdm::RV and bdm::DS). 
     5 
     6All experiments are demonstarted on scenario simulator which can be either standalone application of mex file (simulator.mex**). 
     7 
    238 
    249\section config Configuration of an experiment 
    2510 
    26 Configuration file (or config structure) is organized as a tree of information. High levels represent bigger structures, leafs of the structures are basic data elements such as strings, numbers or vectors. 
     11Configuration file (or config structure) is organized as a tree of information. High levels represent complex structures, leafs of the tree are basic data elements such as strings, numbers or vectors. 
    2712 
    2813Specific treatment was developed for objects. Since BDM is designed as object oriented library, the configuration was designed to honor the rule of inheritance. That is, offspring of a class can be used in place of its predecessor. Hence, objects (instances of classes) are configured by a structure  with compulsory field \c class. This is a string variable corresponding to the name of the class to be used. 
    2914 
    30 Consider the following example: 
    31 \code 
    32 DS = {class="MemDS"; 
    33    data = [1, 2, 3, 4, 5, 6, 7]; 
    34 } 
    35 \endcode 
    36 or written equivalently in Matlab as 
     15The configuration has two possible options: 
     16 - configuration file using syntax of libconfig (see \ref ui), 
     17 - matlab structure. 
     18For the purpose of tutorial, we will use the matlab notation.  
     19These two options can be mutually converted from one to another using prepared mex files: config2mxstruct.mex and mxstruct2config.mex. Naturally, these scripts require matlab to run. If it is not available, manual conversion is relatively trivial, the major difference is in using different types of brackets (\ref ui) 
     20 
     21\subsection first First experiment 
     22 
     23The first experiment that can be performed is: 
    3724\code 
    3825DS.class='MemDS'; 
    3926DS.Data =[1 2 3 4 5 6]; 
    4027\endcode 
    41  
    42 The code above is the minimum necessary information to run a pre-made algorithm implemented as executable \c estimator or Matlab mex file \c estimator. The expected result for Matlab is: 
    43 \code 
    44 >> M=estimator(DS,{}) 
     28which can be found in file bdmtoolbox/tutorials/userguide/memds_example.m. 
     29 
     30The code above is the minimum necessary information to run scenario \c simulator in matlab.  
     31To actually do so, make sure that matlab can find the simulator.mex file, e.g. by running: 
     32\code  
     33>> addpath _path_to_/bmtoolbox/mex/ 
     34\endcode 
     35 
     36The expected result for Matlab is: 
     37\code 
     38>> M=simulator(DS) 
    4539 
    4640M =  
     
    4943\endcode 
    5044 
    51 The structure \c M has one field called \c ch0 to which the data from \c DS.Data were copied. This was configured to be the default behavior which can be easily changed by adding more information to the configuration structure. 
    52  
    53 First, we will have a look at all options of MemDS. 
    54  
    55 \section memds How to understand configuration of classes 
    56  
    57 As a first step, the estimator algorithm has created an object of class MemDS and called its method  bdm::MemDS::from_setting(). 
    58 This is a universal method called when creating an instance of class from configuration. Object that does not implement this method can not be created automatically from configuration. 
    59  
    60 The documentation contains the full structure which can be loaded. e.g.: 
    61 \code 
    62 { class = 'MemDS'; 
    63         Data = (...);            // Data matrix or data vector 
     45If you see this result, you have configured BDM correctly and you have sucessfully run you first experiment. In other cases, please check your installation, \ref installation.  
     46All that the simulator did was actually copying \c DS.Data to \c M.ch0. Explanation of the experiment and the logic used there follows. 
     47 
     48\section sim Systems and DataSources 
     49 
     50In standard system theory, the system is typically illustrated graphically as: 
     51\dot  
     52digraph sys{ 
     53        node [shape=box]; 
     54        {"System"} 
     55        node [shape=plaintext] 
     56        {rank="same"; "u"; "System"; "y"} 
     57        "u" -> "System" -> "y" [nodesep=2]; 
     58} 
     59\enddot 
     60Where \c u typically denotes input and \c y denotes output of the system. A causal dependence between input and output is typically presumed. 
     61 
     62We are predominantly concerned with discrete-time systems, hence, we will add indeces \f$ _t \f$ to both input and output, \f$ u_t \f$ and \f$ y_t \f$. We presume that the causal dependence is \f$ u_t \f$ comes before \f$ y_t \f$. 
     63 
     64One of the definition of a system is that system is a "set of variables observed on a part of the world". Under this definition system is understood as generator of data. This definition may be a considered too simplistic, but it serves well as a description of what software object \c DataSource is. 
     65 
     66DataSource is an object that is essentially: 
     67 -# able to return data observed at time \f$ t \f$, (bdm::DS::getdata()), 
     68 -# able to perform one a time step, (bdm::DS::step()). 
     69 -# able to describe what these data are, (bdm::DS::_drv()), 
     70 
     71No fruther specification, e.g. if the data are pre-recorded or computed on-the-fly, are given. 
     72Specific behaviour of various DataSources is implemented as specialization of the root class bdm::DS. 
     73 
     74 
     75\section memds DataSource of pre-recorded data -- MemDS 
     76 
     77The first experiment run in \ref first was actually an instance of DataSource of pre-recorded data that were stored in memory, i.e. the bdm::MemDS class. 
     78 
     79Operation of such object is trivial, the data are stored as a matrix and the general operations defined above are specialized as follows: 
     80 -# data observed at time \f$ t \f$  are columns of the matrix, getdata() ruturns current column, 
     81 -# time step itself is performed by increasing the column index, 
     82 -# each row is named as "ch0","ch1",... 
     83 
     84This is the default bahavior. It can be customized using the UI mechanism. 
     85When the object of class MemDS is created it calls method  bdm::MemDS::from_setting() and the input structure is parsed for settings. All available settings are documented in the method, see bdm::MemDS::from_setting(). The options are: 
     86\code 
     87DS.class = 'MemDS'; 
     88DS.Data = (...);            // Data matrix or data vector 
    6489        --- optional --- 
    65         drv = {class='RV'; ...} // Identification how rows of the matrix Data will be known to others 
    66         time = 0;               // Index of the first column to user_info, 
    67         rowid = [1,2,3...];     // ids of rows to be used 
    68 } 
    69 \endcode 
    70 for MemDS. The compulsory fields are listed at the beginning; the optional fields are separated by string "--- optional ---". 
    71  
    72 For the example given above, the missing fields were filled as follows: 
    73 \code 
    74   drv  = {class="RV"; names="{ch0 }"; sizes=[1];}; 
    75   time = 0; 
    76   rowid = [1]; 
    77 \endcode 
    78 Meaning that the data will be read from the first column (time=0), all rows of data are to be read (rowid=[1]), and this row will be called "ch0".  
    79  
    80 \note <b>Mixtools reference</b> This object replaces global variables DATA and TIME. In BDM, data can be read and written to a range of \c datasources, objects derived from bdm::DS. 
     90DS.drv = RV({"ch0",...} ); // Identification how rows of the matrix Data will be known to others 
     91DS.time = 0;               // Index of the first column to user_info, 
     92DS.rowid = [1,2,3...];     // ids of rows to be used 
     93\endcode 
     94The compulsory fields are listed at the beginning; the optional fields are separated by string "--- optional ---". 
     95 
     96Fields \c time and \c rowid are self-explanatory. Field \c drv is a the one that specifies identification of the data elements, (point 3. of the general requirements of a DataSource).  
     97 
     98All optionals fields will be filled by default values, it this case: 
     99\code 
     100DS.drv  = RV({'ch0'},1,0); 
     101DS.time = 0; 
     102DS.rowid = [1]; 
     103\endcode 
     104Where the first line specifies a universal identification structure: random variable (bdm::RV). 
    81105 
    82106\section rvs What is RV and how to use it 
     
    84108RV stands for \c random \c variable which is a description of random variable or its realization. This object playes role of identifier of elements of vectors of data (in datasources), expected inputs to functions (in pdfs), or required results (operations conditioning).  
    85109 
    86 \note <b>Mixtools reference </b> RV is generalization of "structures" \c str in Mixtools. It replaces channel numbers by string names, and adds extra field size for each record. 
    87  
    88 Mathematical interpretation of RV is straightforward. Consider pdf \f$ f(a)\f$, then \f$ a \f$ is the part represented by RV. Explicit naming of random variables may seem unnecessary for many operations with pdf, e.g. for generation of a uniform sample from <0,1> it is not necessary to specify any random variable. For this reason, RV are often optional information to specify. However, the considered algorithm \c estimator is build in a way that requires RV to be given. 
    89  
    90 The \c estimator use-case expects to join the data source with an array of estimators, each of which declaring its input vector of data. The connection will be made automatically using the mechanism of datalinks (bdm::datalink). 
    91 Readers familiar with Simulink environment may look at the RV as being unique identifiers of inputs and outputs of simulation blocks. The inputs are connected automatically with the outputs with matching RV. This view is however, very incomplete, RV are much more powerful than this.  
     110Mathematical interpretation of RV is straightforward. Consider pdf \f$ f(a)\f$, then \f$ a \f$ is the part represented by RV. Explicit naming of random variables may seem unnecessary for many operations with pdf, e.g. for generation of a uniform sample from <0,1> it is not necessary to specify any random variable. For this reason, RV are often optional information to specify. However, the considered scenanrio \c simulator is build in a way that requires RV to be given. 
     111 
     112The \c simulator scenario connects the DataSource to second basic class of BDM, bdm:logger. The logger is a class that take care of storing results -- in this case, results of simulation. 
     113The connection between these blocks is done automatically. The logger stores results of simulations under the names specified in drv.   
     114Readers familiar with Simulink environment may look at the RV as being unique identifiers of inputs and outputs of simulation blocks. The inputs are connected automatically with the outputs with matching RV. This view is however, very incomplete, RV have more roles than this. 
     115 
     116\section loggers Loggers for flexible handling of results 
     117Loggers are universal objects for storing and manipulating the results of an experiment. Similar to DataSource, every logger has to provide basic functionality: 
     118 -# initialize its storage (bdm::logger.init()), 
     119 -# assign a connection point to each interested object (bdm::logger.logadd()), 
     120 -# accept data to be logged to given connection (bdm::logger.logit()), 
     121 -# finalize the storage when experiment is finished. 
     122 
     123These abstarct operations can be specialized in many ways. For example, storing all results in memory and writing them to disc when finished (bdm::memlog), storing data in a matlab structure (bdm::mexlog), writing them out in ascii (bdm::stdlog) or more sophisticated buffered output to harddrive (bdm::dirfilelog). 
     124 
     125Since all experiments are performed in matlab, the default mexlog class will be used. However, the way how the results are to be stored can be configured using configuration structure filled by fields from \c from_setting of the chosen logger, and passing it as third argument to \c simulator. 
    92126 
    93127\section datasource Class inheritance and DataSources 
    94128 
    95 As mentioned above, the algorithm \c estimator is written to accept any datasource (i.e. any offspring of bdm::DS). For full list of offsprings, click Classes > Class Hierarchy. 
     129As mentioned above, the scenario \c simulator is written to accept any datasource (i.e. any offspring of bdm::DS). For full list of offsprings, click see Classes > Class Hierarchy. 
    96130 
    97131At the time of writing this tutorial, available datasources are 
     
    111145Brief decription of the class states that EpdfDS "Simulate data from a static pdf (epdf)". The static pdf means unconditional pdf in the sense that the random variable is conditioned by numerical values only. In mathematical notation it could be both \f$ f(a) \f$ and \f$ f(x_t |d_1 \ldots d_t)\f$. The latter case is true only when all \f$ d \f$ denotes observed values. 
    112146 
    113 For example, we wish to simulate realizations of a Uniform density on interval <-1,1>. Uniform density is represented by class bdm::euni. 
    114 From bdm::euni.from_setting() we can find that the code is: 
    115 \code  
    116 U={class="euni"; high=1.0; low = -1.0;} 
    117 \endcode 
    118 for configuration file, and  
     147For example, we wish to simulate realizations of a Uniform pdf on interval <-1,1>.  
     148This is achieved by plugging an object representing uniform pdf into general simulator of independent random samples, EpdfDS. Uniform density is implemented as class bdm::euni. 
     149An instance of \c euni can be again created method \c from_setting, in this case bdm::euni.from_setting(). Using documentation we define it with the following code: 
    119150\code 
    120151U.class='euni'; 
     152U.rv   = RV({'a'}); 
    121153U.high = 1.0; 
    122154U.low  = -1.0; 
    123 U.rv.class = 'RV'; 
    124 U.rv.names = {'a'}; 
    125 \endcode 
    126 for Matlab. 
     155\endcode 
     156which encodes information:\f[ 
     157f(a) = \mathcal{U}(-1,1) 
     158\f] 
    127159   
    128 The datasource itself, can be then configured via 
    129 \code 
    130 DS = {class='EpdfDS'; epdf=@U;}; 
    131 \endcode 
    132 in config file, or  
     160The datasource itself, i.e. the instanc of \c EpdfDS can be then configured via: 
    133161\code 
    134162DS.class = 'EpdfDS'; 
    135163DS.epdf  = U; 
    136164\endcode 
    137 in Matlab. 
    138  
    139 Contrary to the previous example, we need to tell to algorithm \c estimator how many samples from the data source we need. This is configured by variable \c experiment.ndat. The configuration has to be finalized by: 
     165where \c U is the structure defined above. 
     166 
     167Contrary to the previous example, we need to tell to algorithm \c simulator how many samples from the data source we need. This is configured by variable \c experiment.ndat. The configuration has to be finalized by: 
    140168\code  
    141169experiment.ndat = 10; 
    142 M=estimator(DS,{},experiment); 
     170M=simulator(DS,experiment); 
    143171\endcode 
    144172 
     
    151179Consider the following autoregressive model:  
    152180\f[ 
    153 y_t \sim \mathcal{N}( a y_{t-3} + b u_{t-1}, r) 
     181f(y_t|y_{t-3},u_{t-1}) = \mathcal{N}( a y_{t-3} + b u_{t-1}, r) 
    154182\f] 
    155183where \f$ a,b \f$ are known constants, and \f$ r \f$ is known variance. 
     
    160188 -# time delayes of the values. 
    161189 
    162 The first issue can be handled in two ways. First, \f$ u \f$ can be considered as input and as such it could be externally given to the datasource. This solution is used in algorithm use-case \c closedloop. 
    163 However, for the \c estimator scenario we will apply the second option, that is we complement \f$ f(y_{t}|y_{t-3},u_{t-1})\f$ by extra pdf:\f[ 
    164 u_t \sim \mathcal{N}(0, r_u) 
    165 \f] 
     190The first issue can be handled in two ways. First, \f$ u \f$ can be considered as input and as such it could be externally given to the datasource. This solution is used in scenario \c closedloop. 
     191However, for the \c simulator scenario we will apply the second option, that is we complement \f$ f(y_{t}|y_{t-3},u_{t-1})\f$ by extra pdf:\f[ 
     192f(u_t) = \mathcal{N}(0, r_u) 
     193\f] 
     194where \f$ r_u \f$ is another known constant. 
    166195Thus, the joint density is now:\f[ 
    167196f(y_{t},u_{t}|y_{t-3},u_{t-1}) = f(y_{t}|y_{t-3},u_{t-1})f(u_{t}) 
     
    205234\subsection ini Initializing simulation 
    206235 
    207 When zeros are not appropriate initial conditions, the correct conditions can be set using additional commands: 
     236When zeros are not appropriate initial conditions, the correct conditions can be set using additional commands (see bdm::MpdfDS.from_setting() ): 
    208237\code 
    209238DS.init_rv = RV({'y','y','y'}, [1,1,1], [-1,-2,-3]); 
     
    214243Initial data is not checked for completeness, i.e. values of random variables missing from \c init_rv (in this case all occurences of \f$ u \f$) are still initialized to 0. 
    215244 
    216 \section conc What was demonstrated in this tutorial 
    217 The purpose of this page was to introduce software image of basic elements of decision making as implemented in BDM. 
    218  
    219  - random values as identification mechanism (bdm::RV) 
    220  - unconditional pdfs (bdm::epdf), 
    221  - conditional pdfs (bdm::mpdf), 
    222   
    223 And the use of these in simulation of data and function of datasources. In the next tutorial, Bayesian models (bdm::BM) and loggers (bdm::logger) will be introduced. 
    224  
    225  
    226  
    227245 
    228246*/