decentralised Data Fusion: A Graphical Model Approach (Summary)

From statwiki
Revision as of 02:01, 9 November 2011 by Kgolesta (talk | contribs)
Jump to navigation Jump to search

Data Fusion

Data fusion is a task in which various types of data are collected from multiple sources, which are then combined to form meaningful information. Consequently, this information will be used as the input of reasoning systems for achieving inference. It is clear that this form of information gathering from multiple platforms is more efficient and applicable than if they were obtained accumulatively from different sources <ref> Reference 1 </ref>. The process of fusing data is categorized to centralized and decentralized data fusion.


Centralized vs Decentralized Data Fusion

In centralized data fusion, the data is collected and combined in a central node, and then fused information is inferred using the central node. On the other hand, decentralized data fusion is a task in which the process of forming, communication and assimilation of data at remotes sites is in a way that guarantees efficient reliable operation; In fact, the main problem of DDF is finding a method to understand and formulate this process. Having the significant advantages of modularity, scalability and robustness, decentralized data fusion has attracted many researchers, and is the main contribution of this paper.


Contribution of the Paper

This paper addresses the problem of Decentralized Data Fusion (DDF) using Graphical Models techniques. Decentralized Data Fusion problem can be defined as fusing data coming from different platforms, i.e. sensors, which are attempting to estimate a common state of interest that might be partially observed by each of the platforms. As a solution, graphical models incorporates a uniform network-like model of state and observation relations as well as communication and assimilation algorithms which are aimed on distributed message passing and local inference. The authors in this paper applied the concepts from graphical models to both the problem formulation and the solution of decentralized data fusion problem. For formulation part, they have used the well-known Channel Filter algorithm, and moving through solution, they have proposed two ideas: graphical models based on 1) cloned variables and 2) variable cliques. The main part of this work is based on the decentralized graphical model architecture introduced by Paskin and Guestrin <ref> Reference 2 </ref> that is applied to the problem of data fusion here.


Formulation

The formulation begins with representing the physical communication links between platforms with a connectivity graph. Furthermore, it is assumed that that data fusion can be done in both static and dynamic models with static topology. Figure below shows two examples of static and dynamic models.

File:photo.png

where in the left depiction, [math]\displaystyle{ \ x }[/math] is the state of interest and observations are shaded ovals, and for the right one, [math]\displaystyle{ \ x_k }[/math]s are representing the state of interest at time [math]\displaystyle{ \ t_k }[/math] that conclusively forms a simple Markov chain.


The Idea

Introducing their idea, the authors first examined the distributed data fusion technique, Channel Filter <ref> Reference 3 </ref> for both static and dynamic models. The idea behind Channel Filter method is inferring the state [math]\displaystyle{ \ x }[/math] at each platform (configured in a spanning tree) first, and then synchronizing local beliefs are by exchanging messages. Doing so, on each pair of platforms, [math]\displaystyle{ \ \phi_{ij} }[/math] and [math]\displaystyle{ \ \phi_{ji} }[/math], called channel filters, are defined and stored which are further updated as a message in the form of the local posterior [math]\displaystyle{ \ \psi_i(x|overline{z_i}) }[/math] reaches the corresponding platform. Local posteriors are also calculated based on local observations [math]\displaystyle{ \ z_i = overline{z_i}) }[/math], and are updated using channel filters. Conclusively the resulting message passing algorithm for static model will be:

[math]\displaystyle{ \ \phi_{ij}^* = \psi_i }[/math] [math]\displaystyle{ \ \phi_{ji}^* = \psi_i }[/math] [math]\displaystyle{ \ \psi_j^* = \phi_{ji}^* / \phi_{ij} \times \psi_j }[/math]

Finally the probability of the state of interest at each platform is gained by calculating the local posterior [math]\displaystyle{ \ p_i(x) = \psi_i }[/math] which is equal to the centralize solution after all the messages have been passed. The same method is used for dynamic model with the difference in common filtering which is gained by iteratively marginalizing past states. Using graphical models with cloned variables were studied as the first attempt in formulating the problem. The idea is to create a clone per platform and define deterministic equality relationships between the clones. It is shown in the paper that for static models, indicating the deterministic equality is a problematic process for continuous variables (Gaussian representation) and it arises with numerical difficulties and added complexity. On the other hand and for dynamic models, there are issues while modeling the message passing technique, and using leader election algorithm for reducing the effect of communication failure, which is often turn out to add latency. Double counting of model information and overconfidence due to informative prior are other issues associated with cloned variables. Tackling this problem, the authors used the concept of Distributed Junction Model, which come with the idea of distributing the potentials over cliques of random variables of the original centralized model. Moreover, a junction tree is constructed which is a special form of spanning clique tree with running intersection property. After defining the junction tree and eliminating observed observations from the model, inference can be performed on the junction model. Here two algorithms named Hugin <ref> Reference 4 </ref> and Shafer-Shenoy <ref> Reference 5 </ref> have been used for inference. Although both methods use the technique of message passing, they differ in the definition of data structure between cliques. While message passing in Hugin algorithm is done between the cliques and separators by updating the potentials, the separators are not defined explicitly in Shafer-Shenoy algorithm, and messages received from neighbors are not multiplied into the local potential. It is interesting to note that in the special case of observed observations, Hugin and Channel Filtering method send the same messages and use the same update equations. On the other hand and in Distributed Junction Tree with dynamic model, the approach of using an external dynamic model showed better results. In this method each distributed network is implemented using either the Hugin or Shafer-Shenoy algorithm. Additionally, for each time slice the collected likelihoods of a specific time slice are added to the local network. Marginalizing out old variables to deal with limited memory can also be incorporated into this dynamic model.


Evaluation

The contribution of this paper is evaluated by targeting a series of simulated tracking problems where in each case the number of platforms and a single target is distributed randomly throughout the environment. Platforms may sense the target within sensor range noisy and contact other platforms within communication range. Simulations are run for 100 iterations plus 20 extra iterations for settling time. Here, two cases are addressed:

  • Static vs Dynamic Phenomenon
  • Shafer-Shenoy vs Hugin

For static phenomenon, Hugin and Shafer-Shenoy algorithms show identical estimates for static phenomenon of interest. It is also indicated that there is an information gradient across the network as information is flowing outwards from the platforms closest to the target. Case of burst communications is also addressed by considering a counter for allowing transmission only on every nth step. In dynamic modeling the number of platforms within sensor range changes over time and this results in a change in the information content of the centralized solution. Setting time in this model causes the decay in centralized precision which is basically due to the process noise.