Data fusion, communication complexity, and decentralized detection


Decentralized detection

This is similar to the classical detection and hypothesis testing problem, except that each of several sensors receives a different set of measurements and each sensor is allowed to communicate only a few bits. These bits have to be chosen so that they are "most informative" summaries of the measurements of each sensor, in a manner that optimizes some overall system objective (usually, the probability of error at fusion center).

The subject is also closely related to recent "social learning" models that have been attracting attention in the social sciences. In the latter models, sensors/agents make decisions that are best in a local (myopic/selfish) sense. In contrast, the papers below correspond to an "engineering" perspective whereby each sensor makes a decision that will be most beneficial for the overall system objective.

Work along the engineering and the social science strands does have some interesting points of contact. For example, any negative (or impossibility) results on the engineering version of a problem, readily implies the same negative results for a social learning formulation involving selfish/myopic agents. Furthermore, one strand can borrow mathematical techniques from the other.

A 1993 survey with a unified treatment of the subject (from the engineering perspective) can be found in:

If the measurements at each sensor are conditionally independent (conditional on any hypothesis), there is a lot to say:

But without conditional independence, the problem becomes NP-complete:


Data fusion using myopic Bayesian updates

This is an alternative model whereby each sensor keeps revising its estimate of an unknown quantity by forming each time, and communicating the optimal estimate (given the currently available information) to its neighbors.

J.N. Tsitsiklis and M. Athans, "Convergence and Asymptotic Agreement in Distributed Decision Problems", IEEE Transactions on Automatic Control, Vol. 29, No. 1, 1984, pp. 42-50.


Communication complexity

Agent 1 knows x, agent 2 knows y. They want to compute a function f(x,y). How much do they need to communicate? This formulation captures much of the essence of the data fusion problem:

Unfortunately, the problem is NP-complete, in general:

But for specific types of functions f, progress is possible. When x and y are continuous variables, calculating the communication complexity may involve tools from algebraic or differential geometry:

In another variant, agent 1 knows a convex function g(.), agent 2 knows a convex function h(.) and they want optimize f(.)+g(.) within some accuracy epsilon: