4

Distributed Detection in Wireless Sensor Networks

Pramod K. Varshney, Engin Masazade, Priyadip Ray and Ruixin Niu

CONTENTS

4.1      Introduction

4.2    Distributed Detection over Ideal Communication Channels

4.2.1    Bayesian Formulation

4.2.2    Neyman–Pearson Formulation

4.2.3    Design of Fusion Rules

4.2.4    Asymptotic Regime

4.2.5    Counting Rule

4.2.6    False Discovery Rate–Based Sensor Decision Rules

4.2.6.1    Review of Multiple Comparison Problems in Statistics

4.2.6.2    Algorithm to Control FDR

4.2.6.3    Design Guidelines for Distributed Detection Systems

4.2.7    Correlated Decisions

4.3    Distributed Detection over Nonideal Communication Channels

4.3.1    Distributed Detection with Partial Channel State Information

4.3.2    Distributed Detection with No Channel State Information

4.4    Conclusions

References

4.1    INTRODUCTION

There are many practical situations in which one is faced with a decision-making problem. Based on observations regarding a certain phenomenon, a particular course of action needs to be employed from a set of possible options. Decision-making structures are found in many real-world situations that include financial institutions, air-traffic control, oil exploration, medical diagnosis, military command and control, electric power networks, weather prediction, and industrial organizations. In conventional decision-making scenarios, a sensor transmits its raw observation to a processor where optimal detection is carried out based on conventional statistical techniques. The branch of statistics dealing with these types of problems is known as statistical decision theory or hypothesis testing. In the context of radar and communication theory, it is known as detection theory [1, 2, 3, 4]. More recently, the trend is to employ multiple sensors to observe a phenomenon. For decision making, raw observations from all the sensors can be transmitted to a central processor where an optimum decision rule can be designed based on conventional detection theory. However, centralized processing based on raw observations from multiple sensors is neither efficient nor necessary. It may consume excessive energy and bandwidth in communications and may impose a heavy computation burden at the central processor.

In distributed detection [1,5,6], multiple detectors (sensors) work collaboratively to distinguish between two or more hypotheses. In a binary distributed detection problem, the objective might be the determination of the absence or presence of a signal of interest, or in a multiple hypothesis testing problem, the objective might be the classification of multiple signals or targets. Local sensors can carry out preliminary processing of data and only communicate with each other and/or the central processing unit called the fusion center with the most informative information relevant to the global objective. As we describe later in the chapter, the global objective might be the minimization of detection error probability or maximization of probability of detection given a fixed false alarm rate constraint. Deployment of multiple sensors for signal detection improves system survivability, results in improved detection performance or in a shorter decision time to attain a prespecified performance level. From the signal processing perspective, two inherently different problems need to be considered for the distributed detection system: the design of the decision rule at the fusion center (often referred to as the fusion rule), which strives for an optimal system performance using compressed input from distributed sensors, and the design of local sensor signal processing algorithms. These two problems are intertwined with each other and they need to be jointly solved to optimize a prescribed performance criterion.

Recently, wireless sensor networks (WSNs) have gained much attention and interest and have become a very active research area. Due to their flexibility, enhanced surveillance coverage, robustness, mobility, and cost effectiveness, WSNs have found wide applications in areas such as military surveillance, and environmental monitoring. Usually, a WSN consists of a large number of low-cost and low-power sensors, which are deployed in the environment to collect observations from an event of interest. Each sensor preprocesses and extracts information from the raw observations and has the ability to communicate with other sensor nodes or the fusion center via wireless channels. The fusion center processes all the sensor data and arrives at a global inference. The detection ability of a WSN is crucial for various applications. As an example, in a surveillance scenario, the presence or absence of a target is usually determined before its attributes, such as its position or velocity, are estimated. For WSNs, the classical distributed detection framework needs to be reconsidered by taking into account the important features and limitations of sensors and the wireless channels between the sensors and the fusion center. Since a WSN has stringent resource availability in terms of power and/or bandwidth, the design of appropriate distributed detection algorithm should satisfy the resource constraints of the WSN. Furthermore, error-free transmission of sensor measurements to the fusion center over wireless channels may require high transmission power and/or powerful error correction codes which might be prohibitive for sensors with limited power and processing capabilities. Therefore, channel impairments should be taken into account in the design of distributed detection systems. A recent survey [7] summarizes the results on distributed detection, estimation, and tracking in WSNs with a special emphasis on solutions that take into account the communication network connecting the sensors and the resource constraints at the sensors.

The remainder of the chapter is organized as follows. In Section 4.2, under the conditional independence assumption, we first introduce the conventional design of decision rules at the local sensors and at the fusion center to optimize detection performance, under the Bayesian and Neyman–Pearson (NP) criteria. In many practical scenarios, it may be difficult to obtain the optimal decision rules which require information about the performance of individual sensors. Hence, decision rules that do not require this information are desirable. Later in this section, we discuss false discovery rate (FDR)-based decision fusion which does not require the knowledge of the local sensor parameters while employing nonidentical decision thresholds at each sensor. In Section 4.3, we investigate the decision fusion problem, where the channels between the sensors and the fusion center are subject to fading and noise. We review channel aware decision fusion algorithms with different degrees of channel state information. Finally, in Section 4.4, a summary of the chapter is presented and some open challenging issues for distributed detection are addressed.

4.2    DISTRIBUTED DETECTION OVER IDEAL COMMUNICATION CHANNELS

When there are two possible sets of action, the problem is a binary hypothesis testing problem. We label the two possible choices as H0 and H1. Hypothesis H0 usually represents the absence of an object or event and Hypothesis H1 corresponds to its presence. If there are M hypotheses with M > 2, it is a multiple hypothesis testing problem or M-ary detection problem. In this chapter, we focus on the binary hypothesis testing problem. More detailed treatment for the multiple hypothesis testing problem can be found in the literature [8, 9, 10, 11, 12, 13].

In the hypothesis testing problem, the source or event of interest is not directly observable. Corresponding to each hypothesis, an observation (a set of observations), which is a random variable (vector) in the observation space is generated according to some probabilistic law. Let us assume that there are K sensors in the WSN and the observation at each of the K sensors, zk, corresponds to either of the two hypotheses

H0p0(θ)H1p1(θ)H0p0(θ)H1p1(θ)

(4.1)

where p0(θ) and p1(θ) are the pdfs under H0 and H1, respectively. More specifically, if the problem is to detect the absence or presence of the signal of interest, the received observation at each sensor has the form

zk={nkUnderH0θ+nkUnderH1

(4.2)

where

θ represents the parameter vector that characterizes the hypothesis H1

nk represents the noise

By examining the observation, we try to infer which hypothesis is the correct one based on a certain decision rule. Usually, a decision rule partitions the observation space into decision regions corresponding to the different hypotheses. The hypothesis corresponding to the decision region where the observation falls is declared true. Whenever a decision does not match the true hypothesis, an error occurs. To obtain the fewest errors (or least cost), the decision rule plays an important role and should be designed according to the optimization criterion in use.

Parallel configuration, as shown in Figure 4.1, is the most common topological structure that has been studied quite extensively in the literature. In parallel topology, the sensors do not communicate with each other and there is no feedback from the fusion center to any sensor. Sensors either transmit their measurements zk’s directly to the fusion center or send a quantized version of their local measurements defined by the mapping rule uk = γk(zk)k ∊ {1, 2, …, K}. Based on the received information u = [u1, …, uK], the fusion center arrives at the global decision u0 = γ0(u) that favors either H1 (decides u0 = 1) or H0 (decides u0 = 0). The goal is to obtain the optimal set of decision rules Γ = (γ0, γ1, …, γK) according to the objective function under consideration which can be formulated according to Bayesian formulation or NP formulation. For general network structures, the optimal solution to the distributed detection problem, i.e., the optimal decision rules (γ1, …, γK), is NP-complete [14, 15, 16]. Nonetheless, under the conditional independence assumption the optimum solution becomes tractable.

Image

FIGURE 4.1 Parallel configuration.

The conditional independence assumption implies that the joint density of the observations obeys

p(z1,,zK|Hj)=Kk=1p(zk|Hj),forj=0,1

(4.3)

Consider a scenario in which the observations at the sensors are conditionally independent as well as identically distributed. The symmetry in the problem suggests that the decision rules at the sensors should be identical. But counterexamples have been found in which nonidentical decision rules are optimal [16, 17, 18, 19]. In the following sections, the decision rules at local sensors and the fusion center are designed according to Bayesian and NP formulations for the parallel configuration.

4.2.1    BAYESIAN FORMULATION

Let the vector of sensor decisions be denoted as u = [u1, …, uK] so that the conditional densities under the two hypotheses are p(u|H0) and p(u|H1) respectively. The observations are generated from these conditional densities which are assumed known. The a priori probabilities of the two hypotheses denoted by P(H0) and P(H1) are assumed to be known. In the binary hypothesis testing problem, four possible actions can occur. Let Ci,j, i ∊ {0, 1}, j ∊ {0, 1} represent the cost of declaring Hi true when Hj is present. The Bayes risk function is given by

=1i=01j=0Ci,jP(Hj)P(DecideHi|Hjispresent)=1i=01j=0Ci,jP(Hj)Uip(u|Hj)du

(4.4)

where Ui is the decision region corresponding to hypothesis Hi which is declared true for any observation falling in the region Ui. Let U be the entire observation space so that U=U0U1 and U0U1=.

If C0,0 = C1,1 = 0 and C0,1 = C1,0 = 1, we have the minimum probability of error criterion, i.e., R = Pe = P(u0 = 1|H0)P0 + P(u0 = 0|H1)P1. The probability of error is given by

Pe=P(H0)PF+P(H1)(1PD)

(4.5)

where

PF = P(u0 = 1|H0) denotes the probability of false alarm

PD = P(u0 = 1|H1) denotes the probability of detection

Given the vector of local sensor decisions, u, the probability of error is expressed as

Pe=P(H0)P(u0=1|H0)+P(H1)(1P(u0=1|H1))

(4.6)

which can be written as

Pe=P(H1)+P(u0=1|u)[P(H0)P(u |H0)P(H1)P(u |H1)]

Pe is minimized if

P(u0=1|u)=0whenP(H0)P(u|H0)P(H1)P(u|H1)]>0P(u0=1|u)=1whenP(H0)P(u|H0)P(H1)P(u|H1)]<0

(4.7)

The earlier property leads to the following likelihood ratio test (LRT) at the fusion center [1]:

P(u|H1)P(u|H0)=Kk=1p(uk|H1)p(uk|H0)P(H0)P(H1)u0=1u0=0=P(H0)P(H1)

(4.8)

The quantity on the left-hand side is known as the likelihood ratio and the quantity on the right-hand side is the threshold. Let

ui=[u1,,ui1,ui+1,,uK],A(ui)=P(u0=1|ui1)P(u0=1|ui0)uij=[u1,,ui1,ui=j,ui+1,,uK],j=0,1

and CF = P0(C10C00), CD = (1 − P0)(C01C11). Then, the LRT at each sensor has the form

p(zi|H1)p(zi|H0)ui=1ui=0uiCFA(ui)Kk=1,kiP(uk|H0)uiCDA(ui)Kk=1,kiP(uk|H1)fori=1,,K

(4.9)

Conditional independence assumption and establishing the optimality of LRT at local sensors does not completely solve the problem. Note that the LRT thresholds at the sensors are coupled with each other which affect the system performance in an interdependent manner. Almost invariably used for finding the local sensor thresholds is the so called person-by-person optimization (PBPO) approach, where each sensor’s threshold is optimized assuming fixed decision rules at all other sensors and the fusion center [20]. Unfortunately, the PBPO algorithm does not necessarily lead to a global optimal solution and may only lead to a local minimum of the solution space. Multiple initializations may be needed to obtain global optimum.

4.2.2    NEYMAN–PEARSON FORMULATION

The NP formulation of the distributed detection problem can be stated as follows: Let α be a prescribed bound on the global probability of false alarm such that PF = P(u0 = 1|H0) ≤ α. Then the problem is to find (optimum) local and global decision rules that maximize the probability of detection PD = P(u0 = 1|H1) given PF = P(u0 = 1|H0) ≤ α.

Under the conditional independence assumption, the mapping rules at the sensors as well as the decision rule at the fusion center are threshold rules based on the appropriate likelihood ratios [21,22]:

p(zk|H1)p(zk|H0){>tk,thenuk=1=tk,thenuk=1withprobalilityϵk<tk,thenuk=0

(4.10)

for k = 1, …, K, and

Kk=1P(uk|H1)P(uk|H0){>λ0,decide H1orsetu0=1=λ0,randomlydecide H1withprobalilityϵ<λ0,decide H0orsetu0=0

(4.11)

If the likelihood ratio in (4.10) is a continuous random variable with no point mass, then the randomization is unnecessary and εk can be assumed to be zero without losing optimality. The threshold λ0 in (4.11) as well as the local thresholds tk in (4.10) need to be determined so as to maximize PD for a given PF = α. This can still be quite difficult even though the local decision rules and the global fusion rule are LRTs [1]. Since (4.11) is known to be a monotone fusion rule, one can solve for the set of optimal local thresholds {tk, i = 1, …, K} for a given monotone fusion rule and compute the corresponding PD. One can then successively consider other possible monotone fusion rules and obtain the corresponding detection probabilities. The final optimal solution is the one monotone fusion rule and the corresponding local decision rules that provide the largest PD. An iterative gradient method was proposed in [23] to find the thresholds satisfying the preassigned false alarm probability. Finding the optimal solution in this fashion is possible only for very small values of N. The complexity increases with N, because (1) the number of monotone rules grows exponentially with N, and (2) finding the optimal {tk, i = 1, …, K} for a given fusion rule is an optimization problem involving an N − 1 dimensional search (it is one dimension less than N because of the constraint PF = α).

4.2.3    DESIGN OF FUSION RULES

Given the local detectors, the problem is to determine the fusion rule to combine local decisions optimally. Let us first consider the case where local detectors make only hard decisions, i.e., uk can take only two values 0 or 1 corresponding to the two hypotheses H0 and H1. Then, the fusion rule is essentially a logical function with K binary inputs and one binary output. There are 22K possible fusion rules in general and an exhaustive search strategy is not feasible for large K.

Let Pf,k and Pd,k denote the probabilities of false alarm and detection of sensor k, respectively, i.e., Pf,k = P(uk = 1|H0) and Pd,k = P(uk = 1|H1). According to (4.8) and (4.11), the optimum fusion rule is given by the LRT:

Kk=1P(uk|H1)P(uk|H0)u0=1u0=0λ

(4.12)

Here, λ is determined by the optimization criterion in use. The left-hand side of (4.12) can be written as

Kk=1P(uk|H1)P(uk|H0)=Kk=1(P(uk=1|H1)P(uk=1|H0))uk(P(uk=0|H1)P(uk=0|H0))=Kk=1(Pd,kPf,k)uk(1Pd,k1Pf,k)1uk1uk

(4.13)

Taking the logarithm of both sides of (4.12), we have the Chair–Varshney fusion rule [24]

Kk=1[uklogPd,kPf,k+(1uk)log1Pd,k1Pf,k]u0=1u0=0logλ

(4.14)

This rule can also be expressed as

Kk=1[logPd,k(1Pf,k)Pf,k(1Pd,k)]uku0=1u0=0logλ+Kk=1log1Pf,k1Pd,k

(4.15)

Thus, the optimum fusion rule can be implemented by forming a weighted sum of the incoming local decisions and comparing it with a threshold. The weights and the threshold are determined by the local probabilities of detection and false alarm. If the local decisions have the same statistics, i.e., Pf,k = Pf,l and Pd,k = Pd,l for kl, the Chair–Varshney fusion rule reduces to a T-out-of-K form or a counting rule, i.e., the global decision u0 = 1 if T or more sensor decisions are one. This structure of the fusion rule reduces the computational complexity considerably.

So far, we have assumed that the parameters characterizing a hypothesis, θ, are fixed and known leading to the conditional independence assumption. In many situations, these parameters can take unknown values or a range of values. Such hypotheses are called composite hypotheses and the corresponding detection problem is known as composite hypothesis testing. If θ is characterized as a random vector with known probability densities under the two hypotheses, the LRT can be extended to composite hypothesis testing in a straightforward manner:

Λ(u)=Θ1p(u|θ,H1)p(θ|H1)dθΘ1p(u|θ,H0)p(θ|H0)dθu0=1u0=0η

(4.16)

If θ is nonrandom, i.e., fixed but unknown constant, one would like to be able to obtain uniformly most powerful (UMP) results for an optimum scheme based on an NP test. If a UMP test does not exist, we can use the maximum likelihood estimates of its value under the two hypotheses as the true values in an LRT, resulting in the so-called generalized likelihood ratio test (GLRT):

Λ(u)=maxθΘ1p(u|θ,H1)maxθΘ0p(u|θ,H0)u0=1u0=0η

(4.17)

Note that the optimum NP or Bayesian detectors involve an LRT as in (4.12). Although the NP and Bayesian detectors are optimum in the sense of maximizing PD for a fixed PF, and minimizing the Bayes risk, the associated LRTs require the complete knowledge of the pdfs p(u|H1) and p(u|H0) which may not always be available in a practical application. Also, there are many detection problems where the exact form of the LRT is too complicated to implement. Therefore, simpler and more robust suboptimal detectors are used in numerous applications [25]. For some suboptimal detectors, the detection performance can be improved by adding an independent noise to the observations under certain conditions which is known as stochastic resonance (SR) noise [26]. The work in [27] first discusses the improvability of the detection performance by adding SR noise given a suboptimal fixed detector. If the performance can be improved, then the best noise type is determined in order to maximize PD without increasing PF. The work in [28] discusses variable detectors.

In this chapter, we have focused on fixed-sample-size detection problems for the parallel architecture. Solutions for arbitrary topologies such as serial [1,29, 30, 31] and tree have been derived and are discussed in [32, 33, 34]. In fixed-sample-size detection, the fusion center arrives at a decision after receiving the entire set of sensor observations or decisions. Sequential detectors may choose to stop at any time and make a final decision or continue to take additional observations [35, 36, 37, 38, 39]. Moreover, in consensus-based detection [40, 41, 42], which requires no fusion center, sensors first collect sufficient observations over a period of time. Then, subsequently they run the consensus algorithm to fuse their local log likelihood ratios.

4.2.4    ASYMPTOTIC REGIME

In this section, we describe some results when the number of sensors becomes very large, i.e., we discuss some asymptotic results. It has been shown that identical decision rules are optimal in the asymptotic regime where the number of sensors increases to infinity [16,43]. In other words, the identical decision rule assumption often results in little or no loss of optimality. Therefore, identical local decision rules are frequently assumed in many situations, which reduces the computational complexity considerably.

For any reasonable collection of decision rules Γ, the probability of error at the fusion center goes to zero exponentially as the number of sensors K grows unbounded. It is then adequate to compare decision rules based on their exponential rate of convergence to zero:

limKlogPe(Γ)K

(4.18)

It was shown that for the binary hypothesis testing problem, use of identical local decision rules for all the sensor nodes is asymptotically optimal in terms of the error exponent [43]. In [44], the exact asymptotics of the minimum error probabilities achieved by the optimal parallel fusion network and the system obtained by imposing the identical decision rule constraint was investigated. It was shown analytically that the restriction of identical decision rules leads to little or no loss of performance. Asymptotic regimes applied to distributed detection are convenient because they capture the dominating behaviors of large systems. This leads to valuable insights into the problem structure and its solution.

In the asymptotic regime, it has been shown in [45] that if there exists a binary quantization function γb whose Chernoff information exceeds half of the information contained in an unquantized observation, then transmitting binary decisions from sensors to the fusion center becomes optimal. The requirement is fulfilled by many practical applications [46] such as the problem of detecting deterministic signals in Gaussian noise and the problem of detecting fluctuating signals in Gaussian noise using a square-law detector. In these scenarios, the gain offered by having more sensor nodes outperforms the benefits of getting detailed information from each sensor.

4.2.5    COUNTING RULE

Most of the results discussed so far on distributed detection are based on the assumption that the local sensors’ detection performances, namely, either the local sensors’ signal to noise ratio (SNR) or their probability of detection and false alarm rate, are known to the fusion center. For a WSN consisting of passive sensors, it might be very difficult to estimate local sensors’ performances via experiments because sensors’ distances from the signal of interest might be unknown to the fusion center and to the local sensors. Even if the local sensors can somehow estimate their detection performances in real time, it can be still very expensive to transmit them to the fusion center, especially for a WSN with very limited system resources. Hence, the knowledge of the local sensors’ performances cannot be taken for granted and a fusion rule that does not require local sensors’ performances is highly preferable. Without the knowledge of local sensors’ detection performances and their positions, an approach at the fusion center is to treat every sensor equally. An intuitive solution is to use the total number of “1”s as a statistic since the information about which sensor reports a “1” is of little use to the fusion center. In [47, 48, 49], a counting-based fusion rule is proposed, which uses the total number of detections (“1”s) transmitted from local sensors as the statistic,

Λ(u)=Kk=1uku0=1u0=0T

(4.19)

where T is the threshold at the fusion center, which can be decided by a prespecified probability of false alarm PF. This fusion rule is called the counting rule. It is an attractive solution, since it is quite simple to implement, and achieves very good detection performance in a WSN with randomly and densely deployed low-cost sensor nodes.

The performance of a distributed detection system that is the probability of false alarm and the probability of detection at the fusion center needs to be calculated from

PF=P(u0=1|H0)=P(Λ(u)>η |H0)PD=P(u0=1|H1)=P(Λ(u)>η |H1)

(4.20)

which requires the probability density function of the test statistic Λ(u). For the counting rule as in (4.19), under hypothesis H0, the total number of detections Λ=Kk=1uk follows a binomial distribution. For a given threshold T, the false alarm rate can be calculated as follows:

PF=Kk=T(Kk)Pkf(1Pf)Nk

(4.21)

where Pf,1 = … = Pf,K = Pf. For the sensing model in (4.2) where θ is fixed and known, the detection probability can be obtained from

PD=Kk=T(Kk)Pkd(1Pd)Nk

(4.22)

where all the sensors use identical decision thresholds. In many practical scenarios, while computing PD, decisions are not independent of each other under hypothesis H1, since the decisions are all dependent on the target and sensors coordinates which can also be random variables. For such cases, several approximations for computing the distribution of Λ(u) under H1 can be found in [47, 48, 49].

The calculation of PD and PF may become difficult since it requires the probability density function of the decision rule Λ(u). Deflection coefficient is a useful performance measure when the statistical properties of the received measurements are limited to moments up to a given order as

D(Λ)=(E[Λ|H1]E[Λ|H0])2Var(Λ|H0)

(4.23)

which requires the first two moments of the decision test statistic Λ.

Our previous survey [50] also summarizes the decision fusion results based on identical decision rules at each sensor. Next, we summarize FDR-based decision fusion which uses nonidentical decision thresholds at each sensor.

4.2.6    FALSE DISCOVERY RATE–BASED SENSOR DECISION RULES

Let us consider a detection scenario where the sensors which are located within the target’s finite radius of influence receive identical target signal and the rest of the sensors do not receive any target signal. This “disk” target signal model may be applied to scenarios such as oil or chemical leaks [51] or to approximate more general electromagnetic or acoustic target models. Though this is a very simple model, it clearly captures the scenario where the sensors in the network receive nonidentical target signals (all sensors receive identical target signal has been the primary assumption in the distributed detection literature). As mentioned earlier, design of the optimum local and global decision rules for such problems is very difficult. Earlier related work [47,49] assumes that all the sensors use an identical local threshold for an LRT to obtain a local decision. Since the probability of detection of each sensor is unknown due to unknown target and sensor location, the optimal Chair–Varshney fusion rule cannot be used for this problem. An intuitive choice is to constrain the fusion center decision statistic to be linear in the total number of local detections, i.e., employ the “count” as the statistic, and perform a threshold test to obtain the global decision. This approach may also be viewed as performing multiple hypotheses tests (each sensor performing a binary hypothesis test locally)* and the fusion center using the results of these tests (i.e., the outcome of the local hypotheses tests) to come up with a global decision. Therefore, the detection problem essentially reduces to obtaining the optimal set of the two design parameters, the local and global decision thresholds. Hence, from here on we will use the terms “decision rules” and “decision thresholds” interchangeably in this article. Note that optimization of distributed detection systems where the local sensor SNRs may be unknown has been investigated in [52, 53, 54]. However, the optimization techniques in [52, 53, 54] require the knowledge or an estimate of the local sensor SNRs. Note that, the estimation of the local sensor SNRs is very difficult as it is a function of the sensor and target location which is generally unknown. In [55], the authors propose a detection scheme based on the control of FDR, which employs nonidentical local sensor decision rules without increasing the total number of design parameters. Also, the FDR-based detection strategy proposed in [55] does not require an estimate of the local sensor SNRs. The FDR-based scheme is discussed in some detail in this section. Since FDR was first proposed in the context of multiple hypotheses problems (also known as multiple comparison problems [MCPs]) in statistics, we next provide a brief review of MCPs.

4.2.6.1    Review of Multiple Comparison Problems in Statistics

Multiple comparisons refer to multiple simultaneous hypothesis tests. When a family of tests is conducted, it is often meaningful to define an error measure for the family instead for the individual tests. One of the most common measures is the family-wise error rate (FWER) [56], defined as the probability of committing any type I error or false alarm. If the error rate for each test is α then the FWER αF for k tests is given by

αF=P(V1)=1(1α)k

(4.24)

where V is defined in Table 4.1. As can be seen from Equation 4.24 for a single comparison, αF = α. When the number of comparisons increases, α remains constant but αF increases. This is a fundamental problem of MCPs and classical multiple comparison procedures aim to control this error measure. A method to control FWER, known as the Bonferroni procedure, controls the FWER in the strong sense, i.e., under all conditions. The method is based on the Bonferroni inequality, which says that the probability of the union of a number of events is less than or equal to the sum of their individual probabilities:

P(A1A2Ak)ki=1P(Ai)

(4.25)

Hence, if each individual test is performed at the probability of false alarm α* = αF/k, the FWER for the family of tests is maintained at αF. But this procedure is very conservative and results in significantly reduced probability of detection (reduced power). A radically different and more liberal approach proposed by Benjamini and Hochberg [57] controls FDR, defined as the fraction of false rejections among those hypotheses rejected. Table 4.1 defines some terms leading to the definition of FWER and FDR for a binary hypothesis testing problem involving two hypotheses H0 and H1.

FDR is defined as the expected ratio of the number of false alarms (declared H1 when H0 is true) to the total number of detections (consisting of both true and false detections). The fraction of false alarms to the total number of detections can be viewed through the random variable defined as

TABLE 4.1
Notations to Define FDR

 

Declared H0

Declared H1

Total

H0 True

U      

V

K0     

H1 True

T      

S

KK0

Total

KR

R

N        

Q={VV+S,ifV+S00,ifV+S=0

(4.26)

FDR (Qe) is defined to be the expectation of Q,

Qe=E(Q)

(4.27)

Along with this metric, Benjamini and Hochberg [57] also proposed the following algorithm to control FDR for multiple comparisons.

4.2.6.2    Algorithm to Control FDR

Suppose p1, p2, …, pK are the p-values for K tests and p(1), p(2), …, p(k) denote the ordered p-values. The p-value for an observation sk is defined as

pk=skf0(s)ds

(4.28)

where f0(s) is the probability density function of the observation under H0.

The algorithm by Benjamini and Hochberg [57] which keeps the FDR below a value γ, is provided as follows:

1.  Calculate the p-values of all the observations and arrange them in ascending order.

2.  Let d be the largest k for which p(k)kγ/K.

3.  Declare all observations corresponding to p(k), k = 1, …, d, as H1.

Under the assumption of independence of test statistics corresponding to the true null hypotheses (H0), this procedure controls the FDR at γ. It has also been proved later in [58], that this same procedure also controls the FDR when the test statistics have positive regression dependency on each of the test statistics corresponding to the true null hypothesis. Note that the FDR-based decision-making system looks for the largest index k = d such that p(d)dγ/K. There may be other indices k = l, where l < d for which the condition p(l)lγ/K may be true, but the FDR-based decision system looks for the largest value of k for which this is true. The reason behind this, as discussed in [57], is to achieve the largest probability of detection while constraining the FDR to less than or equal to γ. A detailed proof for the control of FDR by this algorithm is provided in [57]. It should also be noted that the assumption of independence of the test statistics corresponding to the false null hypotheses (H1) is not needed for the proof of the theorem.

As the ordering of p-values is required for the FDR control procedure described in [57], the procedure conventionally needs centralized processing. For the distributed detection problem considered earlier, the sensors can only send one bit to the fusion center and hence a distributed ordering scheme is necessary. A decentralized FDR procedure has been proposed in [55] which requires only one-bit communication capability for each sensor and achieves the same performance as the centralized Benjamini–Hochberg procedure. The maximum communication cost for the entire network is less than or equal to K bits per detection round, where K is the total number of sensors in the network.

An important property of FDR is now presented in the following proposition [57].

Proposition 1

If all MCP hypotheses are true H0s, i.e., K0 = K, control of FDR is equivalent to the control of FWER. However, if some of the MCP hypotheses are true H1s, i.e., K0K, the FDR is smaller than or equal to FWER.

As seen from Proposition 1, FDR is the expectation of a ratio and hence the control of FDR is more liberal compared to the control of FWER in general, and as the number of true H1s increases, the local detection probability increases. Also, as seen from the algorithm provided earlier, the control of FDR results in a data dependent rejection region (decision region) unlike conventional statistical tests where the rejection region is fixed a priori. This characteristic of FDR, as illustrated next, is the primary motivation behind the control of FDR for distributed detection to design local decision thresholds.

4.2.6.3    Design Guidelines for Distributed Detection Systems

Based on the earlier discussion on MCPs, if K sensors employ an identical decision threshold equal to τ (or p-value threshold of Q(τ)*), the FWER is controlled at a value of NQ(τ) under all conditions. However, an FDR-based threshold selection scheme, with FDR parameter γ, will result in control of the FWER to γ when there is no target in the ROI, i.e., all MCP hypotheses are true H0s. In the presence of a target, i.e., when some MCP hypotheses are true H1s, as seen from Proposition 1, the FWER is greater than the FDR. Thus, when there is no target, an FDR-based scheme may be designed to control the FWER at any arbitrary level. But the same scheme, in the presence of a target, is more liberal (in the sense of permitting more local detections) at the cost of higher FWER. Hence, the total number of detections (irrespective of whether they are true or false local detections) over the sensor field, increases significantly in the presence of a target compared to an identical threshold scheme. Thus, the control of FDR provides better separation of the probability mass functions (pmfs) of the “count” under the global hypothesis G0 (target absent in ROI) and the global hypothesis G1 (target present in ROI) compared to a scheme that controls the FWER. Here by “better separation” it is implied that for the FDR-based detection scheme, it is likely that the distance (quantifiable in terms of metrics such as the deflection coefficient) between the pmfs of the “count” under hypotheses G0 and G1 will be more compared to an identical threshold approach.

As discussed earlier, the two* design parameters for the distributed detection system are the local sensor decision threshold parameter (γ for FDR-based strategy) and the global decision threshold parameter, denoted by T. For any observed count Δ ∊ Z (Z denotes the set of integers [0, …, K]), the binary hypotheses testing problem at the fusion center is given by

G0:P(Δ=i;G0)=p0(Δ):TargetabsentG1:P(Δ=i;G1)=p1(Δ):Targetpresent

(4.29)

If T(Δ) is the decision statistic, the optimal test under the NP criterion is given by a randomized decision rule which chooses the hypothesis G1 with probability δT(Δ), where

δT(Δ)={1,ifT(Δ)>Tκ,ifT(Δ)=T0,ifT(Δ)<T

(4.30)

where

T is the global threshold

κ is the randomization parameter

T(Δ) is the likelihood ratio

However, for the problem considered here, the optimal NP detector is very complex. Hence, a simplified detector is adopted in which the test statistics is linear in “count,” i.e., T(Δ) = Δ. The threshold T and the randomization constant κ are chosen such that the system-wide probability of false alarm is controlled. The system-wide probability of false alarm PFA for this simplified detector is given by

PFA=P(Δ>T;G0)+κP(Δ=T;G0)

(4.31)

The system-wide probability of detection PD for this simplified detector is given by

PD=P(Δ>T;G1)+κP(Δ=T;G1)

(4.32)

For the FDR-based detector, for any arbitrary FDR parameter γ, the parameters T and κ are selected such that the system-level probability of false alarm is constrained. The system-level probability of false alarm for a threshold T and randomization constant κ is given by [55]

PFA=Kk=T+1(Kk)(1γ)(kγK)k(1kγK)Kk1+κ(KT)(1γ)(kγK)T(1kγK)KT1

(4.33)

Also, for any arbitrary FDR parameter γ, T and κ, the system-wide probability of detection is given by [55]

PD=KT+1P(Δ=k;G1)+κP(Δ=T;G1)

(4.34)

where P(Δ = k; G1) is the probability of observing “count” k for a target present in the ROI [55]. For large K [55], the system-wide probability of detection may be approximated by

PDQ(TK¯pdK¯pd(1¯pd))

(4.35)

where ¯pd is the average probability of detection for a sensor.

The choice of the optimum FDR parameter γ, where optimality is with respect to system-level detection performance, is a difficult problem. Receiver operating characteristic (ROC)-based optimization procedures to obtain the best γ or τ is computationally prohibitive. A computationally less intensive approach is to obtain γ or τ via optimization of the deflection coefficient. Under Gaussian assumptions, it is known that maximizing the deflection coefficient maximizes the detection performance [59] in terms of the ROC. Though, under non-Gaussian conditions, there is no general result showing that larger deflection coefficient achieves better performance in terms of ROC curves. It is, however, intuitive that increased deflection coefficient generally implies greater separation between P(Δ; G0) and P(Δ; G1) and hence is likely to lead to better detector design. Hence, the FDR parameter γ is set at a value such that the deflection coefficient is maximized. A comparative detection performance for an FDR-based scheme and an identical threshold scheme is shown in Figure 4.2. It is observed that the FDR-based detection approach shows significant improvement in performance over the classically used identical decision threshold approach.

4.2.7    CORRELATED DECISIONS

An important result in distributed detection is that for the classical framework, LRTs at the local sensors are optimal if observations are conditionally independent given each hypothesis [16]. This property drastically reduces the search space for an optimal set of local decision rules. Although the resulting problem is not necessarily easy, it is amenable to analysis in many contexts. In general, it is reasonable to assume conditional independence across sensor nodes if the uncertainty comes mainly from device and ambient noise. However, it does not necessarily hold for arbitrary sensor systems. For instance, when sensors lie in close proximity of one another, we expect their observations to be strongly correlated. If the observed signal is random in nature or the sensors are subject to common external noise, conditional independence assumption may also fail. Without the conditional independence assumption, the joint density of the observations, given the hypothesis, cannot be written as the product of the marginal densities, as in (4.3). The optimal tests at the sensors are no longer of the threshold type based solely on the likelihood ratio of the observations at the individual sensors. In general, finding the optimal solution to the distributed detection problem becomes intractable [14]. Distributed detection with conditionally dependent observations is known to be a challenging problem in decentralized inference.

Image

FIGURE 4.2 Detection performance comparison of FDR-based scheme and identical threshold scheme.

One may restrict attention to the set of likelihood ratio–based tests and employ algorithms to determine the best solution from this restricted set. The resulting system may yield acceptable performance. This approach has been adopted in [60] where detection of known and unknown signals in correlated noise was considered. For the case of two sensors observing a shift-in-mean of Gaussian data, Chen and Papamarcou [61] develop sufficient conditions for the optimality of each sensor implementing a local LRT. Aalo and Viswanathan [62] assume local LRTs at multiple sensors and study the effect of correlated noise on the performance of a distributed detection system. The detection of a known signal in additive Gaussian and Laplacian noise is considered. System performance deteriorates when the correlation increases. In [63], two correlation models are considered. In one, the correlation coefficient between any two sensors decreases geometrically as the sensor separation increases. In the other model, the correlation coefficient between any two sensors is a constant. Asymptotic performance with Gaussian noise when the number of sensors goes to infinity is examined. In [64], Blum et al. study distributed detection of known signals in correlated non-Gaussian noise, where the noise is restricted to be circularly symmetric. Lin and Blum examine two-sensor distributed detection of known signals in correlated t-distributed noise in [65]. Simulation results show that in some specific cases the optimum local decision rules are better than LRTs. A distributed M-ary hypothesis testing problem when observations are correlated is examined from a numerical perspective in [66]. Willett et al. study the two detector case with dependent Gaussian observations, the simplest meaningful problem one can consider, in [67]. They discover that the nature of the local decision rules can be quite complicated. The recent work presented in [68] proposes a new framework for distributed detection under conditionally dependent observations which builds a hierarchical conditional independence model. Through the introduction of a hidden variable that induces conditional independence among the sensor observations, the proposed model unifies distributed detection with dependent or independent observations.

Constraining the local sensor decision rules to be suboptimal binary quantizers for the dependent observations problem, improvement in the global detection performance can still be attained by taking into account the correlation of local decisions while designing the fusion rule. Towards this end, design of fusion rules using correlated decisions has been proposed in [69,70]. In [69], Drakopoulos and Lee have developed an optimum fusion rule based on the NP criterion for correlated decisions assuming that the correlation coefficients between the sensor decisions are known and local sensor thresholds generating the correlated decisions are given. Using a special correlation structure, they studied the performance of the detection system versus the degree of correlation and showed how the performance advantage obtained by using a large number of sensors degrades as the degree of correlation between local decisions increases. In [70], the authors employed the Bahadur–Lazarsfeld series expansion of probability density functions to derive the optimum fusion rule for correlated local decisions. By using the Bahadur–Lazarsfeld expansion of probability density functions, the pdf of local correlated binary decisions can be represented by the pdf of independent random variables multiplied by a correlation factor. In many practical situations, conditional correlation coefficients beyond a certain order can be assumed to be zero. Thus, computation of the optimal fusion rule becomes less burdensome. When all the conditional correlation coefficients are zero, the optimal fusion rule reduces to the Chair–Varshney rule. Here, the implementation of the fusion rule was carried out assuming that the joint density of sensor observations is multivariate Gaussian, which takes into consideration the linear dependence of sensor observations by using the Pearson-correlation coefficient in the covariance matrix. An implicit assumption is that individual sensor observations are also Gaussian distributed.

In many applications, the dependence can get manifested in many different nonlinear ways. As a result, more general descriptors of correlation than the Pearson correlation coefficient, which only characterizes linear dependence, may be required [71]. Moreover, the marginal distributions of sensor observations characterizing their univariate statistics may also not be identical. Here, emphasis should be laid on the fact that multivariate density (or mass) functions do not necessarily exist for arbitrary marginal density (or mass) functions. In other words, given arbitrary marginal distributions, their joint distribution function cannot be written in a straightforward manner.

An interesting approach for the fusion of correlated decisions, that does not necessarily require prior information about the joint statistics of the sensor observations or decisions, is described next. Its novelty lies in the usage of copula theory [72]. The application of copula theory is widespread in the fields of econometrics and finance. However, its use for signal processing applications has been quite limited. The authors in [73,74] employ copula theory for signal detection problems involving correlated observations as well as for heterogeneous sensors observing a common scene. For the fusion of correlated decisions, copula theory does not require prior information about the joint statistics of the sensor observations or decisions and constructs the joint statistics based on a copula selection procedure. Note that the copula function–based fusion will fail to perform better than the Chair–Varshney rule if the constructed joint distribution using a particular parametric copula function does not adequately model the underlying joint distribution of the sensor observations. Therefore, training is necessary in order to select the best copula function. The topic of copula function selection for the distributed detection problem is considered in [75].

4.3    DISTRIBUTED DETECTION OVER NONIDEAL COMMUNICATION CHANNELS

For systems employing high SNR and/or effective channel error correction coding, communication may have extremely low error rates and can be assumed lossless, meaning that the local decisions can be transmitted to the fusion center without errors. On the other hand, the lossless communication assumption should be subject to careful scrutiny in WSNs. Increasing power and/or employing powerful error correction codes may not always be possible because of the stringent resources of WSNs. Furthermore, in a hostile environment, the power of transmitted signal should be kept to a minimum to attain a low probability of intercept/detection (LPI/LPD). Therefore, it may be necessary in many situations to tolerate the loss during data transmission to some extent. To overcome this loss, it is highly desirable to integrate the communication and decision fusion functions intelligently to achieve an acceptable system performance without spending extra system resources. This motivates the study of fusion of local decisions corrupted during the transmission process due to channel fading/noise impairment.

The model for a distributed detection system in the presence of fading channels is illustrated in Figure 4.3. Decisions at local sensors, denoted by uk for k = 1, …, K, are transmitted over parallel channels that are assumed to undergo independent fading. In this section, we consider a discrete-time Rayleigh flat fading channel with a stationary and ergodic complex gain of hkejϕk between the kth sensor and the fusion center. Note that hk and ϕk denote the fading envelope and the phase of the channel, respectively. It is assumed that the channel gain remains constant during the transmission of a decision and channels are independent of each other. We further simplify the analysis by assuming binary signaling and replace uk ∊ {0, 1} by sk ∊ {−1, 1}, so that the effect of the fading channel reduces to a real scalar multiplication for phase coherent reception. The phase coherent reception can be either accomplished through limited training for stationary channels, or, at a small cost of SNR degradation, by employing differential encoding for fast fading channels which results in the same signal model. The received signal model for sensor k is illustrated as

Image

FIGURE 4.3 Parallel fusion model in the presence of fading and noisy channels between local sensors and the fusion center. uk is the binary decision made by the kth sensor, hk is the fading channel gain, nk is a zero-mean Gaussian random variable with variance σ2, and yk is the observation received by the fusion center from the kth sensor, where k ∊ {1, …, K}.

˜yk=hkejϕksk+vk

(4.36)

where νk is a zero-mean complex Gaussian noise with independent real and imaginary parts having identical variance σ2n, i.e., CN(0,2n). Note that the notation CN represents complex Gaussian distribution. Without loss of generality, we make the assumption of Rayleigh fading channels with unit power, i.e., hkejϕkCN(0,1), therefore E[h2k]=1. Using the knowledge of the channel phase at the receiver, the observation model at the fusion center for the kth sensor can be obtained as

yk=hksk+nk

(4.37)

Since νk follows a circularly symmetric complex Gaussian distribution, the noise term nkRe{vkejϕk} is real WGN with variance σ2n, i.e., nkN(0,σ2n).

Optimal Likelihood Ratio–Based Fusion Rule: By assuming instantaneous channel state knowledge regarding the fading channel and the local sensor performance indices, i.e., the Pd,k and Pf,k values, the optimal likelihood ratio (LR)-based fusion rule has been derived in [76], with the fusion statistic (LR) given by

Λ(y)=log[p(y|H1)p(y|H0)]=Kk=1log[Pd,kexp((ykhk)2/2σ2n)+(1Pd,k)exp((yk+hk)2/2σ2n)Pf,kexp((ykhk)2/2σ2n)+(1Pf,k)exp((yk+hk)2/2σ2n)]

(4.38)

where y = [y1, …, yK]T is a vector containing data received from all the K sensors. Note that, this fusion rule requires both local sensor performance indices and instantaneous CSI. Given exact channel state information and under conditional independence assumption under both hypotheses, the distribution of the optimal LR-based fusion statistic is given in [77]. Several suboptimum fusion rules that relax the requirements on a priori knowledge have also been proposed in [76].

Chair–Varshney Fusion Rule: In [76], the Chair–Varshney fusion statistic [24] has been shown to be a high-SNR approximation to (4.38)

Λ1=sign(yk)=1logPd,kPf,k+sign(yk)=1log1Pd,k1Pf,k

(4.39)

where Λ1 does not require any knowledge regarding the channel gain but does require Pd,k and Pf,k for all k. The probability distribution of the Chair–Varshney statistic, which is very helpful for performance analysis, has also been shown in [78]. This approach may suffer significant performance loss at low to moderate channel SNR.

Maximum Ratio Combining (MRC) Fusion Rule: It has been shown in [76] that for small values of channel SNR, Λ in (4.38) reduces to

ˆΛ2=Kk=1(Pd,kPf,k)hkyk

(4.40)

Further, if the local sensors are identical, i.e., Pd,k = PD and Pf,k = PF for all ks, then Λ further reduces to a form analogous to an MRC statistic:

Λ2=1KKk=1hkyk

(4.41)

Λ2 in (4.41) does not require the knowledge of Pd and Pf provided PdPf > 0. Knowledge of the channel gain is, however, required.

Equal Gain Combining (EGC) Fusion Rule: Motivated by the fact that Λ2 resembles an MRC statistic for diversity combining, a third alternative in the form of an EGC has been proposed, which requires minimum amount of information:

Λ3=1KKk=1yk

(4.42)

Interestingly enough, Λ3 outperforms both Λ1 and Λ2 for a wide range of SNR in terms of its detection performance [76].

4.3.1    DISTRIBUTED DETECTION WITH PARTIAL CHANNEL STATE INFORMATION

The optimal LR-based fusion rule presented in Equation 4.38 requires instantaneous CSI, i.e., hk and ϕk, for all the sensors in the WSN. However, for a WSN with very limited resources (energy and bandwidth), it is prohibitive to spend resources on estimating the channel gain every time a local sensor sends its decision to the fusion center. Thus, it is imperative to avoid channel estimation and conserve resources at the possible cost of relatively small performance degradation. This is the reasoning behind the exploration of new fusion rules that do not require instantaneous channel gains, hk. In many WSN scenarios, the statistics of the fading (random) channel and the additive Gaussian noise can be estimated in advance, and used as prior information. It is the goal to develop a new LR-based fusion rule with only the prior information regarding the channel statistics instead of the instantaneous CSI.

Under hypothesis Hj, we have

p(yk|Hj)=uk[p(uk|Hj)p(yk|sk)]=P(uk=1|Hj)p(yk|sk=1)+P(uk=0|Hj)p(yk|sk=1)

and

p(yk|sk)=0p(yk|hk,sk)f(hk)dhk

(4.43)

By assuming a Rayleigh fading channel with unit power (i.e., E[h2k]=1), the pdf of hk is

p(hk)=2hkeh2k,hk0

(4.44)

and

p(yk|hk,sk)=12πσnexp((ykhksk)22σ2n)

(4.45)

Then, the log LR based on the knowledge of channel statistics and local detection performance indices is expressed as [78]

Λ4=log[f(y|H1)f(y|H0)]=Kk=1log{1+[Pd,kQ(ayk)]2πayke(ayk)2/21+[Pf,kQ(ayk)]2πayke(ayk)2/2}

(4.46)

Image

FIGURE 4.4 ROC curves for various fusion statistics for the Rayleigh fading channel with average channel SNR = 4 dB. There are k = 8 sensors with Pd,k = 0.6 and Pf,k = 0.05.

where a=1/(0n1+2σ2n). As shown in Figure 4.4, the optimal LR-based fusion rule provides the best detection performance, however it requires instantaneous gain of the channel. On the other hand, its performance can be approached closely by the LRT fusion rule with partial channel knowledge (LRT-CS). The performance of the LRT-CS fusion rule is slightly worse than the optimal LR-based fusion rule with instantaneous channel gains and is better than the three suboptimal schemes.

4.3.2    DISTRIBUTED DETECTION WITH NO CHANNEL STATE INFORMATION

Acquiring phase information of transmission channels can be costly as it typically requires training overhead. This overhead may be substantial for time-selective fading channels when mobile sensors are involved or the fusion center is constantly moving. Thus, incoherent-detection-based decision fusion rule has been introduced in Ref. [79]. In the incoherent case, the fusion statistics are based on the received envelope, or equivalently, the received power from each sensor. Denoting rk = |yk|2, given the channel state information hk, the signal power for the kth channel output is given by

p(rk|hk,uk=0)=12σ2nexp(rk2σ2n)p(rk|hk,uk=1)=12σ2nI0(hkσ2nrk)exp(h2k+rk2σ2n)

(4.47)

where I0(.) is the zeroth-order modified Bessel function of the first kind. Given p(hk) as in Equation 4.44,

p(rk|uk=0)=12σ2nexp(rk2σ2n)p(rk|uk=1)=11+2σ2nexp(rk1+2σ2n)

(4.48)

Then the LLR (log-likelihood ratio) can be given as

Λ(r)=log[p(r|H1)p(r|H0)]=Kk=1log[Pd,k(1/(1+2σ2n))exp(rk/(1+2σ2n))+(1Pd,k)(1/2σ2n)exp(rk/2σ2n)Pf,k(1/(1+2σ2n))exp(rk/(1+2σ2n))+(1Pf,k)(1/2σ2n)exp(rk/2σ2n)]

(4.49)

For the case of known fading statistics, Ricean and Nakagami fading channels have also been considered in [79]. In this section, we have investigated channel aware decision fusion algorithms with different degrees of channel state information for single-hop networks [76, 77, 78, 79]. Extensions to multi-hop WSNs can be found in [80,81], while channel-optimized local quantizer design methods are provided in [82, 83, 84]. To counter sensor or channel failures, robust binary quantizer design has been proposed in [85]. Channel aware distributed detection has also been studied in the context of cooperative relay networks [86,87].

4.4    CONCLUSIONS

In this section, we summarize and further discuss distributed detection and decision fusion for a multi-sensor system. In a conventional distributed detection framework, it is assumed that local sensors’ performance indices are known and communication channels between the sensors and fusion center are perfect. Under these assumptions, the design for optimal decision fusion rule at the fusion center and the optimal local decision rules at sensors was discussed under Bayesian and NP criteria. For a WSN consisting of passive sensors, it might be very difficult to estimate local sensors’ performance indices and it can be very expensive to transmit them to the fusion center. Counting rule is an intuitive solution which uses the total number of “1”s as a decision statistic since the information about which sensor reports a “1” is of little use to the fusion center. Recent research shows that FDR-based decision fusion with nonidentical thresholds can substantially improve the detection performance as compared to counting rule with identical thresholds.

In a WSN setting with severe constraints on energy, bandwidth, and delay, transmitting sensor decisions to the fusion center over error free channels may become unrealistic since error free transmission may require high transmission power and/or powerful error correction codes. Therefore, channel impairments should be taken into account in the design of distributed detection systems. Channel aware decision fusion algorithms where each has different degrees of channel state information have been reviewed.

For distributed detection in WSNs, in [55], it has been assumed that the communication channels between the sensors and the fusion center are perfect. It will be interesting to study the effect of imperfect communication channels on the detection performance of the proposed FDR-based framework. Also, the FDR framework has been proposed for the detection of a single target in the ROI. Extension of the FDR framework to detection of multiple targets in the ROI is an interesting and challenging research problem. It is also assumed that every sensor has identical noise power. Extension of the proposed framework to include the scenario of nonidentical noise power at each sensor is an interesting research problem.

Dense deployment of sensors in the WSN introduces redundancy in coverage, so selecting a subset of sensors may still provide information with the desired quality. Adaptive sensor management policies can be applied in distributed detection which select a subset of active sensors or distribute the available resources among the informative sensors while meeting the application requirements in terms of quality of service [36].

In this chapter, we have focused on parallel decision fusion architecture where sensors transmit their observations directly to the fusion center. For serial decision fusion, the information processing dealing with distributed data in the context of accurate signal detection and energy-efficient routing is currently emerging as a fruitful research area [88,89].

REFERENCES

1.  P.K. Varshney, Distributed Detection and Data Fusion, Springer, New York, 1997.

2.  H.L. Van Trees, Detection, Estimation and Modulation Theory, Vol. 1, Wiley, New York, 1968.

3.  H.V. Poor, An Introduction to Signal Detection and Estimation, Springer-Verlag, New York, 1988.

4.  C.W. Helstrom, Elements of Signal Detection and Estimation, Prentice-Hall, Englewood Cliffs, NJ, 1995.

5.  R. Viswanathan and P.K. Varshney, Distributed detection with multiple sensors: Part I—Fundamentals, Proceedings of the IEEE, 85(1), 54–63, January 1997.

6.  R.S. Blum, S.A. Kassam, and H.V. Poor, Distributed detection with multiple sensors: Part II—Advanced topics, Proceedings of the IEEE, 85(1), 64–79, January 1997.

7.  V. Veeravalli and P.K. Varshney, Distributed inference in wireless sensor networks, Philosophical Transactions of the Royal Society, 370(1958), 100–117, January 2012.

8.  J.P. Shaffer, Multiple hypothesis testing, Annual Review of Psychology, 46(1), 561–584, 1995.

9.  M. Schwartz, W.R. Bennett, and S. Stein, Communication Systems and Techniques, Wiley, New York, 1995.

10.   B. Eisenberg, Multihypothesis problems, in Handbook of Sequential Analysis, B.K Ghosh and P.K. Sen, Eds. New York, Marcel Dekker, Vol. 118, pp. 229–244, 1991.

11.  X. Zhu, Y. Yuan, C. Rorres, and M. Kam, Distributed M-ary hypothesis testing with binary local decisions, Information Fusion, 5(3), 157–167, 2004.

12.  Q. Zhang and P.K. Varshney, Decentralized M-ary detection via hierarchical binary decision fusion, Information Fusion, 2(1), 3–16, 2001.

13.  C.W. Baum and V.V. Veeravalli, A sequential procedure for multihypothesis testing, IEEE Transactions on Information Theory, 40(6), 1994–2007, 1994.

14.  J. Tsitsiklis and M. Athans, On the complexity of decentralized decision making and detection problems, IEEE Transactions on Automatic Control, 30, 440–446, May 1985.

15.  N.S.V. Rao, Computational complexity issues in synthesis of simple distributed detection networks, IEEE Transactions on Systems, Man, Cybernetics, 21, 1071–1081, September/October 1991.

16.  J.N. Tsitsiklis, Decentralized detection, in Advances in Statistical Signal Processing, H.V. Poor and J.B. Thomas, Eds. JAI Press, Greenwich, CT, 1993.

17.  J.N. Tsitsiklis, On threshold rules in decentralized detection, in Proceedings of the 25th IEEE Conference on Decision and Control, Athens, Greece, 1986, pp. 232–236.

18.  P. Willet and D. Warren, Decentralized detection: When are identical sensors identical, in Proceedings Conference on Information Science and Systems, Princeton, NJ, 1991, pp. 287–292.

19.  M. Cherikh and P.B. Kantor, Counterexamples in distributed detection, IEEE Transactions on Information Theory, 38, 162–165, January 1992.

20.  Z.B. Tang, K.R. Pattipati, and D. Kleinman, An algorithm for determining the detection thresholds in a distributed detection problem, IEEE Transactions on Systems, Man, and Cybernetics, 21, 231–237, January/February 1991.

21.  A.R. Reibman, Performance and Fault-Tolerance of Distributed Detection Networks, PhD thesis, Duke University, Durham, NC, 1987.

22.  S.C.A. Thomopoulos, R. Viswanathan, and D.K. Bougoulias, Optimal distributed decision fusion, IEEE Transactions on Aerospace and Electronic Systems, 25, 761–765, September 1989.

23.  C.W. Helstrom, Gradient algorithms for quantization levels in distributed detection systems, IEEE Transactions on Aerospace and Electronic Systems, 31, 390–398, January 1995.

24.  Z. Chair and P.K. Varshney, Optimal data fusion in multiple sensor detection systems, IEEE Transactions on Aerospace and Electronic Systems, 22, 98–101, January 1986.

25.  J.B. Thomas, Nonparametric detection, Proceedings of the IEEE, 58(5), 623–631, 1970.

26.  S. Kay, Can detectability be improved by adding noise? IEEE Signal Processing Letters, 7(1), 8–10, January 2000.

27.  H. Chen, P.K. Varshney, S.M. Kay, and J.H. Michels, Theory of the stochastic resonance effect in signal detection: Part I; fixed detectors, IEEE Transactions on Signal Processing, 55(7), 3172–3184, July 2007.

28.  H. Chen and P.K. Varshney, Theory of the stochastic resonance effect in signal detection: Part II; variable detectors, IEEE Transactions on Signal Processing, 56(10), 5031–5041, October 2008.

29.  P.F. Swaszek, On the performance of serial networks in distributed detection, IEEE Transactions on Aerospace and Electronic Systems, 29(1), 254–260, January 1993.

30.  Z.B. Tang, K.R. Pattipati, and D.L. Kleinman, Optimization of detection networks. I. Tandem structures, IEEE Transactions on Systems, Man and Cybernetics, 21(5), 1044–1059, September/October 1991.

31.   W.P. Tay, J.N. Tsitsiklis, and M.Z. Win, On the subexponential decay of detection error probabilities in long tandems, IEEE Transactions on Information Theory, 54(10), 4767–4771, October 2008.

32.  Z.-B. Tang, K.R. Pattipati, and D.L. Kleinman, Optimization of detection networks. II. Tree structures, IEEE Transactions on Systems, Man and Cybernetics, 23(1), 211–221, January/February 1993.

33.  W.P. Tay, J.N. Tsitsiklis, and M.Z. Win, On the impact of node failures and unreliable communications in dense sensor networks, IEEE Transactions on Signal Processing, 56(6), 2535–2546, June 2008.

34.  W.P. Tay, J.N. Tsitsiklis, and M.Z. Win, Bayesian detection in bounded height tree networks, IEEE Transactions on Signal Processing, 57(10), 4042–4051, October 2009.

35.  Q. Zou, S. Zheng, and A.H. Sayed, Cooperative sensing via sequential detection, IEEE Transactions on Signal Processing, 58(12), 6266–6283, December 2010.

36.  Q. Cheng, P.K. Varshney, K.G. Mehrotra, and C.K. Mohan, Bandwidth management in distributed sequential detection, IEEE Transactions on Information Theory, 51(8), 2954–2961, August 2005.

37.  H. Chen, P.K. Varshney, and J.H. Michels, Improving sequential detection performance via stochastic resonance, IEEE Signal Processing Letters, 15, 685–688, 2008.

38.  V.V. Veeravalli, Decentralized quickest change detection, IEEE Transactions on Information Theory, 47(4), 1657–1665, May 2001.

39.  R. Niu and P.K. Varshney, Sampling schemes for sequential detection with dependent data, IEEE Transactions on Signal Processing, 58(3), 1469–1481, March 2010.

40.  D. Bajovic, D. Jakovetic, J. Xavier, B. Sinopoli, and J.M.F. Moura, Distributed detection via Gaussian running consensus: Large deviations asymptotic analysis, IEEE Transactions on Signal Processing, 59(9), 4381–4396, September 2011.

41.  Z. Li, F.R. Yu, and M. Huang, A distributed consensus-based cooperative spectrum-sensing scheme in cognitive radios, IEEE Transactions on Vehicular Technology, 59(1), 383–393, January 2010.

42.  S. Stankovic, N. Ilic, M.S. Stankovic, and K.H. Johansson, Distributed change detection based on a consensus algorithm, IEEE Transactions on Signal Processing, 59(12), 5686–5697, December 2011.

43.  J.N. Tsitsiklis, Decentralized detection with a large number of sensors, Mathematics of Control, Signals, and Systems, 1, 167–182, 1988.

44.  P. Chen and A. Papamarcou, New asymptotic results in parallel distributed detection, IEEE Transactions on Information Theory, 39(6), 1847–1863, November 1993.

45.  J. Chamberland and V.V. Veeravalli, Decentralized detection in sensor networks, IEEE Transactions on Signal Processing, 51, 407–416, February 2003.

46.  J.F. Chamberland and V.V. Veeravalli, Asymptotic results for decentralized detection in power constrained wireless sensor networks, IEEE Journal on Selected Areas in Communications, 22(6), 1007–1015, August 2004.

47.  R. Niu, P.K. Varshney, and Q. Cheng, Distributed detection in a large wireless sensor network, International Journal on Information Fusion, 7(4), 380–394, December 2006.

48.  R. Niu and P.K. Varshney, Distributed detection and fusion in a large wireless sensor network of random size, EURASIP Journal on Wireless Communications and Networking, 5(4), 462–472, September 2005.

49.  R. Niu and P.K. Varshney, Performance analysis of distributed detection in a random sensor field, IEEE Transactions on Signal Processing, 56(1), 339–349, January 2008.

50.  Q. Cheng, R. Niu, A. Sundaresan, and P.K. Varshney, Distributed detection and decision fusion with applications to wireless sensor networks, Integrated Tracking, Classification, and Sensor Management: Theory and Applications, Wiley/IEEE, June 2012.

51.   B. Krishnamachari and S. Iyengar, Distributed Bayesian algorithms for fault-tolerant event region detection in wireless sensor networks, IEEE Transactions on Computers, 53(3), 241–250, March 2004.

52.  F. Gini, F. Lombardini, and L. Verrazzani, Decentralized CFAR detection with binary integration in weibull clutter, IEEE Transactions on Aerospace and Electronic Systems, 33(2), 396–407, April 1997.

53.  F. Gini, F. Lombardini, and L. Verrazzani, Decentralised detection strategies under communication constraints, IEE Proceedings—Radar, Sonar and Navigation, 145(4), 199–208, August 1998.

54.  F. Gini, F. Lombardini, and P.K. Varshney, On distributed signal detection with multiple local free parameters, IEEE Transactions on Aerospace and Electronic Systems, 35(4), 1457–1466, October 1999.

55.  P. Ray and P.K. Varshney, False discovery rate based sensor decision rules for the network-wide distributed detection problem, IEEE Transactions on Aerospace and Electronic Systems, 47(3), 1785–1799, July 2011.

56.  E.L. Lehman and J.P. Romano, Testing Statistical Hypotheses, Springer, New York, 3rd edn., 2008.

57.  Y. Benjamini and Y. Hochberg, Controlling the false discovery rate: A practical and powerful approach to multiple testing, Journal of the Royal Statistical Society, Series B, 57(1), 289–300, 1995.

58.  Y. Benjamini and D. Yekutieli, The control of the false discovery rate in multiple testing under dependency, Annals of Statistics, 29, 1165–1188, 2001.

59.  B. Picinbono, On deflection as a performance criterion in detection, IEEE Transactions on Aerospace and Electronic Systems, 31(3), 1072–1081, July 1995.

60.  G.S. Lauer and N.R. Sandell Jr., Distributed detection with waveform observations: Correlated observation processes, in Proceedings of the 1982 American Controls Conference, Arlington, VA, 1982, Vol. 2, pp. 812–819.

61.  P. Chen and A. Papamarcou, Likelihood ratio partitions for distributed signal detection in correlated Gaussian noise, in Proceedings of IEEE International Symposium on Information Theory, Whistler, Canada, Septemper 1995, p. 118.

62.  V. Aalo and R. Viswanathan, On distributed detection with correlated sensors: Two examples, IEEE Transactions on Aerospace and Electronic Systems, 25, 414–421, May 1989.

63.  V. Aalo and R. Viswanathan, Asymptotic performance of a distributed detection system in correlated Gaussian noise, IEEE Transactions on Signal Processing, 40, 211–213, January 1992.

64.  R. Blum, P. Willett, and P. Swaszek, Distributed detection of known signals in nonGaussian noise which is dependent from sensor to sensor, in Proceedings of Conference of the Information Sciences and Systems, Baltimore, MD, March 1997, pp. 825–830.

65.  X. Lin and R. Blum, Numerical solutions for optimal distributed detection of known signals in dependent t-distributed noise: The two-sensor problem, in Proceedings of the Asilomar Conference on Signals, Systems, and Computers, Pacific Grove, CA, November 1998, pp. 613–617.

66.  Z. Tang, K. Pattipati, and D. Kleinman, A distributed M-ary hypothesis testing problem with correlated observations, IEEE Transactions on Automatic Control, 37, 1042–1046, July 1992.

67.  P.K. Willett, P.F. Swaszek, and R.S. Blum, The good, bad, and ugly: Distributed detection of a known signal in dependent Gaussian noise, IEEE Transactions on Signal Processing, 48, 3266–3279, December 2000.

68.  H. Chen, P.K. Varshney, and B. Chen, A novel framework for distributed detection with dependent observations, IEEE Transactions on Signal Processing, 60(3), 1409–1419, March 2012.

69.   E. Drakopoulos and C.-C. Lee, Optimum multisensor fusion of correlated local decisions, IEEE Transactions on Aerospace and Electronic Systems, 27(4), 593–606, July 1991.

70.  M. Kam, Q. Zhu, and W.S. Gray, Optimal data fusion of correlated local decisions in multiple sensor detection systems, IEEE Transactions on Aerospace and Electronic Systems, 28(3), 916–920, July 1992.

71.  D.D. Mari and S. Kotz, Correlation and Dependence, Imperial College Press, London, U.K., 2001.

72.  R.B. Nelsen, An Introduction to Copulas, Springer-Verlag, New York, 1999.

73.  A. Sundaresan, P.K. Varshney, and N.S.V. Rao, Copula-based fusion of correlated decisions, IEEE Transactions on Aerospace and Electronic Systems, 47(1), 454–471, 2011.

74.  S.G. Iyengar, P.K. Varshney, and T. Damarla, A parametric copula-based framework for hypothesis testing using heterogeneous data, IEEE Transactions on Signal Processing, 59(5), 2308–2319, May 2011.

75.  A. Sundaresan, Detection and source location estimation of random signal sources using sensor networks, PhD thesis, Syracuse University, Syracuse, New York, 2010.

76.  B. Chen, R. Jiang, T. Kasetkasem, and P.K. Varshney, Channel aware decision fusion for wireless sensor networks, IEEE Transactions on Signal Processing, 52, 3454–3458, December 2004.

77.  I. Bahceci, G. Al-Regib, and Y. Altunbasak, Parallel distributed detection for wireless sensor networks: Performance analysis and design, in IEEE Global Telecommunications Conference, GLOBECOM, St. Louis, MO, 2005. IEEE, Piscataway, NJ, Vol. 4, p. 5.

78.  R. Niu, B. Chen, and P. K. Varshney, Fusion of decisions transmitted over Rayleigh fading channels in wireless sensor networks, IEEE Transactions on Signal Processing, 54(3), 1018–1027, March 2006.

79.  R. Jiang and B. Chen, Fusion of censored decisions in wireless sensor networks, IEEE Transactions on Wireless Communications, 4(6), 2668–2673, November 2005.

80.  Y. Lin, B. Chen, and P.K. Varshney, Decision fusion rules in multi-hop wireless sensor networks, IEEE Transactions on AES, 51, 475–488, April 2005.

81.  I. Bahceci, G. Al-Regib, and Y. Altunbasak, Serial distributed detection for wireless sensor networks, in International Symposium on Information Theory, ISIT, Adelaide, Australia, 2005, pp. 830–834.

82.  B. Chen and P.K. Willett, On the optimality of likelihood ratio test for local sensor decision rules in the presence of non-ideal channels, IEEE Transactions on Information Theory, 51(2), 693–699, 2005.

83.  B. Liu and B. Chen, Channel optimized quantizers for decentralized detection in wireless sensor networks, IEEE Transactions on Information Theory, 52, 3349–3358, July 2006.

84.  B. Liu and B. Chen, Decentralized detection in wireless sensor networks with channel fading statistics, EURASIP Journal on Wireless Communications and Networking, 2007, 11, January 2007.

85.  Y. Lin, B. Chen, and B. Suter, Robust binary quantizers for distributed detection, IEEE Transactions on Wireless Communications, 6(6), 2172–2181, June 2007.

86.  B. Liu, B. Chen, and R.S. Blum, Minimum error probability cooperative relay design, IEEE Transactions on Signal Processing, 55(2), 656–664, February 2007.

87.  H. Chen, P.K. Varshney, and B. Chen, Cooperative relay for decentralized detection, in Proceedings of the 2008 IEEE International Conference on Acoustics, Speech and Signal Processing, Las Vegas, NV, March 2008, pp. 2293–2296.

88.  Y. Yang, R.S. Blum, and B.M. Sadler, Energy-efficient routing for signal detection in wireless sensor networks, IEEE Transactions on Signal Processing, 57(6), 2050–2063, 2009.

89.  Y. Sung, S. Misra, L. Tong, and A. Ephremides, Cooperative routing for distributed detection in large sensor networks, IEEE Journal on Selected Areas in Communications, 25(2), 471–483, 2007.

*  Note that in this section, multiple hypotheses tests indicate multiple binary hypothesis tests and a formal definition is provided in the next section. In the previous sections, we use multiple hypotheses testing to indicate M-ary tests.

*  The Q function is the complementary distribution function of the standard Gaussian, which is defined as Q(y)=1/2πyexp(z2/2)dz.

*  Note that due to discrete global test statistics, a third design parameter is the randomization constant.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset