Chapter 14 Nonmyopic Sensor Management

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

14	Nonmyopic Sensor Management Viswanath Avasarala and Tracy Mullen

CONTENTS

14.1 Introduction

14.2 Stochastic Dynamic Programming

14.3 Market-Oriented Programming

14.4 Performance Evaluation Using a Simulation Test Bed

14.4.1 Simulation Platform

14.5 Conclusion

References

14.1 INTRODUCTION

Sensor management which may be defined as “a process, which seeks to manage or coordinate the use of sensing resources in a manner that improves the process of data fusion and ultimately that of perception, synergistically” (Manyika and Whyte 1994). The recent advances in sensor and communication technologies have led to a proliferation of sensor network applications (Balazinska et al. 2007, Hall and McMullen 2004). Traditional sensor management approaches apply network resources policies myopically so that allocation decisions are optimized only for the current time. Myopic sensor management is suboptimal since current network allocations have significant future consequences. To illustrate this example, consider the following scenario. Consider the following very simple scenario (Scenario I). A single sensor X is used to track targets of type A and type B. The utility of tracking targets of type A (U_A) is much greater than the utility of tracking targets of type B (U_B). The sensor X has energy reserves that are sufficient only for operating for T time schedules. Assume that the environment has only a target of type B in the first T time schedules, after which target A appears. If a greedy sensor management approach were used, the energy reserves of X would be exhausted by the time a high priority task becomes available. Nonmyopic behavior is suboptimal even when the only network resources that need to be considered are restricted to sensor schedules alone. For example, consider the scenario illustrated in Figure 14.1 (Scenario II). A sensor network spans a particular area with varying coverage. Two targets, A and B, exist in the environment. The sensor network’s task is to reduce the uncertainty associated with estimating target positions to below a certain threshold. The tasks with tracking A and B have the same utility. Target A is expected to move along the path shown in Figure 14.1 from region R1 to R2. Target B is expected to move from region R1 to R3. Since R3 has much higher sensor coverage then R2, an optimal sensor should consider allocating more tracking resources to A than B, when both A and B are in R1. However, myopic sensor managers (SM) consider optimization only for the current schedule and therefore do not prioritize between A and B in R1.

FIGURE 14.1 Sample scenario for nonmyopic sensor management.

As Figure 14.1 illustrates using a simple scenario, an optimal sensor should be nonmyopic. That is, it should have a far-sighted strategy where the sensor allocations are optimized over a time horizon. However, the resource allocation problem is exponentially complex in the length of time horizon. Furthermore, modeling uncertainty associated with future possibilities is also a challenging problem. As a result, traditionally sensor management problems have used myopic approaches where the optimality for the current is considered (refer to Avasarala [2006] for a good review of greedy techniques). However, recent breakthroughs in stochastic optimization techniques combined with increases in computational speed have made nonmyopic sensor management techniques real-time feasible.

In this chapter, we review two of the most promising techniques for nonmyopic sensor management, stochastic dynamic programming, and market-oriented programming. This chapter is organized as follows. Section 14.2 introduces the stochastic dynamic programming approach and elaborates on a few crucial aspects of this approach. Section 14.3 introduces market-oriented programming. Our own work in this domain is presented as an example implementation in Section 14.4. Finally, Section 14.5 provides some concluding remarks.

14.2 STOCHASTIC DYNAMIC PROGRAMMING

Stochastic dynamic programming (Bertsekas and Tsitsiklis 1996) is based on the following principle (Gelly et al. 2006): “Take the decision at time step t such that the sum ‘cost at time step t due to your decision’ plus ‘expected cost from time steps t + 1 to T from the state resulting from your decision’ is minimal. Bellman’s optimality principle states that this strategy is optimal. Unfortunately, it can only be applied if the expected cost from time steps t + 1 to T can be guessed, depending on the current state of the system and the decision.”

These methods solve the equation of the form:

$V_{t} = g_{t} (Λ_{t}) + α E (\frac{V_{t + 1} (S_{t + 1})}{S_{t}}, Λ_{t})$ $V_{t} = g_{t} (Λ_{t}) + α E (\frac{V_{t + 1} (S_{t + 1})}{S_{t}}, Λ_{t})$

(14.1)

where

g_t(Λ_t) is the utility of taking action Λ_t at time t

S_t is the “state” of the system at time t

E(V_t₊₁(S_t₊₁)/S_t) is the expected future value of current actions

α is the time discount factor

In recent years, stochastic dynamic programming approaches have been applied to sensor management problems. In 1995, Castañon considered a multigrid, single sensor detection problem. Under certain assumptions about the target distributions and probability distribution of sensor measurements, Castañon solved the problem to optimality. The optimal allocation policy was to search either of the two most likely target locations during each round of scheduling. Castañon (1997) further demonstrated swing a simulation study that this optimal policy outperforms a greedy information-theoretic approach. However, except for trivial cases, solving stochastic dynamic programming to optimality is not possible because of their complexity. As a result, more generally, researchers have used approximation techniques to solve stochastic dynamic programming problems. Researchers have used various approximation methods for solving the sensor management problem that is concerned with tracking targets. Washburn et al. (2002) formulate a single-sensor, multitarget scheduling problem as a stochastic scheduling problem and use the Gittin’s index rule to develop approximate solutions. Williams et al. (2005) consider a single-target, multisensor allocation problem with communication constraints and use adaptive Lagrangian relaxation to solve the constrained dynamic programming problem. Schneider et al. (2006) have used approximate dynamic programming to allocate gimbaled radars for detecting and tracking tracks over a multihorizon time period. As explained earlier, these more complex stochastic dynamic programming problems have been solved used using greedy approximations.

As a sample illustration, consider Schneider et al.’s (2006) rollout approximation-based approach. In this problem, a base sensor allocation policy is adopted for myopic allocation. The authors have used certain heuristics as the base policy but more generally any greedy single-period scheduling problem like the information-theoretic sensor management (Kastella 1996, Walsh et al. 1998). Assuming a base policy, the stochastic dynamic programming problem is solved as follows: find the optimal action at time t_k by assuming that all allocations for times t > t_k are calculated using the base policy. This method can be applied straightforwardly when the only resources available are sensor schedules. However, for scenarios that involve other network resources like energy (as explained in Scenario I), the rollout algorithm-based results can be inadequate. This is because, by definition, a greedy approach for nonmyopic optimization implies that a sensor should be used for tasks in the current round schedule if its energy reserves permit it. Therefore, using a greedy base policy for future schedules leads to very conservative energy utilization allocations for the current time. An alternative approach to avoid this pitfall when heterogeneous network resources have to be modeled is to use a weighted sum of performance on multiple objectives like track accuracy, energy reserves, and communication bandwidth. This approach is similar to the approach presented in Williams et al. (2005).

So far, the research approaches for sensor management on stochastic dynamic programming have considered on a single type of target. For example, Castañon has considered the detection problem in 1995 (Castañon 1995), classification problem in McIntyre and Hintz (1996), and the tracking problem has been considered in Washburn et al. (2002), Williams et al. (2005), and Schneider and Chong (2006). Research studies that analyze the performance of a more generic multi-sensor, multitask sensor network environment are not currently available in the public domain.

14.3 MARKET-ORIENTED PROGRAMMING

Market-oriented programming techniques use market algorithms for resource allocation in distributed environments (Wellman et al. 2001). The genesis of market-oriented programming was in Artificial Intelligence (AI) community’s work in developing economic mechanisms for distributed problem solving and can be first traced to the contract net protocol. Davis and Smith (1983) and Sandholm (1993) extended the contract net protocol by integrating the concepts of cost and price. For a detailed review of market-oriented programming, refer to Wellman et al. (2001) and Wellman (1991). We have developed market architecture for sensor management (MASM) based on market-oriented programming techniques for multitask, multiconsumer, multisensor sensor management. MASM assumes that all sensors belong to a single platform where a centralized SM is responsible for allocating resources to task. For smart dust environments, Mainland et al. (2005) have proposed similar price-based mechanisms that do not require a central resource allocation agent.

The design of MASM is shown in Figure 14.2. The Mission Manager (MM) is responsible for allocating task responsibilities and budgets to the various agents in the market. The actual details of the sensor network are invisible to the consumer agents. Instead, consumers are allowed to bid on high-level tasks, like “Track Target X to an accuracy Y.” The SM stipulates the type of tasks that the consumers can bid on, so that the consumer bids are restricted to tasks SM knows how to accomplish. Consumer bids are of the type <t, p> where t is the task description that includes the task type and final task quality desired by the consumer, and p is the price that the consumer is willing to pay. For example, the task description for a bid to search a particular grid, x, so that the uncertainty of target presence (as measured by entropy) <0.001 is as follows:

(type: search

entity: grid no x

quality: (entropy < 0.001))

FIGURE 14.2 MASM architecture.

Since MASM models network resources as commodities that need to be sold in the market, it has to set prices for these resources during each round of scheduling. In the current version of MASM, we use a pricing protocol similar to the tatonement process for determining resource prices. Tatonement is an iterative procedure for finding equilibrium prices based on the search parameter (e.g., price or quantity) (Samuelson 1947, Walras 1954, Waldspurger et al. 1992). The price adjustment process starts with an auctioneer communicating an arbitrary price set to the users. The users compute their demand for the first good at the given prices and communicate it to the auctioneer. Depending on whether the aggregate demand for the first good is positive or negative, the auctioneer either increases or decreases its price. This process continues until a price at which aggregate demand for the first good equals zero is reached. This process is then repeated for the second good and so on. At the end of the first cycle, only the last good is guaranteed to have a zero demand, but assuming gross substitutability (i.e., when the price of good j goes up, there is a positive increase in the demand for every other good by each user) the price set arrived at after each cycle is closer to equilibrium than the previous one. More refined algorithms using partial derivatives of the demand functions have been developed to search for equilibrium in parallel (Shoven and Whalley 1992, Ygge 1998). Though the gross substitutability assumption is often violated (as in sensor networks), the tatonement process has been found to give satisfactory results (Cheng et al. 2005).

To use the tatonement process in MASM, we model the supply and demand functions for a resource at a particular price. MASM estimates these functions using the current resource usage rate. Prices for individual resources are initialized to zero during the sensor network initialization. After each round of scheduling, the prices $(ϑ_{S}^{t + 1})$ $(ϑ_{S}^{t + 1})$ for the resource S for the next round of scheduling are calculated based on the current usage rate of the resource $(r_{S}^{t})$ $(r_{S}^{t})$ and the available usage rate of $a_{S}^{t}$ $a_{S}^{t}$ :

$\begin{matrix} ϑ_{S}^{t + 1} = \max (0, ϑ_{S}^{t} + τ * (r_{S}^{t} - a_{S}^{t})) \\ ϑ_{S}^{0} = 0 \end{matrix}$ $\begin{matrix} ϑ_{S}^{t + 1} = \max (0, ϑ_{S}^{t} + τ * (r_{S}^{t} - a_{S}^{t})) \\ ϑ_{S}^{0} = 0 \end{matrix}$

(14.2)

where τ is the constant which determines the rate at which prices are updated.

The definitions of $r_{S}^{t}$ $r_{S}^{t}$ and $a_{S}^{t}$ $a_{S}^{t}$ are dependent on the resource being modeled. For example, for sensors, we have used the available battery power. Let sensor A be endowed with initial battery power b_i and assume that sensor A needs to be available for a total operating time of T. At time t, if the available battery power is b_t, then

$\begin{matrix} r_{A}^{t} = \frac{(b_{i} - b_{t})}{t} if t > 0, 0 otherwise \\ a_{A}^{t} = \frac{(b_{t})}{(T - t)} if t < T, 0 otherwise \end{matrix}$ $\begin{matrix} r_{A}^{t} = \frac{(b_{i} - b_{t})}{t} if t > 0, 0 otherwise \\ a_{A}^{t} = \frac{(b_{t})}{(T - t)} if t < T, 0 otherwise \end{matrix}$

(14.3)

Once the resource prices are formulated, SM uses a combinatorial auction mechanism to find the optimal allocation given the consumer bids. However, the consumer bids are on high-level tasks and the sensor resources are raw sensor schedules or communication bandwidth, etc. Thus, some method for bundling goods produced by various sellers is essential to create commodities that consumers are interested in and can bid for. The SM uses a special-purpose protocol, continuous combinatorial auction (CCA), to disintegrate the high-level consumer bids to bids for sensor resources. CCA has been designed to decrease computational and communication complexity of the market. The details of CCA are available in Avasarala (2006). We briefly describe the salient features of CCA. CCA uses discrete time slots to schedule resources. For each time slot, a combinatorial auction is held to determine the optimal allocation of resources. As explained earlier, there is a dichotomy in what the consumers bid on (high-level tasks) and what resources the sensor network has. To address this dichotomy, the CCA uses a bid translation function to create low-level resource bids from high-level task bids. For each time slot t, the auctioneer constructs bids on each resource set, S, that can be allotted to task X. To construct these bids, the auctioneer needs to calculate the price associated with the bid on resource set S, for task X, based on the consumer bid price P. For this purpose, a novel mechanism of price calculation has been devised. For the task X, the auctioneer computes the bid price for a resource set S as the percentage of the consumer task completed by the resource set multiplied by the actual consumer bid price P. Computation of the percentage of task completed by a resource set S is in terms of readings of a canonical sensor, A, as follows. Let task X require, on average, n_a consecutive schedules of the standard sensor A to be completed. Instead, if a resource bundle S is allocated to task X during the current round of scheduling, the expected number of standard sensor readings required changes to ${n^{'}}_{a} .$ ${n^{'}}_{a} .$ The percentage of task completed by resource set S is equal to the percentage savings in the required number of sensor A readings:

$f_{S, X} = \frac{(n_{a} - {n^{'}}_{a})}{n_{a}}$ $f_{S, X} = \frac{(n_{a} - {n^{'}}_{a})}{n_{a}}$

(14.4)

Then, the bid price for resource set S at time t for task X with bid price P is

$p_{S, X} = f_{S, X} * P - ϑ_{s}^{t}$ $p_{S, X} = f_{S, X} * P - ϑ_{s}^{t}$

(14.5)

where $ϑ_{s}^{t}$ $ϑ_{s}^{t}$ is the price of the bundle during the tth round of scheduling. It is calculated as the sum of the prices of the individual resources comprising S. Once the resource bids and their corresponding bid prices are formulated, MASM uses a standard combinatorial auction winner determination algorithm (Andersson et al. 2000) to find the optimal resource allocation.

The number of resource combinations that can be allocated to a task is exponential in the number of resources. For large sensor networks, we have formulated a special genetic algorithm that solves the bid translation problem in polynomial time (Avasarala 2006). The representation schema used for the algorithms resembles the SGA algorithm that we developed for determining combinatorial auction winners (Avasarala et al. 2006). First, let the number of bids be k and the number of sensors be n. The representation schema used by the genetic algorithm is a string of length n, where each string element is a real-valued number between 1 and k. The jth string member represents the bid to which sensor j is allocated. We obtain this string’s overall utility as the sum of utilities obtained from individual bids, which are calculated as the sum of individual bid prices calculated as shown in Equation 14.5. The genetic algorithm is anytime and has polynomial complexity.

Clearly, the efficiency of the market algorithm depends to a great extent on the bidding strategies of the consumer. Formulating an optimal bidding strategy for MASM consumers is not straightforward because of the following reasons. In MASM, the SM accepts bids only on a set of predefined tasks. The consumer agent is responsible for deconstructing its high-level tasks or goals into a sequence of SM acceptable sub-tasks on which it can bid. Furthermore, it is responsible for calculating appropriate bid prices for these subtasks in order to bid in the market. The resource requestors have utility for tasks that they are trying to accomplish but not necessarily for the tasks the SM accepts bids for. Therefore, the consumer agents have to formulate a bidding strategy to formulate the optimal prices for their auction bids. For example, assume that a consumer has utility u_t for destroying a target T. The high-level task of destroying T might consist of the following two subtasks, for which SM accepts bids: (a) search for target T and (b) track target T so that the uncertainty about its position is reduced. The consumer then has to divide its overall utility, u_t into utilities for the two subtasks, so that it can formulate bid prices. The optimal weights of the individual subtasks are dependent on the system conditions. For example, it might be difficult to search for targets in some environments whereas in others tracking tracks to high accuracy might be the bottleneck. Utilities for the subtasks should take into consideration the competition for resources. For example, sensors that can be used for velocity estimation, like Doppler sensors, might be abundant in the network, making the tracking task a less competitive one. Moreover, MASM uses a combinatorial auction–based mechanism for resource allocation and therefore predicting the optimal bidding strategies is a difficult problem (Cramton et al. 2005). Furthermore, future market conditions depend on a number of unpredictable variables including future competing consumer task load, their bidding strategies. As an initial study, we implemented a consumer bidding strategy that uses some simplified market assumptions (Avasarala et al. 2009).

We frame this problem as follows. Let φ_i (i = 1 to m) be the set of tasks that the consumer has a utility for. Let u_i be the consumer utility of accomplishing the ith consumer task. Let ϕ_j {j = 1 to n} be the set of tasks that the sensor network accepts bids for and can accomplish. We assume that each consumer task φ_i consists of a collection of sensor network tasks ϕ_j accomplished in a certain sequence. The sequence of sensor tasks for the ith consumer task is denoted by χ_i. We also assume that there is one to many mapping between consumer tasks and sensor network tasks. That is, each sensor network task can be a subtask for one and only one consumer task. For example, in the MASM simulation scenario, consumer has a utility for destroying targets (φ₁ = “destroy targets”). To accomplish this task, the consumer has to use the sensor network resources to first search for and detect targets {ϕ₁ = “search for targets”}. Then, the consumer has to track targets, {ϕ₂ = “track targets”}. In this case,

χ₁ = {“search for targets”, “track targets”}.

Clearly, establishing the optimal bid prices involves solving a stochastic, multiperiod optimization. However, we make the following assumptions to make consumer bidding optimization for a single-period, deterministic optimization problem.

1. Consumers model the market as a fixed-price market. In a fixed-price market, the different commodities have fixed prices and the only choice consumers have is whether to buy them or not (Cliff and Bruten 1997). For fixed-price markets, consumers can construct the price–quality of service (QoS) mapping as a deterministic relationship.

2. Consumers model the price–QoS to be independent of time. That is, consumers assume that the current price–QoS mapping will persist throughout the sensor network operation.

3. Consumers optimize bidding prices under the assumption that they use a constant price for each sensor network task throughout the sensor network operation.

Since the consumers assume that the market is a fixed-price market, they can model the market behavior with a series of monotonically increasing functions, γ_{ϕ_j} {j = 1 to n} where γ_{ϕ_j} (P_ϕj) represents the fraction of sensor task φ_j that will be completed in any round of scheduling if the consumer pays a price P_{ϕ_j}. The number of schedules for completing the sensor network task φ_j using P_{ϕ_j} as the bid price is

$τ_{φ_{j}} = [\frac{1}{γ_{φ_{j}} (P_{φ_{j}})}]$ $τ_{φ_{j}} = [\frac{1}{γ_{φ_{j}} (P_{φ_{j}})}]$

(14.6)

where [x] = min{n∈Z|n ≥ x), Z being the set of integers.

The consumer earns a utility u_i for completing a consumer task φ_i. If χ_i is the set of sensor tasks that comprise φ_i, then the estimated number of bidding cycles to complete consumer task φ_i can be calculated as

$Γ_{_{φ_{i}}} = \sum_{k = 1}^{| χ_{i} |} τ_{χ_{i}^{k}}$ $Γ_{_{φ_{i}}} = \sum_{k = 1}^{| χ_{i} |} τ_{χ_{i}^{k}}$

(14.7)

where $χ_{i}^{k}$ $χ_{i}^{k}$ is the kth element of χ_i.

Let p_{χ_i} be the vector of bidding prices $p_{χ_{i}^{k}} (k = 1 to | χ_{i} |)$ $p_{χ_{i}^{k}} (k = 1 to | χ_{i} |)$ for the sensor network tasks that comprise φ_i. To calculate the optimal set of bidding price for the ith consumer task, consumers have to solve the optimization problem:

$\max_{p_{χ_{i}}} (\frac{u_{i} - \sum_{k = 1}^{| χ_{k} |} p_{χ_{i}^{k}} τ_{χ_{i}^{k}}}{Γ_{φ_{i}}})$ $\max_{p_{χ_{i}}} (\frac{u_{i} - \sum_{k = 1}^{| χ_{k} |} p_{χ_{i}^{k}} τ_{χ_{i}^{k}}}{Γ_{φ_{i}}})$

(14.8)

In the earlier equation, we have assumed that consumers are profit-maximizing agents. Instead, in a cooperative environment where consumers are unselfish and intend to maximum the number of tasks completed subject to budget constraints, the optimization problem faced by the consumers is of the form:

$\min_{p_{χ_{i}}} (Γ_{φ_{i}})$ $\min_{p_{χ_{i}}} (Γ_{φ_{i}})$

subject to the condition that

$(u_{i} \geq \sum_{k = 1}^{| χ_{k} |} p_{χ_{i}^{k}} τ_{χ_{i}^{k}})$ $(u_{i} \geq \sum_{k = 1}^{| χ_{k} |} p_{χ_{i}^{k}} τ_{χ_{i}^{k}})$

(14.9)

For the earlier formulation, we assumed that the consumer can simultaneously pursue multiple consumer tasks and one sensor task pertaining to each consumer task is active at any time. For example, in the MASM simulation scenario involving target destruction explained previously, this translates to the assumption that consumers can either search for targets or track a particular target at any given time, but cannot do both simultaneously. These assumptions were guided by the design of consumer agents in the MASM simulation environment (see the next section). For alternate scenarios, Equations 14.8 and 14.9 have to be reformulated accordingly. Since both Equations 14.8 and 14.9 are deterministic optimization problems, straightforward heuristic-based approaches like genetic algorithms can be used to solve them. However, the results of this optimization are dependent heavily on the price–QoS mapping Y_ϕi, which is constructed based on current market conditions.

Directly using the values obtained by maximizing Equation 14.6 leads to extensive reliance on current market conditions. For example, a particular consumer agent that is tracking a highly important target might bid aggressively for tracking resources. As a result, the estimate of price–QoS mappings for the track subtask might show a temporary shift. If all the consumer agents adapt rapidly to the new auction outcomes, they cause a permanent price increase in the market. To avoid speculative behavior, we implemented Widrow–Hoff learning to buffer consumer responses, similar to the approach used by Cliff and Bruten’s Zero Intelligence Plus (ZIP) traders (Cliff and Bruten 1997).

ZIP traders were originally designed to extend the zero intelligence agent–based simulations of Gode and Sunder (1997). The Widrow–Hoff learning rule with momentum (Widrow and Hoff 1960) is used by the consumers to adapt their bid prices, based on their bid prices in the previous auction round and the calculated optimal prices. Assume that the bidding price used by the consumer for task φ_i in the previous round of scheduling is $p_{φ_{i}}^{t - 1} .$ $p_{φ_{i}}^{t - 1} .$ Let the optimal bidding price calculated according to Equation 14.6 or 14.8 be p_φi. The current bidding price for task budget, $p_{φ_{i}}^{t},$ $p_{φ_{i}}^{t},$ is calculated as follows:

$\begin{array}{l} e = (p_{φ i} - p_{φ i}^{t - 1}) \\ τ_{updated} = κ * {λτ}_{current} + (1 - λ κ) * e \\ p_{φi}^{t} = p_{φi}^{t - 1} + {λ*τ}_{updated} \end{array}$ $\begin{array}{l} e = (p_{φ i} - p_{φ i}^{t - 1}) \\ τ_{updated} = κ * {λτ}_{current} + (1 - λ κ) * e \\ p_{φi}^{t} = p_{φi}^{t - 1} + {λ*τ}_{updated} \end{array}$

(14.10)

The learning rate parameter, λκ determines the rate at which the budget is changed. During each iteration, the search budget is updated using the current momentum, τcurrent, which is a weighted sum of the momentum during the previous iteration and the current error. Current error is defined as the difference between the current search budget and optimal search budget. The momentum rate parameter, κτ determines the weight given to the past changes in the calculation of momentum.

14.4 PERFORMANCE EVALUATION USING A SIMULATION TEST BED

14.4.1 SIMULATION PLATFORM

We developed a multisensor, multiconsumer, multitarget platform to serve as a test bed for MASM. The design of the sensor network and the communication channel are derived from McIntyre and Hintz (1997). The complete details of the simulation environment are available in Avasarala (2006). The simulation environment represents a two-dimensional area where targets are uniformly distributed. These targets move around the search area with constant velocity. The search area has several different kinds of sensors, including sensors that provide range and bearing, bearings only sensors, electronic support measure (ESM) sensors. The simulation has a set of software agents that search for and destroy targets. The agents are not provided any sensing resources and they depend on the sensor network for obtaining information about the environment. They bid for sensor resources and update their status based on information provided by the SM. They use the sensor network’s resource to search for potential targets and if the probability of target existence within their range exceeds a certain threshold, initialize target tracks. Once a target track is initialized, the agents attack the target if their confidence in the target position is greater than a certain threshold. This is again accomplished by buying sensing resources from the sensor network. Agents are assumed to have a utility u_t (=1.0) for destroying a target. To divide the overall utility into utilities for search and track tasks, agents initially use equal priorities. During the simulation run, agents update the search to track budget ratio using the learning previously described. The consumer agents converged to a budget of 0.95 for tracking task and 0.05 for search task (Avasarala 2006). In this simulation environment, we found that MASM outperforms information-theoretic sensor management. For a detailed review of these results, refer to Avasarala (2006). Here, we elaborate on the nonmyopic nature of MASM.

We found that MASM uses resource prices to implicitly reserve resources for future use by high priority tasks, even if no high priority tasks are currently in progress. For example, consider a situation where the first user is tracking a target and the rest of the users are in search mode. Both MASM and ITSM give highest priority to the track task. The first user has a high budget for a track bid and bids accordingly. However, during the tracking task, the prices associated with the sensing resources increases since the rate of their battery power usage during tracking is high. After the tracking task is completed and when only detection tasks are in progress, prices of the sensor schedules would have increased. Consequently, sensors will be used at a slower rate during the detection phase, effectively reserving sensors for future higher-priority tasks. However, ITSM has no method of prioritizing between two tasks, except when both the tasks are currently in progress. Figure 14.3 shows the number of sensors used during different rounds of scheduling using MASM, where the number of sensors used when tracking tasks are in progress is higher than the number of sensors used when only detection tasks are in progress. When only detection tasks are present, a significant percent of sensors are resting, thereby preserving their battery power for future use.

This explains how MASM addresses the situation presented in Scenario I.

Although the current simulation is not equipped to generate situations similar to Scenario II, it is easy to how MASM would generate nongreedy behavior for this situation, even while using the simple consumer learning behavior explained earlier. If MASM is used for resource allocation in Scenario II, it will learn to price the sensors in region R2 higher than sensors in region R3 after a certain period of operation, since they are scarcely available. Consequently, the price–QoS relationship for tracking A constructed using historic data will yield a lower QoS for the same price as compared to the price–QoS relationship for tracking B. Therefore, bids on task B will be priced lower than bids on task A since bidding agents will learn that task B can be completed at a overall lower cost. Consequently, in R1, the tracking tasks involving targets that are expected to move to R2 get preferential resource allocation than tracking tasks that involve targets expected to move to R3.

FIGURE 14.3 A comparison of the number of sensors used for measurements, based on whether target tracks are currently in progress or not.

14.5 CONCLUSION

In this chapter, we presented two promising techniques for implementing nonmyopic sensor management. Approximate dynamic programming approaches offer a rigorous framework for this problem but require elaborate problem-specific formulation. Market-oriented programming approaches offer a comprehensive framework for multisensor, multiuser, multitask nonmyopic sensor management. However, the implementation involves using a few heuristics that might have to be optimized specifically for the concerned domain.

REFERENCES

Andersson, A., M. Tenhunen, and F. Ygge. 2000. Integer programming for combinatorial auction winner determination. Presented at Fourth International Conference on Multi-agent Systems (ICMAS), Boston, MA.

Avasarala, V. 2006. Multi-agent systems for data-rich, information-poor environments. PhD College of Information Sciences and Technology, The Pennsylvania State University, State College, PA.

Avasarala, V., T. Mullen, D. Hall, and S. Tumu. 2009. An experimental study on agent learning for market-based sensor management. IEEE Symposium on Computational Intelligence in Multicriteria Decision-Making (MCDM 2009), Nashville, TN, pp. 30–37.

Avasarala, V., H. Polavarapu, and T. Mullen. 2006. An approximate algorithm for resource allocation using combinatorial auctions. Proceedings of IAT 2006, Hong Kong, China, pp. 571–578.

Balazinska, M., A. Deshpande, M. Franklin et al. 2007. Data management in the Worldwide Sensor Web. IEEE Pervasive Computing, 6(2), 30–40.

Bertsekas, D. P. and J. N. Tsitsiklis. 1996. Neuro-Dynamic Programming. Belmont, MA: Athena Scientific.

Castañon, D. A. 1995. Optimal search strategies for dynamic hypothesis testing. IEEE Transactions on Systems, Man, and Cybernetics, 25, 1130–1138.

Castañon, D. A. 1997. Approximate dynamic programming for sensor management. Proceedings 36th Conference on Decision and Control, IEEE, December, San Diego, CA, Vol. 2, pp. 1202–1207.

Cheng, E., K. Leung, K. Lochner et al. 2005. Walverine: A Walrasian trading agent. Decision Support Systems, 39, 169–184.

Cliff, D. and J. Bruten. 1997. Minimal-intelligence agents for bargaining behaviors in market-based environments. Technical Report, HPL-97-91, HP Labs, Bristol, U.K.

Cramton, P., Y. Shoham, and R. Steinberg, eds. 2005. Introduction to combinatorial auctions. Combinatorial Auctions. Cambridge, MA: MIT Press.

Davis, R. and R. G. Smith. 1983. Negotiation as a metaphor for distributed problem solving. Artificial Intelligence, 20, 63–109.

Gelly, S., J. Mary, and O. Teytaud. 2006. Learning for stochastic dynamic programming. 11th European Symposium on Artificial Neural Networks, ESANN 2006, April, Bruges, Belgium, pp. 191–196.

Gode, D. K. and S. Sunder. 1997. Allocative efficiency of markets with zero-intelligence traders: Market as a partial substitute for individual rationality. Journal of Political Economy, 101, 119–137.

Hall, D. L. and S. A. H. McMullen. 2004. Mathematical Techniques in Multisensor Data Fusion, 2nd edn. Norwood, MA: Artech House Publishers.

Kastella, K. 1996. Discrimination gain for sensor management in multi-target detection and tracking. Proceedings IEEE-SMC and IMACS Multi-Conference CESA’96, July, Lille, France, Vol. 1, pp. 167–172.

Mainland, G., D. C. Parkes, and M. Welsh. 2005. Decentralized, adaptive resource allocation for sensor networks. Presented at 2nd USENIX/ACM Symposium on Networked Systems Design & Implementation (NSDI ’05), Boston, MA.

Manyika, J. M. and H. F. D. Whyte. 1994. Data Fusion and Sensor Management: A Decentralized Information Theoretic Approach. New York: Ellis Horwood.

McIntyre, G. A. and K. J. Hintz. 1996. An information theoretic approach to sensor scheduling. Proceedings of SPIE, 2755, 304–312.

McIntyre, G. A. and K. J. Hintz. 1997. Sensor management simulation and comparative study. Proceedings of SPIE, 68, 250–260.

Samuelson, P. A. 1947. Foundations of Economic Analysis. Cambridge, MA: Harvard University Press.

Sandholm, T. 1993. An implementation of the contract net protocol based on marginal cost calculations. Proceedings of the National Conference on Artificial Intelligence, Washington, DC, pp. 256–262, Association for Advancement of Artificial Intelligence (AAAI), Palo Alto, CA.

Schneider, M. K. and C.-Y. Chong. 2006. A rollout algorithm to coordinate multiple sensor resources to track and discriminate targets. Proceedings of the SPIE Conference on Signal Processing, Sensor Fusion and Target Recognition, April, Orlando, FL, Vol. 6235, 62350E1–62350E10.

Shoven, J. B. and J. Whalley. 1992. Applying General Equilibrium. New York: Cambridge University Press.

Waldspurger, C. A., T. Hogg, B. A. Huberman, J. O. Kephart, and S. Stornetta. 1992. Spawn: A distributed computational economy. IEEE Transactions on Software Engineering, 18, 103–117.

Walras, L. 1954. Elements of Pure Economics. Homewood, IL: Irwin.

Walsh, W. E., M. P. Wellmen, P. R. Wurman, and J. K. MacKie-Mason. 1998. Some economics of market-based distributed scheduling. Proceedings of the 18th International conference on Distributed Computing Systems, Amsterdam, pp. 612–621.

Washburn, R., M. Schneider, and J. Fox. 2002. Stochastic dynamic programming based approaches to sensor resource management. Proceedings of the 5th International Conference on Information Fusion, Annapolis, MD, pp. 608–615.

Wellman, M. P. 1991. Review of Huberman. Artificial Intelligence, 52, 205–218.

Wellman, W., W. Walsh, P. Wurman, and J. MacKie-Mason. 2001. Auction protocols for decentralized scheduling. Games and Economic Behavior, 35, 271–303.

Widrow, B. and M. E. Hoff. 1960. Adaptive switching circuits. IRE WESCON Convention Record, 4, 96–104.

Williams, J. L., J. W. Fisher III, and A. S. Willsky. 2005. An approximate dynamic programming approach to a communication constrained sensor management problem. Proceedings of the 8th International Conference Information Fusion, July, Philadelphia, PA, pp. 1–8.

Ygge, F. 1998. Market-oriented programming and its application to power load management. PhD, Department of Computer Science, Lund University, Lund, Sweden.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 14 Nonmyopic Sensor Management

Create new playlist

Sign In

Sign Up

Table of Contents for
Chapter 14 Nonmyopic Sensor Management