Asymptotic noise analysis of high dimensional consensus

2009, 2009 Conference Record of the Forty-Third Asilomar Conference on Signals, Systems and Computers

Asymptotic Noise Analysis of High Dimensional Consensus Usman A. Khan, Soummya Kar and José M. F. Moura Department of Electrical and Computer Engineering Carnegie Mellon University, Pittsburgh, PA 15213 USA Email: ukhan@ece.cmu.edu, soummyak@andrew.cmu.edu, moura@ece.cmu.edu Abstract—The paper studies the effect of noise on the asymptotic properties of high dimensional consensus (HDC). HDC offers a unified framework to study a broad class of distributed algorithms with applications to average consensus, leader-follower dynamics in multi-agent networks and distributed sensor localization. We show that under a broad range of perturbations, including inter-sensor communication noise, random data packet dropouts and algorithmic parameter uncertainty, a modified version of the HDC converges almost surely (a.s.) We characterize the asymptotic mean squared error (m.s.e.) from the desired agreement state of the sensors (which, in general, vary from sensor to sensor) and show broad conditions on the noise leading to zero asymptotic m.s.e. The convergence proof of the modified HDC algorithm is based on stochastic approximation arguments and offers a general framework to study the convergence properties of distributed algorithms in the presence of noise. Index Terms—High Dimensional Consensus, Random Link Failures, Communication Noise, Almost Sure Convergence, Stochastic Approximation. I. I NTRODUCTION Distributed signal processing is an active research area and offers a scalable, iterative and low complexity alternative to centralized data processing. Extensive research efforts fostered the development of basic network utility schemes, including distributed average consensus, distributed parameter estimation and inference algorithms, distributed sensor network localization, distributed compressed sensing, just to name a few. To envision extensive real time implementation and deployment of such networks and schemes, it is of importance to analyze and design robust versions of these algorithms. Indeed, such networks operate under scarce resources like limited bandwidth and power, unpredictable environments and random operating conditions. In [1],[2] a generic class of algorithms, called HDC (High Dimensional Consensus), was presented which unified a large class of distributed network algorithms and offered a common framework for their analysis and design. In this paper, we continue the study of HDC, with a focus on their robustness properties. Specifically, we consider a modified version of the HDC, called M-HDC, designed to This work was partially supported by the DARPA DSO Advanced Computing and Mathematics Program Integrated Sensing and Processing (ISP) Initiative under ARO grant # DAAD 19-02-1-0180, by NSF under grants # ECS-0225449 and # CNS-0428404, by the Office of Naval Research under MURI N000140710747, and by an IBM Faculty Award. 978-1-4244-5827-1/09/$26.00 ©2009 IEEE 191 cope with the effects of environmental randomness on the operation of the HDC. Under a broad range of random perturbations, including inter-sensor communication noise, random data packet dropouts and uncertainty in model matrices (to be explained later) we prove a.s. convergence of the M-HDC algorithm and explicitly characterize the deviation from the desired equilibrium state in terms of the residual m.s.e. The M-HDC is a stochastic approximation based algorithm and employs a temporally decreasing weight sequence for the iterative scheme. As explained in the paper, the adaptively chosen weight sequence controls the effect of environmental noise leading to robust behavior. On the other hand, the incorporation of robustness in this way incurs a slower convergence rate in comparison to the non-robust HDC version, which converges at a geometric rate in the absence of noise. We comment briefly on the organization of the rest of the paper. Section II reviews prior work on HDC, whereas the robust version M-HDC together with the noise assumptions are presented in Section III. The main convergence results are presented in Section IV and Section V discusses several applications of the robust M-HDC. Finally Section VI concludes the paper. II. P RIOR WORK Consider a network of N sensor nodes communicating over a directed graph, G = (Θ, A). The N network agents are divided into two classes, namely anchors and sensors with κ denoting the set of anchors and Ω the sensors, such that, Θ = κ ∪ Ω. Let uk ∈ R1×m be the state associated to the kth anchor, and let xl ∈ R1×m be the state associated to the lth sensor. We are interested in studying linear iterative algorithms of the form uk (t + 1) = xl (t + 1) = uk (t), j∈KΩ (l)∪{l} k ∈ κ, plj xj (t) + (1) blk uk (0),(2) k∈Kκ (l) for l ∈ Ω, where: t ≥ 0 is the discrete-time iteration index; and plj ’s and blk ’s are the state updating coefficients. We assume that the updating coefficients are constant over the components of the m-dimensional state. The study of such distributed iterated algorithms was initiated in [2], [3], [4], where we introduced the term Higher Dimensional Consensus (HDC) to describe such algorithms (see [4] for a justification of the nomenclature.) In particular, we note that eqns. 1,2 consider Asilomar 2009 the HDC algorithm under no randomness in communication and weight computation as treated in [4]. In this section we review some results from [4] on the properties of HDC in a noise-free environment and in later sections consider the effect of random perturbations on the steady state behavior of the same. For the purpose of analysis, we write the HDC (1)–(2) in matrix form. Define T U(t) = uT1 (t), . . . , uTK (t) , (3) T T T (4) X(t) = xK+1 (t), . . . , xN (t) , P = {plj } ∈ RM ×M , B = {blk } ∈ RM ×K . (5) M ×m K×m Note that U(t) ∈ R and X(t) ∈ R . With the above notation, we write (1)–(2) concisely as U(t + 1) I 0 U(t) = , (6) X(t + 1) B P X(t) C(t + 1) = ΥC(t). (7) Υ Note that the graph, G , associated to the N × N iteration matrix, Υ, must be a subgraph of G. In other words, the sparsity of Υ is dictated by the sparsity of the underlying sensor network. In the iteration matrix, Υ: its submatrix, P, collects the updating coefficients of the M sensors with respect to the M sensors; and its submatrix, B, collects the updating coefficients of the M sensors with respect to the K anchors. From (6), the matrix form of the HDC in (2) is X(t + 1) = PX(t) + BU(0), t ≥ 0. A. No anchors: B = 0 In this case, the HDC reduces to X(t + 1) As discussed in Section III, the HDC algorithm is implemented as (1)–(2), and its matrix representation is given by (7). The study of the forward HDC problem (referred to as the HDC for brevity in the sequel) can be divided into the following two cases: (A) no anchors; and (B) with anchors. We briefly review these two cases separately, their applications being considered in Section V. 192 PX(t), = Pt+1 X(0). (9) An important problem covered by this case is averageconsensus. As well known, when ρ(P) = 1, (10) and under some minimal assumptions on P and on the network connectivity, (9) converges to the average of the initial sensors’ states. For more precise and general statements in this regard, see for instance, [5], [6]. Average-consensus, thus, is a special case of the HDC, when B = 0 and ρ(P) = 1. This problem has been studied in great detail, a detailed set of references is provided in [2]. The rest of this paper deals entirely with the case ρ(P) < 1 and the term HDC subsumes the ρ(P) < 1 case, unless explicitly noted. Note that, when B = 0, the HDC (with ρ(P) < 1) leads to X∞ = 0, which is not interesting. B. With anchors: B = 0 This extends the average-consensus to “higher dimensions” (as explained in [4].) Lemma 1 establishes: (i) the conditions under which the HDC converges; (ii) the limiting state of the network; and (iii) the rate of convergence of the HDC. Lemma 1 ([4]) Let B = 0 and U(0) ∈ / N (B), where N (B) is the null space of B. If (8) As with all distributed algorithms of interest, there are two objectives behind the study of HDC algorithms. First, given the sensor network and the weight matrices, P and B, it is of interest to determine the asymptotic properties of the HDC iterates, i.e., the limiting sensor states, provided they converge. The second problem is the inverse or learning problem, where it is desired to design the weight matrices P and B respecting the sparsity of the given communication network, such that the sensors converge to some desired state. In [4] we consider the learning problem in detail, which leads to a multi-objective design criteria exhibiting the trade-off between convergence rate and steady state m.s.e. The current paper deals solely with the forward or analysis problem, where we assume that appropriate weight matrices P and B are given and the objective is to analyze the asymptotic properties of the HDC under random perturbations. = ρ(P) < 1, (11) then the limiting state of the sensors, X∞ , is given by X∞ lim X(t + 1) = (I − P) t→∞ −1 BU(0), (12) and the error, E(t) = X(t) − X∞ , decays exponentially to 0 with exponent ln(ρ(P)), i.e., 1 lim sup lnE(t) ≤ ln(ρ(P)). t→∞ t (13) The above lemma shows that the limit state of the sensors, X∞ , is independent of the sensors’ initial conditions and is given by (12), for any X(0) ∈ RM ×m . It is also straightforward to show that if ρ(P) ≥ 1, then the HDC algorithm (8) diverges for all U(0) ∈ / N (B), where N (B) is the null space of B. Clearly, the case U(0) ∈ N (B) is not interesting as it leads to X∞ = 0. The following section considers a modified version of the HDC algorithm presented in [4] to cope with the effect of environmental randomness including inter-sensor communication noise, channel effects and incorrect weight design. III. P ROBLEM FORMULATION In this section we present M-HDC, a modified version of the HDC algorithm to cope with the effect of noise. Before stating the assumptions on the random operating conditions formally, we consider a relaxed version of the HDC, eqns. 1,2: Relaxed-HDC: In vector form, the relaxed HDC iterations take the form: U(t + 1) = U(0), t≥0 (14) X(t + 1) = (1 − α)X(t) + α [PX(t) + BU(0)] , t ≥ 0 (15) where the relaxation parameter 0 ≤ α ≤ 1 is a design constant. Under appropriate choice of α it can be shown that the convergence properties of the (unrelaxed) HDC are retained. Since the iterations above can be separated over the columns of U(t) and X(t), we an alternative representation of the HDC takes the form: j j U (t + 1) = U (0), t≥0 (16) j j j X (t + 1) = (1 − α)X (t) + α PX (t) + BU (0) , t ≥ 0 (17) Here, Uj (t), Xj (t), denote the j-th column of the state matrices, U(t), X(t) respectively. Thus, to implement the sequence of iterations in (17) perfectly, the n-th sensor at iteration t needs the corresponding rows of the matrices P and B, and, in addition, the current states, Ulj (t), Xlj (t), l ∈ Θn (j-th component of l-th sensor coordinates), of its neighbors. The computation of the matrices P and B may require online sensing or measurement (for example, in the case of distributed sensor localization [1], the weight matrices P, B consist of local barycentric coordinates computed from inter-sensor distance measurements, which are susceptible to random ranging errors) and hence can be estimated to a limited degree of accuracy only. Also, because of imperfect communication, each sensor receives only noisy versions of its neighbors current state. Hence, in a random environment, the iteration sequence in (17) needs to be modified accordingly, because, each sensor has only partial imperfect information about the system. In the following, we identify the sources of randomness, which may be introduced in (17), because of imperfect communication and noisy distance computation, and state them formally as assumptions. Also, note that the choice of α = 1 reduces eqn. (15) to the update eqn. (8), i.e., the unrelaxed version of the HDC as considered in Section II. The key point to note here, that if α < 1, unlike the unrelaxed case, the relaxed HDC attaches a non-trivial weight (constant over time) to the previous sensor states. As will be formulated later, this relaxation is an important step to make HDC resilient to noisy perturbations, where the relaxation parameter α is chosen to be a time-varying sequence, so that the cumulative effect of noise is controlled as the algorithm progresses. We present the key technical assumptions on the different types of random perturbations on the HDC before presenting the M-HDC algorithm. (C1) Randomness in system matrices: At each iteration, each sensor needs the corresponding row of the system matrices B and P, which can be, possibly, random due to the effect of imperfect online sensing. We assume that at each j 193 n (t) and iteration, the n-th sensor can only get estimates, B Pn (t), of the corresponding rows of the B and P matrices respectively. In the generic imperfect communication case, we have B (t) B(t) = B + SB + S (18) B (t)}t≥0 is an independent sequence of random where, {S matrices with, B (t)] = 0, ∀t, E[S B (t)2 = kB < ∞. (19) sup E S t≥0 Here, SB is the mean measurement error. Similarly, for the case of P, we have P (t), P(t) = P + SP + S (20) P (t)}t≥0 is an independent sequence of random where {S matrices with, P (t)] = 0, ∀t, E[S P (t)2 = kP < ∞. (21) sup E S t≥0 Note that this way of writing B(t), P(t) does not require the noise model to be additive. It only says that any random object may be written as the sum of a deterministic mean part and the corresponding zero mean random part. The moment assumptions in eqns. (19,21) are very weak and in particular, is satisfied if the sequences {B(t)} t≥0 and {P(t)}t≥0 are i.i.d. (C2) Random Link Failure: We assume that the intersensor communication links fail randomly. This happens, for example, in wireless sensor network applications, where occasionally data packets are dropped. To this end, if the sensors n and l share a communication link (or, l ∈ Θn ), we assume that the link fails with some probability 1 − qnl at each iteration, where 0 < qnl ≤ 1. We associate with each such potential network link, a binary random variable, enl (t), where enl (t) = 1 indicates that the corresponding network link is active at time t, whereas enl (t) = 0 indicates a link failure. (C3) Additive Channel Noise: Define the family of indepenj dent zero mean random variables, {vnl (t)}n,l,j,t , such that j sup E[vnl (t)]2 = kv < ∞. (22) n,l,j,t We assume that at the t-th iteration, if the network link (n, l) j is active, sensor n receives only a corrupt version, ynl (t), of j sensor l’s state, cl (t), given by j j ynl (t) = cjl (t) + vnl (t). (23) This models the channel noise. It is to be noted, that the moment assumption in eqn. (22) is very weak and holds, in particular, if the channel noise is i.i.d. (C4) Independence: We assume that the sequences, B (t), S P (t)}t≥0 , {enl (t)}n,l,t , and {v j (t)}n,l,j,t are mu{S nl tually independent. Note that, in the above, we do not put restrictions on the distributional form of the random errors, but, assume they obey some weak moment conditions. Clearly, under the random environment model (as detailed in Assumptions (C1)-(C4), the sensors cannot update their states according to the iterations, given in (17). We now consider the following state update recursion for the random environment case: Algorithm M-HDC: xjn (t + 1) = (1 − α (t)) xjn (t) j ujl + vnl (t) xjl (t) + j vnl (t) + α(t) + α(t) l∈κ∩Θn l∈Ω∩Θn nl (t) enl (t)B qnl nl (t) = P nl (t) P (24) enl (t) −1 qnl (25) enl (t) −1 . qnl (26) Clearly, by (C4), the matrices B(t) ∈ RM ×(m+1) and P(t) ∈ M ×(m+1) R are zero mean. Also, by the bounded moment assumptions in (C1), we have 2 sup E B(t) = kB < ∞, t≥0 2 = kP < ∞. sup E P(t) t≥0 (27) Hence, the iterations in (24) can be written in vector form as xj (t + 1) = (1 − α(t)) xj (t) + α(t) P(t) + P(t) xj (t) + B(t) + B(t) uj + η j (t) , (28) where, the nth element of the vector, η j (t), is given by nl (t) + P nl (t) v j (t) ηnj (t) = P nl l=n + l=n nl (t) v j (t). nl (t) + B B nl α2 (t) < ∞. (32) t≥0 This condition, commonly assumed in the adaptive control and adaptive signal processing literature, assumes that the weights decay to zero, but not too fast. (D2) Low Error Bias: We assume the small bias condition: (33) in addition to ρ(P) < 1. We note that this condition ensures that the matrix (I − P − SP ) is invertible. We refer to the algorithm in (24) as M-HDC (Distributed Localization in Random Environment) in the sequel. To write the algorithm M-HDC in a compact form, we introduce some notation. Define the random matrices, B(t) ∈ M ×(m+1) M ×(m+1) R and P(t) ∈ R , as nl (t) = B nl (t) B t≥0 ρ (P + SP ) < 1. nl (t) enl (t)P qnl , n ∈ Ω, 1 ≤ j ≤ m (D1) Persistence Condition: α(t) > 0, α(t) = ∞, (29) By (C1)-(C4), the sequence, {η j (t)}t≥0 , is zero mean independent with sup E η j (t)2 = kη < ∞. (30) t From (C1), the iteration sequence in (28) can be written as xj (t + 1) = xj (t) − α(t) (I − P − SP ) xj (t) P (t) + P(t) xj (t) − (B + SB ) uj − S B (t) + B(t) uj − η j (t) . (31) − S We now make two additional design assumptions on the algorithm: 194 In the following section, we study convergence properties of the M-HDC algorithm, under the assumptions (C1)-(C4), (D1)-(D2). IV. C ONVERGENCE AND RELATED RESULTS We have the following main result regarding the limiting state of the M-HDC algorithm: Theorem 2 Let {X(t)}t≥0 , 1 ≤ j ≤ m, be the state sequence generated by the iterations, given by (24), under the assumptions (C1)-(C4), (D1)-(D2). Then, P lim X(t) = (I − P − SP ) t→∞ −1 (B + SB ) U(0), ∀j = 1. (34) The proof relies on stochastic approximation based arguments ([7]) and follows similar reasoning as adopted in [8],[1] for proving convergence of distributed network algorithms and is omitted. We discuss the consequences of Theorem 2. Remark 3 Theorem 2 shows that the M-HDC converges a.s. to a deterministic state under the broad noise assumptions discussed above. However, in the steady state there is a residual error from the desired convergence state (I − P)−1 BU(0). This is due to the the presence of non-zero error bias terms SP , SB in the estimation of the weight matrices P and B respectively. In particular, if the error biases SP , SB are zero, then irrespective of the randomness in communication, packet dropouts, the M-HDC converges a.s. to the desired state (I − P)−1 BU(0) where the HDC (noise free environment) converges. However, it is important to note here, that the decreasing weight sequence α(t) lowers the convergence rate of the M-HDC algorithm as compared to the geometric convergence rate attained by the HDC. In other words, robustness can be achieved by sacrificing convergence rate. Typical stochastic approximation based arguments [9] suggest a that a weight sequence {α(t)} of the form α(t) = t+1 (satisfying the persistence condition) leads to a convergence rate of the order of √1t , i.e., the transient error X(t) − −1 (I − P − SP ) (B + SB ) U(0) decreases at a rate of √1t as t → ∞. V. A PPLICATIONS We briefly comment on applications of HDC where robustness to random environmental perturbations is desired. A very important and canonical application is robust distributed average consensus where random inter-sensor data packet dropouts, quantized transmission, additive channel noise require the designing of robust consensus algorithms. An extensive treatment in this line can be found in [8],[10]. As mentioned earlier, another application which, in addition requires robustness to random weight matrices, comes from distributed sensor localization [1], where the matrices P and B are local inter-sensor barycentric coordinates computed from incorrect distance estimates. VI. C ONCLUSIONS The paper presents a modified version of the HDC algorithm, M-HDC, which adds robustness to HDC in the presence of random environmental perturbations. We show that the M-HDC converges a.s. to a deterministic state and characterizes the residual m.s.e. from the desired convergence state. In particular, the residual m.s.e. is zero if the error biases resulting from computation of the weight matrices P and B are zero. Finally, we demonstrate the usefulness of the MHDC algorithm as a unified framework for studying several important distributed applications in sensor networks including average consensus and distributed sensor localizations. R EFERENCES [1] U. A. Khan, S. Kar, and J. M. F. Moura, “Distributed sensor localization in random environments using minimal number of anchor nodes,” IEEE Transactions on Signal Processing, vol. 57, no. 5, May. 2009, also, in arXiv: http://arxiv.org/abs/0802.3563. [2] U. Khan, S. Kar, and J. M. F. Moura, “Distributed algorithms in sensor networks,” in Handbook on Sensor and Array Processing, S. Haykin and K. J. R. Liu, Eds. New York, NY: Wiley-Interscience, 2009, to appear, 33 pages. [3] U. A. Khan, S. Kar, and J. M. F. Moura, “Higher dimensional consensus algorithms in sensor networks,” in IEEE 34th International Conference on Acoustics, Speech, and Signal Processing, Taipei, Taiwan, Apr. 2009. [4] U. Khan, S. Kar, and J. M. F. Moura, “Higher dimensional consensus: learning in large-scale networks,” April 2009, accepted for publication in the IEEE Transactions on Signal Processing. [Online]. Available: http://arxiv.org/abs/0904.1840 [5] L. Xiao and S. Boyd, “Fast linear iterations for distributed averaging,” Systems and Controls Letters, vol. 53, no. 1, pp. 65–78, Apr. 2004. [6] R. Olfati-Saber, J. A. Fax, and R. M. Murray, “Consensus and cooperation in networked multi-agent systems,” Proceedings of the IEEE, vol. 95, no. 1, pp. 215–233, Jan. 2007. [7] M. B. Nevelson and R. Z. Hasminskii, Stochastic Approximation and Recursive Estimation. Providence, Rhode Island: American Mathematical Society, 1973. [8] S. Kar and J. M. F. Moura, “Distributed consensus algorithms in sensor networks: Quantized data,” November 2007, submitted for publication, 30 pages, see http://arxiv.org/abs/0712.1609. [Online]. Available: http://arxiv.org/abs/0712.1609 [9] S. Kar, J. M. F. Moura, and K. Ramanan, “Distributed parameter estimation in sensor networks: Nonlinear observation models and imperfect communication,” Aug. 2008, submitted for publication, see also http://arxiv.org/abs/0809.0009. [Online]. Available: http://arxiv.org/abs/0809.0009 [10] S. Kar and J. M. F. Moura, “Distributed consensus algorithms in sensor networks: Link failures and channel noise,” IEEE Transactions on Signal Processing, 2008, accepted for publication, 30 pages. [Online]. Available: http://arxiv.org/abs/0711.3915 195

Log In

Asymptotic noise analysis of high dimensional consensus

Related papers

Related papers

Related topics