[go: up one dir, main page]

Academia.eduAcademia.edu

Mann2011

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution and sharing with colleagues. Other uses, including reproduction and distribution, or selling or licensing copies, or posting to personal, institutional or third party websites are prohibited. In most cases authors are permitted to post their version of the article (e.g. in Word or Tex form) to their personal website or institutional repository. Authors requiring further information regarding Elsevier’s archiving and manuscript policies are encouraged to visit: http://www.elsevier.com/copyright Author's personal copy Commun Nonlinear Sci Numer Simulat 16 (2011) 2999–3004 Contents lists available at ScienceDirect Commun Nonlinear Sci Numer Simulat journal homepage: www.elsevier.com/locate/cnsns Short communication Using information to generate derivative coordinates from noisy time series B.P. Mann a,⇑, F.A. Khasawneh a, R. Fales b a b Department of Mechanical Engineering & Material Science, Duke University, Durham, NC 27708, USA Department of Mechanical & Aerospace Engineering, University of Missouri, Columbia, MO 65211, USA a r t i c l e i n f o Article history: Received 25 August 2010 Received in revised form 16 November 2010 Accepted 21 November 2010 Available online 3 December 2010 Keywords: Information theoretic Signal derivatives Derivative coordinates a b s t r a c t This paper describes an approach for recovering a signal, along with the derivatives of the signal, from a noisy time series. To mimic an experimental setting, noise was superimposed onto a deterministic time series. Data smoothing was then used to successfully recover the derivative coordinates; however, the appropriate level of data smoothing must be determined. To investigate the level of smoothing, an information theoretic is applied to show a loss of information occurs for increased levels of noise; conversely, we have shown data smoothing can recover information by removing noise. An approximate criterion is then developed to balance the notion of information recovery through data smoothing with the observation that nearly negligible information changes occur for a sufficiently smoothed time series. ! 2010 Elsevier B.V. All rights reserved. 1. Introduction It is rarely practical to measure each state variable in an experimental setting. To circumvent this issue, methods exist for reconstructing a pseudo-state space from a small number of measured observables [1–3]. A primary benefit of the reconstruction is that it produces a pseudo-state space with dynamics equivalent to those of the original state space. As a consequence of the equivalence, an attractor in the reconstructed state space has the same invariants, such as Lyapunov exponents and dimension, as the original attractor [4]. While delayed embedding is the predominant choice for attractor reconstruction [3,5–8], the use of derivative coordinates is appealing, given their obvious physical meaning. For instance, many physical systems can be described by a set of equations with states that are related through their derivatives. While the numerical derivatives of a signal can be used for noise-free data, the presence of noise renders the signal derivatives to be poor approximates – owing to noise amplification in the signal derivatives. The inherent goal of signal analysis is to extract useful information about a system from the observed data. Consider a continuous and deterministic system that has produced the scalar time series q(t). To mimic realistic data from an experiment, noise is superimposed onto the noise-free data as follows: g(t) = q(t) + ra(t), where a(t) is a time series with normally distributed random noise, a zero mean, and a standard deviation equal to r. The underlying goal of this investigation is to extract q(t) and the derivatives of q(t) from the noisy time series g(t). This work investigates smoothing g(t) to recover q(t) and its derivatives. While the process of data smoothing is well known, the question of how much smoothing yields accurate derivative coordinates is unclear. To answer this question, we explored the use of an information theoretic, known as the average mutual information, to develop a criterion for the appropriate amount of data smoothing. ⇑ Corresponding author. Tel./fax: +1 919 660 5214. E-mail address: brian.mann@duke.edu (B.P. Mann). 1007-5704/$ - see front matter ! 2010 Elsevier B.V. All rights reserved. doi:10.1016/j.cnsns.2010.11.011 Author's personal copy 3000 B.P. Mann et al. / Commun Nonlinear Sci Numer Simulat 16 (2011) 2999–3004 The work of this paper is organized as follows. The next section describes the data smoothing technique and average mutual information tools used in our analyses. These discussions are followed by a series of example results that apply the average mutual information to investigate the influence of noise and an approximate criterion for the level of data smoothing. 2. Example implementation The investigations that follow use synthetic data generated from a Duffing oscillator, q00 þ lq0 þ x2 q þ bq3 ¼ C cos Xt; ð1Þ where a prime denotes a derivative with respect to time. Numerical simulation was used to generate a chaotic time series for qðtÞ; q0 ðtÞ, and q00 ðtÞ while using the following parameters l = 0.2, x = 1, b = 1, C = 27, and X = 1.33; however, in an effort to mimic the realistic challenges of an experiment, we have assumed only the noisy time series is observable g(t) = q(t) + ra(t). The remainder of this section describes the application of data smoothing for noise removal and the use of the average mutual information to determine the appropriate smoothing level to recover the derivatives of q(t). 2.1. Noise removal with smoothing Cubic splines are often applied to empirical data to estimate interim points or data points that lie between two measurements. The basic idea is to fit the data with a piecewise polynomial, sðtÞ ¼ bi0 þ bi1 ðt % ti Þ þ bi2 ðt % t i Þ2 þ bi3 ðt % t i Þ3 ; ð2Þ where the polynomial coefficients bi0, bi1, bi2, and bi3 have subscripts that denote their validity between two neighboring data points - the time interval from ti to ti+1. Therefore, cubic splines provide a piecewise fit to the data while simultaneously giving a functional relationship for the derivatives of the data, i.e. after differentiation of Eq. (2). Furthermore, cubic splines invoke continuity in the signal and its first two derivatives at the intersection of the neighboring time steps [9]. Smoothing splines, which differ from the typical cubic spline fitting operation, provide a refinement to the idea using a polynomial to fit empirical data. The basic difference lies in the introduction of a smoothing parameter which reduces noise amplification in the signal derivatives by balancing a fit between the measured data g(t) and the smoothness in the second derivative [10]. In the results that follow, cubic smoothing splines have been implemented; this approach obtains the polynomial coefficients for Eq. (2) by minimizing a b c Fig. 1. Two dimensional state space with graphs showing: (a) the actual q and q0 ; (b) the noisy time series g and the numerical derivative g0 when r = 0.1; and (c) the smoothed version s of the noisy time series and the derivative of the smoothed signal s0 obtained when c = 99.95 & 10%2. Author's personal copy 3001 B.P. Mann et al. / Commun Nonlinear Sci Numer Simulat 16 (2011) 2999–3004 3 2.5 2 1.5 1 0.5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0.25 0.2 0.15 0.1 0.05 0 0.8 0.6 0.4 0.2 0 Fig. 2. Average mutual information for different levels of noise intensity. Graphs denote the mutual information between the observable signal g and the following signals: (a) q, (b) q0 , (c) q00 . Z N X ð1 % cÞjgi % si j2 þ c tN t1 i¼1 ! ! !d2 sðtÞ! ! ! !dt; ! ! dt 2 ! ð3Þ where the notation gi = g(ti) and si = s(ti) has been applied over N-data points. The smoothing parameter c, provides a weight to balance the smoothness in the second derivative with a fit to the measured data. The smoothing process can be illustrated with the following example. Consider the superposition of the scalar time series q(t), generated from Eq. (1), with normally distributed random noise; this takes the form g(t) = q(t) + ra(t). In the absence of noise, accurate derivative coordinates may be obtained from the numerical derivatives of the observed signal since g(t) = q(t) and r = 0 (see attractor in Fig. 1a). If normally distributed random noise, with a standard of deviation r = 0.1, is added to the the signal, Fig. 1b shows that the attractor generated by g(t) and g0 ðtÞ will significantly differ from the attractor given by the states q(t) and q0 ðtÞ. While the numerical derivative, denoted by g0 ðtÞ, contains an amplified amount of noise, the noise in the higher-order numerical derivatives often overtakes the deterministic component of the signal. Here we note that the numerical derivative and derivative of the a cubic spline fit, one without smoothing, are identical and both yield the time series shown in Fig. 1b. The attractor of Fig. 1c shows the potential for smoothing the data since the time series for s(t) and sðtÞ0 closely replicate the time series shown in Fig. 1a for q(t) and qðtÞ0 . 2.2. Average mutual information The mutual information between the measurement xi = x(ti), drawn from a set of measurements x, and a measurement yj = y(tj), drawn from a set of measurements y, is the amount learned by the measurement of xi about the measurement of yj. The average amount of information learned by comparing the measurements of signal x to the measurements of the signal y is Iðx; yÞ ¼ X xi ;yj pðxi ; yj Þ log pðxi ; yj Þ ; pðxi Þpðyj Þ ð4Þ where log was taken to be the natural log and p(xi, yj) is the joint probability density that a measurement drawn from the signals x and y will result in the values of xi and yj [3,11]. The individual probability densities for the measurements xi in Author's personal copy 3002 B.P. Mann et al. / Commun Nonlinear Sci Numer Simulat 16 (2011) 2999–3004 a b c d e f Fig. 3. Average mutual information and error results for a range of smoothing values and a fixed noise level r = 0.1. Graphs (a) and (b) are error plots for the smoothed velocity and acceleration, respectively. Graphs (c)–(f) show average mutual information plots for the corresponding variables. The smoothing level that gives the smallest error is shown by a ' on the plots with s0 and by a ( on the plots with s00 . x and yi in y are given by p(xi) and p(yj), respectively. If the two signals are completely independent of each other, which indicates no information transfer, the joint probability factorizes to p(xi, yj) = p(xi)p(yj) [12]. For computational purposes, it is convenient to rewrite Eq. (4) as Iðx; yÞ ¼ X pðxi ; yj Þ log pðxi ; yj Þ % xi ;yj X xi pðxi Þ log pðxi Þ % X pðyj Þ log pðyj Þ: ð5Þ yj Fig. 2 shows a series of plots comparing the mutual information between the observable signal g and the signals qðtÞ; q0 ðtÞ, and q00 ðtÞ. For each case, the shared information between the signals rapidly decreases for an increase in the noise intensity; conversely, these graphs also show that a reduction in noise, either from smoothing the data or through an alternative approach, increases the shared information between the signals. Before investigating the usefulness of the shared information and noise level trends, the following error equation is introduced to ascertain the goodness of fit between any two signals Eðx; yÞ ¼ N 1 X jxi % yi j; N i ð6Þ where E(x, y) is the total error obtained when the time series x and y are compared at each ti and i 2 [1, N]. For the investigations that follow, N = 18 & 103 data points were used. The effect of different levels of data smoothing is illustrated through the graphs of Figs. 3 and 4. While the top graphs show the error obtained when estimating q0 ðtÞ and q00 ðtÞ by smoothing the noisy observable g(t), the lower graphs show the mutual information between various signals. Focusing on the results of Fig. 3, it is evident that smoothing will initially cause a drastic increase in the mutual information between g(t) (or the smoothed version s(t)) and the smoothed estimates of q0 ðtÞ and q00 ðtÞ, which have been labeled as s0 ðtÞ and s00 ðtÞ. However, once a sufficient amount of noise has been removed, the mutual information curve essentially plateaus - thus indicating that over smoothing the data will not significantly alter the shared information; in addition, the error graphs show that over smoothing will increase the errors of estimating q0 ðtÞ and q00 ðtÞ with s0 ðtÞ and s00 ðtÞ. While the results of Fig. 3 used r = 0.1, an increased noise level of r = 0.5 was used for the analyses shown in Fig. 4. In addition to the multiple other cases investigated, these two figures demonstrate nearly identical trends. For instance, both figures show that different levels of smoothing are required to produce the most accurate derivative coordinates. The mutual information graphs also indicate that the best level of smoothing - the value that produces the most accurate derivative coordinates - occurs just after the bend in the mutual information curves. Author's personal copy B.P. Mann et al. / Commun Nonlinear Sci Numer Simulat 16 (2011) 2999–3004 a b c d e f 3003 Fig. 4. Average mutual information and error results for a range of smoothing values and a fixed noise level r = 0.5. Graphs (a) and (b) are error plots for the smoothed velocity and acceleration, respectively. Graphs (c)–(f) show average mutual information plots for the corresponding variables. The smoothing level that gives the smallest error is shown by a ' on the plots with s0 and by a ( on the plots with s00 . 2.3. Approximate smoothing criterion While it was initially tempting to search for an exact smoothing level that yields the absolute minimum error, we found this approach impractical, given the fact that the noisy time series is the only observable. Instead, we have focused on an approximate criteria that leverages the trends in the mutual information curves. For instance, Fig. 5 shows unit-normalized curvature and slope plots of the mutual information curves of Figs. 3 and 4; these were obtained by fitting the function f ðcÞ ¼ ðco þ c1 cÞð1 þ e%c2 c Þ to the I(x, y) curves, where co–c2 are fitted constants, and taking the derivatives with respect c. Here, eI x;y represents the normalized slope of I(x, y) and j(Ix,y) denotes the normalized curvature of I(x, y). The correlation coefficient obtained when fitting f(c) to the mutual information curves of Figs. 3 and 4 are (a) 0.96, (b) 0.99, (c) 0.97, and (d) 0.96, where the letters (a)–(d) correspond to the letters assigned to the graphs appearing in Fig. 5. The shown in Fig. 5 indicate that the slope is initially large, but it rapidly decays to a constant value - indicating a leveling off in the shared information between the smoothed signals for further levels of smoothing. Thus one idea would be to consider a small value ! and to choose the smoothing level after the normalized slope dropped below !. The problem with this approach was that the choice of ! was arbitrary. As an alternative, a threshold value could be adopted using the normalized curvature. For instance, a range of accurate smoothing levels could be identified when the curvature exceeded a threshold !. However, once again, the choice of ! is rather arbitrary - unless the two ideas are combined. Specifically, we contend that the first intersection of the normalized curvature and slope, where eI x;y ¼ jðIx;y Þ, should be used as the threshold !; the acceptable range for c is then defined by the normalized curvature that exceeds this threshold j(Ix,y) P !. This was the approach taken to successfully smooth the data shown in Fig. 1. To summarize, the proposed criterion considers a range of acceptable smoothing levels as opposed to a single value. The idea stems from the notion that data smoothing can remove noise and therefore recover information; this is confirmed by the shape of the I(x, y) curves and quantified through the normalized slope and curvature. 3. Conclusions This paper described an approach to recover the derivative coordinates from a noisy time series. We have assumed the underlying system was deterministic and continuous. Although a data smoothing technique was shown to successfully recover the derivative coordinates, a criterion was needed to ascertain the appropriate level of data smoothing. To investigate Author's personal copy 3004 B.P. Mann et al. / Commun Nonlinear Sci Numer Simulat 16 (2011) 2999–3004 a c b d Fig. 5. Unit-normalized slope (solid line) and curvature (dotted line) plots of the mutual information curves of Figs. 3 and 4. Graphs give the slope and curvature for: (a) Iðs; s0 Þ and r = 0.1, (b) Iðs; s00 Þ and r = 0.1, (c) Iðs; s0 Þ and r = 0.5, and (d) Iðs; s00 Þ and r = 0.5. The smoothing level that gives the smallest error is shown by a ' on the plots with s0 and by a ( on the plots with s00 . the best level of smoothing, we applied an information theoretic to show information loss for increased levels of noise intensity. Conversely, we also showed that data smoothing can recover information. An approximate criterion was then developed that balanced the notion of recovering information from data smoothing with the observation that nearly negligible information changes occur once the time series has been sufficiently smoothed. In particular, the slopes and curvature of the mutual information curves were used to describe an approximate criterion for an appropriate range of data smoothing - the range that provides the most accurate recovery of the derivative coordinates. Acknowledgement Support from the National Science Foundation (CMMI-0900266) is gratefully acknowledged. References [1] Packard NH, Crutchfield JP, Farmer JD, Shaw RS. Geometry from time series. Phys Rev Lett 1980;45:712–6. [2] Takens F. Detecting strange attractors in turbulence. In: Rand A, Young LS, editors. Dynamical Systems and Turbulence. Lecture Notes in Mathematics. New York: Springer-Verlag; 1981. p. 366–81. Ch. 898. [3] Arbarbanel HD. Analysis of Observed Chaotic Data. 2nd ed. New York, NY: Springer; 1996. [4] Nayfeh AH, Balachandran B. Applied Nonlinear Dynamics. 1st ed. New York, NY: John Wiley & Sons, Inc.; 1995. [5] Kennel MB, Abarbanel HD. False neighbors and false strands: A reliable minimum embedding dimension algorithm. Phys Rev E 2002;66:026209. [6] Fraser AM. Reconstructing attractors from scalar time series: A comparison of singular and redundancy criteria. Physica D 1989;34:391–404. [7] Fraser AM. Independent coordinates for strange attractors from mutual information. Phys Rev A 1986;33:1134–40. [8] Garcia SP, Almeida JS. Nearest neighbor embedding with different time delays. Phys Rev E 2005;71:037204. [9] Kreyszig E. Advanced Engineering Mathematics. 8th ed. New York, NY: John Wiley and Sons, Inc.; 1999. [10] Boor CD. A Practical Guide to Splines. New York, NY: Springer-Verlag; 1978. [11] Nichols JM. Inferences about information flow and dispersal for spatially extended population systems using time-series data. Proc R Soc London B 2005;272:871–6. [12] Nichols JM, Seaver M, Trickey ST, Salvino LW, Pecora DL. Detecting impact damage in an experimental composite structures: an information-theoretic approach. Smart Mater Struc 2006;15:424–34.