[go: up one dir, main page]

WO2018086025A1 - Node identification in distributed adaptive networks - Google Patents

Node identification in distributed adaptive networks Download PDF

Info

Publication number
WO2018086025A1
WO2018086025A1 PCT/CN2016/105324 CN2016105324W WO2018086025A1 WO 2018086025 A1 WO2018086025 A1 WO 2018086025A1 CN 2016105324 W CN2016105324 W CN 2016105324W WO 2018086025 A1 WO2018086025 A1 WO 2018086025A1
Authority
WO
WIPO (PCT)
Prior art keywords
data information
node
data
observation point
information
Prior art date
Application number
PCT/CN2016/105324
Other languages
French (fr)
Inventor
Guangyue LU
Fan Wang
Jingyu FENG
Original Assignee
Nokia Technologies Oy
Nokia Technologies (Beijing) Co., Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nokia Technologies Oy, Nokia Technologies (Beijing) Co., Ltd. filed Critical Nokia Technologies Oy
Priority to PCT/CN2016/105324 priority Critical patent/WO2018086025A1/en
Priority to CN201680091991.6A priority patent/CN110168557B/en
Publication of WO2018086025A1 publication Critical patent/WO2018086025A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/2433Single-class perspective, e.g. one-against-all classification; Novelty detection; Outlier detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/251Fusion techniques of input or preprocessed data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/94Hardware or software architectures specially adapted for image or video understanding
    • G06V10/95Hardware or software architectures specially adapted for image or video understanding structured as a network, e.g. client-server architectures
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/12Protocols specially adapted for proprietary or special-purpose networking environments, e.g. medical networks, sensor networks, networks in vehicles or remote metering networks
    • H04L67/125Protocols specially adapted for proprietary or special-purpose networking environments, e.g. medical networks, sensor networks, networks in vehicles or remote metering networks involving control of end-device applications over a network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/56Provisioning of proxy services
    • H04L67/568Storing data temporarily at an intermediate stage, e.g. caching

Definitions

  • the present disclosure generally relates to communication networks, and more specifically, relates to distributed adaptive networks.
  • Communication networks can be deployed in centralized or distributed manner.
  • all nodes transmit their information to a central authority for making decision or data fusion, which can fully benefit from information collected throughout a network.
  • the central authority can easily become a bottleneck of the network, that is, the failure of the centralized node would become the disaster because of the possible denial-of-service attack.
  • distributed strategies without a central authority, attracts more attentions for its good performances and robustness to the attack. In such case, many kinds of distributed algorithms are needed to fulfill the parameter estimation.
  • the networks are not always in a secure situation.
  • the possible malicious nodes tamper their own values and transmit them to their neighbor nodes for data fusion, which could definitely affect the performance of entire network during the data diffusion among the nodes. So how to identify a malicious node and then prevent its value from being transmitted to its neighbor nodes have become crucial issues for distributed networks.
  • the present description proposes a novel solution for distributed networks, which may be utilized to identify some specified data information and network nodes, for example, abnormal data and related nodes in the distributed networks.
  • a method comprising: collecting first data information transmitted at an observation point from a first node to a second node in a distributed network, wherein the second node is a neighbor node of the first node; comparing the first data information with second data information transmitted at the observation point by the second node to its neighbor nodes in the distributed network; and determining whether the first data information is to be used by the second node in data fusion associated with the observation point, based at least in part on the comparison of the first data information with the second data information.
  • an apparatus comprising at least one processor and at least one memory comprising computer program code, the at least one memory and the computer program code configured to, with the at least one processor, cause the apparatus at least to: collect first data information transmitted at an observation point from another apparatus to the apparatus in a distributed network, wherein the apparatus is a neighbor of the another apparatus; compare the first data information with second data information transmitted at the observation point by the apparatus to its neighbors in the distributed network; and determine whether the first data information is to be used by the apparatus in data fusion associated with the observation point, based at least in part on the comparison of the first data information with the second data information.
  • a computer program product comprising a computer-readable medium bearing computer program codes embodied therein for use with a computer, the computer program codes comprising: code for collecting first data information transmitted at an observation point from a first node to a second node in a distributed network, wherein the second node is a neighbor node of the first node; code for comparing the first data information with second data information transmitted at the observation point by the second node to its neighbor nodes in the distributed network; and code for determining whether the first data information is to be used by the second node in data fusion associated with the observation point, based at least in part on the comparison of the first data information with the second data information.
  • an apparatus comprising: collection means for collecting first data information transmitted at an observation point from another apparatus to the apparatus in a distributed network, wherein the apparatus is a neighbor of the another apparatus; comparison means for comparing the first data information with second data information transmitted at the observation point by the apparatus to its neighbors in the distributed network; and determination means for determining whether the first data information is to be used by the apparatus in data fusion associated with the observation point, based at least in part on the comparison of the first data information with the second data information.
  • the apparatus according to the second/fourth aspect of the present disclosure may comprise the second node, and the another apparatus according to the second/fourth aspect of the present disclosure may comprise the first node.
  • the comparison of the first data information with the second data information may comprise calculating a similarity distance between the first data information and the second data information.
  • the similarity distance may represent a distance between values of the first data information and the second data information.
  • the first data information is to be used by the second node in the data fusion associated with the observation point, in response to the similarity distance between the first data information and the second data information not satisfying a first criterion.
  • the first data information is not to be used by the second node in the data fusion associated with the observation point, in response to the similarity distance between the first data information and the second data information satisfying a first criterion.
  • the first data information may be replaced with the second data information by the second node in the data fusion associated with the observation point.
  • the method according to the first aspect of the present disclosure may further comprise: updating first identification information for the first node, in response to determining that the first data information is not to be used by the second node in the data fusion associated with the observation point.
  • the first identification information may indicate, within an observation cycle comprising one or more observation points, a number of observation points at which data information from the first node is determined not to be used in data fusion by the second node.
  • the method according to the first aspect of the present disclosure may further comprise: updating second identification information for the first node, in response to the first identification information at the end of the observation cycle satisfying a second criterion.
  • the second identification information may indicate a number of observation cycles at the end of which the first identification information for the first node satisfies the second criterion.
  • the method according to the first aspect of the present disclosure may further comprise: triggering a predefined action for the first node, in response to the second identification information satisfying a third criterion.
  • the predefined action may comprise raising an alarm.
  • Fig. 1 exemplarily illustrates a distributed network topology and data transmissions among nodes in accordance with an embodiment of the present disclosure
  • Fig. 2 is a flowchart illustrating a method for distributed networks in accordance with an embodiment of the present disclosure
  • Fig. 3 exemplarily illustrates the effects of network performance as the change of the predefined fusion threshold in the absence of attacks in accordance with an embodiment of the present disclosure
  • Fig. 4 is a flowchart illustrating an identification procedure in accordance with an embodiment of the present disclosure
  • Fig. 5 exemplarily illustrates the system architecture of a data management module in accordance with an embodiment of the present disclosure
  • Fig. 6 exemplarily illustrates a network topology used in the simulations in accordance with an embodiment of the present disclosure
  • Fig. 7 exemplarily illustrates the detection of the number of attacks under two kinds of attack modes in accordance with an embodiment of the present disclosure
  • Fig. 8 exemplarily illustrates the comparison of DLMS performance in the unsafe situation in accordance with an embodiment of the present disclosure
  • Fig. 9 exemplarily illustrates the impact of the increased number of malicious nodes on the network performance in accordance with an embodiment of the present disclosure.
  • Fig. 10 is a simplified block diagram of an apparatus suitable for use in practicing exemplary embodiments of the present disclosure.
  • Distributed adaptation over networks has emerged as an attractive and challenging research area with the advent of multiagent (wireless or wired) networks.
  • Distributed adaptive networks are well-suited for the decentralized information processing and optimization tasks to model various types of self-organized and complex behaviors encountered in engineering, such as target localization, cooperative spectrum sensing, distributed estimation, etc.
  • Distributed adaptive networks consist of a collection of nodes with processing and learning abilities. The interconnected nodes form a connecting topology, and cooperate with each other through local interactions to perform preassigned tasks in real time.
  • the nodes share information with each other and the continuous information diffusion across the network enables the nodes to adapt their performances to the data stream and network conditions. It can also achieve the improved adaptation and learning performance, as comparing with the non-cooperative nodes.
  • the distributed networks may be in an unsecure situation where the possible malicious nodes may deteriorate the performance of entire network during the data diffusion among the nodes.
  • the malicious node can launch different kinds of attacks, with certain probability, to its neighbor nodes at any time via different data falsifying ways. Therefore, it is hard to control and predict the attack behavior of the malicious node.
  • the malicious node may transmit its falsified value to the regular nodes, which will cause catastrophic damage to the performance of the entire network, due to the use of the falsified value in data fusion, and thus make a well designed distributed system useless.
  • the regular nodes would suffer more fierce attack, which may lead to more serious damage to the entire network.
  • one way to detect malicious nodes is the sort-based method.
  • the central authority sorts the values received from its neighbors and regards the values at both ends of them as malicious.
  • the values of malicious may change randomly, which makes it hard to identify a malicious node only via its value position.
  • the effectiveness of the method is sensitive to the network topology and the type of malicious nodes.
  • Another possible scheme relies on the differences between the values of each node and the average value of all neighbor nodes to identify the malicious nodes, which assumes that the malicious node has the biggest difference from the average value.
  • methods, apparatus, and computer program products provided according to exemplary embodiments of the present disclosure can provide a novel and effective solution to identify the malicious nodes in distributed networks.
  • the node identification scheme proposed in the present disclosure could almost eliminate the influences of malicious nodes and greatly improve the performance of entire network.
  • Fig. 1 exemplarily illustrates a distributed network topology and data transmissions among nodes in accordance with an embodiment of the present disclosure.
  • the distributed network as shown in Fig. 1 is a connected network consisting of N nodes. Each node has the ability of communicating with its neighbors and processing data. It will be realized that Fig. 1 merely shows schematically a topology structure and data transmissions of the exemplary distributed network and other possible implementations of the distributed network may also be conceived in view of the exemplary embodiments of the present disclosure.
  • a node in the distributed network as shown in Fig. 1 may generate data at a time instant and broadcast it to all of its neighbor nodes for cooperation.
  • node l may generate data (which may be represented by a data vector with M components ) at time i and transmit it to all of its neighbor nodes (such as neighbor node 2, node 3 and node k) .
  • the node may receive data from its neighbor nodes and perform data fusion by using the received data and its own data.
  • the values of these nodes are the same.
  • the value of a node may refer to the specific value of data transmitted by the node. For example, if all nodes cooperate with each other to estimate the temperature, then the value of a node refers to the temperature estimated and transmitted by the node.
  • node l as one neighbor of node k may be one of the malicious nodes in the network.
  • the vector ⁇ l (i) may be falsified with certain probability in two forms such as the same-direction attack and the Gaussian attack, respectively.
  • the attacker (such as node l) may launch attacks at time i, with probability p l , via tampering its value either in the increasing way or in the decreasing way, which results in ⁇ l (i) at the both ends of ⁇ k (i) after the sorting operation on ⁇ k (i) .
  • the same-direction attack can be expressed as follows:
  • node l if launching attack, may multiply the j-th component by ⁇ (i) at time i with probability p l , and, if no attacking, do nothing with probability 1-p l .
  • the Gaussian attack may randomly generate a Gaussian value as the component of ⁇ l .
  • this attack mode can be written as:
  • the j-th component may be determined by the Gaussian distribution with mean ⁇ l and variance at time i with probability p l , and, if no attacking, do nothing with probability 1-p l .
  • nodes in the distributed network or distributed adaptive network are rational nodes. From the social perspective, for example, a person is willing to allow his/her friends into his/her home because of the close relationships between them. Conversely, the malicious person who intends to harm others will not get any permission because of his/her deteriorated relationship with regular persons. Thus, there may be a standard, as a designed metric, to measure this kind of relationship. In view of this analysis, a novel and effective solution is proposed in the present disclosure to identify some specified nodes (such as the malicious nodes) and solve the attack problem in the distributed adaptive network.
  • Fig. 2 is a flowchart illustrating a method for distributed networks in accordance with an embodiment of the present disclosure.
  • the method illustrated in Fig. 2 may be performed, for example, at a node or any other network element with node functionality in a distributed network.
  • the distributed network to which the proposed method may be applicable may comprise the network as shown in Fig. 1 or any other adaptive network with distributed deployment.
  • a plurality of nodes comprising a first node and a second node in the distributed network may cooperate with each other through data communications and information exchanges.
  • the second node which is a neighbor node of the first node may collect first data information transmitted at an observation point (such as time instant i) from the first node to the second node in the distributed network, as shown in block 202.
  • the first data information may comprise data information measured, sensed, generated and/or derived by the first node.
  • the second node may also collect data information transmitted at this observation point from its other neighbor nodes in addition to the first node.
  • the collected data information may be used by the second node with its own data information to perform information diffusion or data diffusion.
  • the data diffusion among the network may increase both the performance and the risk in unsafe situation. So it is desirable to find ways to prevent the values of malicious nodes from entering the process of data fusion.
  • the second node may compare the first data information with second data information transmitted at the observation point by the second node to its neighbor nodes in the distributed network.
  • the second node can determine whether the first data information is to be used by the second node in data fusion associated with the observation point, based at least in part on the comparison of the first data information with the second data information.
  • the comparison of the first data information with the second data information may comprise calculating a similarity distance between the first data information and the second data information.
  • the similarity distance may represent a distance between values of the first data information and the second data information.
  • the similarity distance ⁇ l,k (i) may be defined as:
  • ⁇ l (i) and ⁇ k (i) respectively represent the data or data information transmitted by node k and node l at time i
  • ⁇ l, k (i) represents the distance between the value of node k and that of its neighbor node l at time i.
  • the distance between ⁇ l (i) and ⁇ k (i) is greater than the distant between ⁇ k (i) and the data of other regular neighbor nodes. Then, the similarity distance ⁇ l, k (i) can be used as an indicator of the malicious node, which can be calculated before the data fusion at node k.
  • This principle may be taken into account in the proposed method described with respect to Fig. 2.
  • the first data information may be replaced with the second data information by the second node in the data fusion associated with the observation point. This enables the second node to eliminate or reduce the adverse effects on the data fusion caused by the abnormal data from the first node, while maintaining the normal number of contributors to the data fusion.
  • the first criterion may comprise the similarity distance ⁇ l, k (i) being larger than a predefined fusion threshold r.
  • ⁇ l, k (i) >r a predefined fusion threshold
  • ⁇ l (i) would be considered as the value of malicious node and need to be processed before data fusion at node k. Otherwise, ⁇ l (i) may be regarded as safe one for data fusion.
  • Fig. 3 exemplarily illustrates the effects of network performance as the change of the predefined fusion threshold in the absence of attacks in accordance with an embodiment of the present disclosure.
  • the Mean Square Deviation (MSD) curve may be used to measure the performance of estimation of the entire network and MSD can be expressed as:
  • w 0 is a M ⁇ 1 target vector which may be estimated by using the conventional Diffusion Least Mean Square (DLMS) algorithm (which is a kind of algorithm in distributed network and will be described later)
  • DLMS Diffusion Least Mean Square
  • w k (i) is the estimation or estimated vector of node k at time i
  • N represents the total number of nodes in the whole network
  • E denotes mathematical expectation.
  • the equation amounts to averaging the differences between the estimated vectors of the individual nodes and the target vector at time i.
  • the predefined fusion threshold r can be obtained through simulations after the network configuration.
  • Fig. 3 shows the determination of r in distributed estimation.
  • the estimations from neighbor nodes are allowed into the process of data fusion, making the performance improved gradually.
  • threshold r results in the benefits on performance, but that does not mean the bigger the better, which may lead to the estimations of malicious nodes to enter into the process of data fusion. It will be realized that the predefined fusion threshold r may also be set as any other suitable value as required.
  • the above method described in combination with Fig. 2 can eliminate the impact of the attack on regular nodes at time i by preventing abnormal or falsified data from data fusion.
  • a node cannot be judged as a malicious node or not only through the short-term effects caused probably by the change of environmental factors, such as channel fading and shadow effect, etc. So a long-time observation may increase the accuracy in identifying an abnormal node or determining the malicious node.
  • first identification information (such as m l described with respect to Fig. 4) for the first node may be defined to indicate, within an observation cycle or period (such as T described with respect to Fig. 4) comprising one or more observation points, a number of observation points at which data information from the first node is determined not to be used in data fusion by the second node.
  • the first identification information for the first node may be updated in response to determining that the first data information is not to be used by the second node in the data fusion associated with the observation point.
  • second identification information (such as ⁇ l described with respect to Fig. 4) for the first node may be defined to indicate a number of observation cycles at the end of which the first identification information for the first node satisfies a second criterion.
  • the second identification information for the first node may be updated in response to the first identification information at the end of the observation cycle satisfying the second criterion.
  • the second criterion may comprise the attack probability (which may be represented by a ratio of the value of the first identification information to the observation cycle) being equal to or larger than a predefined identification threshold ⁇ .
  • the attack probability which may be represented by a ratio of the value of the first identification information to the observation cycle
  • a predefined identification threshold
  • a predefined action may be triggered for the first node in response to the second identification information satisfying a third criterion.
  • the third criterion may comprise the value of the second identification information being equal to or larger than a predefined warning threshold ⁇ .
  • the predefined action may comprise raising an alarm.
  • first, second and third criteria as previously described may be predefined or specified as required.
  • the parameters, thresholds and rules associated with these criteria as well as the relationships thereof may vary, for example, as the network environment and the type of nodes which need to be identified change.
  • the proposed solution for node identification in the distributed network described with respect to Fig. 2 may be implemented with various suitable functional modules and/or network components.
  • the node identification scheme designed in accordance with exemplary embodiments of the present disclosure may be performed with three functional modules, such as an outlier data detection module, an evaluation and warning module, and a data management module.
  • the outlier data detection module may be employed to identify abnormal information from neighbor nodes.
  • the evaluation and warning module may be in charge of determining malicious nodes and raising an alarm in time.
  • the data management module may be used to manage and update various data.
  • Fig. 4 is a flowchart illustrating an identification procedure in accordance with an embodiment of the present disclosure.
  • This identification procedure may be divided into two parts, outlier data identification process and malicious node identification process, which are showed in the left and right of the flowchart, respectively.
  • the outlier data identification process (as shown in blocks 401-405) may be performed to prevent outlier data from entering into date fusion at a node to generate beneficial data, and then the beneficial data would be transmitted to the neighbors of the node.
  • the malicious node identification process (as shown in blocks 406-412) may be performed to identify the malicious node and then to solve the attack problem in distributed adaptive networks.
  • the exemplary identification procedure may begin at block 401.
  • a node such as node k
  • the similarity distance ⁇ l, k (i) can be used to determine whether node l is a malicious node, which can be calculated before the data fusion at node k, as shown in block 403.
  • the calculated ⁇ l, k (i) may be compared with a predefined fusion threshold r at block 404. If ⁇ l, k (i) ⁇ r, ⁇ l (i) is regarded as safe one for data fusion, and the procedure proceeds to block 402 where the observation time i may be updated. Otherwise, ⁇ l (i) would be considered as the value of a malicious node and need to be processed before data fusion at node k.
  • each node in the network can be evaluated before the process of data fusion via the similarity distance to judge whether the received data may be used in data fusion.
  • the malicious node l can launch an attack with certain probability p l , so P l can be defined to approximate the attack probability p l as the evaluating indicator of neighbor node l.
  • P l can be expressed as follows:
  • T is the observation cycle and m l is the number of attacks detected during T.
  • One time of calculation of P l may not be enough for analyzing the behavior of node l, so a long-time accumulated value from many observation cycles may be needed to make a more reliable decision.
  • the malicious node identification process may be invoked. Then the procedure proceeds from block 406 to 407 where P l is calculated for node l. Otherwise, the procedure returns back to 402 where the observation time i may be updated.
  • parameters ⁇ and ⁇ are introduced here, which may be called the identification threshold and the warning threshold, respectively.
  • ⁇ and ⁇ can be preset. As shown in block 408, if P l ⁇ , node l may be suspected as the malicious node, and the counter ⁇ l is incremented by 1 till ⁇ l ⁇ , as shown in blocks 409 and 410. Otherwise, the procedure returns back to 402 where the observation time i may be updated. If ⁇ l ⁇ at block 410, node l may be judged as the malicious node and a warning device or an alarm system may be activated at block 411. Thus, the operator can find and fix malicious nodes in time, which keep the network in safe stage.
  • the proposed solution introduces the concept of similarity distance and sets the fusion threshold to determine whether the data information received from a node is normal and can be used for data fusion.
  • the proposed solution can effectively identify the malicious nodes launching attacks and solve the attack problem in the distributed adaptive network.
  • the data used, obtained and/or generated in performing the proposed solution may be stored in a distributed manner without a central database.
  • Fig. 5 exemplarily illustrates the system architecture of a data management module in accordance with an embodiment of the present disclosure.
  • a node such as node k
  • the distributed adaptive network may have a data management module or a data manager which is responsible for collecting the data from its neighbors and updating some data and/or indicators used in the related components or modules (such as the evaluation and warning module) .
  • the data manager may store the data for node identification in the distributed adaptive network by a database (as shown in Table 1) .
  • the node may not store the data for all nodes of the whole network but its neighbors due to the limited computation capability.
  • Fig. 5 and Table 1 merely schematically show some basic functional components of the data manager and some exemplary data stored in the database.
  • Other possible operational elements and/or information configurations suitable for the proposed identification scheme may also be conceived in view of the exemplary embodiments of the present disclosure.
  • the proposed solution in accordance with the present disclosure does not need a central authority to identify the malicious nodes, but may be applicable in the distributed networks, which can have optimal performance for the distributed networks.
  • the proposed solution would not have the limitations such as the mixed kinds of attack modes and the number of malicious nodes, and thus can effectively solve the attack problem in the distributed networks and greatly improve the performance.
  • less computation may be needed in the proposed solution, which makes it realize easily.
  • the effectiveness and advantages of the proposed solution may be demonstrated in distributed parameter estimation via the simulations and performance analysis.
  • Fig. 6 exemplarily illustrates a network topology used in the simulations in accordance with an embodiment of the present disclosure.
  • a distributed network with 16 nodes (comprising one or more malicious nodes) is considered in the simulations. It will be appreciated that the considered distributed network may comprise more or less nodes than 16 nodes and may employ other suitable network topologies different from that shown in Fig. 6.
  • MSD as expressed in equation (4) may be used to measure the performance of estimation of the entire network.
  • the unknown M ⁇ 1 vector w 0 in MSD may be estimated by using the conventional DLMS algorithm. Considering a signal model denoted as follows:
  • node k has access to the observing realizations ⁇ d k (i) , u k (i) ⁇ , of zero-mean random data, with d k (i) a scalar measurement and u k (i) a 1 ⁇ M regression row vector, both at time i; and v k (i) is the background noise.
  • the conventional DLMS algorithm can be described as follows:
  • w k (i) is the estimation of node k at time i
  • ⁇ k is the step size
  • N k is the collection of node k and its neighbor nodes
  • a l, k is a nonnegative combination weight that node k assigns to data ⁇ l (i) arriving from a member (such as node l) of N k .
  • the coefficients a l, k may be designed to satisfy the following conditions:
  • the DLMS algorithm may comprise two operations: adaptation and combination.
  • each node updates its estimator using the observed data ⁇ d k (i) , u k (i) ⁇ and collects the estimators from its neighbors in the combination stage. It is noted that no much discussion about the details of this algorithm is presented here so as not to unnecessarily obscure the present disclosure.
  • the simulation may be started with two kinds of attack modes such as the same-direction attack and the Gaussian attack.
  • Some parameters and/or thresholds used in the simulation may be estimated or predefined as appropriate.
  • the fusion threshold r may be set to 0.004 as illustrated in Fig. 3.
  • other values of the fusion threshold are also possible.
  • Various performance advantages provided by the proposed solution (which is also denoted as “new scheme” in Figs. 8-9) in accordance with exemplary embodiments are exemplarily illustrated in Figs. 7-9.
  • the proposed solution in accordance with exemplary embodiments can enable a node to detect the abnormal data from its neighbors at every time under different attack modes and calculate the approximation of attack probability.
  • node 1 and node 3 as shown in Fig. 6 are malicious, which have the attack probability 0.2 and 0.6 respectively.
  • the results at node 2 are observed during 10 observation cycles, of which T is set to 1000 times.
  • Fig. 7 exemplarily illustrates the detection of the number of attacks (denoted as m) under two kinds of attack modes in accordance with an embodiment of the present disclosure. As shown in Fig. 7, whatever the same-direction attack mode or the Gaussian attack mode, node 2 can accurately detect the number of attacks at every observation cycle T.
  • the calculated attack probability P 1 and P 3 are around 0.2 and 0.6, which approach the real attack probability.
  • Fig. 8 exemplarily illustrates the comparison of DLMS performance in the unsafe situation in accordance with an embodiment of the present disclosure.
  • Fig. 8 shows the comparison between the conventional DLMS algorithm and the DLMS algorithm using the proposed new scheme when node 3 is malicious.
  • the MSD curves of the conventional DLMS algorithm begin to rapidly deteriorate with the increase of attack probability.
  • Fig. 9 exemplarily illustrates the impact of the increased number of malicious nodes on the network performance in accordance with an embodiment of the present disclosure.
  • Figs. 1-6 may be viewed as method steps, and/or as operations that result from operation of computer program code, and/or as a plurality of coupled logic circuit elements constructed to carry out the associated function (s) .
  • the schematic flow chart diagrams described above are generally set forth as logical flow chart diagrams. As such, the depicted order and labeled steps are indicative of specific embodiments of the presented methods. Other steps and methods may be conceived that are equivalent in function, logic, or effect to one or more steps, or portions thereof, of the illustrated methods. Additionally, the order in which a particular method occurs may or may not strictly adhere to the order of the corresponding steps shown.
  • Fig. 10 is a simplified block diagram of an apparatus suitable for use in practicing exemplary embodiments of the present disclosure.
  • the apparatus 1000 shown in Fig. 10 may be configured to perform various operations and functions of a node in distributed networks according to exemplary embodiments as described in connection with Fig. 1-6. Particularly, the apparatus 1000 may be configured to function as the second node according to exemplary embodiments as described in connection with Fig. 2.
  • the apparatus 1000 may comprise a processor (PROC) 1010, a memory (MEM) 1020 which stores computer program codes (PROG) 1030, and a suitable communication unit (COM) 1040 (such as a transceiver, a receiver and/or a transmitter, optionally in communication with an antenna) for coupling or connecting to another apparatus such as a network node, a communication device, an interactive entity, wireless terminal, wired terminal, a user equipment, a server, a client, a peripheral device, a database and so on.
  • the communication unit 1040 may be configured to support the apparatus 1000 to transmit/receive signals and messages to/from another apparatus.
  • the processor 1010 may be used for processing these signals and messages.
  • one processor and one memory are shown in Fig. 10, but it will be appreciated that other examples may utilize more than one processor and/or more than one memory (for example, same or different processor/memory types) .
  • the processor 1010 may be embodied as various means for implementing the various functionalities of exemplary embodiments of the present disclosure comprising, for example, a microprocessor, a coprocessor, a controller, a general purpose computer, a special-purpose integrated circuit such as, for example, an Application Specific Integrated Circuit (ASIC) , a Field Programmable Gate Array (FPGA) , or a hardware accelerator, processing circuitry or the like.
  • the processor 1010 may be representative of a plurality of processors, or one or more multiple core processors, operating in concert.
  • the processor 1010 may, but need not, comprise one or more accompanying Digital Signal Processors (DSPs) .
  • DSPs Digital Signal Processors
  • the processor 1010 is configured to execute instructions stored in the memory device or instructions otherwise accessible to the processor 1010.
  • the processor 1010 may be configured to operate such that the processor 1010 causes the apparatus 1000 to perform various functionalities described herein.
  • the memory 1020 may be one or more computer-readable storage media that may comprise volatile and/or non-volatile memory.
  • the memory 1020 comprises Random Access Memory (RAM) including dynamic and/or static RAM, on-chip or off-chip cache memory, and/or the like.
  • RAM Random Access Memory
  • the memory 1020 may comprise non-volatile memory, which may be embedded and/or removable, and may include, for example, read-only memory, flash memory, magnetic storage devices (for example, hard disks, floppy disk drives, magnetic tape, etc. ) , optical disc drives and/or media, non-volatile random access memory (NVRAM) , and/or the like.
  • the memory 1020 may comprise a cache area for temporary storage of data.
  • the memory 1020 may be included within the processor 1010. Further, the memory 1020 may be configured to store information, data, applications, computer-readable program code instructions, and/or the like for enabling the processor 1010 and the apparatus 1000 to carry out various functions in accordance with exemplary embodiments described herein. For example, the memory 1020 may be configured to buffer input data for processing by the processor 1010. Additionally or alternatively, the memory 1020 may be configured to store instructions for execution by the processor 1010.
  • the computer program codes 1030 may be stored on a memory device, such as the memory 1020, and executed by a processor, such as the processor 1010, to enable the apparatus 1000 to operate in accordance with the exemplary embodiments, as discussed above. That is, the exemplary embodiments of the present disclosure may be implemented at least in part by computer software executable by the processor 1010, or by hardware, or by a combination of software and hardware. As will be appreciated, any such computer program codes may be loaded onto a computer or other programmable apparatus from a computer-readable storage medium to produce a particular machine, such that the particular machine becomes means for implementing the functions specified in the flowchart’s block (s) or operation (s) .
  • These computer program codes may also be stored in a computer-readable storage medium that can direct a computer, a processor, or other programmable apparatus to function in a particular manner to thereby generate a particular machine or particular article of manufacture that becomes means for implementing the functions specified in the flowchart’s block (s) or operation (s) .
  • the computer program codes may be retrieved from a computer-readable storage medium and loaded into a computer, processor, or other programmable apparatus to configure the computer, processor, or other programmable apparatus to execute operations to be performed on or by the computer, processor, or other programmable apparatus.
  • the apparatus 1000 may comprise various means and/or modules for implementing functions of the foregoing steps and methods in Figs. 1-6.
  • the apparatus 1000 may serve as the second node as described with respect to Fig. 2 and comprise: collection means for collecting first data information transmitted at an observation point from another apparatus (such as the first node as described with respect to Fig.
  • the various exemplary embodiments may be implemented in hardware or special purpose circuits, software, logic or any combination thereof.
  • some aspects may be implemented in hardware, while other aspects may be implemented in firmware or software which may be executed by a controller, microprocessor or other computing device, although the disclosure is not limited thereto.
  • firmware or software which may be executed by a controller, microprocessor or other computing device, although the disclosure is not limited thereto.
  • While various aspects of the exemplary embodiments of this disclosure may be illustrated and described as block diagrams, flow charts, or using some other pictorial representation, it is well understood that these blocks, apparatus, systems, techniques or methods described herein may be implemented in, as non-limiting examples, hardware, software, firmware, special purpose circuits or logic, general purpose hardware or controller or other computing devices, or some combination thereof.
  • exemplary embodiments of the disclosures may be embodied in computer-executable instructions, such as in one or more program modules, executed by one or more computers or other devices.
  • program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types when executed by a processor in a computer or other device.
  • the computer executable instructions may be stored on a computer readable medium such as a hard disk, optical disk, removable storage media, solid state memory, RAM, etc.
  • the functionality of the program modules may be combined or distributed as desired in various embodiments.
  • the functionality may be embodied in whole or in part in firmware or hardware equivalents such as integrated circuits, FPGA, and the like.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Multimedia (AREA)
  • Software Systems (AREA)
  • Medical Informatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

A method for distributed networks may comprise: collecting first data information transmitted at an observation point from a first node to a second node in a distributed network, wherein the second node is a neighbor node of the first node; comparing the first data information with second data information transmitted at the observation point by the second node to its neighbor nodes in the distributed network; and determining whether the first data information is to be used by the second node in data fusion associated with the observation point, based at least in part on the comparison of the first data information with the second data information

Description

NODE IDENTIFICATION IN DISTRIBUTED ADAPTIVE NETWORKS FIELD OF THE INVENTION
The present disclosure generally relates to communication networks, and more specifically, relates to distributed adaptive networks.
BACKGROUND
Communication networks can be deployed in centralized or distributed manner. In the centralized strategies, all nodes transmit their information to a central authority for making decision or data fusion, which can fully benefit from information collected throughout a network. Unfortunately, the central authority can easily become a bottleneck of the network, that is, the failure of the centralized node would become the disaster because of the possible denial-of-service attack. To overcome the limitation, distributed strategies, without a central authority, attracts more attentions for its good performances and robustness to the attack. In such case, many kinds of distributed algorithms are needed to fulfill the parameter estimation.
On the other hand, the networks are not always in a secure situation. For example, the possible malicious nodes tamper their own values and transmit them to their neighbor nodes for data fusion, which could definitely affect the performance of entire network during the data diffusion among the nodes. So how to identify a malicious node and then prevent its value from being transmitted to its neighbor nodes have become crucial issues for distributed networks.
SUMMARY
The present description proposes a novel solution for distributed networks,  which may be utilized to identify some specified data information and network nodes, for example, abnormal data and related nodes in the distributed networks.
According to a first aspect of the present disclosure, there is provided a method comprising: collecting first data information transmitted at an observation point from a first node to a second node in a distributed network, wherein the second node is a neighbor node of the first node; comparing the first data information with second data information transmitted at the observation point by the second node to its neighbor nodes in the distributed network; and determining whether the first data information is to be used by the second node in data fusion associated with the observation point, based at least in part on the comparison of the first data information with the second data information.
According to a second aspect of the present disclosure, there is provided an apparatus comprising at least one processor and at least one memory comprising computer program code, the at least one memory and the computer program code configured to, with the at least one processor, cause the apparatus at least to: collect first data information transmitted at an observation point from another apparatus to the apparatus in a distributed network, wherein the apparatus is a neighbor of the another apparatus; compare the first data information with second data information transmitted at the observation point by the apparatus to its neighbors in the distributed network; and determine whether the first data information is to be used by the apparatus in data fusion associated with the observation point, based at least in part on the comparison of the first data information with the second data information.
According to a third aspect of the present disclosure, there is provided a computer program product comprising a computer-readable medium bearing computer program codes embodied therein for use with a computer, the computer program codes comprising: code for collecting first data information transmitted at an observation point from a first node to a second node in a distributed network, wherein  the second node is a neighbor node of the first node; code for comparing the first data information with second data information transmitted at the observation point by the second node to its neighbor nodes in the distributed network; and code for determining whether the first data information is to be used by the second node in data fusion associated with the observation point, based at least in part on the comparison of the first data information with the second data information.
According to a fourth aspect of the present disclosure, there is provided an apparatus comprising: collection means for collecting first data information transmitted at an observation point from another apparatus to the apparatus in a distributed network, wherein the apparatus is a neighbor of the another apparatus; comparison means for comparing the first data information with second data information transmitted at the observation point by the apparatus to its neighbors in the distributed network; and determination means for determining whether the first data information is to be used by the apparatus in data fusion associated with the observation point, based at least in part on the comparison of the first data information with the second data information.
In accordance with an exemplary embodiment, the apparatus according to the second/fourth aspect of the present disclosure may comprise the second node, and the another apparatus according to the second/fourth aspect of the present disclosure may comprise the first node.
In accordance with an exemplary embodiment, the comparison of the first data information with the second data information may comprise calculating a similarity distance between the first data information and the second data information. The similarity distance may represent a distance between values of the first data information and the second data information.
In accordance with an exemplary embodiment, it is determined that the first data information is to be used by the second node in the data fusion associated with the  observation point, in response to the similarity distance between the first data information and the second data information not satisfying a first criterion.
In accordance with another exemplary embodiment, it is determined that the first data information is not to be used by the second node in the data fusion associated with the observation point, in response to the similarity distance between the first data information and the second data information satisfying a first criterion. For example, the first data information may be replaced with the second data information by the second node in the data fusion associated with the observation point.
Optionally, the method according to the first aspect of the present disclosure may further comprise: updating first identification information for the first node, in response to determining that the first data information is not to be used by the second node in the data fusion associated with the observation point. The first identification information may indicate, within an observation cycle comprising one or more observation points, a number of observation points at which data information from the first node is determined not to be used in data fusion by the second node.
In accordance with an exemplary embodiment, the method according to the first aspect of the present disclosure may further comprise: updating second identification information for the first node, in response to the first identification information at the end of the observation cycle satisfying a second criterion. The second identification information may indicate a number of observation cycles at the end of which the first identification information for the first node satisfies the second criterion.
In accordance with an exemplary embodiment, the method according to the first aspect of the present disclosure may further comprise: triggering a predefined action for the first node, in response to the second identification information satisfying a third criterion. For example, the predefined action may comprise raising an alarm.
BRIEF DESCRIPTION OF THE DRAWINGS
The disclosure itself, the preferable mode of use and further objectives are best understood by reference to the following detailed description of the embodiments when read in conjunction with the accompanying drawings, in which:
Fig. 1 exemplarily illustrates a distributed network topology and data transmissions among nodes in accordance with an embodiment of the present disclosure;
Fig. 2 is a flowchart illustrating a method for distributed networks in accordance with an embodiment of the present disclosure;
Fig. 3 exemplarily illustrates the effects of network performance as the change of the predefined fusion threshold in the absence of attacks in accordance with an embodiment of the present disclosure;
Fig. 4 is a flowchart illustrating an identification procedure in accordance with an embodiment of the present disclosure;
Fig. 5 exemplarily illustrates the system architecture of a data management module in accordance with an embodiment of the present disclosure;
Fig. 6 exemplarily illustrates a network topology used in the simulations in accordance with an embodiment of the present disclosure;
Fig. 7 exemplarily illustrates the detection of the number of attacks under two kinds of attack modes in accordance with an embodiment of the present disclosure;
Fig. 8 exemplarily illustrates the comparison of DLMS performance in the unsafe situation in accordance with an embodiment of the present disclosure;
Fig. 9 exemplarily illustrates the impact of the increased number of malicious nodes on the network performance in accordance with an embodiment of the present disclosure; and
Fig. 10 is a simplified block diagram of an apparatus suitable for use in  practicing exemplary embodiments of the present disclosure.
DETAILED DESCRIPTION OF THE INVENTION
The embodiments of the present disclosure are described in details with reference to the accompanying drawings. Reference throughout this specification to features, advantages, or similar language does not imply that all of the features and advantages that may be realized with the present disclosure should be or are in any single embodiment of the disclosure. Rather, language referring to the features and advantages is understood to mean that a specific feature, advantage, or characteristic described in connection with an embodiment is included in at least one embodiment of the present disclosure. Furthermore, the described features, advantages, and characteristics of the disclosure may be combined in any suitable manner in one or more embodiments. One skilled in the relevant art will recognize that the disclosure may be practiced without one or more of the specific features or advantages of a particular embodiment. In other instances, additional features and advantages may be recognized in certain embodiments that may not be present in all embodiments of the disclosure.
Distributed adaptation over networks has emerged as an attractive and challenging research area with the advent of multiagent (wireless or wired) networks. Distributed adaptive networks are well-suited for the decentralized information processing and optimization tasks to model various types of self-organized and complex behaviors encountered in engineering, such as target localization, cooperative spectrum sensing, distributed estimation, etc. Distributed adaptive networks consist of a collection of nodes with processing and learning abilities. The interconnected nodes form a connecting topology, and cooperate with each other through local interactions to perform preassigned tasks in real time.
In the adaptive networks, the nodes share information with each other and the  continuous information diffusion across the network enables the nodes to adapt their performances to the data stream and network conditions. It can also achieve the improved adaptation and learning performance, as comparing with the non-cooperative nodes. However, the distributed networks may be in an unsecure situation where the possible malicious nodes may deteriorate the performance of entire network during the data diffusion among the nodes.
When a small amount of malicious nodes exist in the adaptive networks, some technical problems may have to be confronted. For example, the malicious node can launch different kinds of attacks, with certain probability, to its neighbor nodes at any time via different data falsifying ways. Therefore, it is hard to control and predict the attack behavior of the malicious node. The malicious node may transmit its falsified value to the regular nodes, which will cause catastrophic damage to the performance of the entire network, due to the use of the falsified value in data fusion, and thus make a well designed distributed system useless. With the increased number of malicious nodes, the regular nodes would suffer more fierce attack, which may lead to more serious damage to the entire network.
There are some possible schemes designed to deal with the problem of malicious nodes in distributed adaptive networks. For example, one way to detect malicious nodes is the sort-based method. In this method, the central authority sorts the values received from its neighbors and regards the values at both ends of them as malicious. However, confronting different attack modes, the values of malicious may change randomly, which makes it hard to identify a malicious node only via its value position. The effectiveness of the method is sensitive to the network topology and the type of malicious nodes. Another possible scheme relies on the differences between the values of each node and the average value of all neighbor nodes to identify the malicious nodes, which assumes that the malicious node has the biggest difference from the average value. However, this scheme is only useful when facing constant  attack, that is, it is also limited by the attack modes and the increased number of malicious nodes. An improved scheme is designed, which also uses the differences between the value of single node and the average value of all nodes to identify malicious nodes. However, it needs a central authority to complete all the procedure and the central authority plays a key role for the entire network, which imposes too much pressure on the central authority.
It can be seen that the mentioned schemes are only for particular attack modes, and the number of malicious nodes is also a great limitation on the performance of such schemes. However, there actually exist malicious nodes with one or mixed attack mode in real networks.
In comparison to the current solutions, methods, apparatus, and computer program products provided according to exemplary embodiments of the present disclosure can provide a novel and effective solution to identify the malicious nodes in distributed networks. The node identification scheme proposed in the present disclosure could almost eliminate the influences of malicious nodes and greatly improve the performance of entire network.
Fig. 1 exemplarily illustrates a distributed network topology and data transmissions among nodes in accordance with an embodiment of the present disclosure. The distributed network as shown in Fig. 1 is a connected network consisting of N nodes. Each node has the ability of communicating with its neighbors and processing data. It will be realized that Fig. 1 merely shows schematically a topology structure and data transmissions of the exemplary distributed network and other possible implementations of the distributed network may also be conceived in view of the exemplary embodiments of the present disclosure.
For data communications, a node in the distributed network as shown in Fig. 1 may generate data at a time instant and broadcast it to all of its neighbor nodes for cooperation. For example, node l may generate data (which may be represented by a  data vector with M components
Figure PCTCN2016105324-appb-000001
) at time i and transmit it to all of its neighbor nodes (such as neighbor node 2, node 3 and node k) . On the other hand, the node may receive data from its neighbor nodes and perform data fusion by using the received data and its own data. For example, the data of node k and all data from its neighbors (which may be denoted as the data collection φk (i) = [ψl (i) , … ψk-1 (i) , ψk (i) , … ψN (i) ] ) would be used for data fusion at node k.
If the entire network is safe, these nodes cooperate with each other to make beneficial contributions to the network, which is the ideal situation. In this situation, the values of these nodes are the same. Here the value of a node may refer to the specific value of data transmitted by the node. For example, if all nodes cooperate with each other to estimate the temperature, then the value of a node refers to the temperature estimated and transmitted by the node.
However, a few of malicious nodes in the network will make the situation change. For example, node l as one neighbor of node k may be one of the malicious nodes in the network. Though there are various kinds of attack modes, the vector ψl (i) may be falsified with certain probability in two forms such as the same-direction attack and the Gaussian attack, respectively.
For the same-direction attack, the attacker (such as node l) may launch attacks at time i, with probability pl, via tampering its value either in the increasing way or in the decreasing way, which results in ψl (i) at the both ends of φk (i) after the sorting operation on φk (i) . For example, the same-direction attack can be expressed as follows:
Figure PCTCN2016105324-appb-000002
where node l, if launching attack, may multiply the j-th component
Figure PCTCN2016105324-appb-000003
by λ (i) at time i with probability pl, and, if no attacking, do nothing with probability 1-pl
Being different from the same-direction attack, the Gaussian attack may randomly generate a Gaussian value as the component of ψl. Mathematically, this attack mode can be written as:
Figure PCTCN2016105324-appb-000004
where the j-th component
Figure PCTCN2016105324-appb-000005
may be determined by the Gaussian distribution with mean αl and variance
Figure PCTCN2016105324-appb-000006
at time i with probability pl, and, if no attacking, do nothing with probability 1-pl.
It is worth noting that most nodes in the distributed network or distributed adaptive network are rational nodes. From the social perspective, for example, a person is willing to allow his/her friends into his/her home because of the close relationships between them. Conversely, the malicious person who intends to harm others will not get any permission because of his/her deteriorated relationship with regular persons. Thus, there may be a standard, as a designed metric, to measure this kind of relationship. In view of this analysis, a novel and effective solution is proposed in the present disclosure to identify some specified nodes (such as the malicious nodes) and solve the attack problem in the distributed adaptive network.
Fig. 2 is a flowchart illustrating a method for distributed networks in accordance with an embodiment of the present disclosure. The method illustrated in Fig. 2 may be performed, for example, at a node or any other network element with node functionality in a distributed network. It will be realized that the distributed network to which the proposed method may be applicable may comprise the network as shown in Fig. 1 or any other adaptive network with distributed deployment.
A plurality of nodes comprising a first node and a second node in the distributed network may cooperate with each other through data communications and information exchanges. According to the method illustrated in Fig. 2, the second node which is a neighbor node of the first node may collect first data information  transmitted at an observation point (such as time instant i) from the first node to the second node in the distributed network, as shown in block 202. For example, the first data information may comprise data information measured, sensed, generated and/or derived by the first node.
In accordance with an exemplary embodiment, the second node may also collect data information transmitted at this observation point from its other neighbor nodes in addition to the first node. The collected data information may be used by the second node with its own data information to perform information diffusion or data diffusion. The data diffusion among the network may increase both the performance and the risk in unsafe situation. So it is desirable to find ways to prevent the values of malicious nodes from entering the process of data fusion.
Since all nodes in the distributed network may have the same task, such as, estimation or tracing, the values of regular nodes are similar to each other. However, the values of malicious nodes will deviate from the norm range, which can be used to detect the malicious node. Accordingly, data information from the detected malicious node may be prevented from being used in data fusion.
Thus, as shown in block 204, the second node may compare the first data information with second data information transmitted at the observation point by the second node to its neighbor nodes in the distributed network. In block 206, the second node can determine whether the first data information is to be used by the second node in data fusion associated with the observation point, based at least in part on the comparison of the first data information with the second data information.
In accordance with an exemplary embodiment, the comparison of the first data information with the second data information may comprise calculating a similarity distance between the first data information and the second data information. The similarity distance may represent a distance between values of the first data information and the second data information. For example, the similarity distance  ηl,k (i) may be defined as:
Figure PCTCN2016105324-appb-000007
where ψl (i) and ψk (i) respectively represent the data or data information transmitted by node k and node l at time i , and ηl, k (i) represents the distance between the value of node k and that of its neighbor node l at time i.
If node l is assumed as a malicious node, the distance between ψl (i) and ψk (i) is greater than the distant between ψk (i) and the data of other regular neighbor nodes. Then, the similarity distance ηl, k (i) can be used as an indicator of the malicious node, which can be calculated before the data fusion at node k.
This principle may be taken into account in the proposed method described with respect to Fig. 2. As such, in response to the similarity distance between the first data information and the second data information not satisfying a first criterion, it is determined that the first data information is to be used by the second node in the data fusion associated with the observation point. Otherwise, it is determined that the first data information is not to be used by the second node in the data fusion associated with the observation point, in response to the similarity distance between the first data information and the second data information satisfying a first criterion.
If the first data information is determined not to be used by the second node in the data fusion, for example, the first data information may be replaced with the second data information by the second node in the data fusion associated with the observation point. This enables the second node to eliminate or reduce the adverse effects on the data fusion caused by the abnormal data from the first node, while maintaining the normal number of contributors to the data fusion.
In accordance with an exemplary embodiment, the first criterion may comprise the similarity distance ηl, k (i) being larger than a predefined fusion threshold r. As such, if the first criterion is satisfied (ηl, k (i) >r) , then ψl (i) would be considered as  the value of malicious node and need to be processed before data fusion at node k. Otherwise, ψl (i) may be regarded as safe one for data fusion. Through this identification scheme, each node in the network can be evaluated before the process of data fusion via the similarity distance to judge whether it may be considered as a malicious node or not.
Fig. 3 exemplarily illustrates the effects of network performance as the change of the predefined fusion threshold in the absence of attacks in accordance with an embodiment of the present disclosure. The Mean Square Deviation (MSD) curve may be used to measure the performance of estimation of the entire network and MSD can be expressed as:
Figure PCTCN2016105324-appb-000008
where w0 is a M×1 target vector which may be estimated by using the conventional Diffusion Least Mean Square (DLMS) algorithm (which is a kind of algorithm in distributed network and will be described later) , wk (i) is the estimation or estimated vector of node k at time i, N represents the total number of nodes in the whole network and E denotes mathematical expectation. The equation amounts to averaging the differences between the estimated vectors of the individual nodes and the target vector at time i.
The predefined fusion threshold r can be obtained through simulations after the network configuration. Fig. 3 shows the determination of r in distributed estimation. As seen in Fig. 3, the nodes of entire network work independently without any information transmission when r = 0, leading to the worst performance of estimation. With the increase of the fusion threshold r, the estimations from neighbor nodes are allowed into the process of data fusion, making the performance improved gradually. The performance curve at r = 0.004 is almost same as the DLMS algorithm without any attacks. Thus, r = 0.004 may be chosen as the value of the predefined fusion threshold. It is worth nothing that the increase of threshold r results in the benefits on  performance, but that does not mean the bigger the better, which may lead to the estimations of malicious nodes to enter into the process of data fusion. It will be realized that the predefined fusion threshold r may also be set as any other suitable value as required.
The above method described in combination with Fig. 2 can eliminate the impact of the attack on regular nodes at time i by preventing abnormal or falsified data from data fusion. However, a node cannot be judged as a malicious node or not only through the short-term effects caused probably by the change of environmental factors, such as channel fading and shadow effect, etc. So a long-time observation may increase the accuracy in identifying an abnormal node or determining the malicious node.
Thus, according to an exemplary embodiment, first identification information (such as ml described with respect to Fig. 4) for the first node may be defined to indicate, within an observation cycle or period (such as T described with respect to Fig. 4) comprising one or more observation points, a number of observation points at which data information from the first node is determined not to be used in data fusion by the second node. Thus, the first identification information for the first node may be updated in response to determining that the first data information is not to be used by the second node in the data fusion associated with the observation point.
According to the exemplary embodiment, second identification information (such as τl described with respect to Fig. 4) for the first node may be defined to indicate a number of observation cycles at the end of which the first identification information for the first node satisfies a second criterion. Thus, the second identification information for the first node may be updated in response to the first identification information at the end of the observation cycle satisfying the second criterion.
For example, the second criterion may comprise the attack probability (which  may be represented by a ratio of the value of the first identification information to the observation cycle) being equal to or larger than a predefined identification threshold ρ. As such, if the second criterion is satisfied, which means that the attack probability is not below the predefined identification threshold ρ, then the first node may be suspected as a malicious node; otherwise, the first node would not be regarded as a malicious node.
In an exemplary embodiment, a predefined action may be triggered for the first node in response to the second identification information satisfying a third criterion. For example, the third criterion may comprise the value of the second identification information being equal to or larger than a predefined warning threshold ε. As such, if the third criterion is satisfied, which means that the first node is suspected as an abnormal node for one or more observation cycles not less than ε observation cycles, then at least one action may be triggered for the first node. For example, the predefined action may comprise raising an alarm.
It will be realized that the first, second and third criteria as previously described may be predefined or specified as required. The parameters, thresholds and rules associated with these criteria as well as the relationships thereof may vary, for example, as the network environment and the type of nodes which need to be identified change.
The proposed solution for node identification in the distributed network described with respect to Fig. 2 may be implemented with various suitable functional modules and/or network components. For example, the node identification scheme designed in accordance with exemplary embodiments of the present disclosure may be performed with three functional modules, such as an outlier data detection module, an evaluation and warning module, and a data management module. The outlier data detection module may be employed to identify abnormal information from neighbor nodes. The evaluation and warning module may be in charge of determining  malicious nodes and raising an alarm in time. The data management module may be used to manage and update various data.
Fig. 4 is a flowchart illustrating an identification procedure in accordance with an embodiment of the present disclosure. This identification procedure may be divided into two parts, outlier data identification process and malicious node identification process, which are showed in the left and right of the flowchart, respectively. Specifically, the outlier data identification process (as shown in blocks 401-405) may be performed to prevent outlier data from entering into date fusion at a node to generate beneficial data, and then the beneficial data would be transmitted to the neighbors of the node. The malicious node identification process (as shown in blocks 406-412) may be performed to identify the malicious node and then to solve the attack problem in distributed adaptive networks.
The exemplary identification procedure may begin at block 401. As shown in Fig. 4, a node (such as node k) may collect information from its neighbors at block 402. For example, this node can obtain the data collection (such as φk (i) = [ψl (i) , … ψk-1 (i) , ψk (i) , … ψN (i) ] ) comprising its own data and the data collected from all of its neighbors for data fusion associated with observation time i. As described with respect to Fig. 2, the similarity distance ηl, k (i) can be used to determine whether node l is a malicious node, which can be calculated before the data fusion at node k, as shown in block 403. The calculated ηl, k (i) may be compared with a predefined fusion threshold r at block 404. If ηl, k (i) ≤r, ψl (i) is regarded as safe one for data fusion, and the procedure proceeds to block 402 where the observation time i may be updated. Otherwise, ψl (i) would be considered as the value of a malicious node and need to be processed before data fusion at node k. For example, as shown in block 405, the number of attacks from node l detected during an observation cycle T may updated by ml = ml+1, and ψl (i) may be replaced with ψk (i) to prevent the falsified value of malicious node l from the data fusion at node k.  Through this outlier data identification process, each node in the network can be evaluated before the process of data fusion via the similarity distance to judge whether the received data may be used in data fusion.
As described with respect to Fig. 1, the malicious node l can launch an attack with certain probability pl, so Pl can be defined to approximate the attack probability pl as the evaluating indicator of neighbor node l. Pl can be expressed as follows:
Figure PCTCN2016105324-appb-000009
where T is the observation cycle and ml is the number of attacks detected during T. One time of calculation of Pl may not be enough for analyzing the behavior of node l, so a long-time accumulated value from many observation cycles may be needed to make a more reliable decision. If the currently accumulated number or time of estimations in one observation cycle (which may be represented by the latest updated value of i) is equal to the observation cycle T, the malicious node identification process may be invoked. Then the procedure proceeds from block 406 to 407 where Pl is calculated for node l. Otherwise, the procedure returns back to 402 where the observation time i may be updated.
In order to improve the procedure, parameters ρ and ε are introduced here, which may be called the identification threshold and the warning threshold, respectively. According to the different conditions and requirements, ρ and ε can be preset. As shown in block 408, if Pl≥ρ, node l may be suspected as the malicious node, and the counter τl is incremented by 1 till τl≥ε, as shown in  blocks  409 and 410. Otherwise, the procedure returns back to 402 where the observation time i may be updated. If τl≥ε at block 410, node l may be judged as the malicious node and a warning device or an alarm system may be activated at block 411. Thus, the operator can find and fix malicious nodes in time, which keep the network in safe stage.
The proposed solution introduces the concept of similarity distance and sets the fusion threshold to determine whether the data information received from a node is normal and can be used for data fusion. In addition, the proposed solution can effectively identify the malicious nodes launching attacks and solve the attack problem in the distributed adaptive network. According to an exemplary embodiment, the data used, obtained and/or generated in performing the proposed solution may be stored in a distributed manner without a central database.
Fig. 5 exemplarily illustrates the system architecture of a data management module in accordance with an embodiment of the present disclosure. As shown in Fig. 5, a node (such as node k) in the distributed adaptive network may have a data management module or a data manager which is responsible for collecting the data from its neighbors and updating some data and/or indicators used in the related components or modules (such as the evaluation and warning module) . For example, the data manager may store the data for node identification in the distributed adaptive network by a database (as shown in Table 1) .
Table 1: Description of database at node k
Figure PCTCN2016105324-appb-000010
In accordance with an exemplary embodiment, the node may not store the data for all nodes of the whole network but its neighbors due to the limited computation capability. It will be realized that Fig. 5 and Table 1 merely schematically show some  basic functional components of the data manager and some exemplary data stored in the database. Other possible operational elements and/or information configurations suitable for the proposed identification scheme may also be conceived in view of the exemplary embodiments of the present disclosure.
The proposed solution in accordance with the present disclosure does not need a central authority to identify the malicious nodes, but may be applicable in the distributed networks, which can have optimal performance for the distributed networks. Compared with the current approaches which are just effective for one kind of attack modes and can be easily affected by the increase of malicious nodes, the proposed solution would not have the limitations such as the mixed kinds of attack modes and the number of malicious nodes, and thus can effectively solve the attack problem in the distributed networks and greatly improve the performance. Moreover, less computation may be needed in the proposed solution, which makes it realize easily. The effectiveness and advantages of the proposed solution may be demonstrated in distributed parameter estimation via the simulations and performance analysis.
Fig. 6 exemplarily illustrates a network topology used in the simulations in accordance with an embodiment of the present disclosure. As shown in Fig. 6, a distributed network with 16 nodes (comprising one or more malicious nodes) is considered in the simulations. It will be appreciated that the considered distributed network may comprise more or less nodes than 16 nodes and may employ other suitable network topologies different from that shown in Fig. 6.
Similar to Fig. 3, MSD as expressed in equation (4) may be used to measure the performance of estimation of the entire network. The unknown M×1 vector w0 in MSD may be estimated by using the conventional DLMS algorithm. Considering a signal model denoted as follows:
dk (i) =uk (i) w0+vk (i) , i = 0, 1, …                    (6)
where node k has access to the observing realizations {dk (i) , uk (i) } , of zero-mean random data, with dk (i) a scalar measurement and uk (i) a 1×M regression row vector, both at time i; and vk (i) is the background noise.
The conventional DLMS algorithm can be described as follows:
Figure PCTCN2016105324-appb-000011
Figure PCTCN2016105324-appb-000012
where wk (i) is the estimation of node k at time i, μk is the step size, Nk is the collection of node k and its neighbor nodes, al, k is a nonnegative combination weight that node k assigns to data ψl (i) arriving from a member (such as node l) of Nk. The coefficients al, k may be designed to satisfy the following conditions:
al, k=0, if
Figure PCTCN2016105324-appb-000013
and
Figure PCTCN2016105324-appb-000014
The DLMS algorithm may comprise two operations: adaptation and combination. In the adaptation stage, each node updates its estimator using the observed data {dk (i) , uk (i) } and collects the estimators from its neighbors in the combination stage. It is noted that no much discussion about the details of this algorithm is presented here so as not to unnecessarily obscure the present disclosure.
The simulation may be started with two kinds of attack modes such as the same-direction attack and the Gaussian attack. Some parameters and/or thresholds used in the simulation may be estimated or predefined as appropriate. For example, the fusion threshold r may be set to 0.004 as illustrated in Fig. 3. However, other values of the fusion threshold are also possible. Various performance advantages provided by the proposed solution (which is also denoted as “new scheme” in Figs. 8-9) in accordance with exemplary embodiments are exemplarily illustrated in Figs. 7-9.
As described with respect to various figures, the proposed solution in accordance with exemplary embodiments can enable a node to detect the abnormal  data from its neighbors at every time under different attack modes and calculate the approximation of attack probability. Assume that node 1 and node 3 as shown in Fig. 6 are malicious, which have the attack probability 0.2 and 0.6 respectively. The results at node 2 are observed during 10 observation cycles, of which T is set to 1000 times. Fig. 7 exemplarily illustrates the detection of the number of attacks (denoted as m) under two kinds of attack modes in accordance with an embodiment of the present disclosure. As shown in Fig. 7, whatever the same-direction attack mode or the Gaussian attack mode, node 2 can accurately detect the number of attacks at every observation cycle T. Node 1 and node 3 are highly suspected as malicious because their attacks, of which m1 = 200 and m3 = 600 times respectively at every observation period T, are much more fierce than node 4 and node 5. The calculated attack probability P1 and P3 are around 0.2 and 0.6, which approach the real attack probability.
With the two kinds of attack modes, the proposed solution can greatly improve the performance of entire network and also eliminate the influences of different attack probabilities. Fig. 8 exemplarily illustrates the comparison of DLMS performance in the unsafe situation in accordance with an embodiment of the present disclosure. Specifically, Fig. 8 shows the comparison between the conventional DLMS algorithm and the DLMS algorithm using the proposed new scheme when node 3 is malicious. As shown in Fig. 8, whatever under any attack mode, the MSD curves of the conventional DLMS algorithm begin to rapidly deteriorate with the increase of attack probability. The performances of the conventional DLMS algorithm worsen to -10dB (Fig. 8 (a) ) and 5dB (Fig. 8 (b) ) while constant attacks (p=1) . However, the proposed new scheme used in DLMS can greatly improve the estimation performance of the whole network and the MSD curves do not change over the attack probability, which always stay around -35dB that is same as the state of no attacks (p=0) .
With the increased number of malicious nodes, the proposed new scheme can also keep the whole network high performance and robust. Fig. 9 exemplarily illustrates the impact of the increased number of malicious nodes on the network performance in accordance with an embodiment of the present disclosure. As shown in Fig. 9, regardless of attack modes, the MSD curve of when node 1 is malicious is almost the same as of when node 1 and node 3 are malicious, which stay around -35dB that is same as the state of no attacks (p=0) .
The various blocks or information flows shown in Figs. 1-6 may be viewed as method steps, and/or as operations that result from operation of computer program code, and/or as a plurality of coupled logic circuit elements constructed to carry out the associated function (s) . The schematic flow chart diagrams described above are generally set forth as logical flow chart diagrams. As such, the depicted order and labeled steps are indicative of specific embodiments of the presented methods. Other steps and methods may be conceived that are equivalent in function, logic, or effect to one or more steps, or portions thereof, of the illustrated methods. Additionally, the order in which a particular method occurs may or may not strictly adhere to the order of the corresponding steps shown.
Fig. 10 is a simplified block diagram of an apparatus suitable for use in practicing exemplary embodiments of the present disclosure. The apparatus 1000 shown in Fig. 10 may be configured to perform various operations and functions of a node in distributed networks according to exemplary embodiments as described in connection with Fig. 1-6. Particularly, the apparatus 1000 may be configured to function as the second node according to exemplary embodiments as described in connection with Fig. 2.
The apparatus 1000 may comprise a processor (PROC) 1010, a memory (MEM) 1020 which stores computer program codes (PROG) 1030, and a suitable communication unit (COM) 1040 (such as a transceiver, a receiver and/or a  transmitter, optionally in communication with an antenna) for coupling or connecting to another apparatus such as a network node, a communication device, an interactive entity, wireless terminal, wired terminal, a user equipment, a server, a client, a peripheral device, a database and so on. For example, the communication unit 1040 may be configured to support the apparatus 1000 to transmit/receive signals and messages to/from another apparatus. The processor 1010 may be used for processing these signals and messages. In this example only one processor and one memory are shown in Fig. 10, but it will be appreciated that other examples may utilize more than one processor and/or more than one memory (for example, same or different processor/memory types) .
The processor 1010 may be embodied as various means for implementing the various functionalities of exemplary embodiments of the present disclosure comprising, for example, a microprocessor, a coprocessor, a controller, a general purpose computer, a special-purpose integrated circuit such as, for example, an Application Specific Integrated Circuit (ASIC) , a Field Programmable Gate Array (FPGA) , or a hardware accelerator, processing circuitry or the like. According to an exemplary embodiment, the processor 1010 may be representative of a plurality of processors, or one or more multiple core processors, operating in concert. The processor 1010 may, but need not, comprise one or more accompanying Digital Signal Processors (DSPs) . In some exemplary embodiments, the processor 1010 is configured to execute instructions stored in the memory device or instructions otherwise accessible to the processor 1010. The processor 1010 may be configured to operate such that the processor 1010 causes the apparatus 1000 to perform various functionalities described herein.
The memory 1020 may be one or more computer-readable storage media that may comprise volatile and/or non-volatile memory. In some exemplary embodiments, the memory 1020 comprises Random Access Memory (RAM) including dynamic  and/or static RAM, on-chip or off-chip cache memory, and/or the like. Further, the memory 1020 may comprise non-volatile memory, which may be embedded and/or removable, and may include, for example, read-only memory, flash memory, magnetic storage devices (for example, hard disks, floppy disk drives, magnetic tape, etc. ) , optical disc drives and/or media, non-volatile random access memory (NVRAM) , and/or the like. The memory 1020 may comprise a cache area for temporary storage of data. In this regard, at least a portion or the entire memory 1020 may be included within the processor 1010. Further, the memory 1020 may be configured to store information, data, applications, computer-readable program code instructions, and/or the like for enabling the processor 1010 and the apparatus 1000 to carry out various functions in accordance with exemplary embodiments described herein. For example, the memory 1020 may be configured to buffer input data for processing by the processor 1010. Additionally or alternatively, the memory 1020 may be configured to store instructions for execution by the processor 1010.
The computer program codes 1030 may be stored on a memory device, such as the memory 1020, and executed by a processor, such as the processor 1010, to enable the apparatus 1000 to operate in accordance with the exemplary embodiments, as discussed above. That is, the exemplary embodiments of the present disclosure may be implemented at least in part by computer software executable by the processor 1010, or by hardware, or by a combination of software and hardware. As will be appreciated, any such computer program codes may be loaded onto a computer or other programmable apparatus from a computer-readable storage medium to produce a particular machine, such that the particular machine becomes means for implementing the functions specified in the flowchart’s block (s) or operation (s) . These computer program codes may also be stored in a computer-readable storage medium that can direct a computer, a processor, or other programmable apparatus to function in a particular manner to thereby generate a particular machine or particular article of manufacture that becomes means for implementing the functions specified  in the flowchart’s block (s) or operation (s) . The computer program codes may be retrieved from a computer-readable storage medium and loaded into a computer, processor, or other programmable apparatus to configure the computer, processor, or other programmable apparatus to execute operations to be performed on or by the computer, processor, or other programmable apparatus.
Alternatively or additionally, the apparatus 1000 may comprise various means and/or modules for implementing functions of the foregoing steps and methods in Figs. 1-6. In an exemplary embodiment, the apparatus 1000 may serve as the second node as described with respect to Fig. 2 and comprise: collection means for collecting first data information transmitted at an observation point from another apparatus (such as the first node as described with respect to Fig. 2) to the apparatus in a distributed network, wherein the apparatus is a neighbor of the another apparatus; comparison means for comparing the first data information with second data information transmitted at the observation point by the apparatus to its neighbors in the distributed network; and determination means for determining whether the first data information is to be used by the apparatus in data fusion associated with the observation point, based at least in part on the comparison of the first data information with the second data information.
In general, the various exemplary embodiments may be implemented in hardware or special purpose circuits, software, logic or any combination thereof. For example, some aspects may be implemented in hardware, while other aspects may be implemented in firmware or software which may be executed by a controller, microprocessor or other computing device, although the disclosure is not limited thereto. While various aspects of the exemplary embodiments of this disclosure may be illustrated and described as block diagrams, flow charts, or using some other pictorial representation, it is well understood that these blocks, apparatus, systems, techniques or methods described herein may be implemented in, as non-limiting  examples, hardware, software, firmware, special purpose circuits or logic, general purpose hardware or controller or other computing devices, or some combination thereof.
It will be appreciated that at least some aspects of the exemplary embodiments of the disclosures may be embodied in computer-executable instructions, such as in one or more program modules, executed by one or more computers or other devices. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types when executed by a processor in a computer or other device. The computer executable instructions may be stored on a computer readable medium such as a hard disk, optical disk, removable storage media, solid state memory, RAM, etc. As will be realized by one of skills in the art, the functionality of the program modules may be combined or distributed as desired in various embodiments. In addition, the functionality may be embodied in whole or in part in firmware or hardware equivalents such as integrated circuits, FPGA, and the like.
Although specific embodiments of the disclosure have been disclosed, those having ordinary skills in the art will understand that changes can be made to the specific embodiments without departing from the spirit and scope of the disclosure. The scope of the disclosure is not to be restricted therefore to the specific embodiments, and it is intended that the appended claims cover any and all such applications, modifications, and embodiments within the scope of the present disclosure.

Claims (28)

  1. A method, comprising:
    collecting first data information transmitted at an observation point from a first node to a second node in a distributed network, wherein the second node is a neighbor node of the first node;
    comparing the first data information with second data information transmitted at the observation point by the second node to its neighbor nodes in the distributed network; and
    determining whether the first data information is to be used by the second node in data fusion associated with the observation point, based at least in part on the comparison of the first data information with the second data information.
  2. The method according to claim 1, wherein the comparison of the first data information with the second data information comprises:
    calculating a similarity distance between the first data information and the second data information, wherein the similarity distance represents a distance between values of the first data information and the second data information.
  3. The method according to claim 2, wherein it is determined that the first data information is to be used by the second node in the data fusion associated with the observation point, in response to the similarity distance between the first data information and the second data information not satisfying a first criterion.
  4. The method according to claim 2, wherein it is determined that the first data information is not to be used by the second node in the data fusion associated with  the observation point, in response to the similarity distance between the first data information and the second data information satisfying a first criterion.
  5. The method according to claim 4, wherein the first data information is replaced with the second data information by the second node in the data fusion associated with the observation point.
  6. The method according to any one of claims 1 to 5, further comprising:
    updating first identification information for the first node, in response to determining that the first data information is not to be used by the second node in the data fusion associated with the observation point,
    wherein the first identification information indicates, within an observation cycle comprising one or more observation points, a number of observation points at which data information from the first node is determined not to be used in data fusion by the second node.
  7. The method according to claim 6, further comprising:
    updating second identification information for the first node, in response to the first identification information at the end of the observation cycle satisfying a second criterion,
    wherein the second identification information indicates a number of observation cycles at the end of which the first identification information for the first node satisfies the second criterion.
  8. The method according to claim 7, further comprising:
    triggering a predefined action for the first node, in response to the second identification information satisfying a third criterion.
  9. The method according to claim 8, wherein the predefined action comprises raising an alarm.
  10. An apparatus, comprising:
    at least one processor; and
    at least one memory comprising computer program code,
    the at least one memory and the computer program code configured to, with the at least one processor, cause the apparatus at least to:
    collect first data information transmitted at an observation point from another apparatus to the apparatus in a distributed network, wherein the apparatus is a neighbor of the another apparatus;
    compare the first data information with second data information transmitted at the observation point by the apparatus to its neighbors in the distributed network; and
    determine whether the first data information is to be used by the apparatus in data fusion associated with the observation point, based at least in part on the comparison of the first data information with the second data information.
  11. The apparatus according to claim 10, wherein the comparison of the first data information with the second data information comprises:
    calculating a similarity distance between the first data information and the second data information, wherein the similarity distance represents a distance between values of the first data information and the second data information.
  12. The apparatus according to claim 11, wherein it is determined that the first data information is to be used by the apparatus in the data fusion associated with the observation point, in response to the similarity distance between the first data information and the second data information not satisfying a first criterion.
  13. The apparatus according to claim 11, wherein it is determined that the first data information is not to be used by the apparatus in the data fusion associated with the observation point, in response to the similarity distance between the first data information and the second data information satisfying a first criterion.
  14. The apparatus according to claim 13, wherein the first data information is replaced with the second data information by the apparatus in the data fusion associated with the observation point.
  15. The apparatus according to any one of claims 10 to 14, wherein the at least one memory and the computer program code are configured to, with the at least one processor, cause the apparatus at least further to:
    update first identification information for the another apparatus, in response to determining that the first data information is not to be used by the apparatus in the data fusion associated with the observation point,
    wherein the first identification information indicates, within an observation cycle comprising one or more observation points, a number of observation points at which data information from the another apparatus is determined not to be used in data fusion by the apparatus.
  16. The apparatus according to claim 15, wherein the at least one memory and the computer program code are configured to, with the at least one processor, cause the apparatus at least further to:
    update second identification information for the another apparatus, in response to the first identification information at the end of the observation cycle satisfying a second criterion,
    wherein the second identification information indicates a number of observation cycles at the end of which the first identification information for the another apparatus satisfies the second criterion.
  17. The apparatus according to claim 16, wherein the at least one memory and the computer program code are configured to, with the at least one processor, cause the apparatus at least further to:
    trigger a predefined action for the another apparatus, in response to the second identification information satisfying a third criterion.
  18. The apparatus according to claim 17, wherein the predefined action comprises raising an alarm.
  19. A computer program product comprising a computer-readable medium bearing computer program codes embodied therein for use with a computer, the computer program codes comprising:
    code for collecting first data information transmitted at an observation point from a first node to a second node in a distributed network, wherein the second node is a neighbor node of the first node;
    code for comparing the first data information with second data information transmitted at the observation point by the second node to its neighbor nodes in the distributed network; and
    code for determining whether the first data information is to be used by the second node in data fusion associated with the observation point, based at least in part on the comparison of the first data information with the second data information.
  20. The computer program product according to claim 19, wherein the comparison of the first data information with the second data information comprises:
    calculating a similarity distance between the first data information and the second data information, wherein the similarity distance represents a distance between values of the first data information and the second data information.
  21. The computer program product according to claim 20, wherein it is determined that the first data information is to be used by the second node in the data fusion associated with the observation point, in response to the similarity distance between the first data information and the second data information not satisfying a first criterion.
  22. The computer program product according to claim 20, wherein it is determined that the first data information is not to be used by the second node in the data fusion associated with the observation point, in response to the similarity distance between the first data information and the second data information satisfying a first criterion.
  23. The computer program product according to claim 22, wherein the first data information is replaced with the second data information by the second node in the data fusion associated with the observation point.
  24. The computer program product according to any one of claims 19 to 23, wherein the computer program codes further comprise:
    code for updating first identification information for the first node, in response to determining that the first data information is not to be used by the second node in the data fusion associated with the observation point,
    wherein the first identification information indicates, within an observation cycle comprising one or more observation points, a number of observation points at which data information from the first node is determined not to be used in data fusion by the second node.
  25. The computer program product according to claim 24, wherein the computer program codes further comprise:
    code for updating second identification information for the first node, in response to the first identification information at the end of the observation cycle satisfying a second criterion,
    wherein the second identification information indicates a number of observation cycles at the end of which the first identification information for the first node satisfies the second criterion.
  26. The computer program product according to claim 25, wherein the computer program codes further comprise:
    code for triggering a predefined action for the first node, in response to the second identification information satisfying a third criterion.
  27. The computer program product according to claim 26, wherein the predefined action comprises raising an alarm.
  28. An apparatus, comprising:
    collection means for collecting first data information transmitted at an observation point from another apparatus to the apparatus in a distributed network, wherein the apparatus is a neighbor of the another apparatus;
    comparison means for comparing the first data information with second data information transmitted at the observation point by the apparatus to its neighbors in the distributed network; and
    determination means for determining whether the first data information is to be used by the apparatus in data fusion associated with the observation point, based at  least in part on the comparison of the first data information with the second data information.
PCT/CN2016/105324 2016-11-10 2016-11-10 Node identification in distributed adaptive networks WO2018086025A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2016/105324 WO2018086025A1 (en) 2016-11-10 2016-11-10 Node identification in distributed adaptive networks
CN201680091991.6A CN110168557B (en) 2016-11-10 2016-11-10 Node identification in a distributed adaptive network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2016/105324 WO2018086025A1 (en) 2016-11-10 2016-11-10 Node identification in distributed adaptive networks

Publications (1)

Publication Number Publication Date
WO2018086025A1 true WO2018086025A1 (en) 2018-05-17

Family

ID=62110128

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/105324 WO2018086025A1 (en) 2016-11-10 2016-11-10 Node identification in distributed adaptive networks

Country Status (2)

Country Link
CN (1) CN110168557B (en)
WO (1) WO2018086025A1 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110954153A (en) * 2019-11-08 2020-04-03 电子科技大学 A Distributed Adaptive Combination Coefficient Optimization Method
CN111614438A (en) * 2020-05-09 2020-09-01 云南电网有限责任公司电力科学研究院 A data fusion system and method based on power carrier communication
CN115022186A (en) * 2022-06-27 2022-09-06 中国联合网络通信集团有限公司 Topology generation method and device and storage medium
CN115396320A (en) * 2022-08-10 2022-11-25 中国联合网络通信集团有限公司 Method, device, equipment and storage medium for determining port connection relation
CN116488847A (en) * 2023-02-27 2023-07-25 西北工业大学 A network information traceability method, device and medium

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113744038A (en) * 2021-09-03 2021-12-03 上海晓途网络科技有限公司 Object parameter determination method, device, equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060133289A1 (en) * 2004-12-16 2006-06-22 Philippe Golle Method and apparatus for detecting and correcting malicious data in an ad-hoc network
US20110179492A1 (en) * 2010-01-21 2011-07-21 Athina Markopoulou Predictive blacklisting using implicit recommendation
CN102945320A (en) * 2012-10-29 2013-02-27 河海大学 Time series data abnormity detection method and device
CN103916860A (en) * 2014-04-16 2014-07-09 东南大学 Outlier data detection method based on space-time correlation in wireless sensor cluster network

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101848095B (en) * 2009-03-23 2013-03-06 中国科学院计算技术研究所 Method and system for treating credibility of feedback reports in distributed network
US8898468B2 (en) * 2009-12-08 2014-11-25 Bae Systems Information And Electronic Systems Integration Inc. Method for ensuring security and privacy in a wireless cognitive network
CN102244577A (en) * 2010-05-10 2011-11-16 东北大学技术转移中心 Node authentication
CN103813355B (en) * 2014-02-21 2018-07-27 厦门大学 The recognition methods of synchronous abnormal point is cooperateed in a kind of distributed network
CN104601553A (en) * 2014-12-26 2015-05-06 北京邮电大学 Internet-of-things tampering invasion detection method in combination with abnormal monitoring
CN104618908B (en) * 2014-12-31 2017-11-10 重庆邮电大学 The method and apparatus that distributed cognition wireless network is attacked anti-distort perception data

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060133289A1 (en) * 2004-12-16 2006-06-22 Philippe Golle Method and apparatus for detecting and correcting malicious data in an ad-hoc network
US20110179492A1 (en) * 2010-01-21 2011-07-21 Athina Markopoulou Predictive blacklisting using implicit recommendation
CN102945320A (en) * 2012-10-29 2013-02-27 河海大学 Time series data abnormity detection method and device
CN103916860A (en) * 2014-04-16 2014-07-09 东南大学 Outlier data detection method based on space-time correlation in wireless sensor cluster network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
ZHU, YANSONG ET AL.: "A Dimensional Trust Based Security Data Aggregation Method in Wireless Sensor Networks", JOURNAL OF WUHAN UNIVERSITY, 30 April 2013 (2013-04-30), pages 194 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110954153A (en) * 2019-11-08 2020-04-03 电子科技大学 A Distributed Adaptive Combination Coefficient Optimization Method
CN111614438A (en) * 2020-05-09 2020-09-01 云南电网有限责任公司电力科学研究院 A data fusion system and method based on power carrier communication
CN111614438B (en) * 2020-05-09 2022-09-02 云南电网有限责任公司电力科学研究院 Data fusion system and method based on power line carrier communication
CN115022186A (en) * 2022-06-27 2022-09-06 中国联合网络通信集团有限公司 Topology generation method and device and storage medium
CN115022186B (en) * 2022-06-27 2023-05-12 中国联合网络通信集团有限公司 Topology generation method, device and storage medium
WO2024001783A1 (en) * 2022-06-27 2024-01-04 中国联合网络通信集团有限公司 Topology generation method and apparatus, and storage medium
CN115396320A (en) * 2022-08-10 2022-11-25 中国联合网络通信集团有限公司 Method, device, equipment and storage medium for determining port connection relation
CN115396320B (en) * 2022-08-10 2023-07-28 中国联合网络通信集团有限公司 Port connection relation determination method, device, equipment and storage medium
CN116488847A (en) * 2023-02-27 2023-07-25 西北工业大学 A network information traceability method, device and medium
CN116488847B (en) * 2023-02-27 2024-01-30 西北工业大学 A network information traceability method, device, equipment and medium

Also Published As

Publication number Publication date
CN110168557B (en) 2023-08-18
CN110168557A (en) 2019-08-23

Similar Documents

Publication Publication Date Title
WO2018086025A1 (en) Node identification in distributed adaptive networks
Kurt et al. Online cyber-attack detection in smart grid: A reinforcement learning approach
Zhang et al. A trust based framework for secure data aggregation in wireless sensor networks
JP6847591B2 (en) Anomaly detection system, model generator, anomaly detection device, anomaly detection method, model generation program, and anomaly detection program
US20220083916A1 (en) System and method for detecting and rectifying concept drift in federated learning
Xie et al. Histogram-based online anomaly detection in hierarchical wireless sensor networks
US20180007578A1 (en) Machine-to-Machine Anomaly Detection
US8583585B2 (en) Trust management system for decision fusion in networks and method for decision fusion
Lu et al. Robust occupancy inference with commodity WiFi
JP6200076B2 (en) Method and system for evaluating measurements obtained from a system
Warriach et al. A machine learning approach for identifying and classifying faults in wireless sensor network
CN108601026A (en) Perception data error attack detection method based on random sampling consistency
Karthik et al. Data trust model for event detection in wireless sensor networks using data correlation techniques
US20170346834A1 (en) Relating to the monitoring of network security
Ghalem et al. A probabilistic multivariate copula-based technique for faulty node diagnosis in wireless sensor networks
Karthik et al. Sensor data modeling for data trustworthiness
CN110662220B (en) Anomaly detection method for wireless sensor network based on spatiotemporal correlation and information entropy
Lalou et al. Least squares method for diffusion source localization in complex networks
Islam et al. Measuring trustworthiness of IoT image sensor data using other sensors’ complementary multimodal data
Sun et al. Dynamic adaptive trust management system in wireless sensor networks
Perez-Toro et al. RDAS: reputation-based resilient data aggregation in sensor network
Moshtaghi et al. Exponentially weighted ellipsoidal model for anomaly detection
Laoudias et al. ftTRACK: fault-tolerant target tracking in binary sensor networks
Nakayama et al. Detection of false data injection attacks in cyber-physical systems using dynamic invariants
CN114969756A (en) A Trusted Participant Selection Method Through Historical Data Interpolation Test

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16921173

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16921173

Country of ref document: EP

Kind code of ref document: A1