CN109362113B - Underwater acoustic sensor network cooperation exploration reinforcement learning routing method - Google Patents
Underwater acoustic sensor network cooperation exploration reinforcement learning routing method Download PDFInfo
- Publication number
- CN109362113B CN109362113B CN201811310120.6A CN201811310120A CN109362113B CN 109362113 B CN109362113 B CN 109362113B CN 201811310120 A CN201811310120 A CN 201811310120A CN 109362113 B CN109362113 B CN 109362113B
- Authority
- CN
- China
- Prior art keywords
- node
- value
- packet
- data packet
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 11
- 230000002787 reinforcement Effects 0.000 title abstract description 21
- 230000005540 biological transmission Effects 0.000 claims abstract description 14
- 238000005265 energy consumption Methods 0.000 claims abstract description 10
- 230000006870 function Effects 0.000 claims description 22
- 238000001514 detection method Methods 0.000 claims description 4
- 238000010606 normalization Methods 0.000 claims description 4
- 230000005484 gravity Effects 0.000 claims description 2
- 238000005728 strengthening Methods 0.000 claims 1
- 238000010586 diagram Methods 0.000 description 7
- 238000004364 calculation method Methods 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 230000002265 prevention Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W40/00—Communication routing or communication path finding
- H04W40/02—Communication route or path selection, e.g. power-based or shortest path routing
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W40/00—Communication routing or communication path finding
- H04W40/02—Communication route or path selection, e.g. power-based or shortest path routing
- H04W40/22—Communication route or path selection, e.g. power-based or shortest path routing using selective relaying for reaching a BTS [Base Transceiver Station] or an access point
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W52/00—Power management, e.g. Transmission Power Control [TPC] or power classes
- H04W52/02—Power saving arrangements
- H04W52/0203—Power saving arrangements in the radio access network or backbone network of wireless communication networks
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W84/00—Network topologies
- H04W84/18—Self-organising networks, e.g. ad-hoc networks or sensor networks
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D30/00—Reducing energy consumption in communication networks
- Y02D30/70—Reducing energy consumption in communication networks in wireless communication networks
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
本发明涉及水声传感器网络、水声路由协议技术领域,特别涉及一种水声传感器网络合作探索强化学习路由方法。本发明包括以下步骤:(1)初始化各节点Q值及V值;(2)判断
是否成立;(3)中继节点收到数据包/控制包,更新邻居列表,并判断是否继续转发;(4)sink收到数据包,结束本次传输。基于强化学习的路由协议在选择路径时能够近似达到全局最优,并且可以合并多项影响性能的因素。本发明中,在算法未收敛时,源节点在发送数据包的同时发送数个控制包,以加速算法的收敛,否则,只发送数据包。在算法收敛后,通过选择V值最高的下一跳节点实现近似全局最优路径,从而均衡了网络能耗,延长了网络寿命,解决了强化学习收敛速度慢的问题。The invention relates to the technical fields of underwater acoustic sensor networks and underwater acoustic routing protocols, in particular to a method of reinforcement learning routing for cooperative exploration of underwater acoustic sensor networks. The present invention includes the following steps: (1) initializing the Q value and V value of each node; (2) judging
Whether it is established; (3) the relay node receives the data packet/control packet, updates the neighbor list, and judges whether to continue forwarding; (4) the sink receives the data packet and ends this transmission. Routing protocols based on reinforcement learning can approximate the global optimum when selecting paths, and can incorporate multiple factors that affect performance. In the present invention, when the algorithm does not converge, the source node sends several control packets while sending the data packets to speed up the convergence of the algorithm, otherwise, only the data packets are sent. After the algorithm converges, the approximate global optimal path is achieved by selecting the next hop node with the highest V value, which balances the network energy consumption, prolongs the network life, and solves the problem of slow convergence of reinforcement learning.Description
技术领域technical field
本发明涉及水声传感器网络、水声路由协议技术领域,特别涉及一种水声传感器网络合作探索强化学习路由方法。The invention relates to the technical fields of underwater acoustic sensor networks and underwater acoustic routing protocols, in particular to a method of reinforcement learning routing for cooperative exploration of underwater acoustic sensor networks.
背景技术Background technique
水声传感器网络,Underwater Acoustic Sensor Networks,即UASNs,由水下部署的传感器节点和用于接收数据的汇聚节点sink组成。这些节点提供了许多应用如环境监测、战术监视、资源勘探、辅助导航和灾难防御等。由于无线电波高传输损耗的限制,水下通信常采用声波。同时,UASNs面临着电池容量有限、误码率高、端到端时延高、可用带宽有限等独特的挑战。Underwater Acoustic Sensor Networks, or UASNs, consist of sensor nodes deployed underwater and sink nodes for receiving data. These nodes provide many applications such as environmental monitoring, tactical surveillance, resource exploration, aided navigation, and disaster prevention. Due to the limitation of high transmission loss of radio waves, underwater communication often uses acoustic waves. At the same time, UASNs face unique challenges such as limited battery capacity, high bit error rate, high end-to-end latency, and limited available bandwidth.
由于UASNs的高延迟、高能耗以及低带宽等固有特性,其网络拓扑结构通常为分布式网络。其路由协议面临的一个主要问题是寻找高效且节能的路径。与环境试错交互以寻找最大期望奖励的强化学习算法已被应用于UASNs,基于强化学习的路由协议,每一个节点在选择路径时不必知道全网拓扑信息就可近似达到全局最优。强化学习算法可以使节点学习和适应其所处的动态环境,并且能够合并多项影响路由性能的因素,使路由决策考虑的更为全面。在本发明中,用源节点V值的收敛速度表征强化学习的收敛速度。Due to the inherent characteristics of UASNs such as high latency, high energy consumption, and low bandwidth, their network topology is usually a distributed network. A major problem facing its routing protocol is finding efficient and energy-efficient paths. Reinforcement learning algorithms that interact with the environment to find the maximum expected reward have been applied to UASNs. Routing protocols based on reinforcement learning, each node can approximate the global optimality without knowing the topology information of the entire network when choosing a path. Reinforcement learning algorithms can make nodes learn and adapt to the dynamic environment they are in, and can combine multiple factors that affect routing performance, making routing decisions more comprehensive. In the present invention, the convergence speed of reinforcement learning is represented by the convergence speed of the V value of the source node.
在UASNs中,随着网络规模的扩大,强化学习的收敛速度减慢,网络能量消耗大,并在网络拓扑改变时,不能很好的跟踪其变化,影响网络性能。In UASNs, with the expansion of the network scale, the convergence speed of reinforcement learning slows down, the network energy consumption is large, and when the network topology changes, the changes cannot be well tracked, which affects the network performance.
发明内容SUMMARY OF THE INVENTION
本发明的目的是针对上述现有技术的不足,提出一种水声传感器网络合作探索强化学习路由方法。在算法未收敛时,源节点发送数据包的同时发送数个控制包对路径进行合作探索,以加速其V值的收敛,解决了强化学习收敛速度慢的问题,同时减小了网络能耗,延长了网络寿命。The purpose of the present invention is to propose a method of reinforcement learning routing for cooperative exploration of underwater acoustic sensor networks in view of the above-mentioned deficiencies of the prior art. When the algorithm does not converge, the source node sends several control packets while sending data packets to explore the path cooperatively to accelerate the convergence of its V value, solve the problem of slow convergence of reinforcement learning, and reduce network energy consumption. Extends network life.
本发明可以通过如下的技术方案实现:The present invention can be achieved through the following technical solutions:
一种水声传感器网络合作探索强化学习路由方法,该方法包括以下步骤:An underwater acoustic sensor network cooperative exploration reinforcement learning routing method, which includes the following steps:
(1)初始化各节点Q值及V值;(1) Initialize the Q value and V value of each node;
(2)确定源节点s下一时刻的V值 (2) Determine the V value of the source node s at the next moment
(3)根据各节点的Q值及V值,判断是否成立:(3) According to the Q value and V value of each node, judge Is it established:
(3.1)如果判断成立,源节点只发送数据包;(3.1) If the judgment is true, the source node only sends data packets;
(3.2)如果判断不成立,源节点在发送数据包的同时发送控制包;(3.2) If the judgment is not established, the source node sends the control packet while sending the data packet;
(4)根据源节点发送的数据包或控制包,中继节点接收数据并读取包头;(4) According to the data packet or control packet sent by the source node, the relay node receives the data and reads the packet header;
(5)根据中继节点接收的数据更新路由表,并判断其是否继续发往本节点,若判断数据是发往本节点,则计算Q值,更新V值至包头,并继续传输数据包;(5) Update the routing table according to the data received by the relay node, and judge whether it continues to be sent to this node, if it is judged that the data is sent to this node, then calculate the Q value, update the V value to the packet header, and continue to transmit the data packet;
(6)判断汇聚节点sink是否收到数据包:(6) Determine whether the sink node receives the data packet:
(6.1)若sink收到数据包,则结束本次传输;(6.1) If the sink receives the data packet, it will end the transmission;
(6.2)若sink没有收到数据包,则重复步骤(2)到步骤(6),直至sink收到数据包。(6.2) If the sink does not receive the data packet, repeat steps (2) to (6) until the sink receives the data packet.
所述步骤(1)包括以下步骤:Described step (1) comprises the following steps:
(1.1)确定奖励函数;(1.1) Determine the reward function;
(1.2)根据奖励函数,确定各节点的Q值迭代函数;(1.2) According to the reward function, determine the Q-value iteration function of each node;
步骤(1.1)所述奖励函数Rnm为第一节点n向第二节点m传输数据包/控制包完成后所获得的即时奖励,奖励函数按下式结算:The reward function R nm in step (1.1) is the instant reward obtained after the first node n transmits the data packet/control packet to the second node m, and the reward function is calculated as follows:
Rnm=-g-α1c+α2dR nm =-g-α 1 c+α 2 d
其中,g为节点在传输数据时的固定损耗,c为节点剩余能量消耗函数,d为节点能量分布情况,α1为节点剩余能量消耗函数c的比重参数,α2为节点能量分布情况d的比重参数;Among them, g is the fixed loss of the node when transmitting data, c is the node's remaining energy consumption function, d is the node energy distribution, α1 is the proportion parameter of the node's remaining energy consumption function c, and α2 is the node energy distribution d. Specific gravity parameter;
步骤(1.2)所述Q值的迭代函数按下式计算:The iterative function of the Q value described in step (1.2) is calculated as follows:
其中,表示第一节点n在t+1时刻的Q值,α表示Q值的更新速率,γ是折扣因子,为第二节点m在t时刻的Q值。in, represents the Q value of the first node n at time t+1, α represents the update rate of the Q value, γ is the discount factor, is the Q value of the second node m at time t.
步骤(3)所述判断条件中,表示源节点在t时刻的V值,表示t+1时刻源节点的V值,ε表示一个大于0的极小值;The judgment condition described in step (3) middle, represents the V value of the source node at time t, Represents the V value of the source node at time t+1, and ε represents a minimum value greater than 0;
若步骤(3.1)所述判断成立,源节点终止控制包的传输,结合路由表通过Q值迭代公式计算最优路径向上传输数据包,直至sink;If the judgment in step (3.1) is established, the source node terminates the transmission of the control packet, and calculates the optimal path to transmit the data packet upward through the Q-value iteration formula in combination with the routing table, until the sink;
其中,源节点下一时刻的V值计算函数为:Among them, the calculation function of the V value of the source node at the next moment is:
其中,α与步骤(1.2)所述Q值的迭代函数中的α数值相同,在这里表示学习速率,意为V值的更新速率;ω为控制包探测路径的归一化参数,表示各数据包或控制包探索路径所得到的经验。Among them, α is the same as the value of α in the iterative function of the Q value described in step (1.2), where it represents the learning rate, which means the update rate of the V value; ω is the normalization parameter that controls the packet detection path, Indicates the experience gained by each data packet or control packet exploring the path.
若如步骤(5)所述该包发往本节点,根则据Q值迭代函数计算Q值,选定Q值为其最大值Qmax时的节点为下一跳节点,并将V值更新为Qmax,并改写节点信息至包头继续传输。If the packet is sent to this node as described in step (5), the Q value is calculated according to the Q value iterative function, the node whose Q value is its maximum value Q max is selected as the next hop node, and the V value is updated is Q max , and rewrite the node information to the packet header to continue transmission.
本发明与现有技术相比,本发明的有益效果在于:Compared with the prior art, the present invention has the following beneficial effects:
(1)本发明提供了一种水声传感器网络合作探索强化学习路由算法,在算法未收敛时,源节点同时发送数据包和控制包,加快了源节点V值的收敛速度。(1) The present invention provides an underwater acoustic sensor network cooperative exploration reinforcement learning routing algorithm. When the algorithm does not converge, the source node sends data packets and control packets at the same time, which speeds up the convergence speed of the V value of the source node.
(2)本发明在算法收敛后,通过选择V值最高的下一跳节点实现近似全局最优路径,从而均衡了网络能耗,延长了网络寿命。(2) After the algorithm converges, the present invention realizes an approximate global optimal path by selecting the next hop node with the highest V value, thereby balancing the network energy consumption and prolonging the network life.
附图说明Description of drawings
图1是水声传感器网络结构图。Figure 1 is a structural diagram of an underwater acoustic sensor network.
图2是合作探索强化学习路由方法的示意图。Figure 2 is a schematic diagram of cooperative exploration of reinforcement learning routing methods.
图3是源节点实现合作探索强化学习算法的流程图。Figure 3 is a flow chart of the source node implementing the cooperative exploration reinforcement learning algorithm.
图4是路由转发流程图。FIG. 4 is a flow chart of routing and forwarding.
具体实施方式Detailed ways
下面结合附图对本发明做进一步阐述。The present invention will be further described below with reference to the accompanying drawings.
显然,所描述的实施例仅是本发明一部分实施例,而不是全部实施例。因此,以下对在附图中提供的本发明的实施例的详细描述并非旨在限制要求保护的本发明的范围,而是仅仅表示本发明的选定实施例。基于发明的实施例,本领域技术人员没有做出创造性劳动的前提下所获得的所有其他实施例,都属于本发明保护的范围。Obviously, the described embodiments are only some, but not all, embodiments of the present invention. Thus, the following detailed description of the embodiments of the invention provided in the accompanying drawings is not intended to limit the scope of the invention as claimed, but is merely representative of selected embodiments of the invention. Based on the embodiments of the invention, all other embodiments obtained by those skilled in the art without creative work fall within the protection scope of the present invention.
本发明提供一种水声传感器网络合作探索强化学习路由方法。基于强化学习的路由协议在选择路径时能够近似达到全局最优,并且可以合并多项影响性能的因素。本发明中,在算法未收敛时,源节点发送数据包的同时发送数个只含有包头信息的控制包对路径进行合作探索,以加速源节点V值的收敛,否则,只发送数据包。本发明解决了强化学习收敛速度慢的问题,同时减小了网络能耗,延长了网络寿命。本发明具体包含以下步骤:The invention provides an underwater acoustic sensor network cooperative exploration reinforcement learning routing method. Routing protocols based on reinforcement learning can approximate the global optimum when selecting paths, and can incorporate multiple factors that affect performance. In the present invention, when the algorithm does not converge, the source node sends several control packets containing only packet header information to explore the path cooperatively while sending data packets, so as to accelerate the convergence of the V value of the source node, otherwise, only send data packets. The invention solves the problem of slow convergence of reinforcement learning, reduces network energy consumption, and prolongs network life. The present invention specifically comprises the following steps:
(1)初始化各节点Q值及V值。(1) Initialize the Q value and V value of each node.
(2)确定源节点s下一时刻的V值 (2) Determine the V value of the source node s at the next moment
(3)判断是否成立,其中,表示源节点在t时刻的V值,ε表示一个大于0的极小值。如果成立,源节点只发送数据包;否则,源节点在发送数据包的同时发送控制包。(3) Judgment is established, where, Represents the V value of the source node at time t, and ε represents a minimum value greater than 0. If true, the source node only sends the data packet; otherwise, the source node sends the control packet at the same time as the data packet.
(4)中继节点收到数据包/控制包,更新邻居列表,并判断是否继续转发。(4) The relay node receives the data packet/control packet, updates the neighbor list, and judges whether to continue forwarding.
(5)sink收到数据包,结束本次传输。(5) The sink receives the data packet and ends the transmission.
步骤(2)中,源节点的V值迭代函数为:In step (2), the V value iteration function of the source node is:
其中,α表示学习速率,意为V值的更新速率,它控制了先前的V值与新的V值之间的差异有多少被考虑在内。γ是折扣因子,意为经验对当前的V值的影响。ω为控制包探测路径的归一化参数,表示各数据包/控制包探索路径所得到的经验。步骤(3)中,在时,源节点只发送数据包;在时,源节点在发送数据包的同时发送控制包。where α represents the learning rate, meaning the update rate of the V value, which controls how much of the difference between the previous V value and the new V value is taken into account. γ is the discount factor, which means the influence of experience on the current value of V. ω is the normalization parameter that controls the packet detection path, Indicates the experience obtained by each packet/control packet exploring the path. In step (3), in When , the source node only sends data packets; , the source node sends control packets at the same time as sending data packets.
附图1为本发明实施例提供的水声传感器网络结构图,附图2为本发明实施例提供的合作探索强化学习路由方法的示意图。结合上述结构图和示意图,本实施例公开了一种水声传感器网络合作探索强化学习路由协议的实现方法,如附图3和附图4所示,具体如下:FIG. 1 is a structural diagram of an underwater acoustic sensor network provided by an embodiment of the present invention, and FIG. 2 is a schematic diagram of a cooperative exploration reinforcement learning routing method provided by an embodiment of the present invention. With reference to the above-mentioned structural diagram and schematic diagram, the present embodiment discloses a method for implementing a reinforcement learning routing protocol for cooperative exploration of underwater acoustic sensor networks, as shown in FIG. 3 and FIG. 4 , and the details are as follows:
(1)初始化各节点Q值V值。(1) Initialize the Q value and V value of each node.
(2)确定奖励值函数。(2) Determine the reward value function.
在本实施例中,奖励值函数Rnm为节点n向节点m传输数据包/控制包完成后所获得的即时奖励。In this embodiment, the reward value function R nm is the instant reward obtained after the node n transmits the data packet/control packet to the node m.
Rnm=-g-α1c+α2dR nm =-g-α 1 c+α 2 d
g为节点在传输数据时的固定损耗,c为节点剩余能量消耗函数,d为节点能量分布情况,α1和α2分别为c与d的比重参数。g is the fixed loss when the node transmits data, c is the residual energy consumption function of the node, d is the energy distribution of the node, α 1 and α 2 are the proportion parameters of c and d, respectively.
(3)确定各节点的Q值迭代函数。(3) Determine the Q-value iteration function of each node.
表示节点n在t+1时刻的Q值,α表示Q值的更新速率,γ是折扣因子,为节点m在t时刻的Q值。 represents the Q value of node n at
(4)确定源节点V值计算函数。(4) Determine the V value calculation function of the source node.
表示下一时刻源节点的V值,ω为控制包探测路径的归一化参数,表示各数据包/控制包探索路径所得到的经验。 represents the V value of the source node at the next moment, ω is the normalization parameter of the control packet detection path, Indicates the experience obtained by each packet/control packet exploring the path.
(5)源节点进行合作探索。(5) The source node conducts cooperative exploration.
本发明的水声传感器网络结构图如附图1所示,简单起见,本实施例的网络结构为单源-单sink,源节点负责收集数据,并将收集到的数据通过水声网络沿着中继节点逐步向上传输,直到sink。sink通过水声接收器接收来自海面下中继节点的数据,并用无线电波向基站发送数据,基站收到sink的数据后进行后续分析和处理。The structure diagram of the underwater acoustic sensor network of the present invention is shown in FIG. 1. For simplicity, the network structure of this embodiment is a single source-single sink, and the source node is responsible for collecting data, and passing the collected data through the underwater acoustic network along the The relay node gradually transmits upwards until the sink. The sink receives the data from the relay nodes under the sea surface through the underwater acoustic receiver, and sends the data to the base station by radio waves, and the base station performs subsequent analysis and processing after receiving the data from the sink.
结合附图2具体说明,当时,源节点同时发送数据包和控制包,在本实施例中,为方便说明,将控制包定为两个。In conjunction with accompanying drawing 2, it is described in detail that when When the source node sends the data packet and the control packet at the same time, in this embodiment, for the convenience of description, two control packets are set.
源节点根据Q值迭代函数计算Q值并更新V值,根据计算结果选择一个节点发送数据包,两个节点发送控制包,在本实施例中,源节点选择它的邻居节点3作为数据包传输的下一跳节点,同时选择节点1和节点5作为控制包传输的下一跳节点。The source node calculates the Q value and updates the V value according to the Q value iteration function, and selects one node to send the data packet according to the calculation result, and two nodes send the control packet. In this embodiment, the source node selects its
节点1,3,5监听到数据包/控制包后读取包头,将上一跳节点的信息更新至自己的邻居列表中,如果该包发往本节点,根据Q值迭代函数计算Q值,选定Q值为Qmax的节点为下一跳节点,并将V值更新为Qmax,并改写节点信息至包头继续传输。
节点1,3,5的邻居节点重复上述动作直到数据包/控制包到达sink。The neighbor nodes of
(6)当时,源节点停止发送控制包。(6) When , the source node stops sending control packets.
源节点判断成立,此时,源节点终止控制包的传输,结合路由表通过Q值迭代公式计算最优路径向上传输数据包,直至sink。source node judgment is established, at this time, the source node terminates the transmission of the control packet, and uses the Q-value iteration formula to calculate the optimal path to transmit the data packet upwards in combination with the routing table until the sink.
Claims (1)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811310120.6A CN109362113B (en) | 2018-11-06 | 2018-11-06 | Underwater acoustic sensor network cooperation exploration reinforcement learning routing method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811310120.6A CN109362113B (en) | 2018-11-06 | 2018-11-06 | Underwater acoustic sensor network cooperation exploration reinforcement learning routing method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109362113A CN109362113A (en) | 2019-02-19 |
CN109362113B true CN109362113B (en) | 2022-03-18 |
Family
ID=65344072
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811310120.6A Active CN109362113B (en) | 2018-11-06 | 2018-11-06 | Underwater acoustic sensor network cooperation exploration reinforcement learning routing method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109362113B (en) |
Families Citing this family (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110719617B (en) * | 2019-09-30 | 2023-02-03 | 西安邮电大学 | Q Routing Method Based on Arctangent Learning Rate Factor |
CN110868727A (en) * | 2019-10-28 | 2020-03-06 | 辽宁大学 | Optimization method of data transmission delay in wireless sensor network |
CN111629440A (en) * | 2020-05-19 | 2020-09-04 | 哈尔滨工程大学 | A Convergence Judgment Method of MAC Protocol Using Q-learning |
CN112351400B (en) * | 2020-10-15 | 2022-03-11 | 天津大学 | A Routing Policy Generation Method for Underwater Multimodal Networks Based on Improved Reinforcement Learning |
CN112469103B (en) * | 2020-11-26 | 2022-03-08 | 厦门大学 | Underwater sound cooperative communication routing method based on reinforcement learning Sarsa algorithm |
CN112867089B (en) * | 2020-12-31 | 2022-04-05 | 厦门大学 | Underwater sound network routing method based on information importance and Q learning algorithm |
CN112954769B (en) * | 2021-01-25 | 2022-06-21 | 哈尔滨工程大学 | Underwater wireless sensor network routing method based on reinforcement learning |
CN113141592B (en) * | 2021-04-11 | 2022-08-19 | 西北工业大学 | Long-life-cycle underwater acoustic sensor network self-adaptive multi-path routing method |
CN113783782B (en) * | 2021-09-09 | 2023-05-30 | 哈尔滨工程大学 | Opportunity routing candidate set node ordering method for deep reinforcement learning |
CN114828141B (en) * | 2022-04-25 | 2024-04-19 | 广西财经学院 | A multi-hop routing method for UWSNs based on AUV networking |
CN114786236B (en) * | 2022-04-27 | 2024-05-31 | 曲阜师范大学 | Method and device for heuristic learning of routing protocol by wireless sensor network |
CN115175268B (en) * | 2022-07-01 | 2023-07-25 | 重庆邮电大学 | Heterogeneous network energy-saving routing method based on deep reinforcement learning |
CN115987886B (en) * | 2022-12-22 | 2024-06-04 | 厦门大学 | A Q-learning routing method for underwater acoustic networks based on meta-learning parameter optimization |
CN115843083B (en) * | 2023-02-24 | 2023-05-12 | 青岛科技大学 | Underwater wireless sensor network routing method based on multi-agent reinforcement learning |
CN118611781B (en) * | 2024-08-08 | 2024-10-18 | 中山大学 | Underwater network data communication method and system based on reinforcement learning and power control |
CN118869095B (en) * | 2024-08-15 | 2025-04-29 | 中国科学院声学研究所 | Cross-layer routing protocol method for underwater acoustic communication network based on channel quality |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106358308A (en) * | 2015-07-14 | 2017-01-25 | 北京化工大学 | Resource allocation method for reinforcement learning in ultra-dense network |
CN107809781A (en) * | 2017-11-02 | 2018-03-16 | 中国科学院声学研究所 | A kind of loop free route selection method of load balancing |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
ITUB20155144A1 (en) * | 2015-10-16 | 2017-04-16 | Univ Degli Studi Di Roma La Sapienza Roma | ? METHOD OF ADAPTING AND JOINING THE JOURNEY POLICY AND A RETRANSMISSION POLICY OF A KNOT IN A SUBMARINE NETWORK, AND THE MEANS OF ITS IMPLEMENTATION? |
-
2018
- 2018-11-06 CN CN201811310120.6A patent/CN109362113B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106358308A (en) * | 2015-07-14 | 2017-01-25 | 北京化工大学 | Resource allocation method for reinforcement learning in ultra-dense network |
CN107809781A (en) * | 2017-11-02 | 2018-03-16 | 中国科学院声学研究所 | A kind of loop free route selection method of load balancing |
Non-Patent Citations (5)
Title |
---|
"AUV-Aided Communication Method for Underwater Mobile Sensor Network";冯晓宁;《IEEE》;20160413;全文 * |
"基于L-π演算的WSN路由协议形式化方法";冯晓宁;《吉林大学学报(工学版)》;20140527;全文 * |
"基于反馈的合作强化学习水下路由算法";卜任菲;《通信技术》;20170810;全文 * |
"多普勒辅助水下传感器网络时间同步机制研究";王卓;《通信学报》;20170125;全文 * |
金志刚."基于指向性换能器水声传感器网络功率控制算法".《华中科技大学学报(自然科学版) 2017-07-14 》.2017,全文. * |
Also Published As
Publication number | Publication date |
---|---|
CN109362113A (en) | 2019-02-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109362113B (en) | Underwater acoustic sensor network cooperation exploration reinforcement learning routing method | |
CN112821940B (en) | Satellite network dynamic routing method based on inter-satellite link attribute | |
CN112867089B (en) | Underwater sound network routing method based on information importance and Q learning algorithm | |
KR101022054B1 (en) | Adaptive communication environment setting method and device for underwater sensor network | |
CN111278078B (en) | A Realization Method of Adaptive Routing Protocol for Mobile Sparse Underwater Acoustic Sensor Network | |
CN108990129A (en) | A kind of wireless sensor network cluster-dividing method and system | |
CN112188583B (en) | Ocean underwater wireless sensing network opportunistic routing method based on reinforcement learning | |
CN109547351B (en) | Routing method based on Q-learning and trust model in Ad Hoc network | |
CN113207156B (en) | A wireless sensor network cluster routing method and system | |
CN102625404A (en) | A Distributed Routing Protocol Method Applied to 3D Underwater Acoustic Sensor Network | |
CN103701567B (en) | A kind of self-adaptive modulation method and system for wireless in-ground sensor network | |
CN103200643A (en) | Distributed fault-tolerant topology control method based on dump energy sensing | |
CN108112050A (en) | Energy balance and deep-controlled Routing Protocol based on underwater wireless sensing network | |
Zou et al. | A cluster-based adaptive routing algorithm for underwater acoustic sensor networks | |
CN106879044B (en) | A hole-aware routing method for underwater sensor networks | |
CN108650030B (en) | Water surface multi-sink node deployment method of underwater wireless sensor network | |
Rahman et al. | Routing protocols for underwater ad hoc networks | |
CN114531716B (en) | A routing selection method based on energy consumption and link quality | |
CN106879042B (en) | A kind of underwater wireless sensor network shortest-path rout ing algorithms | |
Natarajan et al. | Adaptive Time Difference of Time of Arrival in Wireless Sensor Network Routing for Enhancing Quality of Service. | |
CN103607747A (en) | Inter-cluster virtual backbone route protocol method based on power control | |
Diamant et al. | Routing in multi-modal underwater networks: A throughput-optimal approach | |
CN111901237B (en) | Source routing method and system, related device and computer readable storage medium | |
Saravanan et al. | Towards an adaptive routing protocol for low power and lossy networks (RPL) for reliable and energy efficient communication in the Internet of Underwater Things (iout) | |
KR101654734B1 (en) | Method for modelling information transmission network having hierarchy structure and apparatus thereof |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |