CN112911600B

CN112911600B - In-band full duplex cognitive wireless network self-interference suppression method and device

Info

Publication number: CN112911600B
Application number: CN202110062014.6A
Authority: CN
Inventors: 秦航; 杨强
Original assignee: Yangtze University
Current assignee: Yangtze University
Priority date: 2021-01-18
Filing date: 2021-01-18
Publication date: 2023-10-24
Anticipated expiration: 2041-01-18
Also published as: CN112911600A

Abstract

The invention relates to a self-interference suppression method and device for an in-band full duplex cognitive wireless network and a computer readable storage medium, wherein the method comprises the following steps: the method comprises the steps of providing a full duplex antenna, establishing different cooperative operation modes for in-band full duplex nodes, acquiring node switching and transmission cost, and acquiring an interference source of average power constraint; acquiring a reaction scanning interference source of in-band full duplex according to the interference source of average power constraint, and determining defense strategies of different cooperative operation modes of the nodes according to the reaction scanning interference source and the switching and transmission cost of the nodes; and determining an anti-interference strategy according to the defense strategies of the nodes in different cooperative operation modes, and acquiring an optimal defense strategy corresponding to the expected return maximization utility and an optimal defense strategy of average return utility according to the anti-interference strategy. The method provided by the invention improves the interference suppression capability of the in-band full duplex cognitive wireless network.

Description

In-band full duplex cognitive wireless network self-interference suppression method and device

Technical Field

The invention relates to the technical field of in-band full duplex cognitive wireless networks, in particular to a self-interference suppression method and device for an in-band full duplex cognitive wireless network and a computer readable storage medium.

Background

"duplexing" in a wireless network refers to the ability of two systems to communicate with each other, i.e., both systems are capable of data transmission and reception. By combining in-band full duplex, the channel awareness functionality may be greatly enhanced, allowing users to receive data simultaneously on the same frequency band and/or on different frequency bands while transmitting data. Applications of in-band full duplex systems span different areas, aimed at combining transmitters with continuously perceived receivers, including wireless relay with symmetric transmit and receive loads, bi-directional links for data backhaul, and variable-size network topologies of multiple nodes, etc.

In cognitive wireless networks, dynamic spectrum access policies replace traditional static spectrum management that is inefficient. The in-band full duplex technology is integrated into the cognitive wireless network, so that new dimensions for improving the spectrum utilization rate and the network capacity can be explored, and new network architecture and protocol design are brought. With the development of self-interference suppression/elimination technology, the in-band full duplex cognitive wireless network can reduce data loss to the greatest extent, does not need to interrupt transmission to perform channel sensing, improves the frequency spectrum utilization rate, and increases the overall network capacity. These in-band full duplex advantages come at the cost of increased power consumption and increased hardware complexity. While in-band full duplex radio doubles the throughput of the wireless link, it is more susceptible to interference than out-of-band full duplex. The interference suppression capability of the existing in-band full duplex cognitive wireless network is poor.

Disclosure of Invention

In view of the foregoing, it is necessary to provide a method, an apparatus and a computer readable storage medium for suppressing self-interference of an in-band full duplex cognitive wireless network, which are used for solving the problem of poor interference suppression capability of the existing in-band full duplex cognitive wireless network.

The invention provides a self-interference suppression method for an in-band full duplex cognitive wireless network, which comprises the following steps:

the method comprises the steps of providing a full duplex antenna, establishing different cooperative operation modes for in-band full duplex nodes, acquiring node switching and transmission cost, and acquiring an interference source of average power constraint;

acquiring a reaction scanning interference source of in-band full duplex according to the interference source of average power constraint, and determining defense strategies of different cooperative operation modes of the nodes according to the reaction scanning interference source and the switching and transmission cost of the nodes;

and determining an anti-interference strategy according to the defense strategies of the nodes in different cooperative operation modes, and acquiring an optimal defense strategy corresponding to the expected return maximization utility and an optimal defense strategy of average return utility according to the anti-interference strategy.

Further, the establishing different cooperative operation modes for the in-band full duplex node specifically includes: a transmission reception or transmission detection co-operation mode is established for the in-band full duplex node, the transmission detection co-operation mode being that one node receives data packets from only the other node.

Further, according to the interference source constrained by the average power, acquiring a reaction scanning interference source with full duplex, which specifically includes:

determining an in-band full duplex reaction scanning interference source with limited power according to the interference source constrained by average power, and regarding the reaction scanning interference source and attack and defense strategies among nodes as army competition; in each round of army competition, scanning an interference source to obtain a node updating strategy; if the interference source does not find any action on the channel, leaving the channel and continuing the current scanning period; if the interference source obtains the response of the node to the fixed scanning mode in each round of army competition, the strategy is modified.

Further, the node updating strategy specifically includes:

the switching channel expects the interference attack of the current channel, the interference source identifies the node action on the interference channel, a plurality of interference sources and the node are in the same channel, and the interference sources continuously interfere until the node hops to different channels.

Further, according to the response scanning interference source, the switching and transmission cost of the node, determining defense strategies of different cooperative operation modes of the node specifically includes:

if the transmission failure of the cooperative operation mode of the transmission and reception mode is caused by interference attack, the node is attacked again, and the node is switched to the cooperative operation mode of transmission and detection in the next time slot; the node completely acquires the scanning mode of the reaction scanning interference source, so that the interference can be avoided with minimum switching cost, and each scanning period; when an interference source scans a channel and operates in a transmission and reception mode cooperative operation mode on a new channel, a node hops, and the node obtains the highest average throughput;

The node learns the strategy of scanning the interference source, estimates the probability of interference in the current time slot, and actively exits the channel if the probability exceeds a preset value; at each time slot, the node decides whether to hold or exit the current channel and the cooperating mode; the node's optimal policy operates in a transmit receive cooperating mode on the channel and switches to a transmit detect cooperating mode after a set time slot.

Further, determining a strategy for resisting interference according to the defense strategies of the nodes in different cooperative operation modes specifically comprises the following steps:

defining a state space and an action space, representing interaction between the in-band full duplex radio and an interference source as random constraint zero and Markov game, adopting frequency hopping as a strategy for resisting interference, and representing states in the state space as a result of node transmission at the end of a time slot;

representing actions in the action space as the node being the same as the channel used by the last slot and acting in a transmission detection co-operating mode, the node hopping to a new randomly selected channel and acting in a transmission detection co-operating mode, the node staying on the same channel and acting in a transmission reception co-operating mode, or the node hopping to a new randomly selected channel and acting in a transmission reception co-operating mode; the node takes action according to the current state and performs conversion according to the Markov chain.

Further, according to the interference-resistant strategy, an optimal defense strategy corresponding to the expected return maximization utility is obtained, which specifically includes:

and calculating the return and transition probability of the Markov decision process according to the policies of the interference sources, obtaining the optimal policy of the node and obtaining the Belman equation with the maximized utility of the expected return, and obtaining the optimal defense strategy corresponding to the maximized utility of the expected return by using value iteration.

Further, according to the interference-resistant strategy, an optimal defense strategy for average return utility is obtained, which specifically includes:

and representing interaction between the in-band full-duplex node and the interference source as constraint zero and Markov game, wherein the in-band full-duplex node stays on the same channel before hopping until hopping occurs after a certain number of time slots, and determining the hopping time and the cooperative operation mode of the node from the balance of the constraint zero and the Markov game.

The invention also provides a self-interference suppression device of the in-band full-duplex cognitive wireless network, which comprises a processor and a memory, wherein the memory is stored with a computer program, and when the computer program is executed by the processor, the self-interference suppression method of the in-band full-duplex cognitive wireless network according to any one of the technical schemes is realized.

The invention also provides a computer readable storage medium, on which a computer program is stored, which is characterized in that when the computer program is executed by a processor, the method for suppressing self-interference of the in-band full duplex cognitive wireless network according to any one of the technical schemes is realized.

Compared with the prior art, the invention has the beneficial effects that: by providing a full duplex antenna, different cooperative operation modes are established for in-band full duplex nodes, node switching and transmission cost is obtained, and an interference source with average power constraint is obtained; acquiring a reaction scanning interference source of in-band full duplex according to the interference source of average power constraint, and determining defense strategies of different cooperative operation modes of the nodes according to the reaction scanning interference source and the switching and transmission cost of the nodes; determining an anti-interference strategy according to the defense strategies of the nodes in different cooperative operation modes, and acquiring an optimal defense strategy corresponding to the expected return maximization utility and an optimal defense strategy of average return utility according to the anti-interference strategy; the interference suppression capability of the in-band full duplex cognitive wireless network is improved.

Drawings

Fig. 1 is a schematic flow chart of a self-interference suppression method for an in-band full duplex cognitive wireless network provided by the invention;

Fig. 2 is a schematic diagram of an antenna topology with two full duplex functions according to the present application;

fig. 3 is a diagram of an architecture of a secondary user base station provided with a full duplex antenna according to the present application.

Detailed Description

The following detailed description of preferred embodiments of the application is made in connection with the accompanying drawings, which form a part hereof, and together with the description of the embodiments of the application, are used to explain the principles of the application and are not intended to limit the scope of the application.

Example 1

The embodiment of the application provides a self-interference suppression method of an in-band full duplex cognitive wireless network, which is shown in a flow chart in fig. 1, and comprises the following steps:

s1, a full duplex antenna is equipped, different cooperative operation modes are established for in-band full duplex nodes, node switching and transmission cost is acquired, and an interference source with average power constraint is acquired;

s2, acquiring a reaction scanning interference source with full duplex according to the interference source constrained by the average power, and determining defense strategies of different cooperative operation modes of the nodes according to the reaction scanning interference source, the switching and transmission cost of the nodes;

s3, determining an anti-interference strategy according to the defense strategies of the nodes in different cooperative operation modes, and acquiring an optimal defense strategy corresponding to the expected return maximization utility and an optimal defense strategy of the average return utility according to the anti-interference strategy.

Preferably, the establishing different cooperative operation modes for the in-band full duplex node specifically includes: a transmission reception or transmission detection co-operation mode is established for the in-band full duplex node, the transmission detection co-operation mode being that one node receives data packets from only the other node.

It should be noted that the node is equipped with an in-band full duplex radio, and uses time slot transmission, where multiple data packets are transmitted per time slot, and where the transmitter state and the interferer state in the time slot remain unchanged. The node transmits at a fixed power in each time slot, and the interferer produces interference on the channel, degrading the quality of the received signal. Self-interference suppression by base station and user equipment nodes is adopted, and a self-interference suppression factor is defined as the percentage of throughput loss caused by imperfect self-interference suppression, and the value of the self-interference suppression factor depends on the hardware capacity of the nodes for suppressing self-interference.

In one embodiment, the packet detection rate is defined as the ratio of the number of successfully decoded packets to the number of received packets, and is used as an indicator of potential interference activity; if the packet detection rate is low, the node switches to a "transmission-detection" mode; except for an in-band full duplex "transmit-receive" mode (co-operating mode); introducing an additional mode of transmission-detection, in which the node transmits and receives simultaneously on the link; in the "transmit-detect" mode, one node receives data packets only from another node; one node operates in a "transmit-detect" mode, and the other node receives ambient noise and measures intensity; if both nodes are under interference attack, either one can operate in a "transmit-detect" mode, and the other node measures the ambient noise strength on the same channel.

In specific implementation, considering that the cognitive wireless network communicates under the condition of interference, the node is provided with in-band full duplex radio, and in m non-overlapping channels C= { C ₁ ,...,C _m Any one of the channels works, and each channel is provided with independent and uniformly distributed additive Gaussian white noise; transmitting a plurality of data packets by adopting a time slot, wherein the state of a transmitter and the state of an interference source in the time slot are unchanged; the node transmits in each time slot with fixed power, the interference source generates interference in the channel, the quality of the received signal is reduced, and the attenuation process is described by a two-state GE channel; the channel decays with a probability of 1-p; if the channel is not attenuated, the transmission is always successful without interference; if the channel is in an attenuated state, whether or not interference exists, the transmission always fails; self-interference suppression enables doubling of the capacity of a cognitive wireless network; let H be the throughput, θ, of the uplink or downlink in the active state _i ∈[0,1]For the self-interference suppression factor (i.e., the percentage of throughput loss due to imperfect self-interference suppression) at node i, the net throughput without interference is (θ ₁ +θ ₂ )H，θ _i The value of (2) depends on the hardware capability of the node to suppress self-interference.

A secondary user base station is used in the full-duplex cognitive wireless network, so that required spectrum efficiency and performance are obtained; an antenna topology diagram with two full duplex functions, as shown in fig. 2, the secondary user base station is equipped with an antenna 1 and an antenna 2 for sensing and transmitting operations; the secondary user is provided with two antennas, and the increase of the power of the transmitting antenna brings the self-interference of the sensing antenna; the power is also used as a self-interference suppression control factor, and comprises two secondary users with full duplex function antennas, in the first case, the antenna 1 on the secondary user 1 senses signals from the secondary users, and the antenna 2 sends signals to generate self interference; in the second case, antenna 1 transmits a signal and antenna 2 perceives the signal.

In the "transmit-detect" mode, one node acts as a receiver and the other node simultaneously transmits and receives (listens); defining the packet detection rate as the ratio of the number of successfully decoded data packets to the number of received data packets, and using the ratio as an index of potential interference actions (if the packet detection rate is low, the node switches to a "transmission-detection" mode); the node evaluates the link quality by measuring the received signal strength; if the received signal strength is high and the packet detection rate is low, interference exists; if the received signal strength and packet transfer rate are low, the degradation of link quality is associated with attenuation; packet detection rate measurement is performed at one frame time, and received signal strength is measured only when the packet detection rate is below a certain threshold for a short time; the duration of the sampling window of the received signal strength is adjusted according to the flow rate, the measurement accuracy and the detection confidence.

Establishing a cooperative working mode of two operation modes of transmission-receiving and transmission-detection; to effectively avoid interference, the node must perform interference detection, as shown in fig. 2; in a "transmit-receive" mode, nodes transmit and receive simultaneously on links; in the "transmit-detect" mode, one node receives data packets only from another node; one node operates in a "transmit-detect" mode, and the other node receives ambient noise and measures intensity; if two nodes are interfered, either one can operate in a transmission-detection mode, and the other measures the environmental noise intensity on the same channel; if an interferer can only attack one node and cannot attack the other, i.e., the interferer is located near one node but "hidden" by the other, then the packet detection rate of the nodes next to the interferer is low; if another node is operating in a "transmit-detect" mode, ambient noise power may be measured; the interference detection strategy may be adapted to other situations, for example, if the source of interference attacks only one node and not the other, only one node has a higher packet detection rate; nodes with low packet detection rates can switch to facilitate attack detection by other nodes.

In one embodiment, the node hops between channels, reconfigures the radio frequency components, and continues for a setup time dependent on the device, resulting in throughput loss; additional losses occur due to lack of synchronization in the transition times of the transmitter and receiver; the total cost consists of the switching cost, which is the average throughput loss due to frequency hopping, and the transmission cost, which is the average throughput loss due to interference. To reestablish communication after interference, it is necessary to switch control packets, which do not increase throughput; each interference source attacks the channel in the time slot, the legal node action is observed, and the attack result is learned; the interference sources coordinate with each other through attacking non-overlapping channels, so that the success probability is increased; a multi-interference attack corresponds to a single source of interference that attacks multiple channels in sequence in one slot.

The interference model comprises an active interference source and a reactive interference source, wherein the in-band full duplex reactive interference source continuously transmits Gaussian white noise to a channel when a time slot starts, and the action of a node on the channel is intercepted by using an in-band full duplex function; if the interferer detects a transmission action, then continuing to attack this channel until no transmission is detected; if the interference source does not detect the node action for a period of time, other channels are attacked; a power limited interference environment is equipped with a limited number of radios; the interferer may choose whether to interfere or not interfere with the power, with more than two power levels being used to better control the power.

In particular embodiments, to reestablish communication after interference, control packets must be switched, which does not increase throughput. The optimal defense strategy of the transmitter needs to jointly consider the switching cost and the transmission cost; the interference source attacks the channels in the channel set C in the time slot, the legal node action is observed, and the attack result is learned; an interference source detects the action on the channel and attacks the same channel with other interference sources at the same time, so that the link quality is reduced to the greatest extent; this multi-interference attack is equivalent to a single interference source that attacks n channels in sequence in one time slot; thus, consider that a single interferer attacks n channels in sequence in a slot, n < m; the interference source attacks the channel for a long enough time to be effective; otherwise, the attacked node easily recovers the lost data packet from the short interruption; a secondary user base station architecture diagram equipped with full duplex antennas, as shown in fig. 3; the secondary user base station has a full duplex function, and realizes energy-limited spectrum sensing and transmission-oriented self-interference suppression; the secondary user base station is responsible for spectrum sensing, and idle channels are allocated to the base station secondary user 1, secondary user 2 and secondary user 3, so that spectrum efficiency is realized by using the power-throughput tradeoff.

It should be noted that, the active interference source changes the power to meet various constraint conditions; the reactive interferer exhibits complex behavior, transmitting power only when legitimate transmissions are detected; at the beginning of a time slot, an interference source continuously transmits Gaussian white noise to a channel, and the action of a node on the channel is intercepted by using an in-band full duplex function; the interference source monitors the channel action, and the interference is carried out only when the action is detected, so that the interference power is saved, and the effectiveness is reduced; if the interferer detects a transmission action, continuing to attack the channel until no transmission action is detected; if the interferer does not detect node activity for a period of time, other channels are attacked.

The power-limited interference environment is equipped with a limited number of radios, with maximum power P at each slot of the interferer _M Attack n channels; the interference source has average power P _A ，P _A ≤P _M The method comprises the steps of carrying out a first treatment on the surface of the Due to power constraints, the interferer cannot transmit power in all slots; the interference source selects interference or non-interference power, P _K ＝{0,P _M 'phi' let _PM And phi ₀ For power P in time slot _M Probability of interference and power 0 interference. The interference source adopts more than two power levels to better control the power; the strategic space of the interferers satisfies a set of available power profiles (phi) of the average power constraint ₀ ，φ _PM ) The method comprises the steps of carrying out a first treatment on the surface of the Let phi be the set of viable interference strategies phi,the average power constraint is applied only to the interferer and not to the node.

Preferably, the obtaining the reaction scanning interference source of the in-band full duplex according to the interference source of the average power constraint specifically includes:

In one embodiment, the attack and defense strategies between the interference source and the nodes follow the army competition in the game theory; the optimal attack or defense strategy of the interference source depends on the strategy adopted by an adversary, and the node hops between channels to avoid interference; the nodes are kept on the same channel, and all channels are equally vulnerable; the interference source follows a deterministic scanning mode, and the node avoids channel attack in a given time slot, so that the interference attack is effectively resisted; the interferer is aware of the node response and randomizes the scan pattern after one scan cycle is completed.

Preferably, the node updating policy specifically includes: the switching channel expects the interference attack of the current channel, the interference source identifies the node action on the interference channel, a plurality of interference sources and the node are in the same channel, and the interference sources continuously interfere until the node hops to different channels.

In one particular embodiment, the army competition is divided into "rounds" considering the power-limited in-band full duplex "reaction-scan" interferers; in each round of army competition, scanning an interference source to obtain a node updating strategy, and closing a transmitter once the node is interfered in a given scanning period; the interferer does not find any action on the channel, leaves the channel and continues the current scanning period; then, the node resumes transmission on the same channel, and is not attacked again until the current scanning period is over; when the gain is smaller, the node is more willing to remain on the same channel and tolerates less loss of throughput due to interference attacks than switching losses; the node update policy does not use the same channel, but rather the handover channel anticipates an interference attack of the current channel.

During specific implementation, attack and defense strategies between the interference source and the nodes follow the army competition in the game theory; the optimal attack or defense strategy of the interference source depends on the strategy of the adversary; the interference source knows that the node hops between channels to avoid interference, and a simple attack strategy is to randomly select n attacks from m channels with equal probability in a time slot; the node keeps the same channel, all channels are easy to attack, no matter whether the node hops or not, the probability of each time slot being attacked is n/m; the node response is expected, the interferer traverses m channels sequentially, where n are interfered in each slot, and then scans the next n channels; following a deterministic scanning mode, a node can avoid channel attack in a given time slot, and interference is effectively treated; the interference source senses node response, and after finishing a scanning period, the scanning mode is further randomized;

Dividing the army competition into 'wheels' for the in-band full duplex 'reaction-scanning' interference source with limited power; scanning the channel group by simultaneously interfering with a portion of the channels; scanning an interference source to obtain a node updating strategy in each round of army competition; turning off the transmitter once the node is disturbed for a given scanning period; the interferer does not find any action on the channel, leaves the channel and continues the current scanning period; the node resumes transmission on the same channel, and is not attacked again until the current scanning period is over; the node is attacked once at most in each scanning period, and the average throughput of the node running in a transmission-reception mode is 2 (m-n) theta H/m; if the transmitter uses frequency hopping, the throughput is improved by 2 theta nH/m at most, and the throughput loss is caused by channel switching; when the gain is smaller, the node is more willing to stay on the same channel and tolerates less loss of throughput due to interference attacks than switching losses.

The interferer is aware of the node's response to the fixed scan pattern at each round, and can modify the strategy to use the randomized scan pattern in the new scan period; the node strategy does not use the same channel, but switches channels to expect the interference attack of the current channel, and an interference source identifies the node action on the interference channel; if the interference source and the node are in the same channel, the interference source continuously interferes until the node hops to different channels; the interference source adopting the strategy in each round of army competition is called an in-band full duplex 'reaction-scanning' interference source; at each time slot, the interferer attacks a selected channel and observes the channel behavior; if action is observed, the channel is maintained and the node action continues to be detected, otherwise, a new scan cycle is restarted until the end of the new cycle, selecting the next channel.

Preferably, determining a defense strategy of different cooperative operation modes of the node according to the response scanning interference source, the switching and transmission cost of the node specifically includes:

if the transmission failure of the cooperative operation mode of the transmission and reception mode is caused by interference attack, the node is attacked again, and the node is switched to the cooperative operation mode of transmission and detection in the next time slot; the node completely acquires the scanning mode of the reaction scanning interference source, so that the interference can be avoided with minimum switching cost, and each scanning period; when the interferer scans the channel and operates in the transmit receive mode cooperating mode on the new channel, the node hops and the node obtains the highest average throughput.

In one embodiment, when the packet detection rate is low, the node switches to a "transmit-detect" mode to determine the cause of the failure prior to action; the interference attack causes transmission failure in the transmission-receiving mode, the node is attacked again, and the node is switched to the transmission-detection mode in the next time slot; if the node is attacked in the transmission-detection mode, the reason of transmission failure can be known, and the node leaves the channel; the node completely learns the scanning mode of the interference source, so that the interference can be avoided with minimum switching cost; every scanning period, when the interference source scans the channel and operates in a transmission-receiving mode on a new channel, the node hops; the process is repeated for each round, and the node obtains the highest average throughput.

The node adopts a policy to realize the average throughput of each round, so that the regrets are minimized; the node learns the strategy of scanning the interference source, estimates the possibility of interference in the current time slot, and actively exits the channel; at each time slot, the node decides whether to hold or leave the current channel and the mode of operation; the node's best policy operates in a "transmit-receive" mode on the channel, switching to a "transmit-detect" mode after some time slots, detecting whether there is an interfering action.

In the implementation, if the packet detection rate of the node is low, the jump is not the best strategy, the new channel is gradually attenuated, and the throughput loss is caused by switching channels; if the node verifies that the interference source exists, the node must jump to other channels, or else the node is continuously interfered; when the packet detection rate is low, the node switches to a "transmission-detection" mode to determine the cause of the failure prior to action; the interference causes transmission failure in the transmission-receiving mode, but the node is attacked again and switches to the transmission-detection mode in the next time slot; if the node is attacked in the "transmission-detection" mode, it leaves the channel.

If the node is attacked in the transmission-reception mode, reestablishing the link, and the throughput loss is 2TC; in the "transmission-detection" mode, the throughput loss due to interference is TC, since the node only needs to reestablish one link; operation in a transmission-reception mode can bring higher 2 theta H throughput, interference cannot be detected, and transmission loss is high; let k be the number of rounds, define the return of the node in the k rounds

Wherein the brackets are success, interference and jump respectively, the event occurrence index in the brackets is 1, otherwise, the event occurrence index in the brackets is 0; node attack or jump, reporting to be negative; the worst case is that an interference attack causes a loss of throughput for both nodes.

The node completely learns the scanning mode of the interference source, and avoids interference with minimum switching cost; every scanning period, when an interference source scans a channel and operates in a transmission-receiving mode in a new channel, a node hops; if the node jumps to the channel scanned by the interference source in the previous scanning period, the node is not interfered any more; repeating the above process, the node obtains the highest average throughput H of each round _n =p (2θh) - (1-p) (2 TC) -2nSC/m, p being the probability that the channel is unattenuated.

The node adopts a policy to realize the average throughput of each round; let xi be policy, H _ξ (k) For the throughput of the kth round under policy, T is the period, define the regrets of policy ζ in period TThe goal of the node is to employ policies and minimize regrets; learning a strategy for scanning an interference source, estimating the interference possibility of the current time slot, and actively determining to exit from the channel; at each time slot, the node decides whether to hold or leave the current channel and the mode of operation; the best policy of the node is to operate in "transmit-receive" mode on the channel and switch to "transmit-detect" mode after some time slots to detect if there is an interfering action; the node leaves the current channel and when an interferer is detected or arrives at the current channel, the current node decides to affect throughput in the subsequent time slot.

Preferably, determining a policy against interference according to the defense policies of the nodes in different cooperative operation modes specifically includes:

In one embodiment, the interaction between the in-band full duplex radio and the interference source is represented as Markov game, and the node adopts a frequency hopping technology as a strategy for resisting interference attack; representing the state as a result of the node transmission at the end of the time slot, the state space containing detected and undetected disturbances; the node observes the current state and takes action at the end of each time slot, observes failure in the "transmit-receive" mode, either hops or is in the "transmit-detect" mode; the node takes action according to the current state, performs conversion according to the Markov chain, and calculates transition probabilities of all possible state-action pairs; the longer the node succeeds on the channel, the higher the chance that the hopping succeeds on the new channel; the longer the node stays on the channel and the successful current transmission is, the more the number of channels the node scanned by the interference source does not run; the node balances the probability of the current channel being interfered and the probability of the current channel not being interfered under the channel residence time, and each participant policy decides the action taken by each state; in conjunction with the markov smoothing strategy, the node takes action based on the current state and follows the same policy in each time slot.

In specific implementation, a state space and an action space are defined, the interaction between the in-band full duplex radio and an interference source is expressed as random constraint zero and Markov game, and a frequency hopping technology is adopted as a strategy for resisting interference; the node is operating on channel c but does not know which channel the interferer is scanning; if the node succeeds at channel C e C for m slots, it is inferred that the interferer has not scanned channel C for the last m slots.

The state represents the result of the node transmission at the end of the time slot, letThe state set isThe state space contains two types of states, namely detected interference and undetected interference; the former contains a state R indicating transmission failure due to interference attack (interference detected); the node learns of transmission failure in a "transmission-detection" mode, in which the transmitter is in state R only; the state set has two subclassesAnd->State x _m E x denotes that the node continuously remains on one channel for m slots (because the last hop to the channel comprising the failed transmission and the current slot) and no explicit detection of interference presence and transmission success; state y _m E y denotes that the node continuously remains on one channel for m slots, and no interference presence and transmission failure are explicitly detected; both subclasses have m to-1 states, n channels per slot are interfered, and the node then remains on the same channel for at most m to-1 slots without interference; state x _m Check success of transmission in current slot, state y _m Checking failure of transmission in the current time slot; let S e S represent the general state, the current state of the markov chain being visible only to the nodes.

The node may use action q= { (u, transmit-detect), (v, transmit-detect), (u, transmit-receive), (v, transmit-receive) }; the node observes the current state and takes action at the end of each slot: (1) action u ₁ = (u, transmission-detection) means that the node is the same channel as used by the last slot and operates in "transmission-detection" mode; (2) action v ₁ : = (v, transmission-detection) means that the node hops to a new randomly selected channel and operates in "transmission-detection" mode; (3) action u ₂ = (u, transmit-receive) means that the node stays on the same channel and operates in "transmit-receive" mode, (4) action v ₂ = (v, transmit-receive) means node hops to a new randomly selected channel and operates in "transmit-receive" mode; in "transmission-reception"Pattern (y) _m State) failure, node or hop (using v) ₁ Or v ₂ ) Or in a "transmit-detect" mode, but not in a "transmit-receive" mode (action u ₂ Not at y _m Used in (c) a); if the channel is in interference state, the node can be quickly withdrawn from y _m Status of the device.

Let q.epsilon.Q be the general action, V (s, s', Q ₁ ,q ₂ ) Take action q for transmitter ₁ E Q and the interference source takes action Q ₂ ∈P _R Return to the transmitter when transitioning from state S epsilon S to state S epsilon S; under different conditions: v (s, s', q) ₁ ,q ₂ )＝H，V(s,s′,q ₁ ,q ₂ )＝-TC，/> V(s,s′,q ₁ ,q ₂ )＝-TC-SC，/>

The node takes action according to the current state and performs conversion according to the Markov chain; let P (s' |s, q ₁ ,q ₂ ) Taking action q at the node for the node to enter state S 'S from state S' S ₁ E Q and the interferer takes action Q ₂ The probability thereafter, calculating the transition probabilities of all possible state-action pairs; the longer the node succeeds on the channel, the higher the chance that the node hops on the new channel succeeds; the longer the node stays on the channel and the successful current transmission is, the more the number of channels the node scanned by the interference source does not run; the new channel may have been scanned by the interferer when hopped, and the longer the time that the interfering node remains on the channel in the current scanning period, the greater the likelihood of channel interference; the node should balance the probability that the current channel is interfered with and the probability that it is not interfered with for the channel dwell time.

The policy of each participant determines the action taken by each state; adopting a Markov stable strategy, and taking action by the node according to the current state and following the same policy in each time slot; let ψ (Q) be the distribution on set Q, ζ: s→ψ (Q) is the policy, ζ (S) = { ζ (S, Q) ₁ ),q ₁ ∈Q}，ξ(s,q ₁ ) Is when the node selects action q in state S epsilon S ₁ Probability of e Q; let xi be the policy family, define interference policy as j: S → ψ (P _R )，j(s)＝{j(s,q ₂ ),q ₂ ∈P _R }，j(s,q ₂ ) Is to select action q in state S epsilon S ₂ ∈P _R Probability of (2); interference source agnostic state, for any interference source strategy Φ= (Φ) ₀ ，φ _PM ) E phi, have

Preferably, according to the interference-resistant policy, an optimal defense policy corresponding to the expected return maximizing utility is obtained, which specifically includes:

In one embodiment, the goal of the full duplex link is to maximize the expected return of the node, and the goal of the interferer is to minimize the expected return of the node; the policy space of the interference source is constrained, the nodes are not constrained, and the policy pair meets constraint Nash equilibrium; the policy is independent of the state, and the interference source knows the state of the node only when the node is interfered; for a fixed interference source strategy, obtaining average return of the node in-state aiming at game action; the policy of the interferer is fixed and independent of state, and the optimal policy of the node is obtained by calculating the return and transition probabilities of the markov decision process for the policy of the interferer; the best defense strategy is obtained using value iterations, resulting in the bellman equation for the desired utility maximization problem.

In practice, starting from state s, the real-time return of the transmitter is h (s, q ₁ ,q ₂ ) Let eta be the return factor in the round, h: sxQ x P _R →R，q ₁ And q ₂ The actions taken by the node and the interferer respectively,for policy ζ ε and interference policy φ ε, the expected return of a node with initial state S ε S is +.>{(S _k ，Q _1k ，Q _2k ) K=1, 2, is a sequence of random variables representing the state and action of the node and interferer in each slot, the sequence varying according to the initial state and policy (ζ, φ), operator ε ^ξ，φ Is a desire for policy (ζ, φ).

The goal of the full duplex link is to select policy ζ, starting from any initial state S e S, maximize the expected return R (S, ζ, phi) of the node,the goal of the source of interference is to select a type of administrationCurate phi, minimize the node's expected discount return R (s, ζ, phi), +.>The policy space of the interference source is constrained, and the nodes are not constrained; policy pair (xi) ^* ,φ ^* ) Is to restrict Nash equilibrium to meet two conditions: phi (phi) ^* E.phi.for all s.epsilon.S.xi.epsilon.and.phi.epsilon.phi.for all s.epsilon.S.epsilon.and.epsilon.phi->

The interference source cannot observe the node state, and the policy and the state are independent; zero and gaming has stable constraint nash equalization; the optimal strategy for interference consists in attacking the node as much as possible with all the energy, i.e. with probability P in each slot _A /P _M Transmitting power to the selected channel; for a fixed interferer strategy φ ε Φ, nodes in state s are directed to gaming actions q ₁ Average return h of (2) _φ (s,q)＝φ ₀ h(s,q ₁ ,0)+φ ₁ h(s,q ₁ ,P _M ) The method comprises the steps of carrying out a first treatment on the surface of the Probability P of state transition to state s _Φ (s′|s,q)＝φ ₀ P(s′|s,q ₁ ,0)+φ ₁ P(s′|s,q ₁ ,P _M ) The method comprises the steps of carrying out a first treatment on the surface of the The strategy of the interference source is fixed and does not depend on the state, and the return and transition probability of the Markov decision process is calculated aiming at the strategy of the interference source, especially the optimal strategy of the interference source, so as to obtain the optimal strategy of the node; order theOptimal policy for interference source pair phi; />The optimal policy through the Markov decision process is derived, which is a deterministic strategy, ++>Can handle h _φ ,P _φ And->Deletion of the correlation of phi, record->

Belman's equation to get the desired utility maximization problem

Obtaining the optimal defense strategy and the optimal policy xi by using value iteration ^* Satisfying the following requirements; presence constant m ^* E { 1..m-1 } and i) ^* E {1,2} causes;for->And->Presence constant->So that for all 1.ltoreq.m.ltoreq.m ₁ ，ξ(x _m )＝u ₂ ，/>With xi (x) _m )＝u ₁ The method comprises the steps of carrying out a first treatment on the surface of the Wherein for all m.ltoreq.m ^* Has R (x) _m )≥R(x _m+1 ) For m.gtoreq.m ^* Has R (x) _m )<R(x _m+1 )；

Giving an optimal strategy; if the node is m ^* If the channel of each time slot is successful, leaving the channel; on the new channel, the node should be at the next m ^* Operating in a time slot if not interfered; when the node remains on the new channel, the previous one The time slots operate in "transmit-receive" and then in the followingPersonal->Time slot switching to a "transmit-detect" mode; if the node is interfered in a transmission-detection mode, the node jumps immediately; for some parameter sets, if the best strategy is such that +.>The node does not use a "transmit-receive" mode; if the best strategy is such that->The node does not use the "transmit-detect" mode; in state y _m Node uses u according to handover cost and transmission cost ₁ Or hopping; threshold->Increasing in m and decreasing in handover costs and transmission costs.

Preferably, the obtaining the best defense strategy of the average return utility according to the anti-interference strategy specifically includes:

In one embodiment, the selection of the average return criteria uses an optimal Nash equalization pure strategy to maximize the average return of the transmitter, depending on the channel quality, and describes the pure strategy form as an optimization problem; the interaction between the in-band full duplex node and the interference source is expressed as constraint zero and Markov game, the optimal strategy has a threshold structure, and the in-band full duplex node stays on the same channel until a certain number of time slots are reached; the best defense strategy for the node is determined from the balance of the markov game, and the node hop time (how long to stay on the same channel) and which cooperating mode to use (the "transmit-detect" mode or the "transmit-receive" mode) is determined to maximize the throughput of the full duplex node in-band under interference attack.

In the implementation, let P be a random transition probability matrix of |S|×|S|, and the element P (S, S ') be the transition probability from state S to state S' when using static policy xi; let h ^(t) (xi) is the expected return of the transmitter at time t starting with initial state s, h ^t (Ξ)＝(h ^(t) (1,Ξ),...,h ^(t) (|S|, N.sub.normal operation), which is the expected return vector of all initial states S epsilon Q, has h ^(t) (Ξ)＝P ^t h (xi), h (xi) = (h (1, xi),., h (|s|, xi)) is the expected return vector of the transmitter for all|q|initial states; if the Markov chain given the stability policy ζ ε, is irreducible, the average return of the transmitter from state s is

For any implementation of action v with non-zero probability according to transition probability ₁ Or u ₁ The transmitter accesses state R and transitions to state x with non-zero probability _m And y _m In this case the Markov chain is irreducible and the average return is well defined; but cannot always ensure the irreducibility of the markov chain; when the probability of fading 1-p is so small that transmission failure indicates interference, the transmitter does not have to determine the cause of the failure and jump to another channel, which would not access state R; the choice of the average return criterion depends on the channel quality.

Adopts the optimal Nash equilibrium pure strategy xi ^* Maximizing the average return of the transmitter, the pure policy form is described as an optimization problem

maximize∑ _s∈S ∑ _q∈Q h(s，q)c _s，q

Wherein 1 is 1× which is all 14|S |vector, V is 4|S |×|s|matrix, 4|S |×1 vector c= [ c ] _s,q ]＝([c(1,u ₁ ),...,c(1,v ₂ )],...,[c(|S|,u ₁ ),...,c(|S|,v ₂ )]) Is a solution to the optimization problem, if s ' +.s, then the matrix element V (s ', (s, q)) = -P (s ' |s, q); alternatively, if s ' =s, then the matrix element V (s ', (s, q))=1-P (s ' |s, q); if c (s, q)>0, then ζ ^* (s, q) =1, otherwise ζ ^* (s, q) =0. Interaction between the in-band full duplex node and the interference source is expressed as constraint zero and Markov game, the optimal strategy has a threshold structure, and the in-band full duplex node stays on the same channel until a certain number of time slots before hopping; the best defense strategy for the node is determined from the balance of the markov game to determine the node hop time and which mode of operation to use to maximize the throughput of the full duplex node in-band under the interference attack.

Example 2

The embodiment of the invention provides an in-band full-duplex cognitive wireless network self-interference suppression device, which comprises a processor and a memory, wherein a computer program is stored in the memory, and when the computer program is executed by the processor, the in-band full-duplex cognitive wireless network self-interference suppression method described in the embodiment 1 is realized.

Example 3

An embodiment of the present invention provides a computer readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the in-band full duplex cognitive wireless network self-interference suppression method as described in embodiment 1.

The invention discloses a self-interference suppression method, a self-interference suppression device and a computer readable storage medium for an in-band full-duplex cognitive wireless network, wherein different cooperative operation modes are established for in-band full-duplex nodes by being provided with full-duplex antennas, node switching and transmission cost is acquired, and an interference source with average power constraint is acquired; acquiring a reaction scanning interference source of in-band full duplex according to the interference source of average power constraint, and determining defense strategies of different cooperative operation modes of the nodes according to the reaction scanning interference source and the switching and transmission cost of the nodes; determining an anti-interference strategy according to the defense strategies of the nodes in different cooperative operation modes, and acquiring an optimal defense strategy corresponding to the expected return maximization utility and an optimal defense strategy of average return utility according to the anti-interference strategy; the interference suppression capability of the in-band full duplex cognitive wireless network is improved.

Aiming at interference threat, the technical scheme optimally utilizes the synchronous transmitting and receiving functions of equipment, faces to power limited interference with an in-band full duplex function, and aims at an in-band full duplex reaction scanning interference source, and the ratio of the number of successfully decoded data packets to the number of received data packets is used as an index of potential interference action; the technical scheme of the invention introduces a transmission-detection operation mode besides an in-band full duplex transmission-reception operation mode, so that the in-band full duplex radio has better cognitive ability on interference; together with low packet detection rate, the throughput of the in-band full duplex node under interference attack is improved by allowing switching to a "transmission-detection" mode to effectively detect interference.

The technical scheme of the invention combines the switching cost and the transmission cost, establishes an optimal anti-interference strategy of the in-band full duplex node, and represents the interaction between the in-band full duplex radio and the in-band full duplex 'reaction-scanning' interference source with limited power as balanced Markov game; by jointly optimizing frequency hopping and two mode switching, the in-band full duplex node is made more interference tolerant than using adaptive frequency hopping alone, and system throughput is maximized by determining network node hopping time and which two modes of operation are used.

The present invention is not limited to the above-mentioned embodiments, and any changes or substitutions that can be easily understood by those skilled in the art within the technical scope of the present invention are intended to be included in the scope of the present invention.

Claims

1. The self-interference suppression method for the in-band full duplex cognitive wireless network is characterized by comprising the following steps of:

providing full duplex antenna, establishing different cooperative operation modes for in-band full duplex node to obtain

The node switching and transmission cost is taken, and an interference source with average power constraint is obtained;

determining an anti-interference strategy according to the defense strategies of the nodes in different cooperative operation modes, and acquiring an optimal defense strategy corresponding to the expected return maximization utility and an optimal defense strategy of average return utility according to the anti-interference strategy;

the establishing different cooperative operation modes for the in-band full duplex node specifically comprises the following steps: establishing a transmission receiving or transmission detection cooperative operation mode for an in-band full duplex node, wherein the transmission detection cooperative operation mode is that one node only receives a data packet from the other node;

According to the response scanning interference source, the switching and transmission cost of the nodes, determining defense strategies of different cooperative operation modes of the nodes specifically comprises the following steps:

2. The method for self-interference suppression of an in-band full duplex cognitive wireless network of claim 1, wherein the obtaining the in-band full duplex reactive scanning interference source according to the average power constrained interference source specifically comprises:

3. The method for self-interference suppression of an in-band full duplex cognitive wireless network according to claim 2, wherein the node update policy specifically comprises:

4. The method for self-interference suppression of an in-band full duplex cognitive wireless network according to claim 1, wherein determining the strategy for interference resistance according to the defense strategies of the nodes in different cooperative operation modes specifically comprises:

5. The method for self-interference suppression in an in-band full duplex cognitive wireless network of claim 4, wherein the obtaining an optimal defense strategy corresponding to a desired return maximization utility according to the strategy for anti-interference comprises:

6. The method for self-interference suppression in an in-band full duplex cognitive wireless network of claim 5, wherein the obtaining the best defense strategy for average return utility according to the strategy for anti-interference comprises:

7. An in-band full duplex cognitive wireless network self-interference suppression device, comprising a processor and a memory, wherein the memory stores a computer program, and the computer program, when executed by the processor, implements the in-band full duplex cognitive wireless network self-interference suppression method according to any one of claims 1-6.

8. A computer readable storage medium having stored thereon a computer program, which when executed by a processor, implements the in-band full duplex cognitive radio network self-interference suppression method according to any of claims 1-6.