CN107483486A

CN107483486A - Network defense strategy selection method based on stochastic evolutionary game model

Info

Publication number: CN107483486A
Application number: CN201710827946.9A
Authority: CN
Inventors: 黄健明; 张恒巍; 王衡军; 王晋东; 王娜; 寇广
Original assignee: PLA Information Engineering University
Current assignee: PLA Information Engineering University
Priority date: 2017-09-14
Filing date: 2017-09-14
Publication date: 2017-12-15
Anticipated expiration: 2037-09-14
Also published as: CN107483486B

Abstract

The invention belongs to the technical field of network security, and particularly relates to a network defense strategy selection method based on a stochastic evolutionary game model, including: building an asymmetric network attack and defense stochastic evolutionary game model based on a random dynamic system; The network attack and defense stochastic evolutionary game system is obtained from the differential equation; the Milstein method is used to numerically solve the network attack and defense stochastic evolutionary game system to obtain the equilibrium solution of the offensive and defensive evolution; for the equilibrium solution of the offensive and defensive evolution, according to the stability theorem of the solution of the stochastic differential equation, both the offensive and the defensive The stability analysis of the policy selection state is carried out, and the network security defense strategy in the equilibrium solution is output. The invention solves the problem that the traditional deterministic game model is not accurate enough in the selection of network defense strategies, can more accurately analyze the random dynamic evolution process between the attack and defense decision makers with bounded rationality, enhances the practicability of security defense strategy selection, and has great impact on network security defense. Technology has important guiding significance.

Description

Network defense strategy selection method based on stochastic evolutionary game model

技术领域technical field

本发明属于网络安全技术领域，特别涉及一种基于随机演化博弈模型的网络防御策略选取方法。The invention belongs to the technical field of network security, in particular to a method for selecting a network defense strategy based on a stochastic evolutionary game model.

背景技术Background technique

目前，网络攻击手段日益复杂化、智能化和多样化，攻击者的攻击目的也不断转向经济利益驱动。直面网络空间安全领域的诸多挑战，增强网络安全防御能力，确保网络空间安全已成为亟待解决的迫切问题。博弈论是研究决策主体之间行为直接相互作用的决策理论，具有目标对立性、关系非合作性、策略依存性等特点均与网络攻防的基本特征吻合。因此，将博弈理论应用于网络攻防过程的建模与分析成为近几年的研究热点。但已有研究成果具有一个共同特征，即所有模型和方法均建立在确定性攻防条件下。在实际攻防过程中，攻击手段的选择、系统运行环境的改变及其他外来因素的干扰等均具有一定的随机性，因此，对随机因素进行考虑能够提高模型和方法的有效性和准确性。At present, the means of network attacks are increasingly complex, intelligent, and diversified, and the attack goals of attackers are also increasingly driven by economic interests. Facing many challenges in the field of cyberspace security, enhancing cybersecurity defense capabilities, and ensuring cyberspace security have become urgent issues to be solved. Game theory is a decision-making theory that studies the direct interaction between the behaviors of decision-making subjects. It has characteristics such as goal antagonism, relationship non-cooperation, and strategy dependence, which are consistent with the basic characteristics of network attack and defense. Therefore, applying game theory to the modeling and analysis of the network attack and defense process has become a research hotspot in recent years. However, the existing research results have a common feature, that is, all models and methods are established under deterministic attack and defense conditions. In the actual attack and defense process, the selection of attack means, the change of system operating environment and the interference of other external factors all have certain randomness. Therefore, considering random factors can improve the effectiveness and accuracy of models and methods.

网络安全的本质在于攻防对抗，因此从攻防对抗的角度出发，研究探索网络安全分析方法和防御技术体系，具有重要现实意义。博弈论是研究决策主体之间行为直接相互作用的决策理论，具有目标对立性、关系非合作性、策略依存性等特点均与网络攻防的基本特征吻合。因此，将博弈理论应用于网络攻防过程的建模与分析成为近几年的研究热点。由于传统博弈模型大都建立在行为者完全理性的前提下，与实际情况不符，基于非完全理性的演化博弈理论更加符合攻防对抗的实际，但目前使用最多的复制动态学习机制并未考虑攻防过程中存在的各类随机干扰因素的影响，确定型博弈模型降低了其实际的应用价值。网络攻防演化博弈模型ADEGM(Attack-Defense Evolutionary Game Model)表示为4元组，ADEGM＝(N,S,P,U)，其中，N＝(N_D,N_A)是演化博弈的参与者空间。其中，N_D为防御方，N_A为攻击方。S＝(DS,AS)是博弈策略空间。其中DS＝{DS₁,DS₂,…DS_n}表示防御者的可选策略集,AS＝{AS₁,AS₂,…AS_m}表示攻击者的可选策略集。P＝(p,q)是博弈信念集合。其中p_i表示攻击者选择攻击策略AS_i的概率，q_j表示防御者选防御策略DS_j的概率。U＝(U_D,U_A)是收益函数集合，表示参与者的博弈收益，由所有参与者的策略共同决定。传统博弈理论应用于网络安全防御策略选取存在以下缺点：(1)经典博弈模型中的行为者完全理性前提假设与实际情况不符，而现实中由于人的决策能力是有限的，即决策者实际属于非完全理性个体。忽视行为者有限理性条件会对最终的博弈结果产生重大影响，使最终的博弈均衡结果与实际相差较大，从而降低了模型和方法的有效性。(2)传统演化博弈理论以复制动态学习机制为基础，决策者通过学习调整自身策略，使自身收益达到最大，但并未考虑博弈过程中存在的各类随机因素的干扰问题。在实际攻防过程中，攻击手段的选择、系统运行环境的改变及其他外来因素的干扰等均具有一定的随机性，因此，忽略对随机因素的考虑会降低模型和方法的有效性和准确性。The essence of network security lies in the confrontation between attack and defense. Therefore, it is of great practical significance to study and explore network security analysis methods and defense technology systems from the perspective of attack and defense confrontation. Game theory is a decision-making theory that studies the direct interaction between the behaviors of decision-making subjects. It has characteristics such as goal antagonism, relationship non-cooperation, and strategy dependence, which are consistent with the basic characteristics of network attack and defense. Therefore, applying game theory to the modeling and analysis of the network attack and defense process has become a research hotspot in recent years. Because most of the traditional game models are based on the premise that the actors are completely rational, which is inconsistent with the actual situation, the evolutionary game theory based on incomplete rationality is more in line with the reality of offensive and defensive confrontation. Due to the influence of various random interference factors, the deterministic game model reduces its practical application value. The network attack-defense evolutionary game model ADEGM (Attack-Defense Evolutionary Game Model) is expressed as a 4-tuple, ADEGM=(N,S,P,U), where N ₌ ( _ND ,NA ) is the participant space of the evolutionary game . Among them, N _D is the defending party, and N _A is the attacking party. S=(DS, AS) is the game strategy space. Among them, DS＝{DS ₁ , DS ₂ ,...DS _n } represents the optional policy set of the defender, and AS＝{AS ₁ , AS ₂ ,...AS _m } represents the optional policy set of the attacker. P=(p,q) is the set of game beliefs. Among them, p _i represents the probability that the attacker chooses the attack strategy AS _i , and q _j represents the probability that the defender chooses the defense strategy DS _j . U=(UD , U _A ) is _a collection of profit functions, which represents the game profit of the participants, and is jointly determined by the strategies of all the participants. The application of traditional game theory to the selection of network security defense strategies has the following disadvantages: (1) The premise assumption of complete rationality of actors in the classic game model does not match the actual situation. not perfectly rational individuals. Neglecting the actor's bounded rationality conditions will have a significant impact on the final game result, making the final game equilibrium result quite different from the reality, thus reducing the effectiveness of the model and method. (2) The traditional evolutionary game theory is based on replicating the dynamic learning mechanism. Decision makers adjust their own strategies through learning to maximize their own benefits, but they do not consider the interference of various random factors in the game process. In the actual attack and defense process, the selection of attack means, the change of system operating environment and the interference of other external factors all have certain randomness. Therefore, ignoring the consideration of random factors will reduce the effectiveness and accuracy of models and methods.

发明内容Contents of the invention

针对现有技术中的不足，本发明提供一种基于随机演化博弈模型的网络防御策略选取方法，解决传统确定博弈模型应用于网络防御策略选取不够准确等问题，能够更加准确地分析有限理性的攻防决策者之间的随机动态演化过程，增强安全防御策略选取的实用性和指导意义。Aiming at the deficiencies in the prior art, the present invention provides a network defense strategy selection method based on a stochastic evolutionary game model, which solves the problem that the traditional deterministic game model is not accurate enough in network defense strategy selection, and can more accurately analyze the attack and defense of bounded rationality The stochastic dynamic evolution process among decision makers enhances the practicability and guiding significance of security defense strategy selection.

按照本发明所提供的设计方案，一种基于随机演化博弈模型的网络防御策略选取方法，包含：According to the design scheme provided by the present invention, a method for selecting a network defense strategy based on a stochastic evolutionary game model includes:

基于随机动力系统，构建非对称网络攻防随机演化博弈模型；并借鉴高斯白噪声，采用Itó随机微分方程得到网络攻防随机演化博弈系统；Based on the stochastic dynamical system, build an asymmetric network attack and defense stochastic evolutionary game model; and learn from Gaussian white noise, use Itó stochastic differential equation to obtain the network attack and defense stochastic evolutionary game system;

采用Milstein方法对网络攻防随机演化博弈系统进行数值求解，获取攻防演化的均衡解；Using the Milstein method to numerically solve the network attack and defense stochastic evolutionary game system, and obtain the equilibrium solution of the attack and defense evolution;

针对攻防演化的均衡解，根据随机微分方程解的稳定性定理对攻防双方的策略选取状态进行稳定性分析，并输出均衡解中的网络安全防御策略。Aiming at the equilibrium solution of the evolution of attack and defense, according to the stability theorem of the stochastic differential equation solution, the stability analysis of the strategy selection state of both the attacker and the defense is carried out, and the network security defense strategy in the equilibrium solution is output.

上述的，所述的网络攻防随机演化博弈模型采用五元组表示。As mentioned above, the network attack and defense stochastic evolutionary game model is represented by quintuples.

优选的，网络攻防随机演化模型ADEGM＝(N,S,P,Δ,U)，其中，N＝(N_D,N_A)是演化博弈的参与者空间，N_D表示防御方，N_A表示攻击方；S＝(DS,AS)是博弈策略空间，DS表示防御者的可选策略集，AS表示攻击者的可选策略集；P＝(q,p)是博弈信念集合，q表示防御者选取不同防御策略的概率集合，p表示攻击者选取不同攻击策略的概率集合；Δ＝{δ₁,δ₂}是随机干扰强度系数集合，δ₁表示随机干扰对防御方的影响强度系数，δ₂表示随机干扰对攻击方的影响强度系数，且满足δ₁>0,δ₂>0；U＝(U_D,U_A)是博弈收益函数集合，U_D表示防御者的博弈收益，U_A表示攻击者的博弈收益，攻防收益值由攻防决策者选取的策略共同决定。Preferably, the network attack and defense stochastic evolution model ADEGM=(N,S,P,Δ,U), where N=(N _D ,NA ₎ is the participant space of the evolutionary game, N _D represents the defender, and N _A represents Attacker; S=(DS,AS) is the game strategy space, DS represents the optional strategy set of the defender, AS represents the optional strategy set of the attacker; P=(q,p) is the game belief set, q represents the defense The probability set that the attacker chooses different defense strategies, p represents the probability set that the attacker chooses different attack strategies; Δ={δ ₁ ,δ ₂ } is the set of random interference strength coefficients, δ ₁ represents the influence strength coefficient of random interference on the defender, δ ₂ represents the influence intensity coefficient of random interference on the attacker, and satisfies δ ₁ >0, δ ₂ >0; U=( _{UD , U A} ₎ is the game revenue function set, _UD represents the defender’s game revenue, U _A represents the game income of the attacker, and the attack and defense income value is jointly determined by the strategy selected by the attack and defense decision-maker.

优选的，防御方的可选策略集DS＝{DS₁,DS₂}，其中，DS₁表示防御者采取强防御策略，DS₂表示防御者采取弱防御策略；攻击方的可选策略集AS＝{AS₁,AS₂}，其中，AS₁表示攻击者实施强攻击策略，AS₂表示攻击者实施弱攻击策略。Preferably, the defender's optional policy set DS={DS ₁ , DS ₂ }, where DS ₁ indicates that the defender adopts a strong defense strategy, and DS ₂ indicates that the defender adopts a weak defense strategy; the attacker's optional policy set AS ={AS ₁ , AS ₂ }, wherein, AS ₁ indicates that the attacker implements a strong attack strategy, and AS ₂ indicates that the attacker implements a weak attack strategy.

优选的，网络攻防随机演化博弈系统的获取，包含如下内容：Preferably, the acquisition of the network attack and defense stochastic evolutionary game system includes the following content:

A1)、构建防御方的类型空间集合D＝{d_i,i≥1}；构建防御者可选策略空间集合DS＝{DS_j,1≤j≤m}，其中，m为攻击方决策者可选策略数目；A1) Construct the defender’s type space set D={d _i , i≥1}; construct the defender’s optional strategy space set DS={DS _j ,1≤j≤m}, where m is the attacker’s decision maker Number of optional strategies;

A2)、针对攻击方所选攻击策略，以概率q_i选取防御策略DS_i，其中，1≤i≤m；A2), according to the attack strategy selected by the attacker, select the defense strategy DS _i with the probability q _i , where, 1≤i≤m;

A3)、计算防御方的平均收益构建攻防随机干扰强度系数集合Δ＝{δ₁,δ₂}，其中δ₁>0,δ₂>0；A3), calculate the average income of the defender Construct a set of offensive and defensive random interference strength coefficients Δ={δ ₁ ,δ ₂ }, where δ ₁ >0,δ ₂ >0;

A4)、借鉴高斯白噪声并采用随机微分方程描述攻防双方演化博弈的随机干扰，得到防御方和攻击方的随机复制动态微分方程；A4), learn from Gaussian white noise and use stochastic differential equations to describe the random interference of the evolutionary game between the attacker and the defense, and obtain the random replication dynamic differential equations of the defender and the attacker;

A5)、联立防御方和攻击方的随机复制动态微分方程，得到网络攻防随机演化博弈系统。A5) Simultaneously replicate the dynamic differential equations of the defender and the attacker to obtain a stochastic evolutionary game system for network attack and defense.

优选的，A3)中计算防御方的平均收益包含：结合网络攻防博弈树，获取博弈收益矩阵；根据博弈收益矩阵，计算攻防双方的平均收益，其中，防御方的平均收益为防御方的期望收益。Preferably, in A3), the average income of the defender is calculated Including: Combining the network attack and defense game tree to obtain the game income matrix; according to the game income matrix, calculate the average income of both attacking and defending parties, among which, the average income of the defending party is the defense's expected payoff.

优选的，A5)中，网络攻防随机演化博弈系统表示为：Preferably, in A5), the network attack and defense stochastic evolutionary game system is expressed as:

， ,

其中，C_d表示防御方选择强防御策略时所需的防御成本；C_a表示攻击方选择强攻击策略时所需的攻击成本；V_a表示防御方选取弱防御策略时，攻击方选择强攻击策略能够获得的攻击回报；V_ad表示防御方选取强防御策略时，攻击方选择强攻击策略能够获得的攻击回报，且满足V_a>V_ad；q(t)和1-q(t)分别表示选取不同防御策略的防御者数量和选取不同防御策略的人数比例关于时间的函数；ω(t)属于一维的标准Brown运动，描述网络攻防过程中博弈演化受随机干扰因素的影响。Among them, C _d represents the defense cost when the defender chooses a strong defense strategy; C _a represents the attack cost when the attacker chooses a strong attack strategy; V _a represents that when the defender chooses a weak defense strategy, the attacker chooses a strong attack The attack return that the strategy can obtain; V _ad represents the attack return that the attacker can obtain when the defender chooses a strong defense strategy, and satisfies V _a >V _ad ; q(t) and 1-q(t) respectively Indicates the number of defenders who choose different defense strategies and the proportion of people who choose different defense strategies as a function of time; ω(t) belongs to the one-dimensional standard Brownian motion, which describes the game evolution in the network attack and defense process is affected by random interference factors.

优选的，获取攻防演化的均衡解，具体包含：Preferably, obtaining an equilibrium solution of attack and defense evolution, specifically includes:

B1)、根据Itó随机微分方程对网络攻防随机演化博弈系统中防御方和攻击方两者的随机演化微分方程进行随机泰勒展开；B1), carry out stochastic Taylor expansion to the stochastic evolution differential equations of both the defender and the attacker in the network attack and defense stochastic evolutionary game system according to the Itó stochastic differential equation;

B2)、采用Milstein方法对网络攻防随机演化博弈系统中微分方程进行数值求解，得到相应的攻防演化均衡解。B2), using the Milstein method to numerically solve the differential equations in the network attack-defense stochastic evolutionary game system, and obtain the corresponding attack-defense evolutionary equilibrium solution.

进一步，B1)中，Itó随机微分方程表示为dx(t)＝f(t,x(t))dt+g(t,x(t))dω(t)，其中，t∈[t₀,T]，x(t₀)＝x₀，x₀∈R，ω(t)属于一维的标准Brown运动，服从正态分布N(0,t)，dω(t)服从正态分布N(0,Δt)，其中，T表示时间维度的延续，R为实数。Further, in B1), the Itó stochastic differential equation is expressed as dx(t)=f(t,x(t))dt+g(t,x(t))dω(t), where t∈[t ₀ , T], x(t ₀ )=x ₀ , x ₀ ∈ R, ω(t) belongs to one-dimensional standard Brownian motion, obeys normal distribution N(0,t), dω(t) obeys normal distribution N( 0,Δt), where T represents the continuation of the time dimension, and R is a real number.

上述的，攻防双方的策略选取状态进行稳定性分析，验证网络攻防随机演化博弈系统的演化稳定策略，包含：当满足且C_d≥1，且C_a-V_ad≥1时，网络攻防随机演化博弈系统存在唯一演化稳定策略ESS(0,0)；当满足且C_d-V_a+V_ad+1≤0，且C_a-V_a+1≤0时，网络攻防随机演化博弈系统存在唯一演化稳定策略ESS(1,1)。As mentioned above, the stability analysis is carried out on the strategy selection state of the attacking and defending parties to verify the evolutionary stability strategy of the random evolutionary game system of the network attack and defense, including: when satisfying and C _d ≥ 1, And when C _a -V _ad ≥ 1, there is a unique evolutionary stable strategy ESS(0,0) in the network attack and defense stochastic evolutionary game system; when satisfying And C _d -V _a +V _ad +1≤0, And when C _a -V _a +1≤0, there is a unique evolutionary stable strategy ESS(1,1) in the network attack and defense stochastic evolutionary game system.

本发明的有益效果：Beneficial effects of the present invention:

本发明针对攻防博弈系统中存在各类随机干扰因素的问题，为提高模型的有效性和准确性，通过借鉴高斯白噪声的概念，描述攻防博弈过程中存在的系统运行环境改变、网络拓扑结构变化以及攻防策略的改变等各类随机干扰，改进传统复制动态演化博弈方法，利用非线性Itó随机微分方程构建非对称条件下的随机网络攻防演化博弈模型，用于描述网络攻防对抗的实时随机动态演化过程；对攻防随机微分方程进行数值求解，并根据随机微分方程稳定性判别定理对攻防双方的策略选取状态进行稳定性分析，确定随机攻防演化博弈模型的安全防御策略；最后，通过仿真验证了不同强度的随机干扰对攻防决策演化速率的影响，能够为网络攻击行为预测和安全防御策略选取提供一定的技术指导。与现有技术相比，本发明能够更加准确地分析有限理性的攻防决策者之间的随机动态演化过程，安全防御策略选取的实用性和指导意义更强。The present invention aims at the problems of various random interference factors in the offensive and defensive game system, in order to improve the validity and accuracy of the model, by referring to the concept of Gaussian white noise, it describes the changes of the system operating environment and the network topology in the process of the offensive and defensive game As well as various random disturbances such as the change of offensive and defensive strategies, the traditional replication dynamic evolutionary game method is improved, and the nonlinear Itó stochastic differential equation is used to construct a random network offensive and defensive evolutionary game model under asymmetric conditions, which is used to describe the real-time random dynamic evolution of network offensive and defensive confrontation process; numerically solve the attack-defense stochastic differential equation, and analyze the stability of the strategy selection state of both the attacker and the defense according to the stability discriminant theorem of the stochastic differential equation, and determine the security defense strategy of the stochastic attack-defense evolutionary game model; finally, the different The impact of random interference of intensity on the evolution rate of attack and defense decision-making can provide certain technical guidance for network attack behavior prediction and security defense strategy selection. Compared with the prior art, the present invention can more accurately analyze the random dynamic evolution process between attack and defense decision makers with bounded rationality, and the practicability and guiding significance of security defense strategy selection are stronger.

附图说明：Description of drawings:

图1为现有的基本网络攻防博弈树；Figure 1 is the existing basic network attack and defense game tree;

图2为本发明的方法流程示意图；Fig. 2 is a schematic flow chart of the method of the present invention;

图3为实施例中的网络攻防博弈树示意图；Fig. 3 is the schematic diagram of the network attack and defense game tree in the embodiment;

图4为实施例中网络攻防随机演化博弈系统的获取流程示意图；Fig. 4 is a schematic diagram of the acquisition process of the network attack and defense random evolution game system in the embodiment;

图5为实施例中攻防演化的均衡解获取流程示意图；Fig. 5 is the schematic flow diagram of obtaining the equilibrium solution of the attack-defense evolution in the embodiment;

图6为仿真实例中防御方的零解稳定策略演化趋势图；Figure 6 is the evolution trend diagram of the zero-solution stability strategy of the defender in the simulation example;

图7为仿真实例中攻击方的零解稳定策略演化趋势图；Figure 7 is the evolution trend diagram of the zero-solution stability strategy of the attacker in the simulation example;

图8为仿真实例中防御方的零解非稳定策略演化趋势图；Figure 8 is the evolution trend diagram of the zero-solution non-stable strategy of the defender in the simulation example;

图9为仿真实例中攻击方的零解非稳定策略演化趋势图。Figure 9 is the evolution trend diagram of the zero-solution unstable strategy of the attacker in the simulation example.

具体实施方式：detailed description:

为使本发明的目的、技术方案和优点更加清楚、明白，下面结合附图和技术方案对本发明作进一步详细的说明。实施例中涉及到的技术术语如下：In order to make the purpose, technical solution and advantages of the present invention more clear and understandable, the present invention will be further described in detail below in conjunction with the accompanying drawings and technical solutions. The technical term involved in the embodiment is as follows:

演化博弈论(Evolutionary Game Theory)：源于Darwin的生物进化论，继承了生物学对于物种进化的理论阐述，从个体有限理性条件出发，以群体行为为研究对象，在阐述生物物种的发展历程和进化选择中，解释了生物行为的进化博弈过程。通过长期的试错、模仿和改进，所有的博弈方都会趋于某个稳定的策略，该策略可能在群体组织中长期稳定下来，这种稳定的策略均衡就与生物进化的进化稳定策略非常相似，以达到一种相对和谐的博弈均衡状态。复制动态(Replicator Dynamic)：在由有限理性博弈方组成的群体中，博弈者通过不断试错、学习、改进自身策略，使博弈结果比平均水平好的策略逐步被更多博弈方采用，从而群体中采用各种策略的博弈方的比例会发生变化。纳什均衡(Nash Equilibrium)：在博弈G＝{S₁,…,S_n；u₁,…,u_n}中，各博弈方的各一个策略组成的某个策略组合中，任意博弈方i的策略若满足条件：对任意的s_ij∈S_i都成立，则称为博弈G的一个纳什均衡。有限理性(BoundedRationality)：指行为者在博弈过程中通过博弈分析而找到最优策略，且不会因为遗忘、失误、任性等原因偏离最佳选择。在传统博弈理论中，一般以行为者完全理性为前提，即行为者的判断选择能力有限，且在决策过程中会“犯错误”。演化稳定策略(ESS，EvolutionaryStable Strategy)：是指在具有明确定义下不会被突变体入侵的策略，是演化博弈中具有真正稳定性和较强预测能力的均衡策略。它是生物进化理论中具有较强抗干扰能力且在受到干扰后仍能“恢复”的稳健性均衡概念，是演化博弈分析中最核心的均衡概念。Evolutionary Game Theory: Originated from Darwin's theory of biological evolution, it inherits the theoretical explanation of biology for species evolution. Starting from the condition of individual bounded rationality and taking group behavior as the research object, it expounds the development process and evolution of biological species. In selection, the evolutionary game process of biological behavior is explained. Through long-term trial and error, imitation and improvement, all players in the game will tend to a stable strategy, which may be stable in the group organization for a long time. This stable strategy equilibrium is very similar to the evolutionary stable strategy of biological evolution. , in order to achieve a relatively harmonious game equilibrium state. Replicator Dynamic: In a group composed of bounded rational players, players through continuous trial and error, learning, and improving their own strategies, make the strategy with a better game result than the average level gradually adopted by more players, so that the group In , the proportion of players using various strategies will change. Nash Equilibrium: In the game G={S ₁ ,…,S _n ; u ₁ ,…,u _n }, a strategy combination composed of each player’s strategy In , the strategy of any player i If the conditions are met: is true for any s _ij ∈ S _i , then it is called is a Nash equilibrium of the game G. Bounded Rationality: means that actors find the optimal strategy through game analysis during the game process, and will not deviate from the optimal choice due to forgetting, mistakes, willful and other reasons. In traditional game theory, it is generally assumed that the actor is completely rational, that is, the actor has limited ability to judge and choose, and will "make mistakes" in the decision-making process. Evolutionary Stable Strategy (ESS, EvolutionaryStable Strategy): It refers to a strategy that will not be invaded by mutants under a clear definition. It is an equilibrium strategy with real stability and strong predictive ability in the evolutionary game. It is a robust equilibrium concept in biological evolution theory that has strong anti-interference ability and can still "recover" after being disturbed, and is the core equilibrium concept in evolutionary game analysis.

现有网络攻防演化博弈模型ADEGM(Attack-Defense Evolutionary Game Model)可以表示为4元组，ADEGM＝(N,S,P,U)，其中N＝(N_D,N_A)是演化博弈的参与者空间，其中，N_D为防御方，N_A为攻击方。S＝(DS,AS)是博弈策略空间，DS＝{DS₁,DS₂,…DS_n}表示防御者的可选策略集,AS＝{AS₁,AS₂,…AS_m}表示攻击者的可选策略集。P＝(p,q)是博弈信念集合，p_i表示攻击者选择攻击策略AS_i的概率，q_j表示防御者选防御策略DS_j的概率。U＝(U_D,U_A)是收益函数集合，表示参与者的博弈收益，由所有参与者的策略共同决定。在网络攻防对抗中，攻击方A和防御方D的决策者均有多个策略可供选择，假设攻防双方决策者的可选策略集分别为{AS₁,AS₂…AS_m}、{DS₁,DS₂…DS_n}(其中m,n∈N且m,n≥2)，在博弈过程的不同阶段，策略被攻防决策者采用的概率不同，且该概率随着时间的推移在学习机制的作用下不断变化，从而使攻防策略选取形成一个动态变化过程。形成的攻防博弈树如图1所示。p_i表示选择攻击策略AS_i的概率，q_j表示选防御策略DS_j的概率。采用不同策略进行攻防对抗时，会产生相应的攻防收益值。其中a_ij和b_ij分别表示攻击者和防御者采取AS_i、DS_j时各自的收益。对于防御方，策略的选取有n种可能，决策者以不同的概率q_i对各个防御策略DS_i进行选取，但对于整个策略集满足条件：q₁+q₂+…+q_n＝1。同样，攻击方针对自身m种可选策略，决策者以不同的概率p_i对各个攻击策略AS_i进行选取，对于整个策略集满足：p₁+p₂+…+p_m＝1。The existing network attack-defense evolutionary game model ADEGM (Attack-Defense Evolutionary Game Model) can be expressed as a 4-tuple, ADEGM=(N,S,P,U), where N ₌ ( _ND ,NA ) is the participation in the evolutionary game In the space of attackers, N _D is the defender, and N _A is the attacker. S=(DS, AS) is the game strategy space, DS={DS ₁ ,DS ₂ ,…DS _n } means the defender’s optional strategy set, AS={AS ₁ ,AS ₂ ,…AS _m } means the attacker An optional policy set for . P=(p,q) is the game belief set, p _i represents the probability that the attacker chooses the attack strategy AS _i , and q _j represents the probability that the defender chooses the defense strategy DS _j . U=(UD , U _A ) is _a collection of profit functions, which represents the game profit of the participants, and is jointly determined by the strategies of all the participants. In the network attack and defense confrontation, the decision makers of the attacker A and the defender D have multiple strategies to choose from, assuming that the optional strategy sets of the decision makers of both the attacker and the defense are {AS ₁ , AS ₂ ... AS _m }, {DS ₁ ,DS ₂ …DS _n } (where m,n∈N and m,n≥2), at different stages of the game process, the probability of the strategy being adopted by the offensive and defensive decision-makers is different, and this probability is learning over time Under the action of the mechanism, it is constantly changing, so that the selection of offensive and defensive strategies forms a dynamic process of change. The formed offensive and defensive game tree is shown in Figure 1. pi represents the probability of choosing the attack strategy AS _i , and _q _j represents the probability of choosing the defense strategy DS _j . When different strategies are used for offensive and defensive confrontation, corresponding offensive and defensive benefits will be generated. Among them, a _ij and b _ij represent the respective incomes of the attacker and the defender when taking AS _i and DS _j respectively. For the defender, there are n possibilities for strategy selection, and the decision maker selects each defense strategy DS _i with different probabilities q _i , but the whole strategy set satisfies the condition: q ₁ +q ₂ +…+q _n =1. Similarly, the attacker has m optional strategies for itself, and the decision maker selects each attack strategy AS _i with different probabilities p _i , and the entire strategy set satisfies: p ₁ +p ₂ +...+p _m =1.

基于以上条件，计算防御方不同防御策略的期望收益和平均收益 Based on the above conditions, calculate the expected return of the defender's different defense strategies and average earnings

由于防御收益较低者会学习模仿高收益者所选取的策略，针对防御策略集中的可选策略{DS₁,DS₂…DS_n}，选取不同策略的人数比例将随着时间的推移而发生变化，采用q_i(t)表示，其中q_i(t)表示选择防御策略DS_i的人数比例，且满足：对于某个特定防御策略DS_i，选取该策略的人数比例是时间的函数，其动态变化速率可以用复制动态方程进行表示：Since those with low defensive payoffs will learn to imitate the strategies chosen by those with high payoffs, for the alternative strategies {DS ₁ ,DS ₂ …DS _n } in the defensive strategy set, the proportion of people who choose different strategies will occur over time The change is represented by q _i (t), where q _i (t) represents the proportion of people who choose the defense strategy DS _i , and satisfies: For a specific defense strategy DS _i , the proportion of people who choose this strategy is a function of time, and its dynamic change rate can be expressed by the replication dynamic equation:

同理，针对攻击方策略集中的可选策略{AS₁,AS₂…AS_m}，选取不同策略的人数比例随时间动态变化，分别用p_i(t)来进行表示，其中p_i(t)满足：针对攻击方的任意可选攻击策略AS_i可以得到相应的复制动态方程：Similarly, for the optional strategies {AS ₁ , AS ₂ …AS _m } in the attacking party’s strategy set, the proportion of people who choose different strategies changes dynamically with time, which are represented by p _i (t), where p _i (t )Satisfy: Any optional attack strategy AS _i for the attacker can get the corresponding replication dynamic equation:

联立以上两个复制动态方程，令通过求解，即可得到网络攻防演化博弈平衡状态点，可以实现安全防御策略选取的分析和预测。但是，演化博弈理论以复制动态学习机制为基础，决策者通过学习调整自身策略，使自身收益达到最大，但并未考虑博弈过程中存在的各类随机因素的干扰问题。在实际攻防过程中，攻击手段的选择、系统运行环境的改变及其他外来因素的干扰等均具有一定的随机性，因此，忽略对随机因素的考虑会降低模型和方法的有效性和准确性。鉴于此，本发明实施例提供了一种基于随机演化博弈模型的网络防御策略选取方法，参见图2所示，包含：Combining the above two replication dynamic equations, let By solving, the equilibrium state point of the network attack and defense evolutionary game can be obtained, and the analysis and prediction of security defense strategy selection can be realized. However, evolutionary game theory is based on replicating the dynamic learning mechanism. Decision makers adjust their own strategies through learning to maximize their own benefits, but they do not consider the interference of various random factors in the game process. In the actual attack and defense process, the selection of attack means, the change of system operating environment and the interference of other external factors all have certain randomness. Therefore, ignoring the consideration of random factors will reduce the effectiveness and accuracy of models and methods. In view of this, an embodiment of the present invention provides a method for selecting a network defense strategy based on a stochastic evolutionary game model, as shown in FIG. 2 , including:

101、基于随机动力系统，构建非对称网络攻防随机演化博弈模型；并借鉴高斯白噪声，采用Itó随机微分方程得到网络攻防随机演化博弈系统；101. Based on the stochastic dynamic system, build an asymmetric network attack and defense stochastic evolution game model; and learn from Gaussian white noise, use Itó stochastic differential equation to obtain the network attack and defense stochastic evolution game system;

102、采用Milstein方法对网络攻防随机演化博弈系统进行数值求解，获取攻防演化的均衡解；102. Use the Milstein method to numerically solve the network attack and defense stochastic evolutionary game system, and obtain the equilibrium solution of the attack and defense evolution;

103、针对攻防演化的均衡解，根据随机微分方程解的稳定性定理对攻防双方的策略选取状态进行稳定性分析，并输出均衡解中的网络安全防御策略。103. For the equilibrium solution of the evolution of attack and defense, according to the stability theorem of the stochastic differential equation solution, the stability analysis of the strategy selection state of both the attack and defense is carried out, and the network security defense strategy in the equilibrium solution is output.

解决传统确定博弈模型应用于网络防御策略选取不够准确的问题。为提高模型的有效性和准确性，本发明借鉴高斯白噪声的概念，描述攻防博弈过程中存在的系统运行环境改变、网络拓扑结构变化以及攻防策略的改变等各类随机干扰。通过构建非对称条件下的随机网络攻防演化博弈模型，用于描述网络攻防对抗的实时随机动态演化过程。对攻防双方的Itó随机微分方程进行数值求解，并根据随机微分方程稳定性判别定理对攻防双方的策略选取状态进行稳定性分析。该模型和方法能够更加准确地描述网络攻防策略选取动态变化过程。Solve the problem that the traditional determined game model is not accurate enough in the selection of network defense strategies. In order to improve the validity and accuracy of the model, the present invention draws on the concept of Gaussian white noise to describe various random disturbances such as system operating environment changes, network topology changes, and offensive and defensive strategy changes in the offensive and defensive game process. By constructing a stochastic network attack-defense evolutionary game model under asymmetric conditions, it is used to describe the real-time stochastic dynamic evolution process of network attack-defense confrontation. The Itó stochastic differential equations of the offensive and defensive sides are numerically solved, and the stability analysis of the strategy selection state of the offensive and defensive sides is carried out according to the stability discriminant theorem of stochastic differential equations. The model and method can more accurately describe the dynamic change process of network attack and defense strategy selection.

基于随机动力系统，结合网络攻防特点，以演化博弈理论为基础，构建有限理性条件下的非对称网络攻防随机演化博弈模型。在本发明的另一个实施例中，所述的网络攻防随机演化博弈模型采用五元组表示。进一步地，网络攻防随机演化模型ADEGM＝(N,S,P,Δ,U)，其中，N＝(N_D,N_A)是演化博弈的参与者空间，N_D表示防御方，N_A表示攻击方；S＝(DS,AS)是博弈策略空间，DS表示防御者的可选策略集，AS表示攻击者的可选策略集；P＝(q,p)是博弈信念集合，q表示防御者选取不同防御策略的概率集合，p表示攻击者选取不同攻击策略的概率集合；Δ＝{δ₁,δ₂}是随机干扰强度系数集合，δ₁表示随机干扰对防御方的影响强度系数，δ₂表示随机干扰对攻击方的影响强度系数，且满足δ₁>0,δ₂>0；U＝(U_D,U_A)是博弈收益函数集合，U_D表示防御者的博弈收益，U_A表示攻击者的博弈收益，攻防收益值由攻防决策者选取的策略共同决定。Based on the stochastic dynamical system, combined with the characteristics of network attack and defense, and based on evolutionary game theory, a stochastic evolutionary game model of asymmetric network attack and defense under the condition of bounded rationality is constructed. In another embodiment of the present invention, the network attack and defense stochastic evolutionary game model is represented by a quintuple. Furthermore, the network attack and defense stochastic evolution model ADEGM=(N,S,P,Δ,U), where N=(N _D ,NA ₎ is the participant space of the evolutionary game, _ND represents the defender, and N _A represents Attacker; S=(DS,AS) is the game strategy space, DS represents the optional strategy set of the defender, AS represents the optional strategy set of the attacker; P=(q,p) is the game belief set, q represents the defense The probability set that the attacker chooses different defense strategies, p represents the probability set that the attacker chooses different attack strategies; Δ={δ ₁ ,δ ₂ } is the set of random interference strength coefficients, δ ₁ represents the influence strength coefficient of random interference on the defender, δ ₂ represents the influence intensity coefficient of random interference on the attacker, and satisfies δ ₁ >0, δ ₂ >0; U=( _{UD , U A} ₎ is the game revenue function set, _UD represents the defender’s game revenue, U _A represents the game income of the attacker, and the attack and defense income value is jointly determined by the strategy selected by the attack and defense decision-maker.

针对网络攻防对抗过程，为方便分析，将防御策略按防御强弱程度划分为强防御策略和弱防御策略两类，构建防御方的可选策略集DS＝{DS₁,DS₂}，其中DS₁表示防御者采取强防御策略，DS₂表示防御者采取弱防御策略。同理，针对攻击方，将攻击策略划分为强攻击策略和弱攻击策略两类，构建攻击方的可选策略集AS＝{AS₁,AS₂}，其中AS₁表示攻击者实施强攻击策略，AS₂表示攻击者实施弱攻击策略。本发明的另一个实施例，如图4所示，网络攻防随机演化博弈系统的获取，包含如下内容：For the process of network attack and defense confrontation, for the convenience of analysis, the defense strategy is divided into strong defense strategy and weak defense strategy according to the degree of defense strength, and the optional strategy set DS of the defender is constructed = {DS ₁ , DS ₂ }, where DS ₁ indicates that the defender adopts a strong defense strategy, and DS ₂ indicates that the defender adopts a weak defense strategy. Similarly, for the attacker, the attack strategy is divided into strong attack strategy and weak attack strategy, and the optional strategy set AS={AS ₁ , AS ₂ } for the attacker is constructed, where AS ₁ means that the attacker implements a strong attack strategy , AS ₂ indicates that the attacker implements a weak attack strategy. Another embodiment of the present invention, as shown in Figure 4, the acquisition of the network attack and defense random evolution game system includes the following content:

201)、构建防御方的类型空间集合D＝{d_i,i≥1}；构建防御者可选策略空间集合DS＝{DS_j,1≤j≤m}，其中，m为攻击方决策者可选策略数目；201), constructing the defender's type space set D={d _i , i≥1}; constructing the defender's optional strategy space set DS={DS _j ,1≤j≤m}, where m is the attacker's decision maker Number of optional strategies;

202)、针对攻击方所选攻击策略，以概率q_i选取防御策略DS_i，其中，1≤i≤m；202), for the attack strategy selected by the attacker, select the defense strategy DS _i with probability q _i , where, 1≤i≤m;

203)、计算防御方的平均收益构建攻防随机干扰强度系数集合Δ＝{δ₁,δ₂}，其中δ₁>0,δ₂>0；203), calculate the average income of the defender Construct a set of offensive and defensive random interference strength coefficients Δ={δ ₁ ,δ ₂ }, where δ ₁ >0,δ ₂ >0;

204)、借鉴高斯白噪声并采用随机微分方程描述攻防双方演化博弈的随机干扰，得到防御方和攻击方的随机复制动态微分方程；204), referring to Gaussian white noise and using stochastic differential equations to describe the random interference of the evolutionary game between the attacker and the defense, and obtaining the random replication dynamic differential equations of the defender and the attacker;

205)、联立防御方和攻击方的随机复制动态微分方程，得到网络攻防随机演化博弈系统。205), the random replication dynamic differential equations of the defender and the attacker simultaneously, and a random evolutionary game system of network attack and defense is obtained.

网络攻防对抗过程中，在博弈过程的不同阶段，策略被攻防决策者采用的概率不同，且该概率随着时间的推移在学习机制的作用下不断变化，从而使攻防策略选取形成一个动态变化过程。其对应的网络攻防博弈树如图3所示，p表示攻击者选取攻击策略AS₁的概率，1-p表示选取攻击策略AS₂的概率，且满足p∈[0,1]；q表示防御者选取防御策略DS₁的概率，1-q表示选取防御策略DS₂的概率，且满足q∈[0,1]。d_ij表示攻防策略对(AS_i,DS_j)所产生的防御收益值，a_ij表示攻防策略对(AS_i,DS_j)所产生的攻击收益值，该博弈的收益矩阵如表1所示。In the process of network offensive and defensive confrontation, at different stages of the game process, the probability of the strategy being adopted by the offensive and defensive decision makers is different, and the probability is constantly changing with the passage of time under the action of the learning mechanism, so that the selection of offensive and defensive strategies forms a dynamic process. . The corresponding network attack and defense game tree is shown in Figure 3. p represents the probability that the attacker chooses the attack strategy AS ₁ , 1-p represents the probability of selecting the attack strategy AS ₂ , and satisfies p∈[0,1]; q represents the defense is the probability of selecting defense strategy DS ₁ , 1-q represents the probability of selecting defense strategy DS ₂ , and satisfies q∈[0,1]. d _ij represents the defense revenue generated by the attack-defense strategy pair (AS _i , DS _j ), and a _ij represents the attack revenue value generated by the attack-defense strategy pair (AS _i , DS _j ). The revenue matrix of this game is shown in Table 1 .

表1网络攻防博弈收益矩阵Table 1 Network attack and defense game income matrix

其中，V_n表示防御方本身所拥有的信息资产能够带来的固定收益；Among them, V _n represents the fixed income that the information assets owned by the defender itself can bring;

C_d表示防御方选择强防御策略时所需的防御成本；C _d represents the defense cost required when the defender chooses a strong defense strategy;

C_a表示攻击方选择强攻击策略时所需的攻击成本；C _a represents the attack cost required when the attacker chooses a strong attack strategy;

V_a表示防御方选取弱防御策略时，攻击方选择强攻击策略能够获得的攻击回报；V _a represents the attack return that the attacker can obtain by choosing a strong attack strategy when the defender chooses a weak defense strategy;

V_ad表示防御方选取强防御策略时，攻击方选择强攻击策略能够获得的攻击回报，且满足V_a>V_ad。V _ad represents the attack reward that the attacker can obtain by choosing a strong attack strategy when the defender chooses a strong defense strategy, and satisfies Va _a >V _ad .

在博弈过程中，假设弱攻防策略的成本相对强攻防策略为0。In the game process, it is assumed that the cost of the weak attack and defense strategy is 0 relative to the strong attack and defense strategy.

基于此，分别计算出防御方的期望收益和平均收益 Based on this, the expected return of the defender is calculated respectively and average earnings

在攻防过程中，随着博弈的重复进行，不同防御决策者之间通过相互学习并调整自身策略，使自身策略达到最优。因此，选取不同防御策略的防御者数量处于动态变化中，选取不同防御策略的人数比例是关于时间的函数，分别表示为q(t)和1-q(t)。针对强防御策略(DS₁)，可以采用如下复制动态方程描述其动态演化过程：In the process of offense and defense, as the game is repeated, different defense decision makers learn from each other and adjust their own strategies to optimize their own strategies. Therefore, the number of defenders who choose different defense strategies is changing dynamically, and the proportion of people who choose different defense strategies is a function of time, expressed as q(t) and 1-q(t) respectively. For the strong defense strategy (DS ₁ ), the following replication dynamic equation can be used to describe its dynamic evolution process:

由于1-q(t)∈[0,1]，可以推知其对防御策略选取的演化结果不会产生影响，因此，可以将上式转化为如下形式：Since 1-q(t)∈[0,1], it can be deduced that it will not affect the evolution result of defense strategy selection. Therefore, the above formula can be transformed into the following form:

通过分析可知，防御决策者选取策略DS₁的比例随时间的变化率与选取强防御策略的期望收益和选取弱防御策略的期望收益差值幅度成正相关关系。Through the analysis, it can be seen that the rate of change of the proportion of defense decision makers choosing strategy DS ₁ over time The difference between the expected return of choosing a strong defense strategy and the expected return of a weak defense strategy into a positive correlation.

为了更加准确地描述实际网络攻防博弈过程，借鉴高斯白噪声的概念，采用随机微分方程描述博弈系统中防御方存在的防御策略随机改变、信息系统环境改变以及网络结构变化等各类随机干扰，即可得到防御方的随机复制动态微分方程In order to describe the actual network attack and defense game process more accurately, the concept of Gaussian white noise is used for reference, and stochastic differential equations are used to describe various random disturbances in the game system, such as random changes in the defensive strategy of the defender, changes in the information system environment, and changes in the network structure. The random replication dynamic differential equation of the defender can be obtained

同理，针对攻击方，可以求得攻击方不同攻击策略的期望收益和平均收益 Similarly, for the attacker, the expected benefits of different attack strategies of the attacker can be obtained and average earnings

进而得到攻击方的演化博弈复制动态方程：Then the evolutionary game replication dynamic equation of the attacker is obtained:

同理可得，攻击方的随机复制动态微分方程：In the same way, the random replication dynamic differential equation of the attacker:

攻防双方的随机复制动态微分方程为随机分析理论中常用的Itó随机微分方程，分别表示攻防双方的动态演化过程，其中，ω(t)属于一维的标准Brown运动，即一种无规则的随机涨落现象，可以很好地描述网络攻防过程中博弈演化是如何受到随机干扰因素的影响。给定时间t，则ω(t)服从正态分布N(0,t)；dω(t)表示随机干扰，当t>0且步长h>0时，其增量Δω(t)＝ω(t+h)-ω(t)服从正态分布δ_i表示攻防双方的随机干扰强度，且满足δ_i>0。因此，p(t)和q(t)的演化也成为一种随机过程，从而使攻防双方的随机复制动态微分方程构成随机攻防演化系统。The stochastic replication dynamic differential equations of the offensive and defensive parties are Itó stochastic differential equations commonly used in stochastic analysis theory, which respectively represent the dynamic evolution process of the offensive and defensive parties. Among them, ω(t) belongs to the one-dimensional standard Brownian motion, that is, an irregular random The fluctuation phenomenon can well describe how the game evolution is affected by random disturbance factors in the process of network attack and defense. Given time t, ω(t) obeys normal distribution N(0,t); dω(t) represents random interference, when t>0 and step size h>0, its increment Δω(t)=ω (t+h)-ω(t) obeys normal distribution δ _i represents the random interference strength of both attackers and defenders, and satisfies δ _i >0. Therefore, the evolution of p(t) and q(t) also becomes a random process, so that the random replication dynamic differential equations of both attackers and defenders constitute a random attack and defense evolution system.

在攻防博弈演化过程中，存在诸多影响系统稳定性的扰动因素，既有外部因素也有内部因素，每个因素对系统稳定性都不起决定性作用。In the evolution process of the offensive and defensive game, there are many disturbance factors that affect the stability of the system, including both external factors and internal factors, each of which does not play a decisive role in system stability.

和决定了p(t)和q(t)的取值在区间[0,1]之间，符合其实际意义。 with It is determined that the values of p(t) and q(t) are between the interval [0,1], which is in line with its actual meaning.

和当且仅当1-q(t)＝q(t)和1-p(t)＝p(t)满足时达到最大值，即扰动最大。当两种防御策略选取的人数比例相当时，系统的稳定性最容易受到扰动，相反，若二者比例相差较大，则扰动较小。 with The maximum value is reached when and only when 1-q(t)=q(t) and 1-p(t)=p(t) are satisfied, that is, the disturbance is maximum. When the proportion of the number of people selected by the two defense strategies is the same, the stability of the system is most likely to be disturbed. On the contrary, if the proportions of the two defense strategies differ greatly, the disturbance is small.

联立攻防双方的随机复制动态微分方程，即可得到网络攻防随机演化博弈系统：Combining the stochastic replication dynamic differential equations of both offensive and defensive parties, the network offensive and defensive stochastic evolutionary game system can be obtained:

由于上述建立的随机攻防演化微分方程系统由非线性Itó随机微分方程构成，无法直接求出方程的解析解，为此，本发明的另一个实施例中，参见图5所示，获取攻防演化的均衡解，具体包含：Since the stochastic attack-defense evolution differential equation system established above is composed of nonlinear Itó stochastic differential equations, the analytical solution of the equation cannot be directly obtained. For this reason, in another embodiment of the present invention, referring to FIG. 5, the attack-defense evolution is obtained Equilibrium solution, specifically including:

301)、根据Itó随机微分方程对网络攻防随机演化博弈系统中防御方和攻击方两者的随机演化微分方程进行随机泰勒展开；301), carry out stochastic Taylor expansion to the stochastic evolution differential equations of both the defender and the attacker in the network attack and defense stochastic evolutionary game system according to the Itó stochastic differential equation;

302)、采用Milstein方法对网络攻防随机演化博弈系统中微分方程进行数值求解，得到相应的攻防演化均衡解。302), using the Milstein method to numerically solve the differential equation in the network attack and defense stochastic evolutionary game system, and obtain the corresponding attack and defense evolutionary equilibrium solution.

结合随机泰勒展开式和Itó随机公式，对攻防双方的随机复制动态微分方程进行展开求解。Combining the stochastic Taylor expansion and Itó stochastic formula, the dynamic differential equations of the random replication of the offensive and defensive sides are expanded and solved.

针对Itó随机微分方程：dx(t)＝f(t,x(t))dt+g(t,x(t))dω(t)，其中，t∈[t₀,T]，x(t₀)＝x₀，x₀∈R，ω(t)一维的标准Brown运动，服从正态分布N(0,t)，而dω(t)服从正态分布N(0,Δt)。令h＝(T-t₀)/N，t_n＝t₀+nh，进行Itó随机微分方程进行随机泰勒展开，得到For Itó stochastic differential equation: dx(t)=f(t,x(t))dt+g(t,x(t))dω(t), where t∈[t ₀ ,T], x(t ₀ )=x ₀ , x ₀ ∈ R, ω(t) one-dimensional standard Brownian motion, obeys normal distribution N(0,t), and dω(t) obeys normal distribution N(0,Δt). Let h=(Tt ₀ )/N, t _n =t ₀ +nh, carry out the Itó stochastic differential equation and carry out stochastic Taylor expansion, and get

x(t_n+1)＝x(t_n)+K₀f(x(t_n))dt+K₁g(x(t_n))+K₁₁M¹g(x(t_n))+K₀₀M⁰f(x(t_n))+Rx(t _n+1 )＝x(t _n )+K ₀ f(x(t _n ))dt+K ₁ g(x(t _n ))+K ₁₁ M ¹ g(x(t _n ))+ K ₀₀ M ⁰ f(x(t _n ))+R

，其中，R表示展开式的余项，且, where R represents the remainder of the expansion, and

K₀＝h；K₁＝Δω_n； K ₀ =h; K ₁ =Δω _n ;

在此基础上，可以将Itó随机微分方程表示成On this basis, the Itó stochastic differential equation can be expressed as

由此，对防御方的随机演化微分方程进行随机泰勒展开，即可得到Thus, the stochastic Taylor expansion of the stochastic evolutionary differential equation of the defender can be obtained

即which is

同理，针对攻击方的随机演化微分方程，对其进行随机泰勒展开可以得到Similarly, for the stochastic evolutionary differential equation of the attacker, random Taylor expansion can be obtained

其中，R₁和R₂分别表示攻防微分展开式的余项。随机泰勒展开式是随机微分方程数值求解的基础，在求解过程中，一般采用Euler方法和Milstein方法对模型进行数值求解，Euler方法和Milstein方法的求解过程均是在泰勒展开式的基础上截取部分项得到。针对本发明建立的网络攻防随机演化博弈模型，采用Milstein方法对攻防随机微分方程进行数值求解，Milstein方法的表达式如下：Among them, R ₁ and R ₂ respectively represent the remainder of the offensive and defensive differential expansion. The stochastic Taylor expansion is the basis for the numerical solution of stochastic differential equations. In the solution process, the Euler method and the Milstein method are generally used to numerically solve the model. The solution process of the Euler method and the Milstein method are based on the Taylor expansion. Item gets. For the random evolutionary game model of network attack and defense established by the present invention, the Milstein method is used to numerically solve the attack and defense stochastic differential equation. The expression of the Milstein method is as follows:

根据上式可以实现对网络攻防随机演化微分方程(10)和(15)进行数值求解，得到相应的攻防演化均衡解。According to the above formula, the numerical solution to the network attack and defense stochastic evolution differential equations (10) and (15) can be realized, and the corresponding attack and defense evolution equilibrium solution can be obtained.

针对博弈系统存在的均衡解，根据随机微分方程稳定性判别定理对攻防双方的策略选取状态进行稳定性分析。Aiming at the equilibrium solution of the game system, according to the stability discriminant theorem of stochastic differential equations, the stability analysis of the strategy selection state of both attackers and defenders is carried out.

给定一个随机微分方程：Given a stochastic differential equation:

dx(t)＝f(t,x(t))dt+g(t,x(t))dω(t),x(t₀)＝x₀ dx(t)=f(t,x(t))dt+g(t,x(t))dω(t),x(t ₀ )=x ₀

记x(t)＝x(t,x₀)属于上述微分方程的解，为方便分析，假设x(t),f(t,x(t)),g(t,x(t))均为标量。设存在函数V(t,x)与正常数c₁,c₂满足Note that x(t)=x(t,x ₀ ) belongs to the solution of the above differential equation. For the convenience of analysis, it is assumed that x(t), f(t,x(t)), g(t,x(t)) are all is a scalar. Assuming that there is function V(t,x) and normal constants c ₁ , c ₂ satisfy

c₁|x|^p≤V(t,x)≤c₂|x|^p,t≥0.c ₁ |x| ^p ≤V(t,x)≤c ₂ |x| ^p ,t≥0.

(1)若存在正常数γ，满足：(1) If there is a constant γ, satisfying:

LV(t,x)≤-γV(t,x),t≥0.LV(t,x)≤-γV(t,x),t≥0.

则微分方程(21)的零解p阶矩期望指数稳定，且成立Then the p-th order moment expectation index of the zero solution of differential equation (21) is stable, and holds

E|x(t,x⁰)|^p<(c²/c¹)|x⁰|^pe^-γt,t≥0.E|x(t,x ⁰ )| ^p <(c ² /c ¹ )|x ⁰ | ^p e ^-γt ,t≥0.

(2)若存在正常数γ，满足：(2) If there is a constant γ, satisfying:

LV(t,x)≥γV(t,x),t≥0.LV(t,x)≥γV(t,x),t≥0.

则微分方程(21)的零解p阶矩期望指数不稳定，且成立Then the p-th order moment expectation index of the zero solution of differential equation (21) is unstable, and holds

E|x(t,x⁰)|^p≥(c²/c¹)|x⁰|^pe^-γt,t≥0.E|x(t,x ⁰ )| ^p ≥(c ² /c ¹ )|x ⁰ | ^p e ^-γt ,t≥0.

根据上述内容，通过分析可以得到随机攻防演化系统的稳定性判据。According to the above content, the stability criterion of the stochastic attack-defense evolutionary system can be obtained through analysis.

针对防御方的随机演化微分方程，令V(t,q(t))＝q(t)，q(t)∈[0,1]，c₁＝c₂＝1，p＝1，γ＝1，则LV(t,q(t))＝f(t,q(t))，于是满足：For the stochastic evolutionary differential equation of the defender, let V(t,q(t))=q(t), q(t)∈[0,1], c ₁ =c ₂ =1, p=1, γ= 1, then LV(t,q(t))=f(t,q(t)), then satisfy:

(1)当且C_d≥1时，随机微分方程(10)的零解期望矩指数稳定；(1) when And when C _d ≥ 1, the expected moment of the zero solution of stochastic differential equation (10) is exponentially stable;

(2)当且C_d-V_a+V_ad+1≤0时，随机微分方程(10)的零解期望矩指数不稳定。(2) when And when C _d -V _a +V _ad +1≤0, the expected moment of the zero solution of stochastic differential equation (10) is exponentially unstable.

针对防御方的随机演化微分方程，已知c₁＝c₂＝1，p＝1，γ＝1，V(t,q(t))＝q(t)，q(t)∈[0,1]，LV(t,q(t))＝f(t,q(t))＝q(t)[(V_a-V_ad)p(t)-C_d]，要使防御方的随机演化微分方程满足零解期望矩指数稳定，则需要满足For the stochastic evolutionary differential equation of the defender, it is known that c ₁ =c ₂ =1, p=1, γ=1, V(t,q(t))=q(t), q(t)∈[0, 1], LV(t,q(t))=f(t,q(t))=q(t)[(V _a -V _ad )p(t)-C _d ], the defender’s random If the evolutionary differential equation satisfies the zero-solution expected moment exponential stability, it needs to satisfy

LV(t,q(t))≤-γV(t,q(t))LV(t,q(t))≤-γV(t,q(t))

即which is

q(t)[(V_a-V_ad)p(t)-C_d]≤-q(t)q(t)[(V _a -V _ad )p(t)-C _d ]≤-q(t)

进一步可以得到further can be obtained

q(t)[(V_a-V_ad)p(t)-(C_d-1)]≤0q(t)[(V _a -V _ad )p(t)-(C _d -1)]≤0

由q(t)∈[0,1]可知，It can be seen from q(t)∈[0,1],

(V_a-V_ad)p(t)-(C_d-1)≤0(V _a -V _ad )p(t)-(C _d -1)≤0

又因为V_a>V_ad，可得And because V _a >V _ad , we can get

且满足 and satisfied

即which is

且C_d≥1.证毕。 And C _d ≥ 1. The proof is completed.

(2)要使防御方的随机演化微分方程满足零解期望矩指数不稳定，则需要满足(2) To make the stochastic evolutionary differential equation of the defender satisfy the zero-solution expected moment exponential instability, it is necessary to satisfy

LV(t,q(t))≥γV(t,q(t))LV(t,q(t))≥γV(t,q(t))

即which is

q(t)[(V_a-V_ad)p(t)-C_d]≥q(t)q(t)[(V _a -V _ad )p(t)-C _d ]≥q(t)

进一步可以得到further can be obtained

q(t)[(V_a-V_ad)p(t)-(C_d+1)]≥0q(t)[(V _a -V _ad )p(t)-(C _d +1)]≥0

由q(t)∈[0,1]可得From q(t)∈[0,1], we can get

(V_a-V_ad)p(t)-(C_d+1)≥0(V _a -V _ad )p(t)-(C _d +1)≥0

根据V_a>V_ad可得According to V _a >V _ad can be obtained

且满足 and satisfied

即which is

且C_d-V_a+V_ad+1≤0.证毕。 And C _d -V _a +V _ad +1≤0. The proof is completed.

由上述内容可知：当满足条件且C_d≥1时，随着攻防博弈的重复进行，网络防御者最终将选择弱防御策略，达到演化稳定状态；相反，当满足条件且C_d-V_a+V_ad+1≤0时，随着攻防博弈的进行，网络防御者更倾向于选取强防御策略，弱防御策略选取者将不断调整自身策略，改选强防御策略，从而使自身收益达到最大。From the above, it can be seen that when the conditions are met And when C _d ≥ 1, as the attack-defense game repeats, the network defender will eventually choose a weak defense strategy and reach an evolutionary stable state; on the contrary, when the condition is satisfied And when C _d -V _a +V _ad +1≤0, as the attack-defense game progresses, network defenders are more inclined to choose a strong defense strategy, and those who choose a weak defense strategy will constantly adjust their own strategy and choose a strong defense strategy, so that Maximize your own profit.

针对攻击方的随机演化微分方程，令V(t,p(t))＝p(t)，p(t)∈[0,1]，c₁＝c₂＝1，p＝1，γ＝1，则LV(t,p(t))＝f(t,p(t))，于是满足：For the stochastic evolutionary differential equation of the attacker, let V(t,p(t))=p(t), p(t)∈[0,1], c ₁ =c ₂ =1, p=1, γ= 1, then LV(t,p(t))=f(t,p(t)), then satisfy:

(1)当且C_a-V_ad≥1时，随机微分方程(15)的零解期望矩指数稳定；(1) when And when C _a -V _ad ≥ 1, the expected moment of the zero solution of stochastic differential equation (15) is exponentially stable;

(2)当且C_a-V_a+1≤0时，随机微分方程(15)的零解期望矩指数不稳定。(2) when And when C _a -V _a +1≤0, the expected moment of the zero solution of stochastic differential equation (15) is exponentially unstable.

由此可知：当满足条件且C_a-V_ad≥1时，随着攻防博弈的重复进行，网络攻击者最终将选取弱攻击策略，博弈系统达到演化稳定状态；当满足条件且C_a-V_a+1≤0时，攻击者有利可图，此时攻击者更倾向于强攻击策略，通过不断学习调整策略，使收益最大。It can be seen from this that when the conditions And when C _a -V _ad ≥ 1, with the repetition of the offensive and defensive game, the network attacker will eventually choose a weak attack strategy, and the game system will reach an evolutionary stable state; when the condition is satisfied And when C _a -V _a +1 ≤ 0, the attacker is profitable. At this time, the attacker is more inclined to a strong attack strategy, and adjusts the strategy through continuous learning to maximize the profit.

结合攻防双方的随机演化微分方程的上述内容可知，当满足条件且C_d≥1，且C_a-V_ad≥1时，网络攻防博弈系统存在唯一的演化稳定策略ESS(0,0)，即攻击方实施弱攻击策略，防御方选取弱防御策略；当满足条件且C_d-V_a+V_ad+1≤0，且C_a-V_a+1≤0时，博弈系统存在唯一的演化稳定策略ESS(1,1)，即攻击方实施强攻击策略，防御方选取强防御策略，这与实际网络攻防对抗不断演化升级保持一致。Combining the above content of the stochastic evolutionary differential equations of the offensive and defensive sides, it can be seen that when the condition and C _d ≥ 1, And when C _a -V _ad ≥ 1, there is a unique evolutionary stable strategy ESS(0,0) in the network attack and defense game system, that is, the attacker implements a weak attack strategy, and the defender chooses a weak defense strategy; when the condition is met And C _d -V _a +V _ad +1≤0, And when C _a -V _a +1≤0, the game system has a unique evolutionary stable strategy ESS(1,1), that is, the attacker implements a strong attack strategy, and the defender chooses a strong defense strategy, which is constantly evolving with the actual network attack and defense confrontation Upgrades remain consistent.

获取安全防御策略的基本思想是，在建立攻防随机演化博弈模型的基础上，对博弈模型进行演化均衡求解，基于求出的演化稳定均衡解进行安全防御策略选取。针对防御方，本实施例提供一种基于随机演化博弈理论的安全防御策略选取算法，具体如算法1所示：The basic idea of obtaining the security defense strategy is to solve the evolutionary equilibrium of the game model based on the establishment of the attack-defense stochastic evolutionary game model, and select the security defense strategy based on the obtained evolutionary stable equilibrium solution. For the defender, this embodiment provides a security defense strategy selection algorithm based on stochastic evolutionary game theory, specifically as shown in Algorithm 1:

算法1：基于随机演化博弈模型的安全防御策略选取算法Algorithm 1: Security defense strategy selection algorithm based on stochastic evolutionary game model

Input:网络攻防博弈树Input: network attack and defense game tree

Output:安全防御策略Output: security defense strategy

BEGINBEGIN

1.Initialize；1. Initialize;

2.构建防御方的类型空间集合D＝{d_i,i≥1}；2. Construct the type space set D={d _i ,i≥1} of the defender;

3.构建防御者可选策略空间集合DS＝{DS_j,1≤j≤m}；3. Construct the defender's optional strategy space set DS={DS _j ,1≤j≤m};

4.针对攻击方所选攻击策略，以概率q_i(1≤i≤m)选取合理的防御策略DS_i，其中 4. According to the attack strategy selected by the attacker, select a reasonable defense strategy DS _i with probability q _i (1≤i≤m), where

5.针对攻防双方所选攻防策略对{AS_i,DS_j}，得出其防御收益值b_ij；5. According to the attack and defense strategy pair {AS _i , DS _j } selected by the attack and defense sides, the defense benefit value b _ij is obtained;

6.计算各防御策略的期望收益其中n表示攻击方的策略个数；6. Calculate the expected return of each defense strategy Where n represents the number of strategies of the attacker;

7.计算防御方的平均收益 7. Calculate the average benefit of the defender

8.构建攻防随机干扰强度系数集合Δ＝{δ₁,δ₂}，其中δ₁>0,δ₂>0；8. Construct a set of offensive and defensive random interference strength coefficients Δ={δ ₁ ,δ ₂ }, where δ ₁ >0,δ ₂ >0;

9.建立防御方随机复制动态演化方程9. Establish the dynamic evolution equation of the defender's random replication

10.将防御方的随机演化微分方程进行随机泰勒展开，10. Perform random Taylor expansion on the random evolutionary differential equation of the defender,

11.采用Milstein方法对攻防随机微分方程进行数值求解；11. Use the Milstein method to numerically solve the offensive and defensive stochastic differential equations;

12.输出均衡解中的安全防御策略；12. Output the security defense strategy in the equilibrium solution;

ENDEND

该算法的时间复杂度主要集中于随机微分方程的求解，其时间复杂度为O((m+n)²)；本算法的空间消耗主要集中于收益值和均衡求解中间结果的存储之上，其空间复杂度为O(nm)。The time complexity of this algorithm mainly focuses on the solution of the stochastic differential equation, and its time complexity is O((m+n) ² ); the space consumption of this algorithm mainly focuses on the storage of the profit value and the intermediate results of the equilibrium solution. Its space complexity is O(nm).

为验证本发明的有效性，下面通过具体仿真实验做进一步分析：针对发明提出的随机攻防演化博弈模型及求解分析过程，采用Matlab 2014进行数值仿真。假定攻防双方均存在两种可选策略，AS＝{强攻击策略，弱攻击策略}，DS＝{强防御策略，弱防御策略}。在仿真过程中，取模拟步长h＝0.01，模拟攻防双方在不同条件下的策略演化过程。假定策略选取初始状态为q(0)＝0.5，p(0)＝0.5。给定攻防博弈收益，通过改变攻防随机扰动强度系数δ_i，观察随机扰动强度δ_i对攻防双方博弈演化的影响。In order to verify the effectiveness of the present invention, further analysis will be carried out through specific simulation experiments: for the stochastic attack-defense evolutionary game model and solution analysis process proposed by the invention, Matlab 2014 is used for numerical simulation. Assume that there are two optional strategies for both the attacker and the defender, AS={strong attack strategy, weak attack strategy}, DS={strong defense strategy, weak defense strategy}. In the simulation process, take the simulation step size h=0.01 to simulate the evolution process of the strategies of the attacking and defending parties under different conditions. Assume that the strategy selects the initial state as q(0)=0.5, p(0)=0.5. Given the income of the offensive and defensive game, by changing the random disturbance intensity coefficient δ _i of the offensive and defensive, observe the influence of the random disturbance intensity δ _i on the evolution of the game between the offensive and defensive parties.

(1)在攻防博弈过程中，假定攻击成本为C_a＝10，防御成本为C_d＝10，防御方的资产收益为V_n＝20，当防御方选取弱防御策略时的攻击回报为V_a＝10，当防御方选取强防御策略时的攻击回报为V_ad＝5。此时，针对防御方的随机演化过程，满足随机微分方程(10)的零解矩指数稳定条件且C_d≥1，网络防御者将倾向于选取弱防御策略，随着博弈的进行，防御方最终将稳定在q(t)＝0的演化状态，即所有防御者选择弱防御策略。(1) During the attack-defense game, assume that the attack cost is C _a = 10, the defense cost is C _d = 10, the asset income of the defender is V _n = 20, and the attack return when the defender chooses a weak defense strategy is V _a =10, when the defender chooses a strong defense strategy, the attack reward is V _ad =5. at this time, For the random evolution process of the defender, satisfy the zero-solution-moment exponential stability condition of the stochastic differential equation (10) And C _d ≥ 1, the network defender will tend to choose a weak defense strategy. As the game progresses, the defender will eventually stabilize at the evolution state of q(t) = 0, that is, all defenders choose a weak defense strategy.

针对防御方的策略演化，采用Milstein方法进行数值模拟，对随机扰动强度系数取值δ₁＝0.5，δ₁＝2，δ₁＝5，用于分析不同随机干扰下防御策略的演化规律。图6为防御方的零解稳定策略演化趋势图，其中横坐标N表示采样次数，纵坐标q(t)表示选取强防御策略的比例。For the strategy evolution of the defender, the Milstein method is used for numerical simulation, and the random disturbance intensity coefficients are set to δ ₁ = 0.5, δ ₁ = 2, δ ₁ = 5, which are used to analyze the evolution law of the defense strategy under different random disturbances. Figure 6 is the evolution trend diagram of the zero-solution stability strategy of the defender, where the abscissa N represents the number of samples, and the ordinate q(t) represents the proportion of strong defense strategies selected.

由图6可知，防御方强防御策略的选取在演化过程中呈现出一定的波动性，表明系统存在的随机干扰对防御策略的演化具有一定的影响。此外，随着干扰强度δ₁减小，防御策略演化达到稳定状态所需的仿真次数越少(δ₁＝0.5时，防御策略在仿真16次即达到稳定状态；而δ₁＝5时，仿真31次才达到稳定状态)，说明随机因素干扰强度越小，防御方更倾向于选取弱防御策略。It can be seen from Figure 6 that the selection of the strong defense strategy of the defender presents certain fluctuations in the evolution process, indicating that the random interference in the system has a certain impact on the evolution of the defense strategy. In addition, as the interference intensity δ ₁ decreases, the number of simulations required for the evolution of the defense strategy to reach a steady state decreases (when δ ₁ =0.5, the defense strategy reaches a stable state after 16 simulations; and when δ ₁ =5, the simulation 31 times to reach a stable state), indicating that the smaller the interference intensity of random factors, the defender is more inclined to choose a weak defense strategy.

同理，针对攻击方的随机演化过程，且C_a-V_ad＝5，满足随机微分方程(15)的零解矩指数稳定条件且C_a-V_ad≥1，网络攻击者倾向于选取实施弱攻击策略，随着博弈的进行，攻击方最终将稳定在p(t)＝0的演化状态，即所有攻击者选择实施弱攻击策略。Similarly, for the random evolution process of the attacker, And C _a -V _ad = 5, satisfying the zero-moment exponential stability condition of the stochastic differential equation (15) And C _a -V _ad ≥ 1, the network attacker tends to choose to implement a weak attack strategy. As the game progresses, the attacker will eventually stabilize at the evolution state of p(t) = 0, that is, all attackers choose to implement a weak attack strategy Strategy.

针对攻击方的策略演化，对随机扰动强度系数取值δ₂＝0.5，δ₂＝2，δ₂＝5，用于分析不同随机干扰下攻击策略的演化规律。图7为攻击方的零解稳定策略演化趋势，其中横坐标N表示采样次数，纵坐标p(t)表示选取实施强攻击策略的比例。Aiming at the strategy evolution of the attacker, the values of random disturbance strength coefficients δ ₂ =0.5, δ ₂ =2, and δ ₂ =5 are used to analyze the evolution law of the attack strategy under different random disturbances. Figure 7 shows the evolution trend of the attacker's zero-solution stability strategy, where the abscissa N represents the number of samples, and the ordinate p(t) represents the proportion of strong attack strategies selected and implemented.

由图7可知，随着干扰强度δ₂减小，强攻击策略演化达到稳定状态的次数越少(δ₂＝0.5时，攻击策略在仿真16次即达到稳定状态；而δ₂＝5时，仿真29次才达到稳定状态)，说明随机因素干扰强度越小，攻击方更倾向于选取实施弱攻击策略。It can be seen from Figure 7 that as the interference intensity δ ₂ decreases, the number of times that the strong attack strategy evolves to a steady state decreases (when δ ₂ =0.5, the attack strategy reaches a stable state after 16 simulations; and when δ ₂ =5, It takes 29 simulations to reach a steady state), indicating that the smaller the interference intensity of random factors, the more inclined the attacker is to choose a weak attack strategy.

(2)在攻防博弈过程中，假定攻击成本为C_a＝4，防御成本为C_d＝5，防御方的资产收益为V_n＝20，当防御方选取弱防御策略时的攻击回报为V_a＝15，当防御方选取强防御策略时的攻击回报为V_ad＝2。此时，且C_d-V_a+V_ad+1＝-7。针对防御方的随机演化过程，满足随机微分方程(10)的零解矩指数不稳定条件且C_d-V_a+V_ad+1≤0，网络防御者将倾向于选取强防御策略，随着博弈的进行，防御方最终将稳定在q(t)＝1的演化状态，即所有防御者选择强防御策略。(2) During the attack-defense game, assume that the attack cost is C _a = 4, the defense cost is C _d = 5, the asset income of the defender is V _n = 20, and the attack return when the defender chooses a weak defense strategy is V _a =15, when the defender chooses a strong defense strategy, the attack reward is V _ad =2. at this time, And C _d −V _a +V _ad +1=−7. For the random evolution process of the defender, satisfy the zero-solution-moment exponential instability condition of the stochastic differential equation (10) And C _d -V _a +V _ad +1≤0, the network defender will tend to choose a strong defense strategy. As the game progresses, the defender will eventually stabilize at the evolution state of q(t)=1, that is, all defenses choose a strong defense strategy.

基于上述条件，采用Milstein方法对防御方选取强防御策略的演化进行数值模拟，对随机扰动强度系数取值δ₁＝0.5，δ₁＝2，δ₁＝5，用于分析不同随机干扰强度下的防御策略演化规律。防御方的零解非稳定策略演化趋势如图8所示。Based on the above conditions, the Milstein method is used to numerically simulate the evolution of the defender's selection of a strong defense strategy, and the values of the random disturbance intensity coefficients are δ ₁ = 0.5, δ ₁ = 2, and δ ₁ = 5, which are used to analyze the The evolution law of the defense strategy. The evolution trend of the defender's zero-solution non-stable strategy is shown in Figure 8.

由图8可知，防御方选择强防御策略在演化过程中呈现出一定的波动性，表明系统存在的随机干扰对防御策略的演化具有一定的影响。此外，随着干扰强度δ₁减小，防御策略演化达到稳定状态所需的仿真次数越多(δ₁＝0.5时，防御策略在仿真39次即达到稳定状态；而δ₁＝5时，仿真27次才达到稳定状态)，说明随机因素干扰强度越小，防御方更倾向于选取弱防御策略。It can be seen from Figure 8 that the defense strategy selected by the defender presents certain fluctuations in the evolution process, indicating that the random interference in the system has a certain impact on the evolution of the defense strategy. In addition, as the interference intensity δ ₁ decreases, the more simulation times the defense strategy evolution needs to reach a steady state (when δ ₁ =0.5, the defense strategy reaches a stable state after 39 simulations; and when δ ₁ =5, the simulation 27 times to reach a stable state), indicating that the smaller the interference intensity of random factors, the more inclined the defender is to choose a weak defense strategy.

同理，且C_a-V_a+1＝-10，针对攻击方的随机演化过程，满足随机微分方程(15)的零解矩指数不稳定条件且C_a-V_a+1<0，网络攻击者倾向于选取实施强攻击策略，随着博弈的进行，攻击方最终将稳定在p(t)＝1的演化状态，即所有攻击者选择实施强网络攻击。In the same way, And C _a -V _a +1=-10, for the random evolution process of the attacker, the zero-solution-moment exponential instability condition of stochastic differential equation (15) is satisfied And C _a -V _a +1<0, network attackers tend to choose to implement a strong attack strategy, as the game progresses, the attacker will eventually stabilize at the evolution state of p(t)=1, that is, all attackers choose to implement Strong network attack.

针对攻击方的策略演化，对随机扰动强度系数取值δ₂＝0.5，δ₂＝2，δ₂＝5，用于分析不同随机干扰下攻击策略的演化规律。攻击方的零解非稳定策略演化趋势如图9所示。Aiming at the strategy evolution of the attacker, the values of random disturbance strength coefficients δ ₂ =0.5, δ ₂ =2, and δ ₂ =5 are used to analyze the evolution law of the attack strategy under different random disturbances. The evolution trend of the zero-solution non-stable strategy of the attacker is shown in Figure 9.

由图9可知，随着干扰强度δ₂减小，强攻击策略演化达到稳定状态的次数越多(δ₂＝0.5时，攻击策略在仿真37次即达到稳定状态；而δ₂＝5时，仿真24次才达到稳定状态)，说明随机因素干扰强度越小，攻击方更倾向于选取实施弱攻击策略。It can be seen from Figure 9 that as the interference intensity δ ₂ decreases, the number of times the strong attack strategy evolves to a steady state increases (when δ ₂ =0.5, the attack strategy reaches a stable state after 37 simulations; and when δ ₂ =5, It takes 24 simulations to reach a steady state), indicating that the smaller the interference intensity of random factors, the more inclined the attacker is to choose to implement a weak attack strategy.

综上可知，不同随机干扰强度对攻防博弈系统的演化速率具有不同的影响，且干扰强度越大，防御者更倾向于选择强防御策略，攻击者更倾向于选择强攻击策略，该实验结果与随机控制理论中的系统追求稳定性保持一致。当存在随机干扰时，系统通过加强攻防强度来防止扰动对系统稳定性的破坏。本发明针对攻防博弈系统中存在各类随机干扰因素的问题，为提高模型的有效性和准确性，通过借鉴高斯白噪声的概念，描述攻防博弈过程中存在的系统运行环境改变、网络拓扑结构变化以及攻防策略的改变等各类随机干扰，改进传统复制动态演化博弈方法，利用非线性Itó随机微分方程构建非对称条件下的随机网络攻防演化博弈模型，用于描述网络攻防对抗的实时随机动态演化过程。对攻防随机微分方程进行数值求解，并根据随机微分方程稳定性判别定理对攻防双方的策略选取状态进行稳定性分析，设计出基于随机攻防演化博弈模型的安全防御策略选取算法。通过仿真验证了不同强度的随机干扰对攻防决策演化速率的影响，能够为网络攻击行为预测和安全防御策略选取提供一定的指导。与现有技术相比，本发明能够更加准确地分析有限理性的攻防决策者之间的随机动态演化过程，安全防御策略选取的实用性和指导意义更强。In summary, different random disturbance intensities have different effects on the evolution rate of the attack-defense game system, and the greater the disturbance intensity, the defender is more inclined to choose a strong defense strategy, and the attacker is more inclined to choose a strong attack strategy. The experimental results are consistent with Systems in stochastic control theory pursue stability consistently. When there is random disturbance, the system prevents the disturbance from destroying the stability of the system by strengthening the strength of attack and defense. The present invention aims at the problems of various random interference factors in the offensive and defensive game system, in order to improve the validity and accuracy of the model, by referring to the concept of Gaussian white noise, it describes the changes of the system operating environment and the network topology in the process of the offensive and defensive game As well as various random disturbances such as the change of offensive and defensive strategies, the traditional replication dynamic evolutionary game method is improved, and the nonlinear Itó stochastic differential equation is used to construct a random network offensive and defensive evolutionary game model under asymmetric conditions, which is used to describe the real-time random dynamic evolution of network offensive and defensive confrontation process. The attack and defense stochastic differential equations are numerically solved, and the stability analysis of the strategy selection state of both the attacker and the defense is carried out according to the stochastic differential equation stability discriminant theorem, and a security defense strategy selection algorithm based on the stochastic attack and defense evolutionary game model is designed. The influence of different strengths of random interference on the evolution rate of attack and defense decision-making is verified by simulation, which can provide certain guidance for network attack behavior prediction and security defense strategy selection. Compared with the prior art, the present invention can more accurately analyze the random dynamic evolution process between attack and defense decision makers with bounded rationality, and the practicability and guiding significance of security defense strategy selection are stronger.

本说明书中各个实施例采用递进的方式描述，每个实施例重点说明的都是与其他实施例的不同之处，各个实施例之间相同相似部分互相参见即可。对于实施例公开的装置而言，由于其与实施例公开的方法相对应，所以描述的比较简单，相关之处参见方法部分说明即可。Each embodiment in this specification is described in a progressive manner, each embodiment focuses on the difference from other embodiments, and the same and similar parts of each embodiment can be referred to each other. As for the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and for the related information, please refer to the description of the method part.

结合本文中所公开的实施例描述的各实例的单元及方法步骤，能够以电子硬件、计算机软件或者二者的结合来实现，为了清楚地说明硬件和软件的可互换性，在上述说明中已按照功能一般性地描述了各示例的组成及步骤。这些功能是以硬件还是软件方式来执行，取决于技术方案的特定应用和设计约束条件。本领域普通技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能，但是这种实现不认为超出本发明的范围。The units and method steps of the examples described in conjunction with the embodiments disclosed herein can be implemented by electronic hardware, computer software, or a combination of the two. In order to clearly illustrate the interchangeability of hardware and software, in the above description The composition and steps of each example have been generally described in terms of functions. Whether these functions are performed by hardware or software depends on the specific application and design constraints of the technical solution. Those of ordinary skill in the art may use different methods to implement the described functions for each particular application, but such implementation is not considered to exceed the scope of the present invention.

本领域普通技术人员可以理解上述方法中的全部或部分步骤可通过程序来指令相关硬件完成，所述程序可以存储于计算机可读存储介质中，如：只读存储器、磁盘或光盘等。可选地，上述实施例的全部或部分步骤也可以使用一个或多个集成电路来实现，相应地，上述实施例中的各模块/单元可以采用硬件的形式实现，也可以采用软件功能模块的形式实现。本发明不限制于任何特定形式的硬件和软件的结合。Those of ordinary skill in the art can understand that all or part of the steps in the above method can be completed by instructing relevant hardware through a program, and the program can be stored in a computer-readable storage medium, such as: a read-only memory, a magnetic disk or an optical disk, and the like. Optionally, all or part of the steps in the above embodiments can also be implemented using one or more integrated circuits. Correspondingly, each module/unit in the above embodiments can be implemented in the form of hardware, or can be implemented in the form of software function modules. The form is realized. The present invention is not limited to any specific combination of hardware and software.

对所公开的实施例的上述说明，使本领域专业技术人员能够实现或使用本申请。对这些实施例的多种修改对本领域的专业技术人员来说将是显而易见的，本文中所定义的一般原理可以在不脱离本申请的精神或范围的情况下，在其它实施例中实现。因此，本申请将不会被限制于本文所示的这些实施例，而是要符合与本文所公开的原理和新颖特点相一致的最宽的范围。The above description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the general principles defined herein may be implemented in other embodiments without departing from the spirit or scope of the application. Therefore, the present application will not be limited to the embodiments shown herein, but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. A network defense strategy selection method based on a random evolutionary game model is characterized by comprising the following steps:

constructing an asymmetric network attack and defense random evolution game model based on a random power system; by taking the reference of Gaussian white noise, obtaining a network attack and defense random evolutionary game system by using an It and an random differential equation;

the method comprises the steps that a Milstein method is adopted to carry out numerical solution on a network attack and defense random evolution game system, and equilibrium solution of attack and defense evolution is obtained;

aiming at the equilibrium solution of attack and defense evolution, the stability analysis is carried out on the strategy selection states of both the attack and defense parties according to the stability theorem of the random differential equation solution, and the network security defense strategy in the equilibrium solution is output.

2. The method for selecting the network defense strategy based on the random evolution game model as claimed in claim 1, wherein the network attack and defense random evolution game model is expressed by quintuple.

3. The method for choosing network defense strategy based on random evolution game model as claimed in claim 2, wherein the network attack and defense random evolution model ADEGM = (N, S, P, Δ, U), wherein N = (N) _D ,N _A ) Is the participant space of the evolving game, N _D Representing a defensive party, N _A Representing an attacker; s = (DS, AS) is a game policy space, DS denotes an optional policy set of defenders, AS denotes an optional policy set of attackers; p = (q, P) is a game belief set, q represents a probability set that a defender selects different defense strategies, and P represents a probability set that an attacker selects different attack strategies; Δ = { δ ₁ ,δ ₂ Is the set of random interference strength coefficients, δ ₁ Representing the strength factor, δ, of the effect of random disturbances on the defender ₂ Representing the influence intensity coefficient of random interference on an attacker and satisfying delta ₁ >0,δ ₂ >0；U＝(U _D ,U _A ) Is a set of game revenue functions, U _D Expressing the game income of defenders, U _A And the game income of the attackers is represented, and the value of the attack and defense income is jointly determined by the strategy selected by the attack and defense decision maker.

4. The method for choosing network defense strategies based on random evolution game model as claimed in claim 3, wherein the optional strategy set DS = { DS of defensive party ₁ ,DS ₂ In which DS is ₁ Indicating that defender adopted Strong defense strategy, DS ₂ Representing the defender to adopt a weak defense strategy; optional policy set AS = { AS for aggressor ₁ ,AS ₂ Where AS ₁ Representing attackers implementing a strong attack strategy, AS ₂ Representing an attacker implementing a weak attack strategy.

5. The method for selecting the network defense strategy based on the random evolutionary game model according to claim 4, wherein the obtaining of the network defense random evolutionary game system comprises the following contents:

a1 A (c), (c) constructing type space set D = { D) of defenders _i I is more than or equal to 1}; constructing defender-selectable policy space set DS = { DS _j J is more than or equal to 1 and less than or equal to m, wherein m is the number of strategies selectable by an attacker decision maker;

a2 Selected attack strategy for the attacker with probability q) _i Selecting a defense strategy DS _i Wherein, in the step (A),

a3 Computing average profit for defensive partyConstructing an attack and defense random interference intensity coefficient set delta = { delta = ₁ ,δ ₂ In which is delta ₁ >0,δ ₂ >0；

A4 The Gaussian white noise is used for reference, random interference of evolutionary game of an attack party and a defense party is described by adopting a random differential equation, and a randomly copied dynamic differential equation of the defense party and the attack party is obtained;

a5 And) randomly copying a dynamic differential equation of the simultaneous defense party and the attacking party to obtain the network attack and defense random evolution game system.

6. The method for choosing network defense strategy based on random evolution game model as claimed in claim 5, wherein the average profit of the defenders is calculated in A3)Comprises the following steps: acquiring a game income matrix by combining a network attack and defense game tree; calculating the average income of the attacking party and the defending party according to the game income matrix, wherein the average income of the defending party Is the expected revenue for the defender.

7. The method for selecting the network defense strategy based on the random evolution game model according to claim 5, wherein in A5), the network attack and defense random evolution game system is represented as:

，

wherein, C _d Representing the defense cost required by the defensive party when selecting the strong defense strategy; c _a Representing the attack cost required by an attacker for selecting a strong attack strategy; v _a When the defending party selects the weak defending strategy, the attacking party selects the attack return which can be obtained by the strong attacking strategy; v _ad When representing that the defense Fang Xuanqu is a strong defense strategy, the attacker selects the attack return which can be obtained by the strong attack strategy and meets the requirement of V _a >V _ad (ii) a q (t) and 1-q (t) respectively represent the functions of the number of defenders selecting different defense strategies and the proportion of the number of the defenders selecting different defense strategies with respect to time; omega (t) belongs to one-dimensional standard Brown motion and describes the influence of random interference factors on game evolution in the network attack and defense process.

8. The method for selecting the network defense strategy based on the random evolution game model according to claim 1, wherein the step of obtaining a balanced solution of attack and defense evolution specifically comprises the following steps:

b1 Performing random Taylor expansion on the random evolution differential equation of both a defense party and an aggressor in the network attack and defense random evolution game system according to It and the random differential equation;

b2 And) carrying out numerical solution on a differential equation in the network attack and defense random evolution game system by adopting a Milstein method to obtain a corresponding attack and defense evolution equilibrium solution.

9. The method for choosing network defense strategy based on random evolutionary game model in claim 8, wherein in B1), it is expressed as dx (t) = f (t, x (t)) dt + g (t, x (t)) d ω (t), where t e [ t ], (r) ] ₀ ,T]，x(t ₀ )＝x ₀ ，x ₀ e.R, ω (T) belongs to a one-dimensional standard Brown motion, obeying a normal distribution N (0,t), d ω (T) obeys a normal distribution N (0, Δ T), where T represents the continuation of the time dimensionAnd R is a real number.

10. The method for selecting the network defense strategy based on the random evolutionary game model as claimed in claim 7, wherein the strategy selection states of both the attacking and defending parties are subjected to stability analysis to verify the evolutionary stable strategy of the network attacking and defending random evolutionary game system, comprising: when it satisfiesAnd C _d ≥1，And C _a -V _ad When the network attack and defense random evolution game system is more than or equal to 1, a unique evolution stable strategy ESS (0,0) exists in the network attack and defense random evolution game system; when it is satisfied withAnd C _d -V _a +V _ad +1≤0，And C _a -V _a When +1 is less than or equal to 0, the network attack and defense random evolution game system has a unique evolution stable strategy ESS (1,1).