[go: up one dir, main page]

CN110209547B - SRAM type FPGA single event upset reinforcement timing refresh frequency determination method and system - Google Patents

SRAM type FPGA single event upset reinforcement timing refresh frequency determination method and system Download PDF

Info

Publication number
CN110209547B
CN110209547B CN201910368564.3A CN201910368564A CN110209547B CN 110209547 B CN110209547 B CN 110209547B CN 201910368564 A CN201910368564 A CN 201910368564A CN 110209547 B CN110209547 B CN 110209547B
Authority
CN
China
Prior art keywords
fpga
circuit
rate
error
fpga circuit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910368564.3A
Other languages
Chinese (zh)
Other versions
CN110209547A (en
Inventor
贾晓宇
王颖
王建昭
李衍存
秦珊珊
郑玉展
张庆祥
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Institute of Spacecraft System Engineering
Original Assignee
Beijing Institute of Spacecraft System Engineering
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Institute of Spacecraft System Engineering filed Critical Beijing Institute of Spacecraft System Engineering
Priority to CN201910368564.3A priority Critical patent/CN110209547B/en
Publication of CN110209547A publication Critical patent/CN110209547A/en
Application granted granted Critical
Publication of CN110209547B publication Critical patent/CN110209547B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/22Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing
    • G06F11/26Functional testing
    • G06F11/261Functional testing by simulating additional hardware, e.g. fault simulation
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Logic Circuits (AREA)

Abstract

A method and a system for determining the single event upset reinforcement timing refresh frequency of an SRAM type FPGA (field programmable gate array) need to verify and evaluate reinforcement measures applied to the SRAM type FPGA, utilize fault injection as a verification means, observe the running state of the FPGA application to determine the validity of the single event upset design applied by injecting upset into a configuration area of the FPGA, solve the problem that the prior art cannot quantitatively evaluate the inherent validity of the FPGA application, and can determine the refresh frequency value range of the FPGA.

Description

一种SRAM型FPGA单粒子翻转加固定时刷新频率确定方法及 系统A kind of SRAM type FPGA single-event flip plus fixed-time refresh frequency determination method and system

技术领域technical field

本发明涉及一种SRAM型FPGA单粒子翻转加固定时刷新频率确定方法及系统,属于FPGA应用容错设计领域。The invention relates to a method and a system for determining the refresh frequency of a SRAM type FPGA single event flipping and fixing, and belongs to the field of FPGA application fault-tolerant design.

背景技术Background technique

近年来,航天任务对在轨数据处理提出了更高需求,对电子设备的体积、重量和功耗也有更加严格限制,同时应用卫星正在向微型化、集成化的方向发展,对低成本、快速开发的需求也更加迫切。商用SRAM(静态随机存取存储器型)的FPGA(现场可编程逻辑门阵列),具有集成度高、开发成本低、高性能、低功耗及可在线重构等优点,成为了空间应用领域的关注热点,并逐步在航天领域得到应用。同时SRAM型FPGA易发生单粒子翻转,即SEU情况,因此在空间应用SRAM型FPGA时,需要针对SEU问题进行针对性防护设计,确保FPGA的应用可满足任务的在轨稳定运行需求。In recent years, space missions have put forward higher requirements for on-orbit data processing, and stricter restrictions on the volume, weight, and power consumption of electronic equipment. At the same time, the application of satellites is developing in the direction of miniaturization and integration. The need for development is also more urgent. Commercial SRAM (static random access memory type) FPGA (Field Programmable Logic Gate Array) has the advantages of high integration, low development cost, high performance, low power consumption and online reconfiguration, and has become a popular choice in the field of space applications. Focus on hot spots and gradually get applied in the aerospace field. At the same time, SRAM-type FPGAs are prone to single-event upsets, that is, SEU situations. Therefore, when SRAM-type FPGAs are used in space, targeted protection design for SEU problems is required to ensure that the application of FPGAs can meet the on-orbit stable operation requirements of the mission.

冗余技术,包括检错纠错、三模冗余技术,与定时刷新是常见的抗单粒子翻转加固手段。工程实践表明,单独使用冗余或者刷新并不能达理想的加固效果,需要将两者结合起来。空间应用中,SRAM型FPGA的加固一般也需要同时采用冗余与刷新两种加固手段。对FPGA应用的加固并不一定总是有效的,如加固方式不适当,则消耗系统资源的同时也无法达到预期加固目标。目前工程上尚无成熟SRAM型FPGA单粒子翻转加固验证手段,目前可用手段包括:Redundancy technology, including error detection and correction, triple-mode redundancy technology, and scheduled refresh are common anti-single event reversal reinforcement methods. Engineering practice shows that using redundancy or refreshing alone cannot achieve the desired reinforcement effect, and the two need to be combined. In space applications, the reinforcement of SRAM-based FPGAs generally requires both redundancy and refresh. The hardening of FPGA applications is not always effective. If the hardening method is not appropriate, system resources will be consumed and the expected hardening goals will not be achieved. At present, there is no mature SRAM-type FPGA single-event flip hardening verification method in engineering. Currently available methods include:

重离子辐照试验:重离子辐射是最直接、最准确对加固有效性进行考核的手段,但是辐照源机时有限、试验设计、相关技术复杂,且需要准备专门的测试电路,因此多用于研究、技术验证,不适合作为工程上对FPGA设计的考核手段。Heavy ion irradiation test: Heavy ion irradiation is the most direct and accurate means of evaluating the effectiveness of reinforcement, but the irradiation source machine time is limited, the test design, related technologies are complicated, and special test circuits need to be prepared, so it is mostly used for Research and technical verification are not suitable as an evaluation method for FPGA design in engineering.

仿真故障注入:目前存在多种仿真、分析方法对错误率进行预计,其代价小,但精度不足。同时仿真故障注入需要对FPGA具体设计,并且有一定概率领域知识,因此该方法适合于单机设计师对FPGA加固设计进行检查,但不是理想的考核方式。Simulation fault injection: At present, there are many simulation and analysis methods to predict the error rate. The cost is small, but the accuracy is not enough. Simultaneously, fault injection simulation requires a specific design of the FPGA and a certain degree of knowledge in the field of probability. Therefore, this method is suitable for stand-alone designers to check the FPGA reinforcement design, but it is not an ideal assessment method.

故障注入是介于辐射试验与仿真分析之间一种直接而易用的验证手段,通过向FPGA内部注入翻转,观察FPGA应用的运行状态来确定应用的抗单粒子翻转设计有效性。目前故障注入可以分为动态故障注入与静态故障注入两种,动态方式相对比较直接,但是系统设计较为复杂,静态故障注入实现简单,通用性好,但是试验结果无法直接应用与带有刷新策略的电路设计。Fault injection is a direct and easy-to-use verification method between radiation test and simulation analysis. By injecting flips into the FPGA and observing the running status of the FPGA application, the effectiveness of the anti-single event flipping design of the application is determined. At present, fault injection can be divided into two types: dynamic fault injection and static fault injection. The dynamic method is relatively straightforward, but the system design is more complicated. The static fault injection is simple to implement and has good versatility, but the test results cannot be directly applied to systems with refresh strategies. circuit design.

发明内容Contents of the invention

本发明解决的技术问题是:针对目前现有技术中,对FPGA应用单粒子翻转加固有效性的定量评估时,工程上缺少通用性好,易于实现方法,基于静态故障注入的手段提出了一种SRAM型FPGA单粒子翻转加固措施的考核方法,可定量获得FPGA应用电路在轨翻转率,并帮助设计师选择应用的定时刷新频率。The technical problem solved by the present invention is: in view of the current prior art, when quantitatively evaluating the effectiveness of FPGA application single-particle flipping reinforcement, there is a lack of good versatility and easy implementation in engineering, and a method based on static fault injection is proposed. The evaluation method of SRAM-type FPGA single event flip hardening measures can quantitatively obtain the on-rail flip rate of FPGA application circuits, and help designers choose the timing refresh frequency of the application.

本发明解决上述技术问题是通过如下技术方案予以实现的:The present invention solves the problems of the technologies described above and is achieved through the following technical solutions:

一种SRAM型FPGA单粒子翻转加固定时刷新频率确定方法,具体步骤如下:A method for determining refresh frequency when SRAM type FPGA single event flips and is fixed, the specific steps are as follows:

(1)对FPGA电路进行加固设计,根据空间辐射环境、FPGA电路的静态翻转截面曲线对FPGA电路进行在轨静态翻转率分析,获取FPGA电路的在轨静态翻转率u;(1) Reinforce the design of the FPGA circuit, analyze the on-orbit static inversion rate of the FPGA circuit according to the space radiation environment and the static inversion cross-section curve of the FPGA circuit, and obtain the on-orbit static inversion rate u of the FPGA circuit;

(2)对初始状态下FPGA电路的配置文件进行改写,随机写入N位翻转,运行FPGA电路,记录M次试验中,在不同翻转位数情况下出现输出错误或功能中断的试验次数,并计算FPGA电路运行的错误比例λN(2) rewrite the configuration file of the FPGA circuit in the initial state, randomly write N bits to flip, run the FPGA circuit, record the number of trials of output errors or functional interruptions under different flip digits, and Calculate the error ratio λ N of FPGA circuit operation;

(3)逐步增大FPGA电路的配置文件中翻转的位数N,当FPGA运行的错误比例λN大于50%时,停止故障注入试验,并记录错误比例λN(3) gradually increase the number of digits N flipped in the configuration file of the FPGA circuit, when the error ratio λ N of FPGA operation was greater than 50%, stop the fault injection test, and record the error ratio λ N ;

(4)根据步骤(3)记录数据计算N位翻转下的平均敏感位比例rN,判断当N增大时,N位翻转下的平均敏感位比例rN随翻转位数N的增大倍数是否大于变化阈值K,若是,则进入步骤(5)进行最小在轨翻转率判断;否则停止试验,返回步骤(1)对FPGA电路进行重新加固设计;(4) According to the recorded data in step (3), calculate the average sensitive bit ratio r N under N-bit flipping, and judge that when N increases, the average sensitive bit ratio r N under N-bit flipping increases with the number of flipping bits N. Whether it is greater than the change threshold K, if so, enter step (5) to judge the minimum on-orbit turnover rate; otherwise, stop the test and return to step (1) to re-reinforce the design of the FPGA circuit;

(5)计算当前加固设计的最小在轨翻转率Emin,并判断最小在轨翻转率Emin是否小于型号任务设定的电路错误率E0,若Emin不小于E0,则加固后的FPGA电路抗辐射性能不足,返回步骤(1)对FPGA电路进行重新加固设计;若Emin大于E0,则加固设计有效,进入步骤(6);(5) Calculate the minimum on-orbit turnover rate E min of the current reinforcement design, and judge whether the minimum on-orbit turnover rate E min is less than the circuit error rate E 0 set by the model task. If E min is not less than E 0 , then the reinforced If the anti-radiation performance of the FPGA circuit is insufficient, return to step (1) to re-reinforce the design of the FPGA circuit; if E min is greater than E 0 , the reinforcement design is valid and proceed to step (6);

(6)对加固设计有效的FPGA电路进行定时刷新,计算FPGA电路错误率与定时刷新频率f的关系;(6) Carry out timing refresh to the effective FPGA circuit of reinforcement design, calculate the relationship between FPGA circuit error rate and timing refresh frequency f;

(7)在型号任务的许可范围内任意选取刷新频率f,计算当前刷新频率下FPGA电路错误率Ef,并与型号任务设定的电路错误率E0进行比较,若Ef小于E0,则当前刷新频率可用,否则当前刷新频率过低,增大定时刷新频率并重新计算该刷新频率下FPGA电路错误率;(7) Select the refresh frequency f arbitrarily within the allowable range of the model task, calculate the FPGA circuit error rate E f at the current refresh frequency, and compare it with the circuit error rate E 0 set by the model task, if E f is less than E 0 , If the current refresh rate is available, otherwise the current refresh rate is too low, increase the scheduled refresh rate and recalculate the error rate of the FPGA circuit at this refresh rate;

记录满足步骤(7)中筛选条件的定时刷新频率与FPGA应用错误在对应刷新频率f下错误率Ef的数值,并重复步骤(7)获得型号任务许可的刷新频率范围内的Ef-f曲线,供用户自行选择刷新频率。Record the value of the timing refresh frequency and FPGA application error error rate E f at the corresponding refresh frequency f that meets the screening conditions in step (7), and repeat step (7) to obtain E f -f within the refresh frequency range permitted by the model task Curves for users to choose the refresh rate by themselves.

所述步骤(2)中,初始状态下注入翻转的位数N等于1,所述FPGA电路运行的错误比例λN为,可观察状态下FPGA电路输出错误或功能中断的次数与试验总次数M的比值,其中试验总次数M的设定条件如下:In the described step (2), under the initial state, the number of digits N injected into the reversal is equal to 1, and the error ratio λ N of the FPGA circuit operation is, the number of times that the FPGA circuit outputs errors or function interruptions and the total number of tests M under the observable state The ratio of , where the setting conditions for the total number of tests M are as follows:

持续进行试验,直到观察到50次及以上的输出错误或者功能中断;如对试验时间有限制,则可以将观察到的输出错误或者功能中断次数降低到20次。Continue to test until 50 or more output errors or functional interruptions are observed; if there is a limit to the test time, the number of observed output errors or functional interruptions can be reduced to 20 times.

所述N位翻转初始位数为1,在M次试验中,翻转位数以指数形式递增。The initial number of flipped N bits is 1, and the number of flipped bits increases exponentially in M trials.

当FPGA电路运行时无法观察到错误与功能中断的发生,则可以在经过[16u/E0]次试验后结束故障注入试验,同时以95%置信度下错误比例上限E0/4u作为该次试验的错误比例λNWhen the occurrence of errors and functional interruptions cannot be observed when the FPGA circuit is running, the fault injection test can be ended after [16u/E 0 ] trials, and the upper limit of the error ratio under the 95% confidence level E 0 /4u is used as the Trial error ratio λ N .

所述步骤(5)中,N位翻转下的平均敏感位比例rN的计算方法如下:In the step (5), the calculation method of the average sensitive bit ratio r N under N bit flipping is as follows:

Figure BDA0002049082280000041
Figure BDA0002049082280000041

所述变化阈值K为rN的增大倍数,即取翻转位数N最大值为Nmax时,平均敏感位比例为rN_maxThe change threshold K is an increase multiple of r N , that is, when the maximum value of the flipping digit N is taken as N max , the average sensitive bit ratio is r N_max :

K=rN_max/r1K=r N_max /r 1 .

若变化阈值K>5,则加固设计有效,否则认定加固后的FPGA电路抗辐射性能不足,返回步骤(1)对FPGA电路进行重新加固设计。If the change threshold K>5, the reinforcement design is effective, otherwise it is determined that the radiation resistance performance of the reinforced FPGA circuit is insufficient, and return to step (1) to re-reinforce the design of the FPGA circuit.

所述步骤(5)中,最小在轨翻转率Emin的计算公式为:In the step (5), the calculation formula of the minimum on-orbit turnover rate Emin is:

Emin=λ1u。E min1 u.

所述步骤(6)中,错误率Ef与定时刷新周期f的关系式如下:In described step (6), the relational expression of error rate E f and timing refresh period f is as follows:

Figure BDA0002049082280000042
Figure BDA0002049082280000042

P(N)=uNf-Ne-u/f/N!P(N)=u N f -N e -u/f /N!

式中,P(N)为一个刷新周期内发生N位翻转的概率,u是FPGA配置区在轨静态翻转率,Ef是以f为刷新频率下FPGA电路的在轨错误率。In the formula, P(N) is the probability of N-bit flips occurring within a refresh cycle, u is the on-track static flip rate of the FPGA configuration area, and E f is the on-track error rate of the FPGA circuit at f as the refresh frequency.

一种SRAM型FPGA抗单粒子翻转加固定时刷新频率确定系统,包括故障注入模块、运行错误比例判断模块、在轨翻转率判断模块、电路错误率计算模块、刷新频率选取模块,其中:A SRAM-type FPGA anti-single-event flip plus fixed-time refresh frequency determination system, including a fault injection module, an operating error ratio judgment module, an on-orbit flip rate judgment module, a circuit error rate calculation module, and a refresh frequency selection module, wherein:

故障注入模块:对初始状态下FPGA电路的配置文件进行改写,随机写入N位翻转,运行FPGA电路,记录M次试验中,在不同翻转位数情况下出现输出错误或功能中断的试验次数;同时逐步增大翻转的位数N,进行重复试验并记录实验数据;Fault injection module: rewrite the configuration file of the FPGA circuit in the initial state, randomly write N-bit flips, run the FPGA circuit, and record the number of tests with output errors or functional interruptions in the case of different flip bits in M tests; At the same time, gradually increase the number of flipped digits N, repeat the test and record the experimental data;

运行错误比例判断模块:根据故障注入模块记录的试验数据,计算在不同翻转位数N的情况下,FPGA电路运行的错误比例λN,当错误比例λN大于50%时,向故障注入模块发送停止试验指令,记录错误比例λNOperation error ratio judgment module: According to the test data recorded by the fault injection module, calculate the error ratio λ N of FPGA circuit operation under different flipping digits N, and send it to the fault injection module when the error ratio λ N is greater than 50%. Stop the test command and record the error ratio λ N ;

在轨翻转率判断模块:根据运行错误比例判断模块所得不同翻转位数N的情况下记录的错误比例λN,分别计算N位翻转下的平均敏感位比例rN,判断当N增大时,N位翻转下的平均敏感位比例rN随翻转位数N的增大倍数是否大于变化阈值K,若是,则进行电路错误率判断;否则停止试验,对FPGA电路进行重新加固设计;On-orbit flipping rate judgment module: According to the error ratio λ N recorded under different flipping digits N obtained by running the error ratio judging module, calculate the average sensitive bit ratio r N under N-bit flipping respectively, and judge that when N increases, Whether the average sensitive bit ratio r N under N-bit flipping is greater than the change threshold K with the increase of the flipping number N, if so, then judge the circuit error rate; otherwise, stop the test and re-reinforce the design of the FPGA circuit;

电路错误率计算模块:计算当前加固设计的最小在轨翻转率Emin,并判断最小在轨翻转率Emin是否小于型号任务设定的电路错误率E0,若Emin不小于E0,则加固后的FPGA电路抗辐射性能不足,对FPGA电路进行重新加固设计;若Emin大于E0,则加固设计有效,进行刷新频率选取;Circuit error rate calculation module: Calculate the minimum on-orbit turnover rate E min of the current reinforcement design, and judge whether the minimum on-orbit turnover rate E min is less than the circuit error rate E 0 set by the model task. If E min is not less than E 0 , then The radiation resistance performance of the reinforced FPGA circuit is insufficient, and the FPGA circuit is re-reinforced and designed; if E min is greater than E 0 , the reinforced design is valid, and the refresh frequency is selected;

刷新频率选取模块:对加固设计有效的FPGA电路进行定时刷新,计算FPGA电路错误率与定时刷新频率f的关系式,型号任务的许可范围内任意选取刷新频率f,计算当前刷新频率下FPGA电路错误率Ef,并与型号任务设定的电路错误率E0进行比较,若Ef小于E0,则当前刷新频率可用,否则当前刷新频率过低,增大定时刷新频率并重新计算该刷新频率下FPGA电路错误率直至满足型号任务需求。Refresh frequency selection module: regularly refresh the FPGA circuit with effective reinforcement design, calculate the relationship between the FPGA circuit error rate and the timing refresh frequency f, select the refresh frequency f arbitrarily within the allowable range of the model task, and calculate the FPGA circuit error at the current refresh frequency rate E f , and compare it with the circuit error rate E 0 set by the model task, if E f is less than E 0 , the current refresh rate is available, otherwise the current refresh rate is too low, increase the timing refresh rate and recalculate the refresh rate Lower the FPGA circuit error rate until the model task requirements are met.

本发明与现有技术相比的优点在于:The advantage of the present invention compared with prior art is:

(1)本发明提供的一种SRAM型FPGA抗单子例子翻转加固措施的定时刷新频率确定方法,结合了轨道环境分析以及故障注入方法,可定量预测对SRAM型FPGA的在轨错误率,解决目前工程上FPGA应用抗单粒子翻转加固措施无定量考核的问题,同时通过测量错误比例λN与翻转位数N曲线,本方法可以直接识别出电路是否有存在冗余,对于无冗余/低冗余度的电路,定时刷新策略无效,需重新进行加固设计;(1) A method for determining the timing refresh frequency of a SRAM type FPGA anti-monadle example flipping reinforcement measure provided by the present invention, combined with the track environment analysis and fault injection method, can quantitatively predict the on-orbit error rate of the SRAM type FPGA, and solve the current problem In engineering, there is no quantitative assessment problem in the application of anti-single event flip reinforcement measures in FPGA. At the same time, by measuring the curve of error ratio λ N and flip digit N, this method can directly identify whether there is redundancy in the circuit. For no redundancy/low redundancy For redundant circuits, the timing refresh strategy is invalid, and a new reinforcement design is required;

(2)本发明可以指导定时刷新频率的确定,考虑SEU事件出现满足泊松分布,结合器件在轨静态翻转率与任务需求,得到刷新频率与FPGA应用在轨错误率之间关系,帮助设计师选择刷新策略,同时采用使用静态故障注入方法,只需对FPGA的配置文件随机注入修改,考核时不需要FPGA应用的具体设计文件,实现简便,能够适用不同型号的SRAM型FPGA。(2) The present invention can guide the determination of the timing refresh frequency, considering that the occurrence of SEU events satisfies the Poisson distribution, combined with the device on-orbit static flip rate and task requirements, the relationship between the refresh frequency and the FPGA application on-orbit error rate is obtained, helping designers Select the refresh strategy and use the static fault injection method at the same time, only need to randomly inject and modify the FPGA configuration file, and do not need specific design files for FPGA applications during the assessment. It is easy to implement and can be applied to different types of SRAM FPGAs.

附图说明Description of drawings

图1为发明提供的频率确定方法流程图;Fig. 1 is the flow chart of the frequency determination method provided by the invention;

图2为发明提供的N位翻转下的平均敏感位比例rN与翻转位数N的变化趋势图;Fig. 2 is the change trend diagram of the average sensitive bit ratio r N and the number of flipped bits N provided by the invention under N-bit flipping;

图3为发明提供的电路B的刷新频率与该在轨错误率预示结果之间关系;Figure 3 shows the relationship between the refresh frequency of circuit B provided by the invention and the predicted result of the on-track error rate;

具体实施方式Detailed ways

一种SRAM型FPGA抗单粒子翻转加固定时刷新频率确定方法,如图1所示,具体步骤如下:A kind of SRAM type FPGA anti-single-event-flip plus fixed refresh frequency determination method, as shown in Figure 1, the specific steps are as follows:

(1)对FPGA电路进行加固设计,对电路的加固设计一般包括硬件集成的冗余设计、程序纠错、定时刷新等部分,其中硬件设计部分,根据应用所使用的FPGA电路元器件型号、任务轨道参数、任务需求,并结合空间辐射环境,分析加固后的FPGA电路在空间轨道上的静态翻转率u;(1) Reinforce the design of the FPGA circuit. The reinforcement design of the circuit generally includes redundant design of hardware integration, program error correction, and timing refresh. Orbit parameters, mission requirements, and combined with the space radiation environment, analyze the static flip rate u of the reinforced FPGA circuit on the space orbit;

其中,任务轨道参数需提供任务轨道近地点、远地点以及倾角,型号任务需求一般包括,对FPGA在轨功能中断率要求,在各种类型错误的在轨错误率要求。在本发明中,以错误率E0代替任务需求各类错误。Among them, the mission orbit parameters need to provide the perigee, apogee and inclination of the mission orbit. The model mission requirements generally include the requirements for the interruption rate of FPGA on-orbit functions and the on-orbit error rate requirements for various types of errors. In the present invention, the error rate E 0 is used to replace various errors required by the task.

(2)随机向初始状态下的FPGA电路器件配置区的配置文件注入N位故障,获得带有N位翻转的FPGA电路,运行该电路,其中故障注入方法如下:(2) Randomly inject N-bit faults into the configuration file of the FPGA circuit device configuration area in the initial state, obtain an FPGA circuit with N-bit inversion, and run the circuit, wherein the fault injection method is as follows:

(2a)随机向应用的配置文件中随机写入N个错误;(2a) Randomly write N errors into the configuration file of the application;

(2b)将带有错误的配置文件配置到应用电路;(2b) Configure the configuration file with errors to the application circuit;

(2c)运行电路,并检测电路是否有输出错误或者功能中断,记录下错误与异常情况出现的试验次数;(2c) Operate the circuit, and detect whether the circuit has output errors or function interruptions, and record the number of trials where errors and abnormal conditions occur;

(2d)如果规定时间内仍然未能观察的异常,停止本次试验,计算FPGA电路运行的错误比例λN,如有需要,可继续错误配置注入试验,开始下一次故障注入,并重复上述步骤;其中,规定时间设定需根据具体应用来确定,原则上需保证时间足够使得FPGA应用中的错误被观察到;(2d) If the abnormality still cannot be observed within the specified time, stop this test, calculate the error ratio λ N of the FPGA circuit operation, if necessary, continue the wrong configuration injection test, start the next fault injection, and repeat the above steps ; Among them, the specified time setting needs to be determined according to the specific application. In principle, it is necessary to ensure that the time is sufficient to allow errors in FPGA applications to be observed;

同时,进行M次试验,当观察到足够多的错误事件时,分别计算每次试验中带有不同的N位翻转的FPGA应用设计的错误比例λN,这里错误比例等于观察到的出错次数除以试验次数λN=#Error/#Test;At the same time, M times of trials are carried out, and when enough error events are observed, the error ratio λ N of the FPGA application design with different N-bit flips in each trial is calculated respectively, where the error ratio is equal to the number of observed errors divided by Take the number of trials λ N = #Error/#Test;

其中,在M次试验中,错误事件出现问题的试验次数不能少于50次,持续进行试验,直到观察到50次及以上的输出错误或者功能中断;如对试验时间有限制,则可以将观察到的输出错误或者功能中断次数降低到20次;Among them, in the M trials, the number of trials with error events should not be less than 50, and the trials will continue until 50 or more output errors or functional interruptions are observed; if there is a limit to the trial time, the observed The number of received output errors or function interruptions is reduced to 20 times;

且若电路错误比例λN极低,当FPGA电路运行时无法观察到错误与功能中断的发生,则可以设定最大试验次数作为试验总次数M,再达到最大试验次数后停止试验,最大试验次数要根据对电路可靠性要求决定,例如系统要求电路错误率不高于E0,则对应电路错误比例不得大于E0/u,这里要求故障试验精度可以达到E0/4u,则推荐试验次数不小于[16u/E0],在此情况下,即便没观察到错误,也可以用用95%置信度下电路错误比例上限4/M作为FPGA应用设计的错误比例上限来代替电路错误比例λN使用;And if the circuit error ratio λ N is extremely low, and the occurrence of errors and functional interruptions cannot be observed when the FPGA circuit is running, the maximum number of trials can be set as the total number of trials M, and then the test is stopped after reaching the maximum number of trials. The maximum number of trials It should be determined according to the reliability requirements of the circuit. For example, the system requires that the circuit error rate is not higher than E 0 , and the corresponding circuit error ratio should not be greater than E 0 /u. Here, the fault test accuracy is required to reach E 0 /4u, and the recommended number of tests is not is less than [16u/E 0 ], in this case, even if no error is observed, the upper limit of circuit error ratio λ N can be replaced by the upper limit of circuit error ratio λ N with 95% confidence level as the upper limit of error ratio of FPGA application design use;

(3)逐步增大向FPGA配置区注入的翻转位数N,获得错误比例λN与翻转位数N的关系,其中:(3) Gradually increase the flipping number N injected into the FPGA configuration area to obtain the relationship between the error ratio λ N and the flipping number N, where:

当翻转位数较大时,电路错误比例λN随着N的增大变化不明显,建议以指数形式逐步增大N,例如,注入翻转位数选择可以为:1,2,4,6,10,20,40,60,100,200,…When the number of flipping bits is large, the circuit error ratio λ N does not change significantly with the increase of N. It is recommended to gradually increase N in an exponential form. For example, the selection of the number of flipping bits for injection can be: 1,2,4,6, 10,20,40,60,100,200,…

当错误率比例λN达到50%以上时,可以结束故障注入试验;When the error rate ratio λ N reaches more than 50%, the fault injection test can be ended;

(4)判断电路加固设计的有效性,根据步骤(3)记录数据计算N位翻转下的平均敏感位比例rN,判断当N增大时,N位翻转下的平均敏感位比例rN随故障翻转位数N的变化值是否大于变化阈值K,若是,则进入步骤(5)进行最小在轨翻转率判断;否则停止试验,对加固后FPGA电路进行重新设计,其中:(4) To judge the validity of the circuit reinforcement design, calculate the average sensitive bit ratio r N under N-bit flipping according to the recorded data in step (3), and judge that when N increases, the average sensitive bit ratio r N under N-bit flipping increases with Whether the change value of the fault turnover number N is greater than the change threshold K, if so, enter step (5) to judge the minimum on-orbit turnover rate; otherwise, stop the test and redesign the FPGA circuit after reinforcement, where:

所述N位翻转下的平均敏感位比例的计算公式如下:The calculation formula of the average sensitive bit ratio under the N-bit flipping is as follows:

Figure BDA0002049082280000071
Figure BDA0002049082280000071

所述化阈值K是指rN的增大倍数,即取翻转位数N最大值为Nmax时平均敏感位比例rN_max与r1的比值:The threshold value K refers to the increase multiple of r N , that is, the ratio of the average sensitive bit ratio r N_max to r 1 when the maximum value of the number of flipped digits N is N max :

K=rN_max/r1 K=r N_max /r 1

当K<5时,则可以认为电路冗余度不足,定时刷新措施不能减少在轨翻转率,只能作为一种异常恢复措施,该情况需要重新进行加固设计;若变化阈值K>5,则加固设计有效,否则认定加固后的FPGA电路抗辐射性能不足,重新进行加固设计;When K<5, it can be considered that the circuit redundancy is insufficient, and the timing refresh measure cannot reduce the on-orbit turnover rate, and can only be used as an abnormal recovery measure. In this case, a new reinforcement design is required; if the change threshold K>5, then The reinforcement design is effective, otherwise it is determined that the radiation resistance of the reinforced FPGA circuit is insufficient, and the reinforcement design is re-designed;

(5)计算当前故障翻转位数情况下,加固后FPGA电路的最小在轨翻转率Emin,即为判断注入N位翻转对错误比例λ1与器件在轨静态翻转率u的乘积,将通过定时刷新策略电路可以达到的最小在轨翻转率λ1u与型号任务设定的电路错误率为E0进行比较,如果λ1u≥E0,则说明电路抗SEU性能不足,无论在多大刷新频率f下,均无法满足对SEU的容错性要求,需重新设计;如果λ1u<E0,则该电路加固设计有效,可以进入下一步,刷新频率分析;(5) In the case of calculating the number of current fault flips, the minimum on-rail flip rate E min of the FPGA circuit after reinforcement is the product of the error ratio λ 1 of judging injected N-bit flips and the on-rail static flip rate u of the device, which will be calculated by The minimum on-orbit turnover rate λ 1 u that can be achieved by the timing refresh strategy circuit is compared with the circuit error rate E 0 set by the model task. If λ 1 u≥E 0 , it indicates that the circuit’s anti-SEU performance is insufficient, no matter how large the refresh rate is. At frequency f, the fault tolerance requirements for SEU cannot be met, and redesign is required; if λ 1 u<E 0 , the circuit reinforcement design is valid, and you can enter the next step to refresh the frequency analysis;

最小在轨翻转率Emin的计算公式为:The calculation formula of the minimum on-orbit turnover rate E min is:

Emin=λ1u;E min1 u;

(6)对加固设计有效的FPGA电路进行定时刷新,确定FPGA电路错误率与定时刷新频率f的关系,根据步骤(1)所得在轨静态翻转率u、当前故障翻转位数N,计算以f为定时刷新频率时,应用错误率与定时刷新周期f的关系式;(6) Perform timing refresh on the FPGA circuit with effective reinforcement design, determine the relationship between the error rate of the FPGA circuit and the timing refresh frequency f, and calculate according to the on-orbit static flip rate u and the current fault flip number N obtained in step (1). When is the timing refresh frequency, apply the relational expression between the error rate and the timing refresh cycle f;

根据第(1)步中计算得到FPGA配置区翻转率u,按照泊松分布,可以得到在一个刷新周期1/f内,FPGA配置区发生N位翻转概率P(N)计算公式为:According to the flip rate u of the FPGA configuration area calculated in step (1), according to the Poisson distribution, it can be obtained that within a refresh cycle 1/f, the calculation formula for the N-bit flip probability P(N) in the FPGA configuration area is:

P(N)=uNf-Ne-u/f/N!,P(N)=u N f -N e -u/f /N! ,

则在一个刷新周期1/f内,FPGA应用电路出现错误概率E计算公式为:Then, within a refresh period 1/f, the calculation formula for the error probability E of the FPGA application circuit is:

Figure BDA0002049082280000081
Figure BDA0002049082280000081

对应的,采取以f为刷新频率的定时刷新策略后,加固后FPGA电路的应用错误率Ef与定时刷新周期f的关系式如下:Correspondingly, after adopting the timing refresh strategy with f as the refresh frequency, the relationship between the application error rate E f of the FPGA circuit after reinforcement and the timing refresh cycle f is as follows:

Figure BDA0002049082280000082
Figure BDA0002049082280000082

通常一个刷新周期内电路错误概率E是个远远小于1的数据,上述公式也可近似简化成

Figure BDA0002049082280000091
Usually the circuit error probability E in a refresh cycle is a data much smaller than 1, the above formula can also be approximated and simplified as
Figure BDA0002049082280000091

此时,选取型号任务许可范围内的刷新频率f,根据上述公式进行计算,对应的,与设计要求的电路错误率E0进行比较,若Ef小于E0,则当前刷新频率可用,否则当前刷新频率不足,重新选取定时刷新频率并重新计算该刷新频率下,应用加固设计错误率,直到满足要求;At this time, select the refresh frequency f within the allowable range of the model task, and calculate it according to the above formula. Correspondingly, compare it with the circuit error rate E 0 required by the design. If E f is less than E 0 , the current refresh frequency is available, otherwise the current Refresh frequency is insufficient, reselect the scheduled refresh frequency and recalculate the refresh frequency, apply the reinforcement design error rate until the requirements are met;

(8)记录满足步骤(7)中筛选条件的定时刷新频率与FPGA应用错误在对应刷新频率f下错误率Ef的数值,并重复步骤(7)获得型号任务许可的刷新频率范围内的Ef-f曲线,供用户自行选择刷新频率。(8) Record the value of the timing refresh frequency and FPGA application error error rate E f under the corresponding refresh frequency f that meets the screening conditions in step (7), and repeat step (7) to obtain E within the refresh frequency range of the model task license f -f curve, for users to choose the refresh frequency by themselves.

同时,配合上述方法,设计了一种SRAM型FPGA抗单粒子翻转加固定时刷新频率确定系统,包括故障注入模块、运行错误比例判断模块、在轨翻转率判断模块、电路错误率计算模块、刷新频率选取模块,其中:At the same time, in conjunction with the above method, a SRAM-type FPGA anti-single event flip plus fixed-time refresh frequency determination system is designed, including a fault injection module, an operation error ratio judgment module, an on-orbit flip rate judgment module, a circuit error rate calculation module, and a refresh rate module. Frequency selection module, wherein:

故障注入模块:对初始状态下FPGA电路的配置文件进行改写,随机写入N位翻转,运行FPGA电路,记录M次试验中,在不同翻转位数情况下出现输出错误或功能中断的试验次数;同时逐步增大翻转的位数N,进行重复试验并记录实验数据;Fault injection module: rewrite the configuration file of the FPGA circuit in the initial state, randomly write N-bit flips, run the FPGA circuit, and record the number of tests with output errors or functional interruptions in the case of different flip bits in M tests; At the same time, gradually increase the number of flipped digits N, repeat the test and record the experimental data;

运行错误比例判断模块:根据故障注入模块记录的试验数据,计算在不同翻转位数N的情况下,FPGA电路运行的错误比例λN,当错误比例λN大于50%时,向故障注入模块发送停止试验指令,记录错误比例λNOperation error ratio judgment module: According to the test data recorded by the fault injection module, calculate the error ratio λ N of FPGA circuit operation under different flipping digits N, and send it to the fault injection module when the error ratio λ N is greater than 50%. Stop the test command and record the error ratio λ N ;

在轨翻转率判断模块:根据运行错误比例判断模块所得不同翻转位数N的情况下记录的错误比例λN,分别计算N位翻转下的平均敏感位比例rN,判断当N增大时,N位翻转下的平均敏感位比例rN随翻转位数N的增大倍数是否大于变化阈值K,若是,则进行电路错误率判断;否则停止试验,对FPGA电路进行重新加固设计;On-orbit flipping rate judgment module: According to the error ratio λ N recorded under different flipping digits N obtained by running the error ratio judging module, calculate the average sensitive bit ratio r N under N-bit flipping respectively, and judge that when N increases, Whether the average sensitive bit ratio r N under N-bit flipping is greater than the change threshold K with the increase of the flipping number N, if so, then judge the circuit error rate; otherwise, stop the test and re-reinforce the design of the FPGA circuit;

电路错误率计算模块:计算当前加固设计的最小在轨翻转率Emin,并判断最小在轨翻转率Emin是否小于型号任务设定的电路错误率E0,若Emin不小于E0,则加固后的FPGA电路抗辐射性能不足,对FPGA电路进行重新加固设计;若Emin大于E0,则加固设计有效,进行刷新频率选取;Circuit error rate calculation module: Calculate the minimum on-orbit turnover rate E min of the current reinforcement design, and judge whether the minimum on-orbit turnover rate E min is less than the circuit error rate E 0 set by the model task. If E min is not less than E 0 , then The radiation resistance performance of the reinforced FPGA circuit is insufficient, and the FPGA circuit is re-reinforced and designed; if E min is greater than E 0 , the reinforced design is valid, and the refresh frequency is selected;

刷新频率选取模块:对加固设计有效的FPGA电路进行定时刷新,计算FPGA电路错误率与定时刷新频率f的关系式,型号任务的许可范围内任意选取刷新频率f,计算当前刷新频率下FPGA电路错误率Ef,并与型号任务设定的电路错误率E0进行比较,若Ef小于E0,则当前刷新频率可用,否则当前刷新频率过低,增大定时刷新频率并重新计算该刷新频率下FPGA电路错误率直至满足型号任务需求。Refresh frequency selection module: regularly refresh the FPGA circuit with effective reinforcement design, calculate the relationship between the FPGA circuit error rate and the timing refresh frequency f, select the refresh frequency f arbitrarily within the allowable range of the model task, and calculate the FPGA circuit error at the current refresh frequency rate E f , and compare it with the circuit error rate E 0 set by the model task, if E f is less than E 0 , the current refresh rate is available, otherwise the current refresh rate is too low, increase the timing refresh rate and recalculate the refresh rate Lower the FPGA circuit error rate until the model task requirements are met.

下面结合具体实施例进行进一步说明:Further explanation is carried out below in conjunction with specific embodiment:

本发明提出了一种基于静态故障注入的FPGA电路应用定时刷新频率在确方法,可用于确定SRAM型FPGA定时刷新频率以及定量考核FPGA应用的在轨错误率。The invention proposes a method for determining the timing refresh frequency of FPGA circuit applications based on static fault injection, which can be used for determining the timing refresh frequency of SRAM FPGAs and quantitatively assessing the on-track error rate of FPGA applications.

第一步需根据轨道、器件静态截面以及空间天气约束来预示器件的在轨翻转率,具体方法可参考文献“三模冗余技术防护结构的SEU率分析方法”(第十一届全国抗辐射电子学与电磁脉冲学术年会)。以Xilinx V2系列300万门FPGA为例,其在轨静态翻转率见下表所示。The first step is to predict the on-orbit turnover rate of the device according to the orbit, the static cross-section of the device, and space weather constraints. For the specific method, please refer to the document "SEU rate analysis method for triple-mode redundant technology protection structure" (The 11th National Radiation Resistance Electronics and Electromagnetic Pulse Academic Annual Conference). Taking the Xilinx V2 series 3 million gate FPGA as an example, its on-orbit static flip rate is shown in the table below.

Figure BDA0002049082280000101
Figure BDA0002049082280000101

假设某型号要求该器件能够承受最恶劣1天的环境,并且任务要求在最恶劣一天环境下FPGA应用错误率不得高于E0=1次/每周。Assume that a certain model requires that the device can withstand the worst one-day environment, and the task requires that the FPGA application error rate should not be higher than E 0 =1 time/week under the worst one-day environment.

按照器件每秒平均翻转1.94×10-2次,则平均一周平均会发生11733位翻转,取三位有效数字约为1.2×104次翻转,则需保证每一比特位翻转引起错误的概率小于1/1.2×104=8.3×10-5,这里需要注意,8.3×10-5并非对应故障注入一位翻转时的错误比例,一般而言,随着FPGA配置区中翻转数开始积累,SEU时引起的错误概率会上升,在定时刷新策略下,平均每个翻转对应错概率与刷新频率有关。According to the device flipping 1.94×10 -2 times per second on average, 11,733 bit flips will occur on average in one week, and the three effective figures are about 1.2×10 4 flips. It is necessary to ensure that the probability of error caused by each bit flipping is less than 1/1.2×10 4 =8.3×10 -5 , here it should be noted that 8.3×10 -5 does not correspond to the error ratio when a fault is injected into a bit flip. Generally speaking, as the number of flips starts to accumulate in the FPGA configuration area, the SEU The error probability caused by time will increase. Under the timing refresh strategy, the average error probability of each flip is related to the refresh frequency.

第二步,进行故障注入,这里需要观察到不小于50次错误,如果电路加固性能理想,做了进行1.2×104×16=19.2×104次试验也没有观察到错误,则可以停止试验,并且按照95%置信度下,电路出错率置信区间上限2.05×10-5来评估/预测应用在轨错误率。The second step is to perform fault injection. No less than 50 errors need to be observed here. If the circuit reinforcement performance is ideal and no errors are observed after 1.2×10 4 ×16=19.2×10 4 tests, the test can be stopped , and according to the 95% confidence level, the upper limit of the confidence interval of the circuit error rate is 2.05×10 -5 to evaluate/predict the application on-orbit error rate.

第三步,变化注入翻转位数N,获得错误率λN与翻转位数N的曲线,如果电路出错率高于50%,则可以停止试验,进入下一步。The third step is to change the number of flipping bits N injected into, and obtain the curve of the error rate λ N and the flipping bits N, if the error rate of the circuit is higher than 50%, you can stop the test and go to the next step.

第四步,分析电路加固是否有效。假设两个电路A,B故障注入试验结果如下表。The fourth step is to analyze whether the circuit reinforcement is effective. Assuming two circuits A and B, the fault injection test results are shown in the table below.

Figure BDA0002049082280000111
Figure BDA0002049082280000111

这里电路的错误比例λN已近计算出来,对应也将rN计算了出来,rN与N的关系如图2所示。Here the error ratio λ N of the circuit has been calculated, and r N is also calculated correspondingly. The relationship between r N and N is shown in Figure 2.

电路A随着N的增大对应配置区N位翻转下平均敏感位比例rN几乎不变,这类电路是典型未加防护措施的无冗余电路,定时刷新不能够降低其在轨错误率。其在轨错误率为uλ1,这里取最恶劣一天翻转率u为1.94×10-2次/s,电路错误比例λ1约为2.0×10-4,得到电路A在轨最小错误率Emin为2.3次/每周,不满足错误率不得高于E0=1次/每周的要求。With the increase of N, the average sensitive bit ratio r N is almost unchanged under the flipping of N bits in the configuration area corresponding to the increase of N. This type of circuit is a typical non-redundant circuit without protection measures, and regular refresh cannot reduce its on-track error rate. . Its on-orbit error rate is uλ 1 , here the worst day’s turnover rate u is taken as 1.94×10 -2 times/s, the circuit error ratio λ 1 is about 2.0×10 -4 , and the minimum on-orbit error rate of circuit A is E min It is 2.3 times/week, which does not meet the requirement that the error rate should not be higher than E 0 =1 time/week.

电路B对应rN随着N的增大而增大,这是对电路进行冗余加固后典型现象,通过定时刷新可以降低这类电路在轨错误率,最小可以降低到λ1u。如果电路B对应一位翻转错误比例与器件翻转率之积λ1u大于E0,在本例子中E0=1次/周,静态翻转率u为1.94×10-2次/s,可计算得到,如果对应λ1>8.3×10-5,则电路无论刷新多频繁,均无法满足抗单粒子翻转性能需求,需要重新进行加固设计;反之则说明采用适当刷新频率,可以满足任务需求。The r N corresponding to circuit B increases with the increase of N. This is a typical phenomenon after the circuit is redundantly reinforced. The on-orbit error rate of this type of circuit can be reduced by timing refresh, and the minimum can be reduced to λ 1 u. If the product λ 1 u of circuit B corresponding to one-bit inversion error ratio and device inversion rate is greater than E 0 , in this example E 0 = 1 time/cycle, the static inversion rate u is 1.94×10 -2 times/s, which can be calculated It is obtained that if the corresponding λ 1 >8.3×10 -5 , no matter how frequently the circuit is refreshed, it cannot meet the anti-single event upset performance requirements, and needs to be re-reinforced; otherwise, it means that an appropriate refresh frequency can meet the task requirements.

第五步,定时刷新频率的确定。定时刷新的目的是尽量减少SEU在配置区的积累,降低错误发生频率。这里以电路B为例,进行分析。The fifth step is to determine the timing refresh frequency. The purpose of timing refresh is to minimize the accumulation of SEU in the configuration area and reduce the frequency of errors. Here we take circuit B as an example for analysis.

在进行精确计算之前可以对刷新频率进行大致的估计,刷新频率f,一个刷新周期内平均翻转位数n=[u/f](中括号表示取整),则可以用rnu来简单判断一个刷新周期内错误概率,并用Ef=rnu对错误概率作简单估计,应注意的是,上述方法只能初步估计,帮助快速确定大致刷新频率范围,不适宜作为最终分析结果。初步分析结果见下表。Before performing precise calculations, the refresh frequency can be roughly estimated, the refresh frequency f, the average number of flips in a refresh cycle n=[u/f] (brackets indicate rounding), and r n u can be used to simply judge Error probability within a refresh cycle, and use E f =r n u to make a simple estimate of the error probability. It should be noted that the above method can only be used for preliminary estimation, which helps to quickly determine the approximate refresh frequency range, and is not suitable as the final analysis result. The preliminary analysis results are shown in the table below.

Figure BDA0002049082280000121
Figure BDA0002049082280000121

根据上表结果,以及错误率不大于1次/周的要求,可以初步判定刷新周期应当选择在5.2~8.6分钟之间。According to the results in the above table and the requirement that the error rate is not greater than 1 time/week, it can be preliminarily determined that the refresh cycle should be selected between 5.2 and 8.6 minutes.

①选择刷新周期为6分钟,以较低的刷新频率满足错误率不大于1次/周的要求。计算过程如下表,本算例中一个周期翻转位数大于20位概率已经很低,这里将翻转位数超过20位错误比例当成100%,以此估计电路错误率上限。① Select the refresh cycle as 6 minutes, and meet the requirement that the error rate is not greater than 1 time per week with a relatively low refresh rate. The calculation process is shown in the table below. In this calculation example, the probability of flipping over 20 bits in one cycle is already very low. Here, the error ratio of flipping over 20 bits is regarded as 100% to estimate the upper limit of the circuit error rate.

Figure BDA0002049082280000122
Figure BDA0002049082280000122

Figure BDA0002049082280000131
Figure BDA0002049082280000131

从表中可以得到一个刷新周期360s内电路因SEU发生错误的总概率为5.68×10-4,转换为单位时间错误率为Ef=E*f=1.578×10-6次/秒=0.95次/周,可以满足任务要求。From the table, it can be obtained that the total probability of circuit errors due to SEU within a refresh period of 360s is 5.68×10 -4 , and the error rate per unit time is E f =E*f=1.578×10 -6 times/second=0.95 times /week, which can meet the task requirements.

②选择刷新周期为60秒,以提高电路最佳防护性能,本算例中一个周期翻转位数大于10位概率已经很低,这里将翻转位数超过10位错误比例当成100%。计算过程如下表。②Choose the refresh period as 60 seconds to improve the best protection performance of the circuit. In this calculation example, the probability of flipping digits greater than 10 digits in one cycle is already very low. Here, the error ratio of flipping digits exceeding 10 digits is regarded as 100%. The calculation process is shown in the table below.

Figure BDA0002049082280000132
Figure BDA0002049082280000132

从表中可以得到一个刷新周期60s内电路因SEU发生错误的概率为3.35×10-5,转换为单位时间错误率为Ef=E*f=5.58×10-7次/秒=0.34次/周,较360秒的刷新周期下,错误率仅为原来的36%,提高刷新频率后,电路抗单粒子翻转性能提高了180%,有了较为明显的改善。From the table, it can be obtained that the error probability of the circuit due to SEU within a refresh period of 60s is 3.35×10 -5 , and the error rate per unit time is E f =E*f=5.58×10 -7 times/second=0.34 times/ Zhou, compared with the refresh cycle of 360 seconds, the error rate is only 36% of the original. After increasing the refresh rate, the circuit’s anti-single event upset performance has increased by 180%, which is a more obvious improvement.

③电路最佳抗单粒子翻转性能③The circuit has the best anti-single event upset performance

这里我们将讨论提高刷新频率后电路所能达到最佳的抗SEU防护性能。定时刷新所能达到最理想的情况是由于采用了刷新策略,FPGA配置区翻转很快被纠正,无法积累。但同时刷新策略不能防止单个配置位翻转所导致的错误。因此当一个刷新周期内几乎只有单个比特位翻转时意味着电路已经达到了最佳防护效果,此时电路错误率等于u*λ1。理论上加定时刷新后电路B所能达到最低翻转率0.211次/周。Here we will discuss the best anti-SEU protection performance that the circuit can achieve after increasing the refresh rate. The most ideal situation that can be achieved by timing refresh is that due to the adoption of the refresh strategy, the flipping of the FPGA configuration area is quickly corrected and cannot be accumulated. But the simultaneous flush strategy does not protect against errors caused by a single configuration bit flip. Therefore, when almost only a single bit flips in a refresh cycle, it means that the circuit has reached the best protection effect, and the error rate of the circuit is equal to u*λ 1 . Theoretically, circuit B can achieve the lowest flip rate of 0.211 times per week after timing refresh.

本算例中,我们计算了多个刷新频率下电路错误率,具体结果见图3。可以看出随着刷新周期的减小、刷新频率的增加,电路错误率逐步逼近电路理论最低错误率u*λ1In this calculation example, we calculated the circuit error rate at multiple refresh frequencies, and the specific results are shown in Figure 3. It can be seen that with the decrease of the refresh period and the increase of the refresh frequency, the error rate of the circuit gradually approaches the theoretical minimum error rate u*λ 1 of the circuit.

实际应用中刷新频率越高意味着电路资源消耗增大,因此不宜为了追求达到最大防护性能而无限增加刷新频率,以本算例来说,任务要求错误率不大于1次/周,选择6分钟一次刷新周期已经可以满足要求,6分钟一次的刷新频率是满足任务要求最小刷新频率;考虑提升抗SEU性能,选择1分钟刷新周期,则错误率可降低到0.34次/周,较为适宜;而10秒钟一次刷新频率可以将错误率进一步降低到0.22次/周,相比而言SEU加固性能提升不多,但付出代价较高,不如考虑改善电路冗余设计来提升SEU加固性能。In practical applications, the higher the refresh rate, the higher the consumption of circuit resources. Therefore, it is not advisable to increase the refresh rate indefinitely in order to achieve the maximum protection performance. In this example, the task requires that the error rate is not greater than 1 time per week, and 6 minutes is selected. One refresh cycle can already meet the requirements, and the refresh frequency of 6 minutes is the minimum refresh frequency to meet the task requirements; considering improving the anti-SEU performance, choosing a refresh cycle of 1 minute, the error rate can be reduced to 0.34 times/week, which is more appropriate; and 10 Refresh frequency once per second can further reduce the error rate to 0.22 times per week. Compared with SEU hardening performance, the improvement is not much, but the price is high. It is better to consider improving the circuit redundancy design to improve SEU hardening performance.

以上所述,仅为本发明最佳的具体实施方式,但本发明的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本发明揭露的技术范围内,可轻易想到的变化或替换,都应涵盖在本发明的保护范围之内。The above description is only the best specific implementation mode of the present invention, but the scope of protection of the present invention is not limited thereto. Any person skilled in the art can easily conceive of changes or modifications within the technical scope disclosed in the present invention. Replacement should be covered within the protection scope of the present invention.

本发明说明书中未作详细描述的内容属于本领域专业技术人员的公知技术。The content that is not described in detail in the specification of the present invention belongs to the well-known technology of those skilled in the art.

Claims (1)

1. A method for determining the timing refresh frequency of single event upset reinforcement of SRAM type FPGA is characterized by comprising the following steps:
(1) Performing reinforcement design on the FPGA circuit, and performing on-orbit static flip rate analysis on the FPGA circuit according to a space radiation environment and a static flip section curve of the FPGA circuit to obtain an on-orbit static flip rate u of the FPGA circuit;
(2) Rewriting configuration files of the FPGA circuit in an initial state, randomly writing N-bit overturn, operating the FPGA circuit, recording test times of output errors or function interruption under different overturn bit numbers in M tests, and calculating the operation error proportion lambda of the FPGA circuit N
(3) Gradually increasing the number N of turnover bits in the configuration file of the FPGA circuit, and when the FPGA operates in the error proportion lambda N If the error rate is greater than 50%, stopping the fault injection test and recording the error rate lambda N
(4) Calculating the average sensitive bit proportion r under N-bit overturn according to the recorded data in the step (3) N Judging the average sensitive bit proportion r under N bit overturn when N is increased N If the increase multiple of the number of turning bits N is larger than a change threshold K, the step (5) is entered to judge the minimum on-orbit turning rate; otherwise, stopping the test, and returning to the step (1) to carry out re-reinforcement design on the FPGA circuit;
(5) Calculating the minimum on-orbit roll-over rate E of the current reinforcement design min And judging the minimum on-orbit turnover rate E min Whether or not it is smaller than the circuit error rate E of the model task setting 0 If E min Not less than E 0 The reinforced FPGA circuit has insufficient radiation resistance, and the method returns to the step (1) to carry out re-reinforcement design on the FPGA circuit; if E min Greater than E 0 The reinforcement design is effective, and the step (6) is entered;
(6) Performing timing refreshing on the FPGA circuit with the effective reinforcement design, and calculating the relation between the error rate of the FPGA circuit and the timing refreshing frequency f;
(7) The refresh frequency f is arbitrarily selected in the allowable range of the model task, and the FPGA circuit error rate E under the current refresh frequency is calculated f And is set with the model task to the circuit error rate E 0 Compare, if E f Less than E 0 If the current refresh frequency is too low, increasing the timing refresh frequency and recalculating the FPGA circuit error rate under the refresh frequency;
recording the timing refresh frequency meeting the screening condition in the step (7) and the error rate E of the FPGA application error under the corresponding refresh frequency f f E in the refresh frequency range permitted by the model task obtained by repeating the step (7) f -f curve for user to select refresh frequency by himself;
in the step (2), the bit number N of injection inversion in the initial state is equal to 1, and the error proportion lambda of the operation of the FPGA circuit N In order to observe the ratio of the number of output errors or functional interruption of the FPGA circuit to the total test number M, the setting condition of the total test number M is as follows:
continuing the test until 50 or more output errors or functional interruptions are observed; if the test time is limited, the number of observed output errors or functional interruption can be reduced to 20;
the number of the initial turning bits of the N bits is 1, and in M times of experiments, the number of the turning bits is increased in an exponential form;
when the FPGA circuit runs, the occurrence of errors and functional interruption cannot be observed, and the error and the functional interruption can be detected after the operation of [16u/E ] 0 ]Ending the fault injection test after the secondary test, and simultaneouslyUpper error proportion limit E at 95% confidence 0 Error ratio lambda of the test of 4u N
In the step (5), the average sensitive bit proportion r under N-bit inversion N The calculation method of (2) is as follows:
Figure FDA0004189424140000031
the change threshold K is r N The increase multiple of (2), i.e. the maximum value of the number N of the flip bits is N max When the average sensitive bit proportion is r N_max
K=r N_max /r 1
If the variation threshold K is more than 5, the reinforcement design is effective, otherwise, the reinforced FPGA circuit is determined to have insufficient radiation resistance, and the step (1) is returned to carry out the reinforcement design again on the FPGA circuit;
in the step (5), the minimum on-orbit roll-over rate E min The calculation formula of (2) is as follows:
E min =λ 1 u;
in the step (6), the error rate E f The relationship with the timed refresh period f is as follows:
Figure FDA0004189424140000032
P(N)=u N f -N e -u/f /N!
wherein P (N) is the probability of N-bit overturn in one refresh period, u is the static overturn rate of the FPGA configuration area in orbit, E f F is taken as the on-orbit error rate of the FPGA circuit under the refresh frequency;
the SRAM type FPGA anti-single event upset reinforcement timing refresh frequency determining system for realizing the SRAM type FPGA single event upset reinforcement timing refresh frequency determining method comprises a fault injection module, an operation error proportion judging module, an on-track upset rate judging module, a circuit error rate calculating module and a refresh frequency selecting module, wherein:
and a fault injection module: the configuration file of the FPGA circuit in the initial state is rewritten, N-bit overturn is randomly written, the FPGA circuit is operated, and the test times of output errors or function interruption under the condition of different overturn bits in M times of tests are recorded; meanwhile, gradually increasing the number N of turnover bits, performing repeated experiments and recording experimental data;
and an operation error proportion judging module: according to test data recorded by the fault injection module, calculating the error proportion lambda of the operation of the FPGA circuit under the condition of different turnover bit numbers N N When the error ratio lambda N When the error rate is greater than 50%, a test stopping instruction is sent to the fault injection module, and the error rate lambda is recorded N
On-orbit turnover rate judging module: error proportion lambda recorded under the condition of different turnover digits N obtained by the operation error proportion judging module N Calculating average sensitive bit proportion r under N-bit overturn N Judging the average sensitive bit proportion r under N bit overturn when N is increased N Judging whether the increase multiple of the turnover bit number N is larger than a change threshold K or not, if so, judging the circuit error rate; otherwise stopping the test, and re-reinforcing the FPGA circuit;
a circuit error rate calculation module: calculating the minimum on-orbit roll-over rate E of the current reinforcement design min And judging the minimum on-orbit turnover rate E min Whether or not it is smaller than the circuit error rate E of the model task setting 0 If E min Not less than E 0 The radiation resistance of the reinforced FPGA circuit is insufficient, and the FPGA circuit is re-reinforced; if E min Greater than E 0 The reinforcement design is effective, and refresh frequency selection is performed;
the refresh frequency selection module: the method comprises the steps of regularly refreshing an FPGA circuit with an effective reinforcement design, calculating the relation between the error rate of the FPGA circuit and the regular refresh frequency f, randomly selecting the refresh frequency f within the allowable range of model tasks, and calculating the error rate E of the FPGA circuit under the current refresh frequency f And is set with the model task to the circuit error rate E 0 Compare, if E f Less than E 0 The current refresh frequency is available, otherwise the current refresh frequencyThe rate is too low, the timing refreshing frequency is increased, and the FPGA circuit error rate under the refreshing frequency is recalculated until the model task requirement is met;
the method comprises the steps of analyzing the on-orbit static turnover rate of an FPGA circuit according to a space radiation environment and a static turnover section curve of the FPGA circuit, and determining the turnover rate related to refresh frequency;
in the fault injection process, estimating the test times of output errors or function interruption;
recording error proportion parameters after stopping fault injection test by adjusting the number of the turning bits, and setting error proportion judgment parameters;
setting a change threshold K, and judging the average sensitive bit proportion r N Whether the increase multiple of the flip bit number N is larger than a change threshold K is determined to be required to carry out re-reinforcement design on the FPGA circuit;
and judging whether reinforcement is effective or not by calculating the minimum on-track turnover rate, and determining the timing refresh frequency to reduce the occurrence frequency of errors.
CN201910368564.3A 2019-05-05 2019-05-05 SRAM type FPGA single event upset reinforcement timing refresh frequency determination method and system Active CN110209547B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910368564.3A CN110209547B (en) 2019-05-05 2019-05-05 SRAM type FPGA single event upset reinforcement timing refresh frequency determination method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910368564.3A CN110209547B (en) 2019-05-05 2019-05-05 SRAM type FPGA single event upset reinforcement timing refresh frequency determination method and system

Publications (2)

Publication Number Publication Date
CN110209547A CN110209547A (en) 2019-09-06
CN110209547B true CN110209547B (en) 2023-06-16

Family

ID=67786875

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910368564.3A Active CN110209547B (en) 2019-05-05 2019-05-05 SRAM type FPGA single event upset reinforcement timing refresh frequency determination method and system

Country Status (1)

Country Link
CN (1) CN110209547B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112966459B (en) * 2021-03-01 2025-03-28 北京圣涛平试验工程技术研究院有限责任公司 A quantitative evaluation method and device for single-particle upset protection design of block memory
CN113254288B (en) * 2021-06-02 2021-09-21 中国人民解放军国防科技大学 FPGA single event upset fault injection method in satellite-borne equipment
CN118035023A (en) * 2024-02-23 2024-05-14 北京航天拓扑高科技有限责任公司 A method for evaluating the difference and confidence between virtual and real prototypes

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7764081B1 (en) * 2005-08-05 2010-07-27 Xilinx, Inc. Programmable logic device (PLD) with memory refresh based on single event upset (SEU) occurrence to maintain soft error immunity
CN103840823A (en) * 2014-02-14 2014-06-04 北京时代民芯科技有限公司 Astronavigation FPGA universal refresh circuit based on JTAG interface and achieving method thereof
CN104051002A (en) * 2014-06-06 2014-09-17 中国科学院长春光学精密机械与物理研究所 Single event upset resistant SRAM (Static Random Access Memory) type FPGA (Field Programmable Gate Array) refresh circuit and refresh method
CN106293991A (en) * 2016-08-10 2017-01-04 上海无线电设备研究所 FPGA anti-single particle based on ECC upset fast refresh circuitry and method
CN107817439A (en) * 2016-09-13 2018-03-20 北京航空航天大学 A kind of disabler time appraisal procedure based on SRAM type FPGA sensitive factors

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7764081B1 (en) * 2005-08-05 2010-07-27 Xilinx, Inc. Programmable logic device (PLD) with memory refresh based on single event upset (SEU) occurrence to maintain soft error immunity
CN103840823A (en) * 2014-02-14 2014-06-04 北京时代民芯科技有限公司 Astronavigation FPGA universal refresh circuit based on JTAG interface and achieving method thereof
CN104051002A (en) * 2014-06-06 2014-09-17 中国科学院长春光学精密机械与物理研究所 Single event upset resistant SRAM (Static Random Access Memory) type FPGA (Field Programmable Gate Array) refresh circuit and refresh method
CN106293991A (en) * 2016-08-10 2017-01-04 上海无线电设备研究所 FPGA anti-single particle based on ECC upset fast refresh circuitry and method
CN107817439A (en) * 2016-09-13 2018-03-20 北京航空航天大学 A kind of disabler time appraisal procedure based on SRAM type FPGA sensitive factors

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
一种用于评估抗辐射DSP单粒子翻转的试验方法;王月玲等;《微电子学与计算机》;20181005(第10期);59-63页 *
基于SRAM型FPGA可重构技术的故障注入系统设计;朱启等;《空间电子技术》;20171025(第05期);25-29页 *
基于定向故障注入的SRAM型FPGA单粒子翻转效应评估方法;卢凌云等;《微电子学》;20170228(第01期);137-142页 *

Also Published As

Publication number Publication date
CN110209547A (en) 2019-09-06

Similar Documents

Publication Publication Date Title
CN110209547B (en) SRAM type FPGA single event upset reinforcement timing refresh frequency determination method and system
Mukherjee Architecture design for soft errors
Li et al. Learning-based modeling and optimization for real-time system availability
Alouani et al. AS8‐static random access memory (SRAM): asymmetric SRAM architecture for soft error hardening enhancement
US8661382B2 (en) Modeling for soft error specification
CN104035834B (en) Buffering reliability analytical method considering safeguard measures
CN104268253B (en) A kind of part triplication redundancy method counted based on look-up table configuration bit
CN102521467A (en) Bit-by-bit upset fault injection method specifically for SRAM (static random access memory) type FPGA (field programmable gate array)
JP4943427B2 (en) Insertion of error detection circuit based on error propagation in integrated circuit
CN102879730A (en) Single event upset characteristic testing method for partially triple modular redundancy reinforced SRAM (static random access memory) type FPGA (field programmable gate array)
Cabanas-Holmen et al. Predicting the single-event error rate of a radiation hardened by design microprocessor
Gong et al. DRAM scaling error evaluation model using various retention time
Kehl et al. An efficient SER estimation method for combinational circuits
Garg et al. A fast, analytical estimator for the SEU-induced pulse width in combinational designs
CN104462658B (en) A kind of triplication redundancy safeguard structure FPGA single particle overturns the appraisal procedure of failure probability
Wang et al. The reliability and availability analysis of SEU mitigation techniques in SRAM-based FPGAs
Cui et al. Mitigating single event upset of FPGA for the onboard bus control of satellite
Mousavi et al. Fault tolerant fpgas: where to spend the effort?
CN112988431B (en) Reliability evaluation method and system for different exceptions of SRAM (static random Access memory) type FPGA (field programmable Gate array)
Abid et al. LFTSM: Lightweight and fully testable SEU mitigation system for Xilinx processor-based SoCs
CN109783300B (en) Criticality-based FPGA soft error multi-frequency refresh method and refresher
CN115421967A (en) A method and system for evaluating abnormal risk points of secondary equipment storage
US20100169850A1 (en) Analyzer and methods for architecturally independent noise sensitivity analysis of integrated circuits having a memory storage device
CN114974388A (en) High-speed DDR memory single-particle error evaluation system and method
Cui et al. Mitigating single event upset method for Zynq MPSoC

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant