CN110533150B

CN110533150B - Test generation and reuse system and method based on support vector machine regression model

Info

Publication number: CN110533150B
Application number: CN201910606331.2A
Authority: CN
Inventors: 钱忠胜; 宋涛
Original assignee: Jiangxi University of Finance and Economics
Current assignee: Jiangxi University of Finance and Economics
Priority date: 2019-07-05
Filing date: 2019-07-05
Publication date: 2023-05-23
Anticipated expiration: 2039-07-05
Also published as: CN110533150A

Abstract

The invention discloses a test generation and reuse system and a test generation and reuse method based on a support vector machine regression model, wherein the test generation and reuse system comprises the following steps: the test case generation unit based on the support vector machine regression model comprises a support vector machine regression module and a genetic algorithm module, wherein the support vector machine regression module is used for training a model for simulating and calculating fitness, and the genetic algorithm module is used for referring to the model trained by the support vector machine regression module to generate a test case; the test case reuse unit integrated with the support vector machine regression model comprises a test case reuse module, when a genetic algorithm is utilized to test a program, the support vector machine regression model is trained by taking an adaptation value calculated by a population individual in evolution and a pile inserting method as a sample, test data with higher adaptation degree is queried in a test case set of the program by utilizing the support vector machine regression model, and in the process of introducing the test data into new population iteration, cross operation is carried out on each selected individual and randomly selected referenced individuals with a certain probability.

Description

Test generation and reuse system and method based on support vector machine regression model

技术领域Technical Field

本发明涉及软件开发技术领域，具体涉及一种基于支持向量机回归模型的测试生成与重用系统及方法。The present invention relates to the technical field of software development, and in particular to a test generation and reuse system and method based on a support vector machine regression model.

背景技术Background Art

软件测试贯穿软件开发的整个过程，是软件开发不可缺少的一个环节，通常在软件生命周期中近40%的时间和精力花在软件测试上。研究表明，在软件测试上的花费高达所有其他软件工程阶段费用总和的3到5倍。有效地提高测试效率，减少在软件测试上的花费以及测试工程师的劳动强度，是软件领域的热门研究对象之一。Software testing runs through the entire process of software development and is an indispensable part of software development. Usually, nearly 40% of the time and energy in the software life cycle is spent on software testing. Studies have shown that the cost of software testing is as high as 3 to 5 times the total cost of all other software engineering stages. Effectively improving testing efficiency and reducing the cost of software testing and the labor intensity of test engineers are one of the hot research topics in the software field.

软件测试的重用是指在新的软件测试中重复使用到已经生成的测试资源，是测试工程师在执行新的测试或者回归测试时，通过使用或者简单修改已经存在的测试用例，并将它们运用到测试中。其目的是多次利用之前存在的测试资源，使一次生成的成果发挥到最大的作用，不必每次测试都要重新生成测试资源，从而提高软件测试的效率和软件的可靠性。测试用例作为软件测试的核心资源，测试用例的重用是整个软件测试重用的关键内容。Software testing reuse refers to the reuse of generated test resources in new software testing. When test engineers perform new tests or regression tests, they use or simply modify existing test cases and apply them to the test. The purpose is to use the previously existing test resources multiple times to maximize the results of one generation. It is not necessary to regenerate test resources for each test, thereby improving the efficiency of software testing and the reliability of the software. Test cases are the core resources of software testing, and the reuse of test cases is the key content of the entire software testing reuse.

作为一种受自然界生物进化和遗传变异机制启发产生的全局搜索方法，遗传算法近年来在软件测试中的应用取得了丰硕的研究成果。遗传算法具有种群初始化，个体评价，选择运算，交叉运算，变异运算，进化终止条件判断等步骤。遗传算法的初始种群通常采用随机的方式生成，个体评价通过相应的适应度函数计算种群中各个个体的适应度值，传统方法往往需要把个体输入到插桩程序中计算个体的适应度，适应度决定了个体的性能，采用优胜劣汰的生存法则选择个体进化种群。As a global search method inspired by the biological evolution and genetic variation mechanism in nature, the application of genetic algorithms in software testing has achieved fruitful research results in recent years. Genetic algorithms have the steps of population initialization, individual evaluation, selection operation, crossover operation, mutation operation, and evolution termination condition judgment. The initial population of genetic algorithms is usually generated randomly, and the individual evaluation calculates the fitness value of each individual in the population through the corresponding fitness function. Traditional methods often require the individual to be input into the plug-in program to calculate the individual's fitness. The fitness determines the performance of the individual, and the survival law of the fittest is used to select the individual evolution population.

软件测试工作中存在许多重复性的劳动，在测试新的程序往往需要重新生成新的测试用例，测试用例不能充分利用会导致测试成本增加，测试工程师的劳动效率降低。使用遗传算法生成新的测试用例时，传统方法计算个体适应度需要花费大量的运行时间。There are many repetitive tasks in software testing. When testing new programs, new test cases often need to be regenerated. Failure to fully utilize test cases will increase testing costs and reduce the labor efficiency of test engineers. When using genetic algorithms to generate new test cases, traditional methods for calculating individual fitness require a lot of running time.

发明内容Summary of the invention

有鉴于此，有必要提供一种提高测试用例的生成效率、减少软件测试的工作量的基于支持向量机回归模型的测试生成与重用的方法及系统。In view of this, it is necessary to provide a method and system for test generation and reuse based on a support vector machine regression model, which can improve the efficiency of test case generation and reduce the workload of software testing.

一种基于支持向量机回归模型的测试生成与重用系统，包括：A test generation and reuse system based on a support vector machine regression model, comprising:

基于支持向量机回归模型的测试用例生成单元，包括支持向量机回归模块和遗传算法模块，所述支持向量机回归模块是用于训练模拟计算适应度大小的模型，训练的样本来自所述遗传算法模块输出的测试用例，所述遗传算法模块用于引用所述支持向量机回归模块训练的模型生成测试用例；A test case generation unit based on a support vector machine regression model includes a support vector machine regression module and a genetic algorithm module. The support vector machine regression module is used to train a model for simulating and calculating fitness values. The training samples come from the test cases output by the genetic algorithm module. The genetic algorithm module is used to generate test cases by referencing the model trained by the support vector machine regression module.

融入支持向量机回归模型的测试用例重用单元，包括测试用例重用模块，所述测试用例重用模块用于在利用遗传算法测试程序时初始化预定数量的个体，并在种群进化的过程中通过进化中的种群个体及插桩法计算的适应度值作为样本来训练支持向量机回归模型，使支持向量机回归算法融合到所述遗传算法模块生成的测试用例中，将训练好的模型引用到测试用例的生成中，同时将模型选择的适应度高于预定值的个体引入到遗传种群进化过程，并将测试数据引用到新种群迭代的过程之中，对每个被选中的个体以预定的概率与随机选择的引用个体进行交叉操作，实现测试用例的重用。A test case reuse unit incorporating a support vector machine regression model comprises a test case reuse module, wherein the test case reuse module is used to initialize a predetermined number of individuals when a genetic algorithm is used to test a program, and to train a support vector machine regression model by using individuals of the evolving population and fitness values calculated by a plugging method as samples during population evolution, so that the support vector machine regression algorithm is integrated into the test cases generated by the genetic algorithm module, the trained model is referenced in the generation of the test cases, and individuals selected by the model with fitness values higher than a predetermined value are introduced into the genetic population evolution process, and test data are referenced in the new population iteration process, and each selected individual is cross-operated with a randomly selected reference individual with a predetermined probability, so as to realize the reuse of the test cases.

进一步地，所述基于支持向量机回归模型的测试用例生成单元中，所述遗传算法模块在利用遗传算法进行测试用例生成时，通过种群初始化、选择、交叉、变异等操作进化种群，所述种群中个体的适应度值为经过输入插桩程序所输出的准确值，种群个体作为训练模型的样本输入支持向量机回归模型中进行训练，当预测模型训练完成后，种群再进行进化时，采用已经训练好的模型预测种群个体适应度的值，种群其他的遗传操作仍然按照传统模式下的方法进行，根据训练模型的预测情况设定目标路径被覆盖下适应度值的范围，若种群在进化过程中适应度值在区间内，需要将该个体输入到插桩程序中进行准确值的计算。Furthermore, in the test case generation unit based on the support vector machine regression model, when the genetic algorithm module generates test cases using the genetic algorithm, the population is evolved through operations such as population initialization, selection, crossover, and mutation. The fitness values of individuals in the population are accurate values output by the input plug-in program. The population individuals are input into the support vector machine regression model as samples of the training model for training. When the prediction model training is completed, when the population is evolved again, the trained model is used to predict the fitness values of the population individuals. Other genetic operations of the population are still performed according to the method in the traditional mode. The range of fitness values under the coverage of the target path is set according to the prediction of the training model. If the fitness value of the population is within the interval during the evolution process, the individual needs to be input into the plug-in program for accurate value calculation.

进一步地，所述融入支持向量机回归模型的测试用例重用单元通过整合所述支持向量机回归模块和所述遗传算法模块，训练好的所述训练支持向量机回归模型作为预测模型，个体适应度在所述预测模型训练完成之后，种群个体的适应度由所述预测模型模拟计算；运用所述支持向量机回归模型在程序的测试用例集中查询适应度高于预定值的测试数据；当一个引用个体的被引用次数超过预定值时，所述预定值设置为三次，移除所述引用个体，以免陷入局部最优；若所述支持向量机回归模型训练出的个体适应度的效果较好，则表明所述支持向量机回归模型选择的适应度较高的个体具有有利于个体生存的优秀基因，所述引用个体携带有利于种群生存的优秀基因，与种群个体结合，优秀基因的引入会加速种群的进化。Furthermore, the test case reuse unit integrated with the support vector machine regression model integrates the support vector machine regression module and the genetic algorithm module, and uses the trained support vector machine regression model as a prediction model. After the prediction model is trained, the fitness of the individuals in the population is simulated and calculated by the prediction model; the support vector machine regression model is used to query the test data with a fitness higher than a predetermined value in the test case set of the program; when the number of citations of a referenced individual exceeds a predetermined value, the predetermined value is set to three times, and the referenced individual is removed to avoid falling into a local optimum; if the individual fitness trained by the support vector machine regression model is better, it indicates that the individuals with higher fitness selected by the support vector machine regression model have excellent genes that are beneficial to individual survival, and the referenced individuals carry excellent genes that are beneficial to the survival of the population. Combined with the individuals in the population, the introduction of excellent genes will accelerate the evolution of the population.

进一步地，所述测试用例的生成过程，需要使用预定数量的样本对构建的支持向量机回归模型进行训练，所述样本(X，d)包含输入特征

和所述特征对应真值

；特征x为测试用例的输入向量，d为其适应度值。Furthermore, the test case generation process requires the use of a predetermined number of samples to train the constructed support vector machine regression model, wherein the samples ( X, d ) contain input features

The true value corresponding to the feature

; Feature x is the input vector of the test case, and d is its fitness value.

进一步地，所述测试用例表示为

，其中a为样本的个数，适应度值

为输入插桩程序得到的适应度真值，得到容量大小为a的样本

；将样本输入支持向量机回归模型，根据样本数据及适应度值拟合每个输入特征的权重

，预测值与输入特征值的关系公式为：Furthermore, the test case is expressed as

, where a is the number of samples and the fitness value

The true fitness value obtained by the input plug-in program is obtained, and a sample with a capacity of a is obtained.

; Input the sample into the support vector machine regression model, and fit the weight of each input feature according to the sample data and fitness value

, the relationship between the predicted value and the input feature value is:

。

.

进一步地，所述训练模型为在遗传算法生成测试用例的过程中，引用生成的测试用例及插桩法求得的适应度作为样本训练支持向量机回归模型，训练的模型能够模拟计算待测程序测试用例适应度的大小。Furthermore, the training model is a support vector machine regression model that uses the generated test cases and the fitness obtained by the plugging method as samples to train the model during the process of generating test cases using the genetic algorithm. The trained model can simulate and calculate the fitness of the test cases of the program to be tested.

以及，一种基于支持向量机回归模型的测试生成与重用的方法，使用如上述任一项所述的基于支持向量机回归模型的测试生成与重用系统，利用支持向量机回归模型预测适应度的值，并使用所述支持向量机回归模型在待测程序的测试用例库中查找适应度较高的个体，将查找到的适应度较高的个体重用到程序的测试中，包括如下步骤：And, a method for test generation and reuse based on a support vector machine regression model, using the test generation and reuse system based on a support vector machine regression model as described in any of the above items, using the support vector machine regression model to predict the value of fitness, and using the support vector machine regression model to find individuals with higher fitness in the test case library of the program to be tested, and reusing the found individuals with higher fitness in the test of the program, including the following steps:

步骤一，设计一个用于模拟计算测试用例适应度值的支持向量机的训练模型，在程序的测试过程中，使用插桩法计算个体适应度值，将种群个体及其适应度值输入到支持向量机的训练模型中；Step 1: Design a training model of a support vector machine for simulating the calculation of the fitness value of the test case. During the program testing process, use the plugging method to calculate the individual fitness value, and input the population individuals and their fitness values into the training model of the support vector machine.

步骤二，给出一种利用支持向量机回归模型模拟计算适应度的方法，在利用遗传算法生成测试用例的过程中，利用预定的种群个体及其适应度作为样本训练支持向量机回归模型，在接下来种群进化的过程中，使用该模型计算个体适应度代替传统插桩计算适应度的方法；Step 2: Provide a method for simulating fitness calculation using a support vector machine regression model. In the process of generating test cases using a genetic algorithm, use the predetermined population individuals and their fitness as samples to train the support vector machine regression model. In the subsequent population evolution process, use the model to calculate individual fitness instead of the traditional method of calculating fitness by plugging.

步骤三，提出一种利用支持向量机回归模型查找测试数据并将其重用到程序测试中的方法，利用到模拟计算个体适应度方法所训练的模型，使用所述模拟计算个体适应度方法所训练的模型在相应程序的测试用例库中查找适应度较高的测试数据，在利用遗传算法测试该程序生成相应的测试用例时，引用所述测试数据，使所述测试数据与进化中的个体进行交叉操作。Step three, propose a method for using a support vector machine regression model to find test data and reuse it in program testing, using the model trained by the simulation calculation method of individual fitness, and use the model trained by the simulation calculation method of individual fitness to find test data with higher fitness in the test case library of the corresponding program. When using a genetic algorithm to test the program to generate corresponding test cases, the test data is referenced to perform a cross operation with the evolving individuals.

进一步地，所述利用支持向量机回归模型模拟计算适应度的方法包括如下步骤：Furthermore, the method for simulating and calculating fitness using a support vector machine regression model comprises the following steps:

步骤a，将获得的样本按照8:2的比例分成训练样本和预测样本；Step a, dividing the obtained samples into training samples and prediction samples in a ratio of 8:2;

步骤b，计算所述训练样本的各个特征的权值，包括存在误差情况下风险函数R、Lagrange因子α、利用内核映射方法预测测试用例的适应度值；Step b, calculating the weights of each feature of the training sample, including the risk function R, the Lagrange factor α, and the fitness value of the test case predicted by the kernel mapping method in the presence of errors;

步骤c，对各个样本的权值进行修正，并利用十字交叉法进行验证；Step c, modifying the weight of each sample and verifying it using the cross method;

步骤d，所有训练样本训练完成后预测模型训练结束；Step d: After all training samples are trained, the prediction model training ends;

步骤e，模型训练完成，利用预测样本制定覆盖目标路径个体的适应度的范围区间。Step e: After model training is completed, the range of fitness of individuals on the target path is determined using the predicted samples.

进一步地，所述利用遗传算法进行测试用例生成的方法包括如下步骤：Furthermore, the method for generating test cases using a genetic algorithm comprises the following steps:

步骤S1，初始化种群；Step S1, initializing the population;

步骤S2，种群个体输入插桩程序求得每个个体的适应度值，个体与其适应度值输入到支持向量机模型进行支持向量机回归模型的训练；Step S2, the population individuals are input into the plugging program to obtain the fitness value of each individual, and the individuals and their fitness values are input into the support vector machine model to train the support vector machine regression model;

步骤S3，每个进化中个体的适应度由训练好的模型计算其近似值；Step S3, the fitness of each evolving individual is approximated by the trained model;

步骤S4，对于适应度值在预设阈值内的个体利用插桩程序求其准确值；Step S4, using the instrumentation program to find the exact value of the individuals whose fitness values are within the preset threshold;

步骤S5，满足算法终止条件或者达到最大的迭代次数，算法终止，否则转至步骤S6；Step S5: if the algorithm termination condition is met or the maximum number of iterations is reached, the algorithm terminates, otherwise, go to step S6;

步骤S6，对个体进行选择、交叉、变异操作，转至步骤S3。Step S6, perform selection, crossover and mutation operations on individuals, and go to step S3.

进一步地，所述利用支持向量机回归模型查找测试数据并将其重用到程序测试中的方法包括如下步骤：Furthermore, the method of using the support vector machine regression model to find test data and reuse it in program testing includes the following steps:

步骤(1)，利用训练好的模型选择适应度较高的个体，在支持向量机回归模块中制定覆盖目标路径个体的适应度的范围区间；Step (1), using the trained model to select individuals with higher fitness, and setting a range of fitness intervals covering the target path individuals in the support vector machine regression module;

步骤(2)，将选择测试的数据作为遗传个体封装成数据库引入到遗传算法的遗传过程中；Step (2), encapsulating the selected test data as genetic individuals into a database and introducing it into the genetic process of the genetic algorithm;

步骤(3)，种群在进化下一代新种群的过程中以一定概率与数据库中随机引入的个体进行交叉操作；Step (3), in the process of evolving the next generation of new population, the population performs a crossover operation with individuals randomly introduced from the database with a certain probability;

步骤(4)，若个体被引用的次数大于三次，则将此个体从数据库中移除；Step (4), if the individual is cited more than three times, remove the individual from the database;

步骤(5)，满足算法终止条件或者达到最大的迭代次数，算法终止，否则转至步骤(6)；Step (5): if the algorithm termination condition is met or the maximum number of iterations is reached, the algorithm terminates; otherwise, go to step (6);

步骤(6)，在每一代新的种群进化过程中，重复步骤(3)和步骤(4)。Step (6): Repeat steps (3) and (4) in each new generation of population evolution.

本发明主要有以下几个方面的贡献：The present invention mainly has the following contributions:

1）训练出能够模拟计算测试用例适应度大小的模型。遗传算法生成测试用例的过程中，引用生成的测试用例及插桩法求得的适应度作为样本训练支持向量机回归模型，训练的模型能够模拟计算待测程序测试用例适应度的大小。1) Train a model that can simulate and calculate the fitness of test cases. In the process of generating test cases by genetic algorithm, the generated test cases and the fitness obtained by the instrumentation method are used as samples to train the support vector machine regression model. The trained model can simulate and calculate the fitness of the test cases of the program to be tested.

2）将训练模型运用到遗传算法生成测试用例的过程中。传统插桩法计算种群个体适应度值时，需要运行程序，因此会消耗大量时间。支持向量机回归模型能够根据个体模拟其适应度值，不需要运行待测程序，减小了个体适应度求值所消耗的时间。模型求出个体适应度的值不是准确值，因此，本软件根据模型模拟个体适应度的具体情况，制定属于优秀个体的适应度值区间范围，适应度值在该区间的个体有可能是覆盖目标路径的最优个体，需要对该个体适应度进行准确值的计算。该方法最小程度地使用插桩法覆盖目标路径的测试数据。实验结果也表明，该方法更有效地降低了测试数据的生成时间，提高了测试效率。2) Apply the training model to the process of generating test cases by genetic algorithm. When the traditional instrumentation method calculates the fitness value of individuals in the population, it is necessary to run the program, which consumes a lot of time. The support vector machine regression model can simulate the fitness value of individuals according to the individual, without running the program to be tested, reducing the time consumed by the individual fitness evaluation. The value of individual fitness calculated by the model is not an accurate value. Therefore, this software formulates the fitness value interval range of excellent individuals according to the specific situation of the model simulating the individual fitness. The individuals with fitness values in this interval may be the optimal individuals covering the target path, and the fitness of the individuals needs to be accurately calculated. This method uses the instrumentation method to cover the test data of the target path to the minimum extent. The experimental results also show that this method more effectively reduces the time of generating test data and improves the test efficiency.

3）将训练模型运用到测试用例的重用中。在测试用例的重用方面，运用训练的测试模型在相应程序的测试用例数据库中查找适应度较高的个体，该个体携带有利于种群生存的优秀基因。在利用遗传算法生成测试用例的过程中，种群个体以一定的概率与引入的个体结合。实验结果表明，该测试用例生成方法能进一步地减少测试用例生成时间的消耗，提高程序的测试效率。3) Apply the training model to the reuse of test cases. In terms of test case reuse, the trained test model is used to search for individuals with higher fitness in the test case database of the corresponding program. The individuals carry excellent genes that are beneficial to the survival of the population. In the process of generating test cases using genetic algorithms, the population individuals are combined with the introduced individuals with a certain probability. The experimental results show that this test case generation method can further reduce the time consumption of test case generation and improve the test efficiency of the program.

上述基于支持向量机回归模型的测试生成与重用系统及方法中，提出融入支持向量机回归模型的测试用例生成与重用方法，利用支持向量机回归模型计算适应度的值，相较于神经网络模型，支持向量机回归在解决小样本、非线性及高维模式识别中表现出许多特有的优势。利用支持向量机回归模型预测适应度的值，并使用该模型在待测程序的测试用例库中查找适应度较高的个体，将其重用到程序的测试中。从减少计算个体适应度的时间消耗和加快种群进化两个方面提高测试用例的生成效率。In the above-mentioned test generation and reuse system and method based on support vector machine regression model, a test case generation and reuse method integrating support vector machine regression model is proposed, and the support vector machine regression model is used to calculate the fitness value. Compared with the neural network model, support vector machine regression shows many unique advantages in solving small sample, nonlinear and high-dimensional pattern recognition. The support vector machine regression model is used to predict the fitness value, and the model is used to find individuals with higher fitness in the test case library of the program to be tested, and reuse them in the test of the program. The efficiency of test case generation is improved by reducing the time consumption of calculating individual fitness and accelerating population evolution.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

图1是本发明实施例的基于支持向量机回归模型的测试生成与重用方法的技术架构流程图。FIG1 is a technical architecture flow chart of a test generation and reuse method based on a support vector machine regression model according to an embodiment of the present invention.

图2是本发明实施例的基于支持向量机回归模型的测试生成与重用方法的测试用例生成框架流程图。FIG2 is a flow chart of a test case generation framework of a test generation and reuse method based on a support vector machine regression model according to an embodiment of the present invention.

图3是本发明实施例的基于支持向量机回归模型的测试生成与重用方法的测试用例重用框架流程图。FIG3 is a flow chart of a test case reuse framework of a test generation and reuse method based on a support vector machine regression model according to an embodiment of the present invention.

图4是本发明实施例的基于支持向量机回归模型的测试生成与重用方法的支持向量机回归模型训练及其应用的软件运行界面。FIG. 4 is a software operation interface of support vector machine regression model training and application of the test generation and reuse method based on the support vector machine regression model according to an embodiment of the present invention.

具体实施方式DETAILED DESCRIPTION

本实施例以基于支持向量机回归模型的测试生成与重用系统及方法为例，以下将结合具体实施例和附图对本发明进行详细说明。This embodiment takes a test generation and reuse system and method based on a support vector machine regression model as an example, and the present invention will be described in detail below in conjunction with specific embodiments and drawings.

请参阅图1、图2和图3，示出本发明实施例提供的一种基于支持向量机回归模型的测试生成与重用系统及方法。Please refer to FIG. 1 , FIG. 2 and FIG. 3 , which illustrate a test generation and reuse system and method based on a support vector machine regression model provided by an embodiment of the present invention.

支持向量机(Support Vector Machine，简称为SVM)主要包括了目标函数的定义，处理目标函数中的噪声，进一步优化目标函数以及解决非线性回归问题的方法。预测测试用例适应度的支持向量机回归模型按照以下支持向量机的相关知识进行训练。Support Vector Machine (SVM) mainly includes the definition of the objective function, the processing of noise in the objective function, the further optimization of the objective function and the method of solving nonlinear regression problems. The support vector machine regression model for predicting the fitness of test cases is trained according to the following knowledge of support vector machines.

支持向量机是基于统计学习理论的VC维理论和结构风险最小化原则，是按监督学习(Supervised Learning)方式对数据进行二元分类以获得最好的泛化能力的广义分类器。支持向量机回归(Support Vector Machine Regression，称SVR)是一种基于惩罚学习的回归方法。SVR 的目标是模拟输入x和结果y之间的回归关系f(x)，公式表示为：Support vector machine is based on the VC dimension theory and structural risk minimization principle of statistical learning theory. It is a generalized classifier that performs binary classification of data in a supervised learning manner to obtain the best generalization ability. Support vector machine regression (SVR) is a regression method based on penalty learning. The goal of SVR is to simulate the regression relationship f ( x ) between input x and result y , which is expressed as:

在利用SVR进行分类时，数据中往往会存在噪声点，为了把噪声点也划分正确，超平面就会向另外一个类的样本靠拢，这就使得划分超平面的几何间距变小，降低模型的泛化性能。为了将噪声点对模型的影响降低到最小，引入松弛因子

和

，表示超出和未超出ε惩罚区间的情况。并结合参数C来构成最终的风险函数R。SVR的问题即转化为存在误差情况下风险函数R的最小化问题，公式表示为：When using SVR for classification, there are often noise points in the data. In order to correctly classify the noise points, the hyperplane will move closer to the samples of another class, which makes the geometric spacing of the dividing hyperplane smaller and reduces the generalization performance of the model. In order to minimize the impact of noise points on the model, the relaxation factor is introduced.

and

, indicating the situation of exceeding and not exceeding the ε penalty interval. And combined with parameter C to form the final risk function R. The SVR problem is transformed into the minimization problem of the risk function R in the presence of errors, and the formula is expressed as:

该优化问题是一个约束二次规划问题，可以通过引入 Lagrange 因子α-i和α+i，将其转化为二次形式的问题，通过 Lagrange函数将约束条件融合到目标函数中，并对其进行遍历求偏导数，令其偏导数为0，便可得到Lagrange 因子与w的关系。从而只用一个函数表达式便能清楚的表达问题，即将原来求参数w和b简化为只求Lagrange 因子α，因此最初的回归目标可以表示成：This optimization problem is a constrained quadratic programming problem. It can be transformed into a quadratic problem by introducing Lagrange factors α - i and α + i . The constraints are integrated into the objective function through the Lagrange function, and the partial derivatives are traversed to obtain the relationship between the Lagrange factor and w . Therefore, the problem can be clearly expressed with only one function expression, that is, the original parameters w and b are simplified to only the Lagrange factor α . Therefore, the initial regression target can be expressed as:

为了解决非线性回归问题，引入了内核映射方法，使用转换函数Φ将变量x映射到高维非线性空间，并通过引入核函数K，

，来避免在同一特征空间的内积

的计算，最后得到非线性回归表达式的最终形式为：In order to solve the nonlinear regression problem, the kernel mapping method is introduced. The variable x is mapped to a high-dimensional nonlinear space using the transformation function Φ , and the kernel function K is introduced.

, to avoid inner products in the same feature space

The final form of the nonlinear regression expression is:

核函数的选取决定了SVR模型的精确性，本软件选取径向基函数(RBF)作为核函数，该函数对于非线性的数据模拟较好，适合预测测试用例的适应度值。The selection of kernel function determines the accuracy of the SVR model. This software selects radial basis function (RBF) as the kernel function, which is better for nonlinear data simulation and suitable for predicting the fitness value of the test case.

所有参数满足ξ－i，ξ＋i＝0的输入数据集

则称为支持向量。支持向量即为落在ε误差边界上的向量。支持向量代表了回归模型的特征。除了支持向量以外，其他样本的增删对模型的训练效果影响很小，这使得SVR方法相对其它机器学习方法，需要建模的样本数量更少。这点在特征维度大于数据量时更为明显。Input data set with all parameters satisfying ξ － i , ξ ＋ i ＝ 0

It is called a support vector. The support vector is a vector that falls on the ε error boundary. The support vector represents the characteristics of the regression model. In addition to the support vector, the addition and deletion of other samples have little effect on the training effect of the model, which makes the SVR method require fewer samples to model than other machine learning methods. This is more obvious when the feature dimension is larger than the data volume.

监督学习实际上就是一个经验风险或者结构风险函数的最优化问题。风险函数度量平均意义下模型预测的好坏，模型每一次预测的好坏用损失函数来度量。假设空间F中选择模型f作为决策函数，对于给定的输入X，由f(X)给出相应的输出Y，这个输出的预测值f(X)与真实值Y可能会有一定的差值，用一个损失函数来度量预测错误的程度。损失函数记为L(Y,f(X))。本软件选用的损失函数为平方损失函数，其公式表示为：Supervised learning is actually an optimization problem of empirical risk or structural risk function. The risk function measures the quality of model prediction in an average sense, and the quality of each prediction of the model is measured by the loss function. In the hypothesis space F, model f is selected as the decision function. For a given input X , the corresponding output Y is given by f ( X ). The predicted value f ( X ) of this output may have a certain difference with the true value Y. A loss function is used to measure the degree of prediction error. The loss function is recorded as L ( Y, f ( X )). The loss function selected by this software is the square loss function, and its formula is expressed as:

本软件将机器学习方法与遗传算法相结合，利用支持向量机回归模型模拟个体适应度的值，不用每一个种群个体都输入到插桩程序进行适应度的计算，减少适应度求值消耗的时间，提高测试效率。This software combines machine learning methods with genetic algorithms, and uses the support vector machine regression model to simulate the value of individual fitness. It does not need to input every population individual into the plug-in program for fitness calculation, which reduces the time consumed by fitness evaluation and improves test efficiency.

本软件包括测试用例的生成与测试用例的重用。This software includes test case generation and test case reuse.

1. 基于支持向量机回归模型的测试用例生成1. Test case generation based on support vector machine regression model

在利用遗传算法进行测试用例生成的过程中，按照传统算法进行种群初始化、选择、交叉、变异等操作进化种群，该种群个体的适应度值是经过输入插桩程序所输出的准确值，种群个体当做训练模型的样本输入支持向量机回归模型中进行训练。当预测模型训练完成后，种群再进行进化时，采用已经训练好的模型预测种群个体适应度的值，种群其他的遗传操作仍然按照传统模式下的方法进行。根据训练模型的预测情况设定目标路径被覆盖下适应度值的范围，若种群在进化过程中适应度值在区间内，需要将该个体输入到插桩程序中进行准确值的计算。In the process of using genetic algorithms to generate test cases, the population is evolved by performing operations such as population initialization, selection, crossover, and mutation according to the traditional algorithm. The fitness value of the individual in the population is the accurate value output by the input plug-in program. The individual in the population is used as a sample of the training model and input into the support vector machine regression model for training. When the prediction model training is completed, when the population is evolved again, the trained model is used to predict the fitness value of the individual in the population. Other genetic operations of the population are still performed according to the traditional method. According to the prediction of the training model, the range of fitness values under which the target path is covered is set. If the fitness value of the population is within the range during the evolution process, the individual needs to be input into the plug-in program to calculate the accurate value.

在测试用例生成时，需要使用一定数量的样本对构建的支持向量机回归模型进行训练，一个样本(X，d)是包含输入特征

和各个特征对应真值

；特征x为测试用例的输入向量，d为其适应度值。When generating test cases, a certain number of samples are needed to train the constructed support vector machine regression model. A sample ( X, d ) contains the input features

Corresponding to each feature

; Feature x is the input vector of the test case, and d is its fitness value.

先要生成一定数量的测试用例

，a为样本的个数，这些测试数据的适应度值

是输入插桩程序得到的适应度真值，得到容量大小为a的样本

。这些样本输入支持向量机回归模型会根据样本数据及其适应度值拟合每个输入特征的权重

，预测值与输入特征值的关系用公式表示为：First, generate a certain number of test cases

, a is the number of samples, and the fitness value of these test data is

is the true value of fitness obtained by inputting the instrumentation program, and a sample with a capacity of a is obtained

These sample input support vector machine regression models will fit the weights of each input feature based on the sample data and its fitness value.

, the relationship between the predicted value and the input feature value is expressed as:

。

.

支持向量机回归模型会根据输入某程序的测试用例，训练出预测其测试用例适应度值的模型。为了使样本数据具有代表性，训练样本选用的测试用例为遗传算法随机产生的数据。需要注意的是选用样本的容量不宜过大或过小。样本容量太小训练的模型准确性较低，反应不出样本内在的线性规律；容量太大消耗的计算资源也就越多，花费时间较长，降低测试效率。The support vector machine regression model will train a model to predict the fitness value of a test case based on the test case input to a certain program. In order to make the sample data representative, the test cases selected for the training sample are data randomly generated by the genetic algorithm. It should be noted that the capacity of the selected sample should not be too large or too small. If the sample capacity is too small, the accuracy of the trained model is low and it cannot reflect the inherent linear law of the sample; if the capacity is too large, more computing resources will be consumed, it will take longer and reduce the test efficiency.

融入支持向量机模型的测试用例生成分为两个模块，支持向量机回归模块和遗传算法模块。支持向量机回归模块主要是训练模拟计算适应度大小的模型，训练的样本来自遗传算法模块输出的测试用例。遗传算法模块引用了支持向量机模块训练的模型生成测试用例。The test case generation integrated with the support vector machine model is divided into two modules, the support vector machine regression module and the genetic algorithm module. The support vector machine regression module is mainly used to train the model of simulation calculation of fitness size, and the training samples come from the test cases output by the genetic algorithm module. The genetic algorithm module references the model trained by the support vector machine module to generate test cases.

(1)支持向量机回归模块(1) Support vector machine regression module

利用遗传算法生成的测试用例作为样本训练支持向量机回归模型，其主要步骤如下：The test cases generated by the genetic algorithm are used as samples to train the support vector machine regression model. The main steps are as follows:

1）获得一定量数量的样本，按照8:2的比例分成训练样本和预测样本；1) Obtain a certain number of samples and divide them into training samples and prediction samples in a ratio of 8:2;

2）根据公式(2)将目标函数转化为存在误差情况下风险函数R的最小化问题；2) According to formula (2), the objective function is transformed into the minimization problem of the risk function R in the presence of errors;

3）根据公式(3)将Lagrange函数作为约束条件融合到目标函数，得到w与Lagrange因子a的关系式；3) According to formula (3), the Lagrange function is integrated into the objective function as a constraint condition to obtain the relationship between w and the Lagrange factor a ;

4）公式(4)引入了内核映射方法，采用公式(5)中的核函数解决非线性高维度的问题；4) Formula (4) introduces the kernel mapping method, and uses the kernel function in formula (5) to solve nonlinear high-dimensional problems;

5）随着样本的增加，根据步骤2)、步骤3)和公式(6)对权值进行修正，并利用十字交叉法进行验证，所有样本训练完成后预测模型训练结束；5) As the number of samples increases, the weights are modified according to step 2), step 3) and formula (6), and verified using the cross method. After all samples are trained, the prediction model training ends;

6）模型训练完成，利用预测样本制定覆盖目标路径个体的适应度的范围区间。6) After model training is completed, the predicted samples are used to determine the range of fitness of individuals covering the target path.

在遗传算法生成测试用例的过程中，训练出模拟个体适应度大小的支持向量机回归模型，并将该模型运用到种群之后的进化中。遗传算法模块包含支持向量机回归模型的训练以及模型使用的方法。In the process of generating test cases by genetic algorithm, a support vector machine regression model simulating the fitness of individuals is trained and applied to the subsequent evolution of the population. The genetic algorithm module includes the training of the support vector machine regression model and the method of using the model.

(2)遗传算法模块(2) Genetic Algorithm Module

采用遗传算法进行测试用例的生成，其主要步骤如下：Genetic algorithm is used to generate test cases. The main steps are as follows:

1）种群初始化；1) Population initialization;

2）种群个体输入插桩程序求得每个个体的适应度值，个体与其适应度值输入到支持向量机模型进行支持向量机回归模型的训练；2) The population individuals are input into the plug-in program to obtain the fitness value of each individual, and the individuals and their fitness values are input into the support vector machine model to train the support vector machine regression model;

3）每个进化中个体的适应度由训练好的模型计算其近似值；3) The fitness of each evolving individual is approximated by the trained model;

4）对于适应度值在预设阈值内的个体利用插桩程序求其准确值；4) For individuals whose fitness values are within the preset threshold, use the instrumentation program to find their exact values;

5）满足算法终止条件或者达到最大的迭代次数，算法终止，否则转至步骤6)；5) If the algorithm termination condition is met or the maximum number of iterations is reached, the algorithm terminates, otherwise go to step 6);

6）对个体进行选择、交叉、变异操作，转至步骤3)。6) Perform selection, crossover and mutation operations on individuals and go to step 3).

利用遗传算法测试程序时，在支持向量机回归模型未训练成功之前，遗传种群采用插桩法计算的适应度进行进化，当种群进化到一定代数后，模型训练成功，在接下来的种群进化中，使用训练模型模拟计算个体适应度大小。利用训练模型代替插桩法计算适应度值旨在减少种群进化时需要的时间。When using the genetic algorithm to test the program, before the support vector machine regression model is successfully trained, the genetic population is evolved using the fitness calculated by the plugging method. When the population evolves to a certain generation, the model training is successful. In the next population evolution, the training model is used to simulate and calculate the individual fitness. Using the training model instead of the plugging method to calculate the fitness value is intended to reduce the time required for population evolution.

测试用例重用的方法再次利用支持向量机回归模型，将训练模型运用到测试用例的重用中，旨在加快覆盖目标路径的个体的进化速度，提高测试效率。The test case reuse method reuses the support vector machine regression model and applies the training model to the reuse of test cases, aiming to speed up the evolution of individuals covering the target path and improve the test efficiency.

2. 融入支持向量机回归模型的测试用例重用2. Test case reuse with support vector machine regression model

支持向量机回归算法融合到遗传算法生成测试用例中，其训练模型的使用并不能加快种群的进化，需要从减少个体运行插装程序的时间方面提高测试效率。本软件测试用例的重用是将训练好的模型引用到测试用例的生成中，再一次根据模型提高测试用例的生成效率。即，将模型选择的适应度较高的个体引入遗传种群进化的过程中，加快种群的进化，从而进一步提高测试数据的生成效率。The support vector machine regression algorithm is integrated into the genetic algorithm to generate test cases. The use of its training model cannot speed up the evolution of the population. It is necessary to improve the test efficiency by reducing the time it takes for individuals to run the plug-in program. The reuse of the test case of this software is to refer to the trained model in the generation of test cases, and once again improve the efficiency of test case generation based on the model. That is, the individuals with higher fitness selected by the model are introduced into the process of genetic population evolution to speed up the evolution of the population, thereby further improving the efficiency of test data generation.

其中支持向量机回归模型的测试用例重用单元包括测试用例重用模块，利用训练出的支持向量机回归模型，完成测试用例重用的步骤如下：The test case reuse unit of the support vector machine regression model includes a test case reuse module. The steps of completing the test case reuse by using the trained support vector machine regression model are as follows:

1）利用训练好的模型选择适应度较高的个体。在支持向量机回归模块中已经制定了能够覆盖目标路径个体的适应度的范围区间，适应度在此区间的个体称为适应度较高的个体或者优秀个体；1) Use the trained model to select individuals with higher fitness. In the support vector machine regression module, a range of fitness that can cover the target path individuals has been established. Individuals with fitness in this range are called individuals with higher fitness or excellent individuals;

2）将选择测试的数据作为遗传个体封装成数据库引入到遗传算法的遗传过程中；2) Encapsulate the selected test data as genetic individuals into a database and introduce it into the genetic process of the genetic algorithm;

3）种群在进化下一代的过程中以一定概率与数据库中引入的个体（随机选择）进行交叉操作；3) In the process of evolving the next generation, the population performs a crossover operation with individuals introduced from the database (randomly selected) with a certain probability;

4）为了避免出现局部最优的情况，若个体被引用超过三次，则该个体从数据库中被移除；4) In order to avoid local optimality, if an individual is cited more than three times, the individual will be removed from the database;

6）在每一代新的种群进化过程中，重复步骤3)和步骤4)。6) Repeat steps 3) and 4) in each new generation of population evolution.

在利用遗传算法测试某个程序时，首先初始化一定数量的个体，在种群进化的过程中，通过进化中的种群个体及插桩法计算的适应度值作为样本来训练支持向量机回归模型。个体适应度的预测模型训练完成之后，种群个体的适应度由该模型模拟计算。此外，运用该模型在该程序的测试用例集中查询适应度较高的测试数据，将这些测试数据引用到新种群迭代的过程中，被引用的个体称为“引用个体”。每个被选中的个体以一定的概率与随机选择的引用个体进行交叉操作。为了保持种群基因的多样性，避免陷入局部最优的情况，若引用个体的被引用次数超过一定的值（这里设置为3）时，则移除该个体。若支持向量机回归模型模拟个体适应度的效果较好，则表明该模型选择适应度较高的个体具有有利于个体生存的优秀基因，优秀基因的引入会加速种群的进化，提高测试用例的生成效率。When using a genetic algorithm to test a program, a certain number of individuals are first initialized. During the population evolution process, the fitness values calculated by the evolving population individuals and the plugging method are used as samples to train the support vector machine regression model. After the prediction model of individual fitness is trained, the fitness of the population individuals is simulated and calculated by the model. In addition, the model is used to query the test data with higher fitness in the test case set of the program. These test data are referenced in the new population iteration process. The referenced individuals are called "reference individuals". Each selected individual crosses with the randomly selected reference individual with a certain probability. In order to maintain the diversity of population genes and avoid falling into the local optimal situation, if the number of references of the reference individual exceeds a certain value (here set to 3), the individual is removed. If the support vector machine regression model simulates the individual fitness well, it means that the model selects individuals with higher fitness with excellent genes that are beneficial to individual survival. The introduction of excellent genes will accelerate the evolution of the population and improve the efficiency of test case generation.

请参阅图4，示出支持向量机回归模型训练及其应用的运行界面，一个支持向量机回归模型训练及其应用的插件原型，服务于软件项目主体，有效地扩展和完善了寄主软件的功能。将支持向量机回归预测模型的训练及使用该模型生成与重用测试用例的过程以插件形式实现，进一步简化了程序的测试过程。插件的开发选用java作为编辑语言，开发环境为MyEclipse 2010。计算机配置为Windows(Intel(R) Core(TM) CPU i5-6500，3.20GHz，8.00GB RAM，64位操作系统。Refer to Fig. 4, which shows the operation interface of support vector machine regression model training and its application, a plug-in prototype of support vector machine regression model training and its application, which serves the main body of the software project and effectively expands and improves the function of the host software. The training of the support vector machine regression prediction model and the process of generating and reusing test cases using the model are implemented in the form of plug-ins, which further simplifies the testing process of the program. Java is selected as the editing language for the development of the plug-in, and the development environment is MyEclipse 2010. The computer configuration is Windows (Intel (R) Core (TM) CPU i5-6500, 3.20GHz, 8.00GB RAM, 64-bit operating system.

菜单选项中的“Program”给出了待测程序的插桩方法，“SVR Model”包括训练支持向量机回归模型参数的选择和解析，“TestCase”分析了测试用例数据库的作用和要求，“Options”提供了语言和字体颜色大小等环境选择，“Help”包含对此插件使用过程的说明。The "Program" menu option gives the instrumentation method of the program to be tested, "SVR Model" includes the selection and analysis of the parameters for training the support vector machine regression model, "TestCase" analyzes the role and requirements of the test case database, "Options" provides environment selections such as language and font color size, and "Help" contains instructions for using this plug-in.

使用该插件时，界面的按钮按照编号顺序依次执行，其操作过程如下：When using this plug-in, the buttons on the interface are executed in sequence according to the numbering order. The operation process is as follows:

1）点击按钮“1. Select a Program”，在文件选取界面中选择待测程序源码所在的文件，文件中保存的是插装后的待测程序。1) Click the button "1. Select a Program" and select the file containing the source code of the program to be tested in the file selection interface. The file contains the program to be tested after insertion.

2）点击按钮“2. Train SVR Model”进行支持向量机回归适应度预测模型的训练，文本框显示“Successful Training of SVR Model”表示模型训练成功。2) Click the button “2. Train SVR Model” to train the support vector machine regression fitness prediction model. The text box displays “Successful Training of SVR Model”, indicating that the model training is successful.

3）点击按钮“3. Find Test Cases”，在弹出界面上选择待测程序的用例数据库所在文件。3) Click the button "3. Find Test Cases" and select the file where the use case database of the program to be tested is located on the pop-up interface.

4）点击按钮“4. Reuse and Generate Test Cases”，则在遗传算法生成测试用例时，种群个体使用训练模型进行适应度的求解。并且训练模型将会在该文件中查找适应度较高的测试用例作为优秀个体重用到测试用例的生成中，该按钮下文本框输出测试结果。4) Click the button "4. Reuse and Generate Test Cases". When the genetic algorithm generates test cases, the individuals in the population use the training model to solve the fitness. The training model will search for test cases with higher fitness in the file and reuse them as excellent individuals in the generation of test cases. The text box under the button outputs the test results.

上述基于支持向量机回归模型的测试生成与重用系统及方法中，提出融入支持向量机回归模型的测试用例生成与重用方法，利用支持向量机回归模型计算适应度的值，相较于神经网络模型，支持向量机回归在解决小样本、非线性及高维模式识别中表现出许多特有的优势。利用支持向量机回归模型预测适应度的值，并使用该模型在待测程序的测试用例库中查找适应度较高的个体，将其重用到程序的测试中。从减少计算个体适应度的时间消耗和加快种群进化两个方面提高测试用例的生成效率。In the above-mentioned test generation and reuse system and method based on the support vector machine regression model, a test case generation and reuse method incorporating the support vector machine regression model is proposed, and the support vector machine regression model is used to calculate the fitness value. Compared with the neural network model, the support vector machine regression shows many unique advantages in solving small sample, nonlinear and high-dimensional pattern recognition. The support vector machine regression model is used to predict the fitness value, and the model is used to find individuals with higher fitness in the test case library of the program to be tested, and reuse them in the program test. The efficiency of test case generation is improved by reducing the time consumption of calculating individual fitness and accelerating population evolution.

需要说明的是，以上所述仅为本发明的优选实施例，并不用于限制本发明，对于本领域技术人员而言，本发明可以有各种改动和变化。凡在本发明的精神和原理之内所作的任何修改、等同替换、改进等，均应包含在本发明的保护范围之内。It should be noted that the above is only a preferred embodiment of the present invention and is not intended to limit the present invention. For those skilled in the art, the present invention may have various modifications and changes. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention shall be included in the protection scope of the present invention.

Claims

1. A test generation and reuse system based on a support vector machine regression model, characterized by comprising:

A test case generation unit based on a support vector machine regression model includes a support vector machine regression module and a genetic algorithm module. The support vector machine regression module is used to train a model for simulating and calculating fitness values. The training samples come from the test cases output by the genetic algorithm module. The genetic algorithm module is used to generate test cases by referencing the model trained by the support vector machine regression module.

A test case reuse unit incorporating a support vector machine regression model comprises a test case reuse module, wherein the test case reuse module is used to initialize a predetermined number of individuals when a genetic algorithm is used to test a program, and to train a support vector machine regression model by using individuals of the evolving population and fitness values calculated by a plugging method as samples during population evolution, so that the support vector machine regression algorithm is integrated into the test cases generated by the genetic algorithm module, the trained model is referenced in the generation of the test cases, and individuals selected by the model with fitness values higher than a predetermined value are introduced into the genetic population evolution process, and test data are referenced in the new population iteration process, and each selected individual is cross-operated with a randomly selected reference individual with a predetermined probability, so as to realize the reuse of the test cases.

2. The test generation and reuse system based on the support vector machine regression model as described in claim 1 is characterized in that in the test case generation unit based on the support vector machine regression model, when the genetic algorithm module uses the genetic algorithm to generate test cases, it evolves the population through operations such as population initialization, selection, crossover, and mutation. The fitness value of the individual in the population is the accurate value output by the input plug-in program. The population individuals are input into the support vector machine regression model as samples of the training model for training. When the prediction model training is completed, when the population is evolved again, the trained model is used to predict the fitness value of the population individuals. Other genetic operations of the population are still performed according to the method in the traditional mode. The range of fitness values under the coverage of the target path is set according to the prediction of the training model. If the fitness value of the population is within the interval during the evolution process, the individual needs to be input into the plug-in program for accurate value calculation.

3. The test generation and reuse system based on the support vector machine regression model as described in claim 1 is characterized in that the test case reuse unit integrated with the support vector machine regression model integrates the support vector machine regression module and the genetic algorithm module, and the trained support vector machine regression model is used as a prediction model. After the individual fitness of the prediction model is trained, the fitness of the population individuals is simulated and calculated by the prediction model; the support vector machine regression model is used to query the test data with fitness higher than a predetermined value in the test case set of the program; when the number of citations of a referenced individual exceeds a predetermined value, the predetermined value is set to three times, and the referenced individual is removed to avoid falling into a local optimum; if the individual fitness trained by the support vector machine regression model is better, it indicates that the individual with higher fitness selected by the support vector machine regression model has excellent genes that are beneficial to individual survival, and the referenced individual carries excellent genes that are beneficial to population survival. Combined with the population individuals, the introduction of excellent genes will accelerate the evolution of the population.

4. The test generation and reuse system based on the support vector machine regression model as claimed in claim 1 is characterized in that the test case generation process requires the use of a predetermined number of samples to train the constructed support vector machine regression model, and the samples ( X, d ) contain input features

The true value corresponding to the feature

; Feature x is the input vector of the test case, and d is its fitness value.

5. The test generation and reuse system based on the support vector machine regression model according to claim 4, characterized in that the test case is represented as

, where a is the number of samples and the fitness value

, the relationship between the predicted value and the input feature value is:

.

6. The test generation and reuse system based on the support vector machine regression model as described in claim 2 is characterized in that the training model is a support vector machine regression model trained by referencing the generated test cases and the fitness obtained by the plug-in method in the process of generating test cases by the genetic algorithm, and the trained model can simulate the calculation of the fitness of the test cases of the program to be tested.

7. A method for test generation and reuse based on a support vector machine regression model, characterized in that the test generation and reuse system based on a support vector machine regression model as described in any one of claims 1 to 6 is used, the support vector machine regression model is used to predict the value of fitness, and the support vector machine regression model is used to search for individuals with higher fitness in the test case library of the program to be tested, and the individuals with higher fitness found are reused in the test of the program, comprising the following steps:

Step 1: Design a training model of a support vector machine for simulating the calculation of the fitness value of the test case. During the program testing process, use the plugging method to calculate the individual fitness value, and input the population individuals and their fitness values into the training model of the support vector machine.

Step 2: Provide a method for simulating fitness calculation using a support vector machine regression model. In the process of generating test cases using a genetic algorithm, use the predetermined population individuals and their fitness as samples to train the support vector machine regression model. In the subsequent population evolution process, use the model to calculate individual fitness instead of the traditional method of calculating fitness by plugging.

Step three, propose a method for using a support vector machine regression model to find test data and reuse it in program testing, using the model trained by the simulation calculation method of individual fitness, and use the model trained by the simulation calculation method of individual fitness to find test data with higher fitness in the test case library of the corresponding program. When using a genetic algorithm to test the program to generate corresponding test cases, reference the test data to perform a cross operation with the evolving individuals.

8. The method for test generation and reuse based on support vector machine regression model according to claim 7, characterized in that the method of simulating fitness calculation using support vector machine regression model comprises the following steps:

Step a, dividing the obtained samples into training samples and prediction samples in a ratio of 8:2;

Step b, calculating the weights of each feature of the training sample, including the risk function R, the Lagrange factor α, and the fitness value of the test case predicted by the kernel mapping method in the presence of errors;

Step c, modifying the weight of each sample and verifying it using the cross method;

Step d: After all training samples are trained, the prediction model training ends;

Step e: After model training is completed, the range of fitness of individuals on the target path is determined using the predicted samples.

9. The method for test generation and reuse based on support vector machine regression model according to claim 7, characterized in that the method for generating test cases using genetic algorithm comprises the following steps:

Step S1, initializing the population;

Step S2, the population individuals are input into the plugging program to obtain the fitness value of each individual, and the individuals and their fitness values are input into the support vector machine model to train the support vector machine regression model;

Step S3, the fitness of each evolving individual is approximated by the trained model;

Step S4, using the instrumentation program to find the exact value of the individuals whose fitness values are within the preset threshold;

Step S5: if the algorithm termination condition is met or the maximum number of iterations is reached, the algorithm terminates, otherwise, go to step S6;

Step S6, perform selection, crossover and mutation operations on individuals, and go to step S3.

10. The method for test generation and reuse based on support vector machine regression model according to claim 7, characterized in that the method of using support vector machine regression model to find test data and reuse it in program testing comprises the following steps:

Step (1), using the trained model to select individuals with higher fitness, and setting a range of fitness intervals covering the target path individuals in the support vector machine regression module;

Step (2), encapsulating the selected test data as genetic individuals into a database and introducing it into the genetic process of the genetic algorithm;

Step (3), in the process of evolving the next generation of new population, the population performs a crossover operation with individuals randomly introduced from the database with a certain probability;

Step (4), if the individual is cited more than three times, remove the individual from the database;

Step (5): if the algorithm termination condition is met or the maximum number of iterations is reached, the algorithm terminates; otherwise, go to step (6);

Step (6): Repeat steps (3) and (4) in each new generation of population evolution.