CN111611177B - A Software Performance Defect Detection Method Based on Performance Expectation of Configuration Items - Google Patents
A Software Performance Defect Detection Method Based on Performance Expectation of Configuration Items Download PDFInfo
- Publication number
- CN111611177B CN111611177B CN202010610996.3A CN202010610996A CN111611177B CN 111611177 B CN111611177 B CN 111611177B CN 202010610996 A CN202010610996 A CN 202010610996A CN 111611177 B CN111611177 B CN 111611177B
- Authority
- CN
- China
- Prior art keywords
- performance
- configuration item
- label
- software
- test
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 230000007547 defect Effects 0.000 title claims abstract description 130
- 238000001514 detection method Methods 0.000 title claims abstract description 75
- 238000012360 testing method Methods 0.000 claims abstract description 201
- 238000000034 method Methods 0.000 claims abstract description 58
- 238000012549 training Methods 0.000 claims description 17
- 230000015556 catabolic process Effects 0.000 claims description 12
- 238000006731 degradation reaction Methods 0.000 claims description 12
- 238000011056 performance test Methods 0.000 claims description 9
- 239000008186 active pharmaceutical agent Substances 0.000 claims description 6
- 238000005457 optimization Methods 0.000 claims description 5
- 241000238366 Cephalopoda Species 0.000 claims description 4
- 230000006870 function Effects 0.000 claims description 4
- 238000005065 mining Methods 0.000 claims description 4
- 230000001186 cumulative effect Effects 0.000 claims description 3
- 238000005315 distribution function Methods 0.000 claims description 3
- 238000007781 pre-processing Methods 0.000 claims description 3
- 239000000284 extract Substances 0.000 claims 1
- 230000002950 deficient Effects 0.000 abstract description 4
- 238000010998 test method Methods 0.000 description 5
- 238000013522 software testing Methods 0.000 description 3
- 101100012902 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) FIG2 gene Proteins 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000002389 environmental scanning electron microscopy Methods 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/36—Prevention of errors by analysis, debugging or testing of software
- G06F11/3668—Testing of software
- G06F11/3672—Test management
- G06F11/3684—Test management for test design, e.g. generating new test cases
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
- G06F11/3409—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/36—Prevention of errors by analysis, debugging or testing of software
- G06F11/3668—Testing of software
- G06F11/3672—Test management
- G06F11/3692—Test management for test results analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Quality & Reliability (AREA)
- Computer Hardware Design (AREA)
- Data Mining & Analysis (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Computation (AREA)
- Evolutionary Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Debugging And Monitoring (AREA)
Abstract
本发明公开了一种基于配置项性能期望的软件性能缺陷检测方法,目的是提供一种有效检测出配置项相关性能缺陷的方法。技术方案为:利用配置项性能期望构建由配置项期望预测模块、测试样例生成模块、性能缺陷检测模块构成的性能缺陷检测系统;对配置项期望预测模块进行训练;读入待检测软件由配置项期望预测模块预测配置项的性能期望,测试样例生成模块根据性能期望和软件测试集生成测试样例,性能缺陷检测模块执行测试样例并检测性能期望和实际性能是否相符,若不相符则输出性能缺陷。采用本发明既能有效检测出软件性能缺陷,又能为软件社区检测出新的性能缺陷,且采用本发明可有效判别无缺陷软件和有缺陷软件的性能差异。
The invention discloses a software performance defect detection method based on configuration item performance expectations, and aims to provide a method for effectively detecting performance defects related to configuration items. The technical solution is: use configuration item performance expectations to build a performance defect detection system composed of configuration item expectation prediction module, test sample generation module, and performance defect detection module; train configuration item expectation prediction module; read in the software to be tested by configuring The item expectation prediction module predicts the performance expectation of the configuration item, the test sample generation module generates test samples according to the performance expectation and the software test set, the performance defect detection module executes the test sample and detects whether the performance expectation is consistent with the actual performance, if not, then Output performance flaws. The invention can not only effectively detect software performance defects, but also detect new performance defects for the software community, and adopt the invention to effectively distinguish the performance difference between non-defective software and defective software.
Description
技术领域Technical Field
本发明涉及大型软件中的性能缺陷检测领域,具体涉及一种基于配置项性能期望的软件性能缺陷检测方法。The present invention relates to the field of performance defect detection in large-scale software, and in particular to a software performance defect detection method based on configuration item performance expectations.
背景技术Background Art
随着社会的不断进步,软件系统已经在各个领域得到广泛应用,在现代社会中扮演着举足轻重的角色,发挥了重要的作用。随着软件系统的不断发展,人们对软件的可靠性,安全性,性能(软件运行速度)要求越来越高,导致软件规模不断增大,软件复杂度不断提升。例如,Hadoop分布式开源软件的2.8.0版本,源码文件数量超过8000,代码总行数接近千万。同时,软件系统提供更多更加灵活的配置项以使用户根据需求配置软件。例如,Apache httpd软件中共有个1000多个配置项,MySQL中有800多个配置项。且非功能属性的所占的比例日益增加,这些配置项与计算资源(如CPU、内存等)、性能优化策略密切相关。同时,随着软件规模不断增大,提高软件性能是软件演化和维护最重要的任务之一。Xue Han等人在ESEM 2016发表的文章“An Empirical Study on Performance Bugs for HighlyConfigurable Software Systems(高可配置软件系统中的性能缺陷实证研究)”表明:配置项也成为引发软件性能问题的主要原因之一,比例高达59%。在对148家企业的调查中,92%的企业认为提高软件性能是软件发展过程中最重要的任务之一。近年来,软件配置项相关的代码缺陷导致的软件性能问题造成了巨大的商业损失。With the continuous progress of society, software systems have been widely used in various fields, playing a pivotal role in modern society and playing an important role. With the continuous development of software systems, people have higher and higher requirements for software reliability, security, and performance (software running speed), resulting in the continuous increase of software scale and software complexity. For example, the 2.8.0 version of Hadoop distributed open source software has more than 8,000 source code files and nearly 10 million lines of code. At the same time, the software system provides more and more flexible configuration items to enable users to configure the software according to their needs. For example, there are more than 1,000 configuration items in Apache httpd software and more than 800 configuration items in MySQL. And the proportion of non-functional attributes is increasing day by day. These configuration items are closely related to computing resources (such as CPU, memory, etc.) and performance optimization strategies. At the same time, as the scale of software continues to increase, improving software performance is one of the most important tasks in software evolution and maintenance. Xue Han et al. published an article titled "An Empirical Study on Performance Bugs for HighlyConfigurable Software Systems" at ESEM 2016, which showed that configuration items have also become one of the main causes of software performance problems, accounting for as high as 59%. In a survey of 148 companies, 92% of them believed that improving software performance is one of the most important tasks in the software development process. In recent years, software performance problems caused by code defects related to software configuration items have caused huge business losses.
针对软件性能问题,现有技术主要采用两类方法对其进行检测。第一类方法,如DuShen等人在ISSTA2015发表的“Automating Performance Bottleneck Detection usingSearch-Based Application Profiling(一种基于搜索和profiling的性能缺陷自动检测方法)”,主要基于profiler等性能瓶颈诊断工具生成使软件运行缓慢的测试用例,并将执行该用例耗时最长的函数作为性能缺陷报告给开发者。虽然此类方法检测性能缺陷的覆盖率较高,但会存在大量误报。原因是测试用例执行缓慢可能并非由于性能缺陷导致,而是因为测试用例本身所需的时间较长。即,该类方法缺乏有效的性能测试预言(Test Oracle:Incomputing,software engineering,and software testing,a test oracle(or justoracle)is a mechanism for determining whether a test has passed or failed.测试预言:在计算机、软件、软件测试领域,测试预言是判断一个测试是否通过测试的标准)。In view of software performance issues, the existing technology mainly uses two types of methods to detect them. The first type of method, such as "Automating Performance Bottleneck Detection using Search-Based Application Profiling (a performance defect automatic detection method based on search and profiling)" published by DuShen et al. in ISSTA2015, mainly generates test cases that make the software run slowly based on performance bottleneck diagnostic tools such as profilers, and reports the function that takes the longest time to execute the case to the developer as a performance defect. Although this type of method has a high coverage rate for detecting performance defects, there will be a large number of false positives. The reason is that the slow execution of test cases may not be caused by performance defects, but because the test cases themselves take a long time. That is, this type of method lacks an effective performance test oracle (Test Oracle: Input computing, software engineering, and software testing, a test oracle (or just oracle) is a mechanism for determining whether a test has passed or failed. Test oracle: In the field of computers, software, and software testing, test oracle is a standard for determining whether a test has passed the test).
第二类方法,如Adrian Nistor等人在ICSE 2013发表的“Toddler:DetectingPerformance Problems via Similar Memory-Access Patterns(通过相似的内存读写模式检测性能缺陷)”,通过总结循环结构中的性能缺陷代码模式和变量读取模式,匹配待测软件中的性能缺陷。此类方法基于缺陷代码模式构建测试预言,能够有效减少性能故障的误报。然而,循环结构中的性能缺陷仅占一般性能缺陷的少量比例,因此该类方法局限于检测某种特定类型的故障(如循环结构中的缺陷),且经验证,该类方法仅能检测出9.8%配置项相关的性能故障。The second type of method, such as "Toddler: Detecting Performance Problems via Similar Memory-Access Patterns" published by Adrian Nistor et al. at ICSE 2013, matches the performance defects in the software under test by summarizing the performance defect code patterns and variable read patterns in the loop structure. This type of method builds test predictions based on defective code patterns, which can effectively reduce false positives of performance faults. However, performance defects in loop structures only account for a small proportion of general performance defects, so this type of method is limited to detecting a specific type of fault (such as defects in loop structures), and it has been verified that this type of method can only detect 9.8% of performance faults related to configuration items.
综上,如何构建低误报、高覆盖的性能测试预言,并自动化生成相应的测试样例,以有效、全面地检测软件性能缺陷是本领域技术人员正在探讨的热点问题。In summary, how to construct a performance test oracle with low false positives and high coverage, and automatically generate corresponding test samples to effectively and comprehensively detect software performance defects is a hot issue that technicians in this field are discussing.
发明内容Summary of the invention
本发明要解决的技术问题是提供一种基于配置项性能期望的软件性能缺陷检测方法。此方法利用软件配置项性能期望构建测试预言(即当软件实际性能与配置项性能期望不符时,则存在性能缺陷),同时自动预测待测软件的测试预言;基于测试预言,自动生成测试样例,有效检测出配置项相关的性能缺陷。The technical problem to be solved by the present invention is to provide a method for detecting software performance defects based on configuration item performance expectations. This method uses the performance expectations of software configuration items to construct test predictions (i.e., when the actual performance of the software does not meet the performance expectations of the configuration items, there is a performance defect), and automatically predicts the test predictions of the software to be tested; based on the test predictions, test samples are automatically generated to effectively detect performance defects related to the configuration items.
为解决上述技术问题,本发明的技术方案为:首先,利用He Haochen在ESEC/FSE2019发表的“Tuning backfired?not(always)your fault:understanding anddetecting configuration-related performance bugs(配置调节适得其反?不总是你的错!理解并检测配置相关的性能缺陷)”所述的配置项性能期望构建由配置项期望预测模块、测试样例生成模块、性能缺陷检测模块构成的性能缺陷检测系统;然后,读入人工标记了配置项期望的训练数据集,对配置项期望预测模块进行训练;最后读入待检测软件(包括软件、软件自带测试集、软件配置项用户手册),由配置项期望预测模块预测配置项的性能期望并发送到测试样例生成模块和性能缺陷检测模块,测试样例生成模块根据性能期望和软件测试集生成测试样例并发送到性能缺陷检测模块,性能缺陷检测模块执行测试样例并检测性能期望和实际性能是否相符,若不相符则输出性能缺陷。To solve the above technical problems, the technical solution of the present invention is as follows: first, the configuration item performance expectations described in "Tuning backfired? not (always) your fault: understanding and detecting configuration-related performance bugs" published by He Haochen at ESEC/FSE2019 are used to construct a performance defect detection system consisting of a configuration item expectation prediction module, a test sample generation module, and a performance defect detection module; then, a training data set with manually marked configuration item expectations is read in to train the configuration item expectation prediction module; finally, the software to be tested (including software, software built-in test set, and software configuration item user manual) is read in, and the configuration item expectation prediction module predicts the performance expectation of the configuration item and sends it to the test sample generation module and the performance defect detection module. The test sample generation module generates test samples according to the performance expectations and the software test set and sends them to the performance defect detection module. The performance defect detection module executes the test samples and detects whether the performance expectations are consistent with the actual performance. If they are not consistent, a performance defect is output.
本发明包括以下步骤:The present invention comprises the following steps:
第一步,构建性能缺陷检测系统,性能缺陷检测系统由配置项期望预测模块、测试样例生成模块、性能缺陷检测模块构成。The first step is to build a performance defect detection system, which consists of a configuration item expectation prediction module, a test sample generation module, and a performance defect detection module.
配置项期望预测模块是一个加权投票分类器,与测试样例生成模块、性能缺陷检测模块相连,从待检测软件的配置项用户手册读取配置项的描述、取值范围,对待预测配置项的性能期望进行预测,得到配置项的性能期望标签(用标签表示性能期望的类别),将配置项的性能期望标签发送给测试样例生成模块和性能缺陷检测模块。The configuration item expectation prediction module is a weighted voting classifier, which is connected to the test sample generation module and the performance defect detection module. It reads the description and value range of the configuration item from the configuration item user manual of the software to be tested, predicts the performance expectation of the configuration item to be predicted, obtains the performance expectation label of the configuration item (using the label to represent the category of the performance expectation), and sends the performance expectation label of the configuration item to the test sample generation module and the performance defect detection module.
测试样例生成模块与配置项期望预测模块、性能缺陷检测模块相连,从配置项期望预测模块接收配置项的性能期望标签,从待检测软件的测试集读取测试命令,根据配置项的性能期望标签和待检测软件测试集生成测试样例集合T。The test sample generation module is connected to the configuration item expectation prediction module and the performance defect detection module, receives the performance expectation label of the configuration item from the configuration item expectation prediction module, reads the test command from the test set of the software to be tested, and generates a test sample set T according to the performance expectation label of the configuration item and the test set of the software to be tested.
性能缺陷检测模块与配置项期望预测模块、测试样例生成模块相连,从测试样例生成模块接收测试样例集合T,从配置项期望预测模块接收配置项的性能期望标签,执行测试样例集合T中测试样例并检测配置项的性能期望标签所对应的期望性能和实际性能是否相符,若不相符则输出待检测软件的性能缺陷。The performance defect detection module is connected to the configuration item expectation prediction module and the test sample generation module, receives the test sample set T from the test sample generation module, receives the performance expectation label of the configuration item from the configuration item expectation prediction module, executes the test samples in the test sample set T and detects whether the expected performance corresponding to the performance expectation label of the configuration item is consistent with the actual performance. If not, the performance defect of the software to be tested is output.
第二步:训练性能缺陷检测系统的配置项期望预测模块。读入人工标注期望的配置项和配置项的官方文档描述,训练配置项期望预测模块。Step 2: Train the configuration item expectation prediction module of the performance defect detection system. Read the manually annotated expected configuration items and the official document descriptions of the configuration items to train the configuration item expectation prediction module.
2.1构建训练集,方法是:从MySQL、MariaDB、Apache-httpd、Apache-Tomcat、Apache-Derby、H2、PostgreSQL、GCC、Clang、MongoDB、RocksDB、Squid共12款软件的1万多个配置项中随机选取N(其中,N≥500)个配置项。2.1 Construct a training set by randomly selecting N (where N ≥ 500) configuration items from more than 10,000 configuration items of 12 software, including MySQL, MariaDB, Apache-httpd, Apache-Tomcat, Apache-Derby, H2, PostgreSQL, GCC, Clang, MongoDB, RocksDB, and Squid.
2.2根据N个配置项的官方文档描述,对配置项人工标注其性能期望标签,方法为:根据配置项(记为c)的文档描述(记为d),如果调节该配置项的目的是为了开启优化开关(即性能期望标签的含义是开启优化开关),则该配置项的性能期望标签为Label1;如果调节该配置项的目的是为了提升性能牺牲可靠性等非功能需求,则该配置项的性能期望标签为Label2;如果调节该配置项的目是为了分配更多计算机资源,则该配置项的性能期望标签为Label3;如果调节该配置项的目的是为了开启软件额外功能,则该配置项的性能期望标签为Label4;如果调节该配置项与软件性能无关,则该配置项的性能期望标签为Label5;最终得到训练集,记为 其中,N1+N2+N3+N4+N5=N;N1、N2、N3、N4、N5分别为性能期望标签为Label1,Label2,Label3,Label4,Label5的配置项文档描述的个数。是训练集中性能期望标签为Labell的第il个配置项。是的文档描述,由单词组成。其中,1≤l≤5,1≤il≤Nl。令中的单词总数为 记为(单词1,单词2,…,单词…,单词)。2.2 According to the official document descriptions of N configuration items, manually label the configuration items with their performance expectation labels. The method is as follows: according to the document description (denoted as d) of the configuration item (denoted as c), if the purpose of adjusting the configuration item is to turn on the optimization switch (that is, the meaning of the performance expectation label is to turn on the optimization switch), then the performance expectation label of the configuration item is Label 1 ; if the purpose of adjusting the configuration item is to improve performance at the expense of non-functional requirements such as reliability, then the performance expectation label of the configuration item is Label 2 ; if the purpose of adjusting the configuration item is to allocate more computer resources, then the performance expectation label of the configuration item is Label 3 ; if the purpose of adjusting the configuration item is to enable additional software functions, then the performance expectation label of the configuration item is Label 4 ; if adjusting the configuration item is not related to software performance, then the performance expectation label of the configuration item is Label 5 ; finally, the training set is obtained, denoted as Among them, N 1 +N 2 +N 3 +N 4 +N 5 =N; N 1 , N 2 , N 3 , N 4 , and N 5 are the numbers of configuration item document descriptions with performance expectation labels Label 1 , Label 2 , Label 3 , Label 4 , and Label 5 respectively. It is the i lth configuration item with expected performance label Label l in the training set. yes The document description consists of words. Among them, 1≤l≤5, 1≤i l ≤N l . Let The total number of words in Recorded as (word 1 , word 2 , ..., word …,word ).
2.3配置项期望预测模块预处理训练集;2.3 Configuration item expectation prediction module preprocesses the training set;
2.3.1初始化变量l=1;2.3.1 Initialize variable l = 1;
2.3.2初始化变量ol=1;2.3.2 Initialize variable o l = 1;
2.3.3对进行预处理,方法是:2.3.3 Pair Perform preprocessing by:
2.3.3.1令变量 2.3.3.1 Let variables
2.3.3.2将单词转化为其中为单词的词性标签(如名词(Noun),动词(Verb)等),为计算机领域同义词(如memory、CPU的DS均为resource);2.3.3.2 Word Convert to in is the part-of-speech tag of the word (such as noun, verb, etc.), They are synonyms in the computer field (e.g., DS in memory and CPU both stand for resource);
2.3.3.3若令转2.3.3.2;若则得到预处理后的为如下形式: 简记为转2.3.4;2.3.3.3 If make Go to 2.3.3.2; if Then the preprocessed In the following form: Abbreviated as Go to 2.3.4;
2.3.4判断il是否等于Nl,若是,转2.3.5,否则令il=il+1,转2.3.3;2.3.4 Determine whether i l is equal to N l . If so, go to 2.3.5. Otherwise, set i l =i l +1 and go to 2.3.3.
2.3.5判断l是否等于5,若是,转2.4,否则令l=l+1,转2.3.2;2.3.5 Determine whether l is equal to 5. If so, go to 2.4. Otherwise, set l = l + 1 and go to 2.3.2.
2.4配置项期望预测模块挖掘频繁子序列。使用Jian Pei等人在ICDE 2001发表的文献“PrefixSpan:Mining Sequential Patterns Efficiently by Prefix-ProjectedPattern Growth(PrefixSpan:通过前缀投影模式有效地挖掘序列模式)”的PrefixSpan算法分别对集合 进行频繁子序列挖掘,得到5个频繁子序列集合: 其中Q1,Q2,…,Ql,…,Q5为正整数,表示当l=1,2,…,5时,PrefixSpan算法从集合挖掘出的频繁子序列的个数;1≤q≤Ql;2.4 Configuration Item Expectation Prediction Module Mines Frequent Subsequences. We use the PrefixSpan algorithm from the paper “PrefixSpan: Mining Sequential Patterns Efficiently by Prefix-Projected Pattern Growth” published by Jian Pei et al. in ICDE 2001 to mine frequent subsequences. Perform frequent subsequence mining and obtain 5 frequent subsequence sets: Where Q 1 ,Q 2 ,…,Q l ,…,Q 5 are positive integers, indicating that when l=1,2,…,5, the PrefixSpan algorithm selects The number of frequent subsequences mined; 1≤q≤Q l ;
2.5对P1,P2,…,P5中的所有频繁子序列计算置信度(Confidence),方法是:2.5 Calculate the confidence of all frequent subsequences in P 1 , P 2 ,…, P 5 by:
2.5.1初始化变量l=1;2.5.1 Initialize variable l = 1;
2.5.2初始化变量q=1;2.5.2 Initialize variable q = 1;
2.5.3计算频繁子序列p(l,q)的置信度Confidence(l,q):2.5.3 Calculate the confidence ( l,q) of the frequent subsequence p ( l,q):
中的匹配次数之和)。其中,若p(l,q)是 的一个子序列,则判定p(l,q)与一次匹配。 The sum of the number of matches in ). If p (l,q) is is a subsequence of , then determine whether p (l,q) is One match.
2.5.4判断q是否等于Ql,若是,转2.5.5;若否,令q=q+1,转2.5.3;2.5.4 Determine whether q is equal to Q l . If so, go to 2.5.5. If not, set q = q + 1 and go to 2.5.3.
2.5.5判断l是否等于5,若是,表示得到了P1,P2,…,P5中的所有频繁子序列的置信度,转2.6;若否,令l=l+1,转2.5.2。2.5.5 Determine whether l is equal to 5. If so, it means that the confidence of all frequent subsequences in P 1 , P 2 , …, P 5 is obtained, and go to 2.6; if not, set l = l + 1, and go to 2.5.2.
2.6根据P1,P2,…,P5中的频繁子序列的置信度Confidence,对P1,P2,…,P5中的频繁子序列进行筛选。方法为:2.6 According to the confidence of the frequent subsequences in P 1 , P 2 , …, P 5, filter the frequent subsequences in P 1 , P 2 , …, P 5. The method is:
2.6.1初始化变量l=1;2.6.1 Initialize variable l = 1;
2.6.2初始化变量q=1;2.6.2 Initialize variable q = 1;
2.6.3若其中5为期望标签种类数,则将plq放入集合Pl'中;2.6.3 If Where 5 is the expected number of label types, then p lq is put into the set P l ';
2.6.4判断q是否等于Ql,若是,转2.6.5;若否,令q=q+1,转2.6.3;2.6.4 Determine whether q is equal to Q l . If so, go to 2.6.5. If not, set q = q + 1 and go to 2.6.3.
2.6.5判断l是否等于5,若是,表示得到了筛选后的频繁子序列集合P1',P2',P3',P4',P5',转2.7;若否,令l=l+1,转2.6.2。2.6.5 Determine whether l is equal to 5. If so, it means that the filtered frequent subsequence set P 1 ', P 2 ', P 3 ', P 4 ', P 5 ' is obtained, and go to 2.7; if not, set l = l + 1, and go to 2.6.2.
2.7采用P1',P2',P3',P4',P5'对配置项期望预测模块进行训练,方法是:2.7 Use P 1 ', P 2 ', P 3 ', P 4 ', P 5 ' to train the configuration item expectation prediction module, the method is:
2.7.1初始化:从P1',P2',P3',P4',P5'中分别随机选出(选出后放回)100个频繁子序列,构成随机选出频繁子序列集合P1”,P2”,P3”,P4”,P5”。P1”,P2”,P3”,P4”,P5”中共包含500个频繁子序列,即:2.7.1 Initialization: Randomly select (and replace after selection) 100 frequent subsequences from P 1 ', P 2 ', P 3 ', P 4 ', and P 5 ' respectively to form the randomly selected frequent subsequence set P 1 ', P 2 ', P 3 ', P 4 ', and P 5 '. P 1 ', P 2 ', P 3 ', P 4 ', and P 5 ' contain a total of 500 frequent subsequences, namely:
{p(1,1),p(1,2),…,p(1,r),…,p(1,100)},…,{p(l,1),p(l,2),…,p(l,r),…,p(l,100)},…,{p (1,1) ,p (1,2) ,…,p (1,r) ,…,p (1,100) },…,{p (l,1) ,p (l,2) ,… ,p (l,r) ,…,p (l,100) },…,
{p(5,1),p(5,2),…,p(5,r),…,p(5,100)},1≤r≤100;{p (5,1) ,p (5,2) ,…,p (5,r) ,…,p (5,100) }, 1≤r≤100;
2.7.2分别计算P1”,P2”,P3”,P4”,P5”在训练数据集上的准确率(Precision)、召2.7.2 Calculate the accuracy ( Precision), recall ( Recall ) and
回率(Recall)、F-score(准确率和召回率的调和平均数):Recall, F-score (the harmonic mean of precision and recall):
2.7.3判断F-score最大值的估计累积分布函数值是否大于阈值δ,δ一般为99%-99.9%,若大于,转2.8;若小于等于阈值δ,转2.7.1;2.7.3 Determine whether the estimated cumulative distribution function value of the maximum F-score is greater than a threshold value δ, which is generally 99%-99.9%. If so, go to 2.8; if less than or equal to the threshold value δ, go to 2.7.1;
2.8配置项期望预测模块选取F-score最大时对应的P1”,P2”,P3”,P4”,P5”构建加权投票分类器。方法为:将加权投票分类器的输入设定为任一待预测期望标签的已预处理的配置项描述《POSx,DSx》(简记为x),输出为5个期望标签的得票,x的性能期望标签为得票最高的期望标签。其中,类别l的得票为频繁子序列的置信度之和,1≤rx≤100,且为x的子序列。分类器输出得票五元组,记为 2.8 Configuration item expectation prediction module selects P 1 ”, P 2 ”, P 3 ”, P 4 ”, P 5 ” corresponding to the maximum F-score to construct a weighted voting classifier. The method is: set the input of the weighted voting classifier to any preprocessed configuration item description of the expected label to be predicted "POS x , DS x " (abbreviated as x), and the output is the votes of the 5 expected labels. The performance expected label of x is the expected label with the highest votes. Among them, the votes of category l are frequent subsequences. The sum of the confidences of , 1≤r x ≤100, and is a subsequence of x. The classifier outputs a five-tuple of votes, recorded as
其中,的含义为:“在Pl”中,满足“为x的子序列”的的置信度之和(其中l=1,2,…,5)”。若Votes(x)中有元素不为0,则找到Votes(x)中最大值所对应的元素,该元素对应的序号l即为x的性能期望标签对应的序号,为Labell,转第三步;若Votes(x)=[0,0,0,0,0],则x的性能期望标签为空,转第三步;例如,若Votes(x)=[1.1,1.4,5.3,0,2.0],则x的性能期望标签为Label3,若Votes(x)=[0,0,0,0,0],则该配置项x的性能期望标签为空;in, The meaning is: "In P l ", the sequence that satisfies "is a subsequence of x" The sum of the confidences of (where l = 1, 2, …, 5)". If any element in Votes(x) is not 0, find the element corresponding to the maximum value in Votes(x). The sequence number l corresponding to the element is the sequence number corresponding to the performance expectation label of x, which is Label l , and go to the third step; if Votes(x) = [0, 0, 0, 0, 0], the performance expectation label of x is empty, and go to the third step; for example, if Votes(x) = [1.1, 1.4, 5.3, 0, 2.0], the performance expectation label of x is Label 3 , and if Votes(x) = [0, 0, 0, 0, 0], the performance expectation label of the configuration item x is empty;
第三步,利用训练后的配置项期望预测模块为待检测软件生成性能期望标签集合L,将L发送给测试样例生成模块和性能缺陷检测模块,方法是:The third step is to use the trained configuration item expectation prediction module to generate a performance expectation label set L for the software to be tested, and send L to the test sample generation module and the performance defect detection module. The method is:
训练后的配置项期望预测模块从待检测软件的配置项用户手册读取配置项描述,的加权投票分类器对所有待测配置项C={c1,c2,…,cz,…,cN'},其中,1≤z≤N′(令N′为配置项用户手册中配置项的个数)的性能期望进行预测,得到性能期望标签集合L=[Lab1,Lab2,…,Labz,…,LabN′],其中Labz∈{Label1,Label2,Label3,Label4,Label5,null(空)};将L发送给测试样例生成模块和性能缺陷检测模块。The trained configuration item expectation prediction module reads the configuration item description from the configuration item user manual of the software to be tested, and uses the weighted voting classifier to predict the performance expectations of all configuration items to be tested C = {c 1 ,c 2 ,…,c z ,…,c N '}, where 1≤z≤N′ (let N′ be the number of configuration items in the configuration item user manual), and obtains the performance expectation label set L = [Lab 1 ,Lab 2 ,…,Lab z ,…,Lab N′ ], where Lab z ∈{Label 1 ,Label 2 ,Label 3 ,Label 4 ,Label 5 ,null}; L is sent to the test sample generation module and the performance defect detection module.
第四步,测试样例生成模块为待检测软件生成测试样例集合T,并将T发送给性能缺陷检测模块,方法是:In the fourth step, the test sample generation module generates a test sample set T for the software to be tested and sends T to the performance defect detection module, the method is:
4.1测试样例生成模块使用Tianyin Xu等人在SOSP 2013发表的文章“Do NotBlame Users for Misconfigurations(不要责备用户的配置错误)”的Spex算法,对C中的软件配置项的语法类型和取值范围进行提取。Spex最终提取出的语法类型分为四类:数值类型(int)、布尔类型(bool)、枚举类型(enum)、字符串类型(string);4.1 The test sample generation module uses the Spex algorithm from the article “Do Not Blame Users for Misconfigurations” published by Tianyin Xu et al. in SOSP 2013 to extract the syntax type and value range of software configuration items in C. The syntax types finally extracted by Spex are divided into four categories: numeric type (int), Boolean type (bool), enumeration type (enum), and string type (string);
4.2测试样例生成模块为配置项集合C={c1,c2,…,cz,…,cN'}生成待测值集合V,V={V1,V2,…,Vz,…,VN'},其中为配置项cz的一个取值,Kz为测试样例生成模块为cz生成的值的个数。方法为:4.2 The test sample generation module generates a set of test values V for the configuration item set C = {c 1 ,c 2 ,…,c z ,…,c N' }, V = {V 1 ,V 2 ,…,V z ,…,V N' }, where is a value of the configuration item c z , and K z is the number of values generated by the test sample generation module for c z . The method is:
4.2.1初始化变量z=1;4.2.1 Initialize variable z = 1;
4.2.2若cz对应的期望标签为空,则令转4.2.7;4.2.2 If the expected label corresponding to c z is empty, then let Go to 4.2.7;
4.2.3若cz为布尔类型(bool),则令Vz={0,1},转4.2.7;4.2.3 If c z is of Boolean type (bool), let V z = {0,1} and go to 4.2.7;
4.2.4若cz为枚举类型(enum),则令其中为Spex算法提取到的cz的全部可能取值,转4.2.7;4.2.4 If c z is an enumeration type (enum), then let in All possible values of c z extracted by the Spex algorithm, go to 4.2.7;
4.2.5若cz为字符串类型(string),则令(根据何浩辰在ESEC/FSE 2019发表的“Tuning backfired?not(always)your fault:understanding and detectingconfiguration-related performance bugs(配置调节适得其反?不总是你的错!理解并检测配置相关的性能缺陷)”的结论,极少数字符串类型的配置项会导致性能缺陷),转4.2.7;4.2.5 If c z is a string type (string), then let (According to the conclusion of "Tuning backfired? not (always) your fault: understanding and detecting configuration-related performance bugs" published by Haochen He at ESEC/FSE 2019, very few string type configuration items can cause performance defects), go to 4.2.7;
4.2.6若cz为数值类型(int),则对cz的值进行抽样,方法为:记Spex算法提取到的cz的最小取值和最大取值记为Min、Max,令Vz={Min,10·Min,102·Min,Max,10-1·Max,10-2·Max},转4.2.7;4.2.6 If c z is a numeric type (int), sample the value of c z as follows: record the minimum and maximum values of c z extracted by the Spex algorithm as Min and Max, let V z = {Min, 10·Min, 10 2 ·Min, Max, 10 -1 ·Max, 10 -2 ·Max}, and go to 4.2.7;
4.2.7若z=N′,转4.3;否则,令z=z+1,转4.2.2;4.2.7 If z = N', go to 4.3; otherwise, let z = z + 1 and go to 4.2.2;
4.3对V1,V2,…,Vz,…,VN'取笛卡尔积,得到笛卡尔积VCartesian=V1×V2×…×VN';4.3 Taking the Cartesian product of V 1 , V 2 , …, V z , …, V N' , we obtain the Cartesian product V Cartesian = V 1 × V 2 × … × V N' ;
4.4软件的性能测试集一般以性能测试工具的形式提供。因此测试样例生成模块基于性能测试工具(如sysbench、apache-benchmark)生成测试命令。方法为:采用经典的pair-wise方法(Pair-wise Testing is a combinatorial method of software testingthat,for each pair of input parameters to a system,tests all possiblediscrete combinations of those parameters.“pair-wise方法是一种软件测试领域的组合方法,该方法针对系统的每对输入参数,测试这些参数的所有可能的离散组合”--《Pragmatic Software Testing:Becoming an Effective and Efficient TestProfessional》“实用软件测试:成为一个高效的测试专业”)对性能测试工具的参数进行抽样,然后将参数(如并发度、负载类型、数据表大小、数据表数量、读操作比例、写操作比例)输入性能测试工具,输出测试命令,得到测试命令集合B={b1,b2,b3,…,by,…,bY},1≤y≤Y,Y为B中测试命令的个数;4.4 The performance test set of the software is generally provided in the form of a performance test tool. Therefore, the test sample generation module generates test commands based on the performance test tool (such as sysbench, apache-benchmark). The method is as follows: the parameters of the performance testing tool are sampled by using the classic pair-wise method (Pair-wise Testing is a combinatorial method of software testingthat, for each pair of input parameters to a system, tests all possible discrete combinations of those parameters. "Pragmatic Software Testing: Becoming an Effective and Efficient Test Professional"), and then the parameters (such as concurrency, load type, data table size, number of data tables, read operation ratio, write operation ratio) are input into the performance testing tool, and the test commands are output to obtain the test command set B = {b 1 ,b 2 ,b 3 ,…,by , …,b Y }, 1≤y≤Y, and Y is the number of test commands in B;
4.5测试样例生成模块生成测试样例集合T,T=B×VCartesian={t1,t2,t3,…,ta,…,tW},1≤a≤W,ta为一个二元组,(其中,的含义是:cz的取值为),W为T中测试样例的个数,为c1的第u(1≤u≤K1)个可能取值,为cz的第h(1≤h≤Kz)个可能取值,为c8′(1≤j≤K8′)的第j个可能取值,K1、Kz、K8′分别为Spex算法提取到的配置项c1、cz、c8′的可能取值的个数,且K1、Kz、KN′均为正整数;将测试样例集合T发送给性能缺陷检测模块;4.5 Test sample generation module generates a test sample set T, T = B × V Cartesian = {t 1 ,t 2 ,t 3 ,…,t a ,…,t W }, 1≤a≤W, t a is a two-tuple, (in, The meaning is: the value of c z is ), W is the number of test samples in T, is the uth (1≤u≤K 1 ) possible value of c 1 , is the hth ( 1≤h≤Kz ) possible value of cz , is the jth possible value of c 8′ (1≤j≤K 8′ ), K 1 , K z , K 8′ are respectively the number of possible values of configuration items c 1 , c z , c 8′ extracted by the Spex algorithm, and K 1 , K z , K N′ are all positive integers; the test sample set T is sent to the performance defect detection module;
第五步:性能缺陷检测模块根据T和L检测待测软件可执行文件的性能缺陷:Step 5: The performance defect detection module detects the performance defects of the executable file of the software to be tested based on T and L:
5.1性能缺陷检测模块执行T中的测试样例,得到测试样例的性能值,方法是:5.1 The performance defect detection module executes the test samples in T and obtains the performance values of the test samples. The method is:
5.1.1初始化变量a=1;5.1.1 Initialize variable a=1;
5.1.2为防止因测试环境不稳定导致的性能波动,性能缺陷检测模块重复执行每个测试样例A次,A为正整数,A优选为10;因此,令变量repeat=1(变量repeat记录当前重复执行的次数);5.1.2 To prevent performance fluctuations caused by unstable test environment, the performance defect detection module repeats each test sample A times, where A is a positive integer, preferably 10; therefore, let the variable repeat = 1 (the variable repeat records the number of current repeated executions);
5.1.3性能缺陷检测模块将测试样例ta输入待检测软件,运行待检测软件,记录第repeat次输入ta运行得到的性能值设定检测性能值的默认性能指标为软件数据吞吐量;5.1.3 The performance defect detection module inputs the test sample t a into the software to be tested, runs the software to be tested, and records the performance value obtained by running the software after the repeat input t a. The default performance indicator for setting the detection performance value is the software data throughput;
5.1.4判定repeat是否等于A,若是,则得到一组关于测试样例ta的性能指标,记为:转4.1.5;否则令repeat=repeat+1,转4.1.3;5.1.4 Determine whether repeat is equal to A. If so, a set of performance indicators for the test sample ta is obtained, which is recorded as: Go to 4.1.5; otherwise, set repeat = repeat + 1 and go to 4.1.3;
5.1.5判定a是否等于W,若是,记输出为Out={[t1,R1],…,[ta,Ra],…,[tW,RW]}(其中,二元组[ta,Ra]的第一个元素为测试样例,第二个元素为执行该测试样例A次得到的性能值集合),转5.2;否则令a=a+1,转5.1.2;5.1.5 Determine whether a is equal to W. If so, record the output as Out = {[t 1 ,R 1 ],…,[t a ,R a ],…,[t W ,R W ]} (where the first element of the tuple [t a ,R a ] is the test sample, and the second element is the performance value set obtained by executing the test sample A times), and go to 5.2; otherwise, let a = a + 1, and go to 5.1.2;
5.2性能缺陷检测模块将Out依据测试样例进行分组,方法是:5.2 The performance defect detection module groups Out according to the test samples by:
5.2.1初始化变量a=1;5.2.1 Initialize variable a=1;
5.2.2判断若[ta,Ra]已被分组,则令a=a+1,转5.2.2;否则转5.2.3;5.2.2 If [t a ,R a ] has been grouped, set a=a+1 and go to 5.2.2; otherwise go to 5.2.3;
5.2.3将[ta,Ra]按ta中的配置项取值和测试命令进行分组,即[ta,Ra]与{[t1,R1],…,[ta,Ra],…,[tW,RW]}中,若ta和ta'同时满足以下3个条件,则将[ta,Ra],[ta’,Ra’]组成一组:5.2.3 Group [t a ,R a ] according to the configuration item values and test commands in t a , that is, [t a ,R a ] and {[t 1 ,R 1 ],…,[t a ,R a ],…,[t W ,R W ]}, if t a and t a' simultaneously meet the following three conditions, then [t a ,R a ],[t a' ,R a' ] are grouped together:
条件1,ta和ta'仅有某一个配置项cz(其中,1≤z≤N′)的取值不同;Condition 1: ta and ta ' have only one configuration item c z (where 1≤z≤N') that differs in value;
条件2,测试命令均为 Condition 2, the test commands are
条件3,[ta,Ra],[ta’,Ra’]未被分组;Condition 3, [t a ,R a ], [t a' ,R a' ] are not grouped;
令与[ta,Ra]满足以上条件的共有Numa个,即分为一组,记为Group(z,y),Group(z,y)={[ta,Ra],[ta‘,Ra‘],[ta’‘,Ra“],…,[ta*,Ra*]}(其中,1≤a',a”,…,a*≤W,Numa为正整数,Numa的大小与cz的类型有关:若cz为布尔类型,则Numa=2;若为枚举类型,则Numa=Kz;若为数值类型,则Numa=6;若为字符串类型,则Numa=1)。例如:ta为并且ta’为 时,将[ta,Ra],[ta’,Ra’]组成一组;Let [t a ,R a ] satisfy the above conditions. That is, Divide them into a group, denoted as Group (z,y) , Group (z,y) = {[ ta , Ra], [ta ' , Ra ' ], [ta '' , Ra" ], ..., [ta * , Ra * ]} (where 1 ≤ a', a", ..., a* ≤ W, Numa is a positive integer, and the size of Numa depends on the type of cz : if cz is a Boolean type, then Numa = 2; if it is an enumeration type, then Numa = Kz ; if it is a numeric type, then Numa = 6; if it is a string type, then Numa = 1). For example: t a is And t a' is , group [t a ,R a ],[t a' ,R a' ] together;
5.2.4若a=W,表示分组完成,得到分组后的测试结果集合G={Group(1,1),Group(1,2),…,Group(1,Y),…,Group(z,y)…,Group(N',Y)},转5.3;否则令a=a+1,转5.2.2;5.2.4 If a=W, it means the grouping is completed, and the test result set after grouping is G={Group (1,1) ,Group (1,2) ,…,Group (1,Y) ,…,Group (z,y) …,Group (N',Y) }, go to 5.3; otherwise let a=a+1, go to 5.2.2;
5.3性能缺陷检测模块根据配置项集合C的性能期望标签L以及分组后的测试结果集合G,使用假设检验(假设检验(hypothesis testing),又称统计假设检验,是用来判断样本与样本、样本与总体的差异是由抽样误差引起还是本质差别造成的统计推断方法。假设检验参数β为小于1的正实数.优选β=0.05)的方法判别待检测软件是否存在缺陷。假设检验原理为:若任意一配置项cz的期望标签为Label1、Label2或Label3,调节cz的性能预期为性能提升,若实际测试结果为性能下降,则软件存在性能缺陷;若cz的期望标签为Label4,调节cz的性能预期为性能合理下降,若实际测试结果为性能大幅下降,则软件存在性能缺陷;若cz的期望标签为Label5,调节cz的性能预期为性能不变,若实际测试结果为性能下降,则软件存在性能缺陷。方法为:遍历R中每一个分组,使用假设检验的方法判别待检测软件是否存在缺陷:5.3 The performance defect detection module uses the hypothesis test (hypothesis testing, also known as statistical hypothesis testing, is a statistical inference method used to determine whether the difference between samples and samples, samples and the population is caused by sampling errors or essential differences. The hypothesis test parameter β is a positive real number less than 1. β = 0.05 is preferred) to determine whether the software to be tested has defects. The hypothesis test principle is: if the expected label of any configuration item c z is Label 1 , Label 2 or Label 3 , the performance expectation of c z is adjusted to improve performance. If the actual test result is a performance degradation, the software has a performance defect; if the expected label of c z is Label 4 , the performance expectation of c z is adjusted to a reasonable performance degradation. If the actual test result is a significant performance degradation, the software has a performance defect; if the expected label of c z is Label 5 , the performance expectation of c z is adjusted to remain unchanged. If the actual test result is a performance degradation, the software has a performance defect. The method is: traverse each group in R and use the hypothesis test method to determine whether the software to be tested has defects:
5.3.1初始化变量z=1;5.3.1 Initialize variable z = 1;
5.3.2初始化变量y=1;5.3.2 Initialize variable y = 1;
5.3.3若Labz=Label1,(其中,Labz为cz的期望标签)设定待验假设H0:Ra≤Ra'(其中,cz在ta中的值为0,cz在ta'中的值为1)。转5.3.8;5.3.3 If Lab z = Label 1 , (where Lab z is the expected label of c z ), set the hypothesis to be tested H 0 : Ra ≤ Ra' (where the value of c z in ta is 0, and the value of c z in ta ' is 1). Go to 5.3.8;
5.3.4若Labz=Label2,设定待验假设H0:Ra≤Ra'(其中,cz在ta中的值大于cz在ta'中的值)。转5.3.8;5.3.4 If Lab z = Label 2 , set the hypothesis to be tested H 0 : Ra ≤ Ra' (where the value of c z in ta is greater than the value of c z in ta ' ). Go to 5.3.8;
5.3.5若Labz=Label3,设定待验假设H0:Ra≤Ra'(其中,cz在ta中的值小于cz在ta'中的值)。转5.3.8;5.3.5 If Lab z = Label 3 , set the hypothesis to be tested H 0 : Ra ≤ Ra ' (where the value of c z in ta is less than the value of c z in ta ' ). Go to 5.3.8;
5.3.6若Labz=Label4,设定待验假设H0:5·Ra≤Ra'(其中,cz在ta中的值为1,cz在ta'中的值为0)。转5.3.8;5.3.6 If Lab z = Label 4 , set the hypothesis to be tested H 0 : 5·R a ≤R a' (where c z is 1 in t a and 0 in t a' ). Go to 5.3.8;
5.3.7若Labz=Label5,设定待验假设H0:Ra≠Ra'。转5.3.8;5.3.7 If Lab z = Label 5 , set the hypothesis to be tested as H 0 : Ra ≠Ra ' . Go to 5.3.8;
5.3.8当假设检验结果表明H0被拒绝时(即使用假设检验方法计算得到的被拒绝概率≥1-β),表明软件存在一个与配置项cz有关的性能缺陷,且触发该缺陷的测试命令为 5.3.8 When the hypothesis test result shows that H 0 is rejected (i.e. the rejection probability calculated by the hypothesis test method is ≥ 1-β), it indicates that the software has a performance defect related to configuration item c z , and the test command that triggers the defect is
5.3.9若y=Y,则转5.3.10;否则令y=y+1,转5.3.3;5.3.9 If y = Y, go to 5.3.10; otherwise, let y = y + 1 and go to 5.3.3;
5.3.10若z=N′,结束检测;否则令z=z+1,转5.3.2。5.3.10 If z=N′, end the test; otherwise, set z=z+1 and go to 5.3.2.
与现有技术相比,采用本发明能达到以下有益效果:Compared with the prior art, the present invention can achieve the following beneficial effects:
1、采用本发明能有效检测出软件性能缺陷。采用本发明在12款大型开源软件MySQL、MariaDB、Apache-httpd、Apache-Tomcat、Apache-Derby、H2、PostgreSQL、GCC、Clang、MongoDB、RocksDB、Squid中的61个历史性能缺陷中,基于52个配置项的预期,运行了23418个测试样例,耗费178个小时,成功检测出54个性能缺陷,仅产生7个假阳性(误报)。而已有工作(Adrian Nistor等人在ICSE 2013发表的“Toddler:Detecting PerformanceProblems via Similar Memory-Access Patterns”通过相似的内存读写模式检测性能缺陷)仅能检测出6个。1. The present invention can effectively detect software performance defects. The present invention was used to detect 61 historical performance defects in 12 large open source software, including MySQL, MariaDB, Apache-httpd, Apache-Tomcat, Apache-Derby, H2, PostgreSQL, GCC, Clang, MongoDB, RocksDB, and Squid. Based on the expectations of 52 configuration items, 23,418 test samples were run, which took 178 hours, and 54 performance defects were successfully detected, with only 7 false positives (false alarms). However, existing work (“Toddler: Detecting Performance Problems via Similar Memory-Access Patterns” published by Adrian Nistor et al. at ICSE 2013, which detects performance defects through similar memory read and write patterns) can only detect 6.
2、采用本发明能为软件社区检测出11个新的性能缺陷,防止了潜在的因软件性能问题可能导致的经济、用户损失。缺陷ID为:Clang-43576,Clang-43084,Clang-44359,Clang-44518,GCC-93521,GCC-93037,GCC91895,GCC91852,GCC-91817,GCC-91875,GCC-93535。2. The present invention can detect 11 new performance defects for the software community, preventing potential economic and user losses caused by software performance problems. The defect IDs are: Clang-43576, Clang-43084, Clang-44359, Clang-44518, GCC-93521, GCC-93037, GCC91895, GCC91852, GCC-91817, GCC-91875, GCC-93535.
3、本发明第二步给出了一个详尽的配置项性能期望分类并给出了自动预测配置项性能期望的方法,并给出了包含大量配置项及其期望的数据集;本发明基于配置项的预期,可有效判别无缺陷软件和有缺陷软件的性能差异,具有良好的应用前景。3. The second step of the present invention provides a detailed classification of configuration item performance expectations and a method for automatically predicting configuration item performance expectations, and provides a data set containing a large number of configuration items and their expectations; based on the expectations of configuration items, the present invention can effectively distinguish the performance differences between defect-free software and defective software, and has good application prospects.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
图1是本发明的总体流程图;Fig. 1 is an overall flow chart of the present invention;
图2是本发明第一步构建的性能期望检测系统逻辑结构图;FIG2 is a logical structure diagram of the performance expectation detection system constructed in the first step of the present invention;
图3是本发明第二步使用的配置项性能期望表。FIG. 3 is a configuration item performance expectation table used in the second step of the present invention.
具体实施方式DETAILED DESCRIPTION
下面结合附图对本发明进行说明。The present invention will be described below in conjunction with the accompanying drawings.
如图1所示,本发明包括以下步骤:As shown in Figure 1, the present invention comprises the following steps:
第一步,构建性能缺陷检测系统,性能缺陷检测系统如图2所示,由配置项期望预测模块、测试样例生成模块、性能缺陷检测模块构成。The first step is to build a performance defect detection system. The performance defect detection system is shown in Figure 2 and consists of a configuration item expectation prediction module, a test sample generation module, and a performance defect detection module.
配置项期望预测模块是一个加权投票分类器,与测试样例生成模块、性能缺陷检测模块相连,从待检测软件的配置项用户手册读取配置项的描述、取值范围,对待预测配置项的性能期望进行预测,得到配置项的性能期望标签,将配置项的性能期望标签发送给测试样例生成模块和性能缺陷检测模块。The configuration item expectation prediction module is a weighted voting classifier, which is connected to the test sample generation module and the performance defect detection module. It reads the description and value range of the configuration item from the configuration item user manual of the software to be tested, predicts the performance expectation of the configuration item to be predicted, obtains the performance expectation label of the configuration item, and sends the performance expectation label of the configuration item to the test sample generation module and the performance defect detection module.
测试样例生成模块与配置项期望预测模块、性能缺陷检测模块相连,从配置项期望预测模块接收配置项的性能期望标签,从待检测软件的测试集读取测试命令,根据配置项的性能期望标签和待检测软件测试集生成测试样例集合T。The test sample generation module is connected to the configuration item expectation prediction module and the performance defect detection module, receives the performance expectation label of the configuration item from the configuration item expectation prediction module, reads the test command from the test set of the software to be tested, and generates a test sample set T according to the performance expectation label of the configuration item and the test set of the software to be tested.
性能缺陷检测模块与配置项期望预测模块、测试样例生成模块相连,从测试样例生成模块接收测试样例集合T,从配置项期望预测模块接收配置项的性能期望标签,执行测试样例集合T中测试样例并检测配置项的性能期望标签所对应的期望性能和实际性能是否相符,若不相符则输出待检测软件的性能缺陷。The performance defect detection module is connected to the configuration item expectation prediction module and the test sample generation module, receives the test sample set T from the test sample generation module, receives the performance expectation label of the configuration item from the configuration item expectation prediction module, executes the test samples in the test sample set T and detects whether the expected performance corresponding to the performance expectation label of the configuration item is consistent with the actual performance. If not, the performance defect of the software to be tested is output.
第二步:训练性能缺陷检测系统的配置项期望预测模块。读入人工标注期望的配置项和配置项的官方文档描述,训练配置项期望预测模块。Step 2: Train the configuration item expectation prediction module of the performance defect detection system. Read the manually annotated expected configuration items and the official document descriptions of the configuration items to train the configuration item expectation prediction module.
2.1构建训练集,方法是:从MySQL、MariaDB、Apache-httpd、Apache-Tomcat、Apache-Derby、H2、PostgreSQL、GCC、Clang、MongoDB、RocksDB、Squid共12款软件的1万多个配置项中随机选取N(其中,N≥500)个配置项。2.1 Construct a training set by randomly selecting N (where N ≥ 500) configuration items from more than 10,000 configuration items of 12 software, including MySQL, MariaDB, Apache-httpd, Apache-Tomcat, Apache-Derby, H2, PostgreSQL, GCC, Clang, MongoDB, RocksDB, and Squid.
2.2根据N个配置项的官方文档描述,对配置项人工标注其性能期望标签,方法如图3所示:根据配置项(记为c)的文档描述(记为d),如果调节该配置项的目的是为了开启优化开关,则该配置项的性能期望标签为Label1;如果调节该配置项的目的是为了提升性能牺牲可靠性等非功能需求,则该配置项的性能期望标签为Label2;如果调节该配置项的目是为了分配更多计算机资源,则该配置项的性能期望标签为Label3;如果调节该配置项的目的是为了开启软件额外功能,则该配置项的性能期望标签为Label4;如果调节该配置项与软件性能无关,则该配置项的性能期望标签为Label5;最终得到训练集,记为 其中,N1+N2+N3+N4+N5=N;N1、N2、N3、N4、N5分别为性能期望标签为Label1,Label2,Label3,Label4,Label5的配置项文档描述的个数。是训练集中性能期望标签为Labell的第il个配置项。是的文档描述,由单词组成。其中,1≤l≤5,1≤il≤Nl。令中的单词总数为 记为(单词1,单词2,…,单词…,单词)。2.2 According to the official document descriptions of N configuration items, the performance expectation labels of the configuration items are manually labeled. The method is shown in Figure 3: According to the document description (denoted as d) of the configuration item (denoted as c), if the purpose of adjusting the configuration item is to turn on the optimization switch, the performance expectation label of the configuration item is Label 1 ; if the purpose of adjusting the configuration item is to improve performance at the expense of non-functional requirements such as reliability, the performance expectation label of the configuration item is Label 2 ; if the purpose of adjusting the configuration item is to allocate more computer resources, the performance expectation label of the configuration item is Label 3 ; if the purpose of adjusting the configuration item is to turn on additional software functions, the performance expectation label of the configuration item is Label 4 ; if adjusting the configuration item is not related to software performance, the performance expectation label of the configuration item is Label 5 ; finally, the training set is obtained, denoted as Among them, N 1 +N 2 +N 3 +N 4 +N 5 =N; N 1 , N 2 , N 3 , N 4 , and N 5 are the numbers of configuration item document descriptions with performance expectation labels Label 1 , Label 2 , Label 3 , Label 4 , and Label 5 respectively. It is the i lth configuration item with expected performance label Label l in the training set. yes The document description consists of words. Among them, 1≤l≤5, 1≤i l ≤N l . Let The total number of words in Recorded as (word 1 , word 2 , ..., word …,word ).
2.3配置项期望预测模块预处理训练集;2.3 Configuration item expectation prediction module preprocesses the training set;
2.3.1初始化变量l=1;2.3.1 Initialize variable l = 1;
2.3.2初始化变量il=1;2.3.2 Initialize variable i l = 1;
2.3.3对进行预处理,方法是:2.3.3 Pair Perform preprocessing by:
2.3.3.1令变量 2.3.3.1 Let variables
2.3.3.2将单词转化为其中为单词的词性标签,为计算机领域同义词;2.3.3.2 Word Convert to in is the part-of-speech tag of the word, Synonymous with the computer field;
2.3.3.3若令转2.3.3.2;若则得到预处理后的为如下形式: 简记为转2.3.4;2.3.3.3 If make Go to 2.3.3.2; if Then the preprocessed In the following form: Abbreviated as Go to 2.3.4;
2.3.4判断il是否等于Nl,若是,转2.3.5,否则令il=il+1,转2.3.3;2.3.4 Determine whether i l is equal to N l . If so, go to 2.3.5. Otherwise, set i l =i l +1 and go to 2.3.3.
2.3.5判断l是否等于5,若是,转2.4,否则令l=l+1,转2.3.2;2.3.5 Determine whether l is equal to 5. If so, go to 2.4. Otherwise, set l = l + 1 and go to 2.3.2.
2.4配置项期望预测模块挖掘频繁子序列。使用的PrefixSpan算法分别对集合 进行频繁子序列挖掘,得到5个频繁子序列集合: 其中Q1,Q2,…,Ql,…,Q5为正整数,表示当l=1,2,…,5时,PrefixSpan算法从集合挖掘出的频繁子序列的个数;1≤q≤Ql;2.4 Configuration item expectation prediction module mines frequent subsequences. The PrefixSpan algorithm used is respectively Perform frequent subsequence mining and obtain 5 frequent subsequence sets: Where Q 1 ,Q 2 ,…,Q l ,…,Q 5 are positive integers, indicating that when l=1,2,…,5, the PrefixSpan algorithm selects The number of frequent subsequences mined; 1≤q≤Q l ;
2.5对P1,P2,…,P5中的所有频繁子序列计算置信度(Confidence),方法是:2.5 Calculate the confidence of all frequent subsequences in P 1 , P 2 ,…, P 5 by:
2.5.1初始化变量l=1;2.5.1 Initialize variable l = 1;
2.5.2初始化变量q=1;2.5.2 Initialize variable q = 1;
2.5.3计算频繁子序列p(l,q)的置信度Confidence(l,q):2.5.3 Calculate the confidence ( l,q) of the frequent subsequence p (l,q) :
中的匹配次数之和)。其中,若p(l,q)是 的一个子序列,则判定p(l,q)与一次匹配。 The sum of the number of matches in ). If p (l,q) is is a subsequence of , then determine whether p (l,q) is One match.
2.5.4判断q是否等于Ql,若是,转2.5.5;若否,令q=q+1,转2.5.3;2.5.4 Determine whether q is equal to Q l . If so, go to 2.5.5. If not, set q = q + 1 and go to 2.5.3.
2.5.5判断l是否等于5,若是,表示得到了P1,P2,…,P5中的所有频繁子序列的置信度,转2.6;若否,令l=l+1,转2.5.2。2.5.5 Determine whether l is equal to 5. If so, it means that the confidence of all frequent subsequences in P 1 , P 2 , …, P 5 is obtained, and go to 2.6; if not, set l = l + 1, and go to 2.5.2.
2.6根据P1,P2,…,P5中的频繁子序列的置信度Confidence,对P1,P2,…,P5中的频繁子序列进行筛选。方法为:2.6 According to the confidence of the frequent subsequences in P 1 , P 2 , …, P 5, filter the frequent subsequences in P 1 , P 2 , …, P 5. The method is:
2.6.1初始化变量l=1;2.6.1 Initialize variable l = 1;
2.6.2初始化变量q=1;2.6.2 Initialize variable q = 1;
2.6.3若其中5为期望标签种类数,则将plq放入集合Pl'中;2.6.3 If Where 5 is the expected number of label types, then p lq is put into the set P l ';
2.6.4判断q是否等于Ql,若是,转2.6.5;若否,令q=q+1,转2.6.3;2.6.4 Determine whether q is equal to Q l . If so, go to 2.6.5. If not, set q = q + 1 and go to 2.6.3.
2.6.5判断l是否等于5,若是,表示得到了筛选后的频繁子序列集合P1',P2',P3',P4',P5',转2.7;若否,令l=l+1,转2.6.2。2.6.5 Determine whether l is equal to 5. If so, it means that the filtered frequent subsequence set P 1 ', P 2 ', P 3 ', P 4 ', P 5 ' is obtained, and go to 2.7; if not, set l = l + 1, and go to 2.6.2.
2.7采用P1',P2',P3',P4',P5'对配置项期望预测模块进行训练,方法是:2.7 Use P 1 ', P 2 ', P 3 ', P 4 ', P 5 ' to train the configuration item expectation prediction module, the method is:
2.7.1初始化:从P1',P2',P3',P4',P5'中分别随机选出(选出后放回)100个频繁子序列,构成随机选出频繁子序列集合P1”,P2”,P3”,P4”,P5”。P1”,P2”,P3”,P4”,P5”中共包含500个频繁子序列,即:{p(1,1),p(1,2),…,p(1,r),…,p(1,100)},…,{p(l,1),p(l,2),…,p(l,r),…,p(l,100)},…,2.7.1 Initialization: Randomly select (and replace after selection) 100 frequent subsequences from P 1 ′, P 2 ′, P 3 ′, P 4 ′, and P 5 ′ respectively to form the randomly selected frequent subsequence set P 1 ′, P 2 ′, P 3 ′, P 4 ′, and P 5 ′. P 1 ′, P 2 ′, P 3 ′, P 4 ′, and P 5 ′ contain a total of 500 frequent subsequences, namely: {p (1,1) , p (1,2) ,…, p (1,r) ,…, p (1,100) },…, {p (l,1) , p (l,2) ,…, p (l,r) ,…, p (l,100) },…,
{p(5,1),p(5,2),…,p(5,r),…,p(5,100)},1≤r≤100;{p (5,1) ,p (5,2) ,…,p (5,r) ,…,p (5,100) }, 1≤r≤100;
2.7.2分别计算P1”,P2”,P3”,P4”,P5”在训练数据集上的准确率(Precision)、召回率(Recall)、F-score(准确率和召回率的调和平均数):2.7.2 Calculate the accuracy (Precision), recall (Recall), and F-score (the harmonic mean of accuracy and recall) of P 1 ”, P 2 ”, P 3 ”, P 4 ”, and P 5 ” on the training data set respectively:
2.7.3判断F-score最大值的估计累积分布函数值是否大于阈值δ,δ一般为99%-99.9%,若大于,转2.8;若小于等于阈值δ,转2.7.1;2.7.3 Determine whether the estimated cumulative distribution function value of the maximum F-score is greater than a threshold value δ, which is generally 99%-99.9%. If so, go to 2.8; if less than or equal to the threshold value δ, go to 2.7.1;
2.8配置项期望预测模块选取F-score最大时对应的P1”,P2”,P3”,P4”,P5”构建加权投票分类器。方法为:将加权投票分类器的输入设定为任一待预测期望标签的已预处理的配置项描述《POSx,DSx》(简记为x),输出为5个期望标签的得票,x的性能期望标签为得票最高的期望标签。其中,类别l的得票为频繁子序列的置信度之和,1≤rx≤100,且为x的子序列。分类器输出得票五元组,记为Votes(x)= 2.8 Configuration item expectation prediction module selects P 1 ”, P 2 ”, P 3 ”, P 4 ”, P 5 ” corresponding to the maximum F-score to construct a weighted voting classifier. The method is: set the input of the weighted voting classifier to any preprocessed configuration item description of the expected label to be predicted "POS x , DS x " (abbreviated as x), and the output is the votes of the 5 expected labels. The performance expected label of x is the expected label with the highest votes. Among them, the votes of category l are frequent subsequences. The sum of the confidences of , 1≤r x ≤100, and is a subsequence of x. The classifier outputs a five-tuple of votes, recorded as Votes(x) =
其中,为x的子序列的含义为:“在Pl”中,满足“为x的子序列”的的置信度之和(其中l=1,2,…,5)”。若Votes(x)中有元素不为0,则找到Votes(x)中最大值所对应的元素,该元素对应的序号l即为x的性能期望标签对应的序号,为Labell,转第三步;若Votes(x)=[0,0,0,0,0],则x的性能期望标签为空,转第三步;例如,若Votes(x)=[1.1,1.4,5.3,0,2.0],则x的性能期望标签为Label3,若Votes(x)=[0,0,0,0,0],则该配置项x的性能期望标签为空;in, is a subsequence of x The meaning is: "In P l ", the sequence that satisfies "is a subsequence of x" The sum of the confidences of (where l = 1, 2, …, 5)". If any element in Votes(x) is not 0, find the element corresponding to the maximum value in Votes(x). The sequence number l corresponding to the element is the sequence number corresponding to the performance expectation label of x, which is Label l , and go to the third step; if Votes(x) = [0, 0, 0, 0, 0], the performance expectation label of x is empty, and go to the third step; for example, if Votes(x) = [1.1, 1.4, 5.3, 0, 2.0], the performance expectation label of x is Label 3 , and if Votes(x) = [0, 0, 0, 0, 0], the performance expectation label of the configuration item x is empty;
第三步,利用训练后的配置项期望预测模块为待检测软件生成性能期望标签集合L,将L发送给测试样例生成模块和性能缺陷检测模块,方法是:The third step is to use the trained configuration item expectation prediction module to generate a performance expectation label set L for the software to be tested, and send L to the test sample generation module and the performance defect detection module. The method is:
训练后的配置项期望预测模块从待检测软件的配置项用户手册读取配置项描述,的加权投票分类器对所有待测配置项C={c1,c2,…,cz,…,cN'},其中,1≤z≤N′(令N′为配置项用户手册中配置项的个数)的性能期望进行预测,得到性能期望标签集合L=[Lab1,Lab2,…,Labz,…,LabN′],其中Labz∈{Label1,Label2,Label3,Label4,Label5,null(空)};将L发送给测试样例生成模块和性能缺陷检测模块。The trained configuration item expectation prediction module reads the configuration item description from the configuration item user manual of the software to be tested, and uses the weighted voting classifier to predict the performance expectations of all configuration items to be tested C = {c 1 ,c 2 ,…,c z ,…,c N '}, where 1≤z≤N′ (let N′ be the number of configuration items in the configuration item user manual), and obtains the performance expectation label set L = [Lab 1 ,Lab 2 ,…,Lab z ,…,Lab N′ ], where Lab z ∈{Label 1 ,Label 2 ,Label 3 ,Label 4 ,Label 5 ,null}; L is sent to the test sample generation module and the performance defect detection module.
第四步,测试样例生成模块为待检测软件生成测试样例集合T,并将T发送给性能缺陷检测模块,方法是:In the fourth step, the test sample generation module generates a test sample set T for the software to be tested and sends T to the performance defect detection module, the method is:
4.1测试样例生成模块使用Spex算法,对C中的软件配置项的语法类型和取值范围进行提取。Spex最终提取出的语法类型分为四类:数值类型(int)、布尔类型(bool)、枚举类型(enum)、字符串类型(string);4.1 The test sample generation module uses the Spex algorithm to extract the syntax type and value range of the software configuration items in C. The syntax types finally extracted by Spex are divided into four categories: numeric type (int), Boolean type (bool), enumeration type (enum), and string type (string);
4.2测试样例生成模块为配置项集合C={c1,c2,…,cz,…,cN'}生成待测值集合V,V={V1,V2,…,Vz,…,VN'},其中为配置项cz的一个取值,Kz为测试样例生成模块为cz生成的值的个数。方法为:4.2 The test sample generation module generates a set of test values V for the configuration item set C = {c 1 ,c 2 ,…,c z ,…,c N' }, V = {V 1 ,V 2 ,…,V z ,…,V N' }, where is a value of the configuration item c z , and K z is the number of values generated by the test sample generation module for c z . The method is:
4.2.1初始化变量z=1;4.2.1 Initialize variable z = 1;
4.2.2若cz对应的期望标签为空,则令转4.2.7;4.2.2 If the expected label corresponding to c z is empty, then let Go to 4.2.7;
4.2.3若cz为布尔类型(bool),则令Vz={0,1},转4.2.7;4.2.3 If c z is of Boolean type (bool), let V z = {0,1} and go to 4.2.7;
4.2.4若cz为枚举类型(enum),则令其中为Spex算法提取到的cz的全部可能取值,转4.2.7;4.2.4 If c z is an enumeration type (enum), then let in All possible values of c z extracted by the Spex algorithm, go to 4.2.7;
4.2.5若cz为字符串类型(string),则令转4.2.7;4.2.5 If c z is a string type (string), then let Go to 4.2.7;
4.2.6若cz为数值类型(int),则对cz的值进行抽样,方法为:记Spex算法提取到的cz的最小取值和最大取值记为Min、Max,令Vz={Min,10·Min,102·Min,Max,10-1·Max,10-2·Max},转4.2.7;4.2.6 If c z is a numeric type (int), sample the value of c z as follows: record the minimum and maximum values of c z extracted by the Spex algorithm as Min and Max, let V z = {Min, 10·Min, 10 2 ·Min, Max, 10 -1 ·Max, 10 -2 ·Max}, and go to 4.2.7;
4.2.7若z=N′,转4.3;否则,令z=z+1,转4.2.2;4.2.7 If z = N', go to 4.3; otherwise, let z = z + 1 and go to 4.2.2;
4.3对V1,V2,…,Vz,…,VN'取笛卡尔积,得到笛卡尔积VCartesian=V1×V2×…×VN';4.3 Taking the Cartesian product of V 1 , V 2 , …, V z , …, V N' , we obtain the Cartesian product V Cartesian = V 1 × V 2 × … × V N' ;
4.4软件的性能测试集一般以性能测试工具的形式提供。因此测试样例生成模块基于性能测试工具(如sysbench、apache-benchmark)生成测试命令。方法为:采用经典的pair-wise方法对性能测试工具的参数进行抽样,然后将参数(如并发度、负载类型、数据表大小、数据表数量、读操作比例、写操作比例)输入性能测试工具,输出测试命令,得到测试命令集合B={b1,b2,b3,…,by,…,bY},1≤y≤Y,Y为B中测试命令的个数;4.4 The performance test set of the software is generally provided in the form of a performance testing tool. Therefore, the test sample generation module generates test commands based on the performance testing tools (such as sysbench, apache-benchmark). The method is: use the classic pair-wise method to sample the parameters of the performance testing tool, and then input the parameters (such as concurrency, load type, data table size, number of data tables, read operation ratio, write operation ratio) into the performance testing tool, output the test command, and obtain the test command set B = {b 1 , b 2 , b 3 ,…, by , …, b Y }, 1≤y≤Y, Y is the number of test commands in B;
4.5测试样例生成模块生成测试样例集合T,T=B×VCartesian={t1,t2,t3,…,ta,…,tW},1≤a≤W,ta为一个二元组,(其中,的含义是:cz的取值为),W为T中测试样例的个数,为c1的第u(1≤u≤K1)个可能取值,为cz的第h(1≤h≤Kz)个可能取值,为cN′(1≤j≤KN′)的第j个可能取值,K1、Kz、KN′分别为Spex算法提取到的配置项c1、cz、cN′的可能取值的个数,且均为正整数;将测试样例集合T发送给性能缺陷检测模块;4.5 Test sample generation module generates a test sample set T, T = B × V Cartesian = {t 1 ,t 2 ,t 3 ,…,t a ,…,t W }, 1≤a≤W, t a is a two-tuple, (in, The meaning is: the value of c z is ), W is the number of test samples in T, is the uth (1≤u≤K 1 ) possible value of c 1 , is the hth ( 1≤h≤Kz ) possible value of cz , is the jth possible value of c N′ (1≤j≤K N′ ), K 1 , K z , K N′ are respectively the number of possible values of configuration items c 1 , c z , c N′ extracted by the Spex algorithm, and are all positive integers; the test sample set T is sent to the performance defect detection module;
第五步:性能缺陷检测模块根据T和L检测待测软件可执行文件的性能缺陷:Step 5: The performance defect detection module detects the performance defects of the executable file of the software to be tested based on T and L:
5.1性能缺陷检测模块执行T中的测试样例,得到测试样例的性能值,方法是:5.1 The performance defect detection module executes the test samples in T and obtains the performance values of the test samples. The method is:
5.1.1初始化变量a=1;5.1.1 Initialize variable a=1;
5.1.2为防止因测试环境不稳定导致的性能波动,性能缺陷检测模块重复执行每个测试样例A次,A为正整数,A优选为10;因此,令变量repeat=1;5.1.2 To prevent performance fluctuations caused by unstable test environment, the performance defect detection module repeats each test sample A times, where A is a positive integer and A is preferably 10; therefore, let the variable repeat = 1;
5.1.3性能缺陷检测模块将测试样例ta输入待检测软件,运行待检测软件,记录第repeat次输入ta运行得到的性能值设定检测性能值的默认性能指标为软件数据吞吐量;5.1.3 The performance defect detection module inputs the test sample t a into the software to be tested, runs the software to be tested, and records the performance value obtained by running the software after the input t a for the first time. The default performance indicator for setting the detection performance value is the software data throughput;
5.1.4判定repeat是否等于A,若是,则得到一组关于测试样例ta的性能指标,记为:转4.1.5;否则令repeat=repeat+1,转4.1.3;5.1.4 Determine whether repeat is equal to A. If so, a set of performance indicators for the test sample ta is obtained, which is recorded as: Go to 4.1.5; otherwise, set repeat = repeat + 1 and go to 4.1.3;
5.1.5判定a是否等于W,若是,记输出为Out={[t1,R1],…,[ta,Ra],…,[tW,RW]}(其中,二元组[ta,Ra]的第一个元素为测试样例,第二个元素为执行该测试样例A次得到的性能值集合),转5.2;否则令a=a+1,转5.1.2;5.1.5 Determine whether a is equal to W. If so, record the output as Out = {[t 1 ,R 1 ],…,[t a ,R a ],…,[t W ,R W ]} (where the first element of the tuple [t a ,R a ] is the test sample, and the second element is the performance value set obtained by executing the test sample A times), and go to 5.2; otherwise, let a = a + 1, and go to 5.1.2;
5.2性能缺陷检测模块将Out依据测试样例进行分组,方法是:5.2 The performance defect detection module groups Out according to the test samples by:
5.2.1初始化变量a=1;5.2.1 Initialize variable a=1;
5.2.2判断若[ta,Ra]已被分组,则令a=a+1,转5.2.2;否则转5.2.3;5.2.2 If [t a ,R a ] has been grouped, set a=a+1 and go to 5.2.2; otherwise go to 5.2.3;
5.2.3将[ta,Ra]按ta中的配置项取值和测试命令进行分组,即[ta,Ra]与{[t1,R1],…,[ta,Ra],…,[tW,RW]}中,若ta和ta'同时满足以下3个条件,则将[ta,Ra],[ta’,Ra’]组成一组:5.2.3 Group [t a ,R a ] according to the configuration item values and test commands in t a , that is, [t a ,R a ] and {[t 1 ,R 1 ],…,[t a ,R a ],…,[t W ,R W ]}, if t a and t a' simultaneously meet the following three conditions, then [t a ,R a ],[t a' ,R a' ] are grouped together:
条件1,ta和ta'仅有某一个配置项cz(其中,1≤z≤N′)的取值不同;Condition 1: ta and ta ' have only one configuration item c z (where 1≤z≤N') that differs in value;
条件2,测试命令均为 Condition 2, the test commands are
条件3,[ta,Ra],[ta’,Ra’]未被分组;Condition 3, [t a ,R a ], [t a' ,R a' ] are not grouped;
令与[ta,Ra]满足以上条件的共有Numa个,即分为一组,记为Group(z,y),Group(z,y)={[ta,Ra],[ta‘,Ra‘],[ta’‘,Ra“],…,[ta*,Ra*]}(其中,1≤a',a”,…,a*≤W,Numa为正整数,Numa的大小与cz的类型有关:若cz为布尔类型,则Numa=2;若为枚举类型,则Numa=Kz;若为数值类型,则Numa=6;若为字符串类型,则Numa=1)。例如:ta为并且ta'为 时,将[ta,Ra],[ta’,Ra’]组成一组;Let [t a ,R a ] satisfy the above conditions. That is, Divide them into a group, denoted as Group (z,y) , Group (z,y) = {[ ta , Ra], [ta ' , Ra ' ], [ta '' , Ra" ], ..., [ta * , Ra * ]} (where 1 ≤ a', a", ..., a* ≤ W, Numa is a positive integer, and the size of Numa depends on the type of cz : if cz is a Boolean type, then Numa = 2; if it is an enumeration type, then Numa = Kz ; if it is a numeric type, then Numa = 6; if it is a string type, then Numa = 1). For example: t a is And t a' is , group [t a ,R a ],[t a' ,R a' ] together;
5.2.4若a=W,表示分组完成,得到分组后的测试结果集合G={Group(1,1),Group(1,2),…,Group(1,Y),…,Group(z,y)…,Group(N',Y)},转5.3;否则令a=a+1,转5.2.2;5.2.4 If a=W, it means the grouping is completed, and the test result set after grouping is G={Group (1,1) ,Group (1,2) ,…,Group (1,Y) ,…,Group (z,y) …,Group (N',Y) }, go to 5.3; otherwise let a=a+1, go to 5.2.2;
5.3性能缺陷检测模块根据配置项集合C的性能期望标签L以及分组后的测试结果集合G,使用假设检验(假设检验参数β为小于1的正实数.优选β=0.05)的方法判别待检测软件是否存在缺陷。假设检验原理为:若任意一配置项cz的期望标签为Label1、Label2或Label3,调节cz的性能预期为性能提升,若实际测试结果为性能下降,则软件存在性能缺陷;若cz的期望标签为Label4,调节cz的性能预期为性能合理下降,若实际测试结果为性能大幅下降,则软件存在性能缺陷;若cz的期望标签为Label5,调节cz的性能预期为性能不变,若实际测试结果为性能下降,则软件存在性能缺陷。方法为:遍历R中每一个分组,使用假设检验的方法判别待检测软件是否存在缺陷:5.3 The performance defect detection module uses the hypothesis test method (the hypothesis test parameter β is a positive real number less than 1. β = 0.05 is preferred) to determine whether the software to be tested has defects based on the performance expectation label L of the configuration item set C and the grouped test result set G. The principle of hypothesis testing is: if the expected label of any configuration item c z is Label 1 , Label 2 or Label 3 , the performance expectation of c z is adjusted to improve the performance. If the actual test result is a performance degradation, the software has a performance defect; if the expected label of c z is Label 4 , the performance expectation of c z is adjusted to a reasonable performance degradation. If the actual test result is a significant performance degradation, the software has a performance defect; if the expected label of c z is Label 5 , the performance expectation of c z is adjusted to remain unchanged. If the actual test result is a performance degradation, the software has a performance defect. The method is: traverse each group in R and use the hypothesis test method to determine whether the software to be tested has defects:
5.3.1初始化变量z=1;5.3.1 Initialize variable z = 1;
5.3.2初始化变量y=1;5.3.2 Initialize variable y = 1;
5.3.3若Labz=Label1,(其中,Labz为cz的期望标签)设定待验假设H0:Ra≤Ra'(其中,cz在ta中的值为0,cz在ta'中的值为1)。转5.3.8;5.3.3 If Lab z = Label 1 , (where Lab z is the expected label of c z ), set the hypothesis to be tested H 0 : Ra ≤ Ra' (where the value of c z in ta is 0, and the value of c z in ta ' is 1). Go to 5.3.8;
5.3.4若Labz=Label2,设定待验假设H0:Ra≤Ra'(其中,cz在ta中的值大于cz在ta'中的值)。转5.3.8;5.3.4 If Lab z = Label 2 , set the hypothesis to be tested H 0 : Ra ≤ Ra' (where the value of c z in ta is greater than the value of c z in ta ' ). Go to 5.3.8;
5.3.5若Labz=Label3,设定待验假设H0:Ra≤Ra'(其中,cz在ta中的值小于cz在ta'中的值)。转5.3.8;5.3.5 If Lab z = Label 3 , set the hypothesis to be tested H 0 : Ra ≤ Ra' (where the value of c z in ta is less than the value of c z in ta ' ). Go to 5.3.8;
5.3.6若Labz=Label4,设定待验假设H0:5·Ra≤Ra'(其中,cz在ta中的值为1,cz在ta'中的值为0)。转5.3.8;5.3.6 If Lab z = Label 4 , set the hypothesis to be tested H 0 : 5·R a ≤R a' (where c z is 1 in t a and 0 in t a' ). Go to 5.3.8;
5.3.7若Labz=Label5,设定待验假设H0:Ra≠Ra'。转5.3.8;5.3.7 If Lab z = Label 5 , set the hypothesis to be tested as H 0 : Ra ≠Ra ' . Go to 5.3.8;
5.3.8当假设检验结果表明H0被拒绝时(即使用假设检验方法计算得到的被拒绝概率≥1-β),表明软件存在一个与配置项cz有关的性能缺陷,且触发该缺陷的测试命令为 5.3.8 When the hypothesis test result shows that H 0 is rejected (i.e. the rejection probability calculated by the hypothesis test method is ≥ 1-β), it indicates that the software has a performance defect related to configuration item c z , and the test command that triggers the defect is
5.3.9若y=Y,则转5.3.10;否则令y=y+1,转5.3.3;5.3.9 If y = Y, go to 5.3.10; otherwise, let y = y + 1 and go to 5.3.3;
5.3.10若z=N′,结束检测;否则令z=z+1,转5.3.2。5.3.10 If z=N′, end the test; otherwise, set z=z+1 and go to 5.3.2.
Claims (13)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010610996.3A CN111611177B (en) | 2020-06-29 | 2020-06-29 | A Software Performance Defect Detection Method Based on Performance Expectation of Configuration Items |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010610996.3A CN111611177B (en) | 2020-06-29 | 2020-06-29 | A Software Performance Defect Detection Method Based on Performance Expectation of Configuration Items |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111611177A CN111611177A (en) | 2020-09-01 |
CN111611177B true CN111611177B (en) | 2023-06-09 |
Family
ID=72200573
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010610996.3A Active CN111611177B (en) | 2020-06-29 | 2020-06-29 | A Software Performance Defect Detection Method Based on Performance Expectation of Configuration Items |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111611177B (en) |
Families Citing this family (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112131108B (en) * | 2020-09-18 | 2024-04-02 | 电信科学技术第十研究所有限公司 | Feature attribute-based test strategy adjustment method and device |
CN114756865B (en) * | 2022-04-24 | 2024-08-13 | 安天科技集团股份有限公司 | RDP file security detection method and device, electronic equipment and storage medium |
CN114780411B (en) * | 2022-04-26 | 2023-04-07 | 中国人民解放军国防科技大学 | Software configuration item preselection method oriented to performance tuning |
CN115562645B (en) * | 2022-09-29 | 2023-06-09 | 中国人民解放军国防科技大学 | Configuration fault prediction method based on program semantics |
CN116225965B (en) * | 2023-04-11 | 2023-10-10 | 中国人民解放军国防科技大学 | IO size-oriented database performance problem detection method |
CN116561002B (en) * | 2023-05-16 | 2023-10-10 | 中国人民解放军国防科技大学 | A database performance problem detection method oriented to I/O concurrency |
CN116560998B (en) * | 2023-05-16 | 2023-12-01 | 中国人民解放军国防科技大学 | I/O (input/output) sequence-oriented database performance problem detection method |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104407971A (en) * | 2014-11-18 | 2015-03-11 | 中国电子科技集团公司第十研究所 | Method for automatically testing embedded software |
CN106201871A (en) * | 2016-06-30 | 2016-12-07 | 重庆大学 | Based on the Software Defects Predict Methods that cost-sensitive is semi-supervised |
CN106528417A (en) * | 2016-10-28 | 2017-03-22 | 中国电子产品可靠性与环境试验研究所 | Intelligent detection method and system of software defects |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7398469B2 (en) * | 2004-03-12 | 2008-07-08 | United Parcel Of America, Inc. | Automated test system for testing an application running in a windows-based environment and related methods |
WO2008155779A2 (en) * | 2007-06-20 | 2008-12-24 | Sanjeev Krishnan | A method and apparatus for software simulation |
US8140319B2 (en) * | 2008-02-05 | 2012-03-20 | International Business Machines Corporation | Method and system for predicting system performance and capacity using software module performance statistics |
CN107066389A (en) * | 2017-04-19 | 2017-08-18 | 西安交通大学 | The Forecasting Methodology that software defect based on integrated study is reopened |
-
2020
- 2020-06-29 CN CN202010610996.3A patent/CN111611177B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104407971A (en) * | 2014-11-18 | 2015-03-11 | 中国电子科技集团公司第十研究所 | Method for automatically testing embedded software |
CN106201871A (en) * | 2016-06-30 | 2016-12-07 | 重庆大学 | Based on the Software Defects Predict Methods that cost-sensitive is semi-supervised |
CN106528417A (en) * | 2016-10-28 | 2017-03-22 | 中国电子产品可靠性与环境试验研究所 | Intelligent detection method and system of software defects |
Also Published As
Publication number | Publication date |
---|---|
CN111611177A (en) | 2020-09-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111611177B (en) | A Software Performance Defect Detection Method Based on Performance Expectation of Configuration Items | |
US12001788B2 (en) | Systems and methods for diagnosing problems from error logs using natural language processing | |
De Santo et al. | Deep Learning for HDD health assessment: An application based on LSTM | |
Bao et al. | Execution anomaly detection in large-scale systems through console log analysis | |
Banerjee et al. | Automated triaging of very large bug repositories | |
Shankar et al. | Towards observability for production machine learning pipelines | |
CN111400122B (en) | Hard disk health degree assessment method and device | |
CN116450399A (en) | Fault diagnosis and root cause positioning method for micro service system | |
CN108804558A (en) | A kind of defect report automatic classification method based on semantic model | |
Gao et al. | Research on software multiple fault localization method based on machine learning | |
Jiang et al. | A large-scale benchmark for log parsing | |
Liu et al. | PinSQL: Pinpoint root cause SQLs to resolve performance issues in cloud databases | |
CN116302984B (en) | A root cause analysis method, device and related equipment for test tasks | |
Zhu et al. | A Performance Fault Diagnosis Method for SaaS Software Based on GBDT Algorithm. | |
WO2022259161A1 (en) | Self-optimizing analysis system for core dumps | |
Matcha et al. | Using Deep Learning Classifiers to Identify Candidate Classes for Unit Testing in Object-Oriented Systems. | |
Sadaf et al. | AI-based software defect prediction for trustworthy android apps | |
CN116560998B (en) | I/O (input/output) sequence-oriented database performance problem detection method | |
Altman et al. | Anomaly Detection on IBM Z Mainframes: Performance Analysis and More | |
Anttila | Developing a log file analysis tool: A machine learning approach for anomaly detection | |
CN116225965B (en) | IO size-oriented database performance problem detection method | |
Bittencourt et al. | Enhancing Linux Kernel Test Result Analysis: Automated Log Clustering in the KernelCI Database | |
Al Amin | Supervised Learning for Detecting Cognitive Security Anomalies in Real-Time Log Data | |
CN115437921A (en) | Software test acceleration method based on predicted coverage rate | |
Kasubuchi et al. | An empirical evaluation of the effectiveness of inspection scenarios developed from a defect repository |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |