0% found this document useful (0 votes)

47 views13 pages

SVM Malware Detection

This paper investigates the effectiveness of combining three malware detection techniques—Hidden Markov Models (HMMs), Simple Substitution Distance (SSD), and Opcode Graph Similarity (OGS)—using Support Vector Machines (SVMs). The authors demonstrate that the SVM can significantly enhance detection capabilities by optimizing the combination of scores from these individual techniques. The study includes a robustness analysis against morphing strategies that could potentially evade detection.

Uploaded by

nandhiniramesharec

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

47 views13 pages

SVM Malware Detection

Uploaded by

nandhiniramesharec

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/283036223

Support vector machines and malware detection

Article in Journal of Computer Virology and Hacking Techniques · November 2016

DOI: 10.1007/s11416-015-0252-0

CITATIONS READS

67 3,448

5 authors, including:

Fabio Di Troia Corrado Aaron Visaggio

San Jose State University University of Sannio
79 PUBLICATIONS 1,560 CITATIONS 156 PUBLICATIONS 5,218 CITATIONS

SEE PROFILE SEE PROFILE

Thomas H. Austin Mark Stamp

San Jose State University San Jose State University
50 PUBLICATIONS 1,717 CITATIONS 263 PUBLICATIONS 5,447 CITATIONS

SEE PROFILE SEE PROFILE

All content following this page was uploaded by Mark Stamp on 04 November 2018.

The user has requested enhancement of the downloaded file.

Support vector machines and malware
detection

Tanuvir Singh, Fabio Di Troia, Visaggio

Aaron Corrado, Thomas H. Austin &
Mark Stamp

Journal of Computer Virology and

Hacking Techniques

ISSN 2274-2042

J Comput Virol Hack Tech

DOI 10.1007/s11416-015-0252-0

1 23
Your article is protected by copyright and
all rights are held exclusively by Springer-
Verlag France. This e-offprint is for personal
use only and shall not be self-archived
in electronic repositories. If you wish to
self-archive your article, please use the
accepted manuscript version for posting on
your own website. You may further deposit
the accepted manuscript version in any
repository, provided it is only made publicly
available 12 months after official publication
or later and provided acknowledgement is
given to the original source of publication
and a link is inserted to the published article
on Springer's website. The link must be
accompanied by the following text: "The final
publication is available at link.springer.com”.

1 23
Author's personal copy
J Comput Virol Hack Tech
DOI 10.1007/s11416-015-0252-0

ORIGINAL PAPER

Support vector machines and malware detection

Tanuvir Singh1 · Fabio Di Troia2 · Visaggio Aaron Corrado2 · Thomas H. Austin1 ·
Mark Stamp1

Received: 9 July 2015 / Accepted: 12 September 2015

Abstract In this research, we test three advanced mal- analysis [2], compression rates [13], and Principal Compo-
ware scoring techniques that have shown promise in previous nent Analysis [8,12].
research, namely, Hidden Markov Models, Simple Substitu- We selected three methods for our research, namely
tion Distance, and Opcode Graph based detection. We then HMMs, SSD, and OGS. We implement and test each of these
perform a careful robustness analysis by employing morph- scoring techniques, and we implement a morphing strategy to
ing strategies that cause each score to fail. We show that defeat each of the scores. For each case, we carefully analyze
combining scores using a Support Vector Machine yields the degree of modification needed to break the score, where
results that are significantly more robust than those obtained the area under the ROC curve (AUC) serves as our measure
using any of the individual scores. of success. Then we implement a support vector machine
(SVM) [19] that serves to generate an optimal combination
of scores, and we measure the success of this SVM-based
1 Introduction score in comparison to the individual scores. We show that
the SVM is able to improve on the detection capability of
Most of the advanced scoring techniques that have been pro- any of the individual scores.
posed for malware detection can be classified in one of three This paper is organized as follows. Section 2 provides an
broad (and not necessarily mutually exclusive) categories. overview of relevant background topics, including previous
Statistical-based analysis looks for statistical properties that work and some details on the specific scoring techniques
remain constant through different generations of the mal- considered in this research. In Sect. 3, we discuss our experi-
ware. Structural-based techniques are based on the principle mental design and provide our experimental results. Section 4
that common file structures may exist, even in variants of contains our conclusions and a brief consideration of future
the same malware. Graph-based techniques apply similarity work.
measures to graphs extracted from malware code.
Examples of statistical-based scores include those that
rely on hidden Markov models (HMMs) [29,33] and simple 2 Background
substitution distance (SSD) [24]. Examples of graph-based
scores include the opcode graph similarity (OGS) consid- In this section, we first consider some examples of relevant
ered in [22], as well as the Function Call Graph technique previous research. Then we discuss the specific scores used in
in [7]. Examples of structure-based scores include entropy this research. We conclude this section with a brief discussion
of ROC analysis, which we use to measure the effectiveness
B Mark Stamp of our detection techniques.
mark.stamp@sjsu.edu
1 Department of Computer Science, San Jose State University, 2.1 Related work
San Jose, USA
2 Department of Engineering, Università degli Studi del HMMs have been extensively investigated in the context of
Sannio, Benevento, Italy malware detection. For example, Xin et al. [34] and Qin

123
Author's personal copy
T. Singh et al.

et al. [21] apply HMMs to the problem of malware detection 2.2.1 Hidden Markov models
on mobile devices. In [34], the keys pressed and system func-
tion call sequences are analyzed—the pressed keys represent A HMM includes a Markov process that is “hidden” in the
the hidden states, while the system call sequences represent sense that the states cannot be directly observed. However,
the observations. This proposed solution is evaluated on a sin- we do have access to a series of observations that are proba-
gle Symbian application, with a specific focus on the SMS bilistically related to the hidden states.
sending process. In [21], a prototype HMM-based detection In our context, an HMM is trained based on features
system is proposed, but it is not implemented or evaluated. extracted from members of a given malware family. The
The research in [1] analyzes the effectiveness of a profile resulting model is then used to score other samples belonging
hidden Markov model (PHMM) for metamorphic malware to the same family, as well as representative benign samples.
detection. A total of 240 virus variants and 70 trusted samples The results are used to determine the effectiveness of a detec-
are used in the experiments. The results are mixed, with some tion strategy based on HMMs.
metamorphic families not being detected with any reasonable We use the following standard notation to describe an
accuracy. HMM [27]:
Graph techniques have been extensively studied in the
malware detection literature. The paper [6] considers a graph-
T = length of the observation sequence
based score that uses dynamically extracted API calls. The
N = number of states in the model
authors of [7] analyze a function call graph score, which
M = number of distinct observation symbols
appears to be relatively robust with respect to obvious code
Q = {q0 , q1 , . . . , q N −1 } = distinct states of the Markov
morphing strategies.
process
Various statistical-based scores have been considered. For
V = {0, 1, . . . , M − 1} = set of possible observations
example, in [29], a chi-squared analysis is performed, and it
A = state transition probabilities
is shown that such a technique can be used to improve on a
B = observation probability matrix
straightforward HMM score, such as that in [33].
π = initial state distribution
Combining various classification techniques has also been
O = (O0 , O1 , . . . , OT −1 ) = observation sequence
explored in literature [20]. For example, the authors of [35]
use Dempster–Shafer theory to create combining rules for
individual decisions based on probabilistic neural network A model is defined by A, B, and π , and hence we denote an
(PNN) classifiers. The ensemble outperforms the individual HMM as λ = (A, B, π ).
PNN classifiers. Figure 1 gives a graphical view of a generic HMM. In this
The paper [15] considers an ensemble method, called figure, the X i represent the hidden states of the underlying
SVM-AR, which combines a SVM with association rules. Markov process.
The SVM determines a hyper plane that classifies samples as Given a set of virus variants, we train a HMMs, which
malicious or trusted. Then, the association rules are applied represents certain statistical properties of the virus fam-
to determine false predictions produced by the SVM. The ily. The trained model can then be used to determine the
authors conclude that this algorithm is essentially a single probability that a given program belongs to the same virus
learning algorithm that yields better results than some ensem- family as the training set. We trained our models based on
ble techniques. opcode sequences extracted from virus files, obtained by
The authors in [17] combine five different classifiers. The disassembling the executable files. For training, we simply
ensemble was compared to other combination techniques concatenated the opcode sequences to yield one long obser-
defined in the literature. This research demonstrates that vation sequence.
some combination techniques can work better than others.

A A A A
X0 X1 X2 ··· XT −1
2.2 Scores

In this section, we discuss the scoring techniques used in the

research presented in this paper. First, we discuss the three B B B B
techniques that form the basis of our research. Specifically,
we consider HMMs, a graph similarity technique, and a dis-
tance function based on simple substitution cryptanalysis. O0 O1 O2 ··· OT −1
Then we discuss SVMs, which we use to combine the three
scores. Fig. 1 Hidden Markov model

123
Author's personal copy
Support vector machines and malware detection

After training a model, we used the resulting HMM to Table 1 Opcode sequence
compute the log likelihood (per opcode) for each virus variant Number Opcode Number Opcode
in the test set and also for each program in the comparison
set. Here, the test set consists of viruses in the same family 1 CALL 11 JMP
as those used for training, while the comparison set includes 2 JMP 12 ADD
a representative sample of benign programs. We expect that 3 ADD 13 NOP
the trained model will assign higher scores to files belonging 4 SUB 14 JMP
to the virus family used to train the model. Success is based 5 NOP 15 CALL
on how well the HMM can separate viruses in the test set 6 CALL 16 CALL
from the benign programs. 7 ADD 17 CALL
8 JMP 18 ADD
9 JMP 19 JMP
2.2.2 Opcode graph similarity 10 SUB 20 SUB

As with the HMM score, the OGS score [22] is based on

extracted opcode sequences. A weighted directed graph is
their relative frequencies. Consequently, the OGS and HMM
constructed, where each distinct opcode that appears is a
scores can yield very different results.
node in the directed graph. A directed edge is inserted from a
node to each possible successor node, that is, each successor
opcode. Edge weights give the probability of the correspond- 2.2.3 Simple substitution distance
ing successor node.
Figure 2 illustrates the opcode graph corresponding to the The SSD malware score considered in [24] is based on a fast
snippet of code in Table 1. Note that the edge weights are hill climb technique known as Jakobsen’s algorithm [11],
normalized so that for any given node, the weights from all which was developed for simple substitution cipher crypt-
outgoing edges sum to one. analysis. Although the malware families considered in this
To use the OGS score for malware detection, we first paper are not encrypted, many are obfuscated, and simple
construct an opcode graph corresponding to a collection of substitution cryptanalysis can effectively undo some com-
family viruses. Then given any file that we want to score, we mon obfuscations.
construct its opcode graph. The distance between the graphs It is well known that simple substitution ciphers are weak,
is computed as the absolute sum of the differences between and that elementary statistical analysis can be used to attack
the corresponding edge weights. such ciphers. The naïve approach to simple substitution
Note that due to the normalization used, the OGS score cryptanalysis is to guess a putative key based on monograph
weights each opcode the same. This is in contrast to the HMM statistics, then decrypt the ciphertext using this putative key,
score, where opcodes are effectively weighted according to and compute a score based on some relevant language sta-
tistics. This process is slow due to the need to decrypt the
ciphertext for each modification to the key.
1/3 In Jakobsen’s algorithm, the plaintext is decrypted once,
then all subsequent modifications to the key only require
1/2 elementary matrix manipulations. This works because the
score is based entirely on language digraph statistics. In the
2/5 1/6 malware context, we use opcode digraph statistics instead
ADD CALL JMP 1/6 of language digraph statistics, but otherwise the process is
2/5 completely analogous.
1/5
The SSD score is distinct from the HMM and OGS scores.
1/2
1/4 1/2 Whereas the HMM and OGS scores rely directly on opcodes
that are extracted from malware files, the SSD score modifies
NOP
1/3 these opcodes via the “decryption” process, allowing us to
1/4
negate certain types of obfuscation.
1/2
1/2

2.2.4 Support vector machines

SUB
One of the most popular and useful machine learning tech-
Fig. 2 Opcode graph niques is the SVM. An SVM is used for binary classification.

123
Author's personal copy
T. Singh et al.

SVM is a very general technique that can be applied in a wide Fig. 3 Maximizing the margin
variety of situations. However, an SVM is not a scoring tech-
nique in the same sense as, say an HMM. A trained HMM,
for example, can be used to generate scores, which in turn
can enable us to determine a threshold for classifying input
samples. In contrast, an SVM directly generates a classifica-
tion, eliminating the intermediate step of generating a scores
to determine a threshold.
We can apply SVMs in situations where we might consider
other scoring techniques, such as HMMs. For example, in the
context of malware detection, we could train an SVM on, say,
opcodes extracted from members of a given malware family.
Then the trained SVM could be used to classify samples as
either malware—of the type that the SVM was trained to
φ
detect—or benign. =⇒
However, due to the fact that an SVM generates a clas-
sification, it is also natural to apply the technique to a set
of scores, as opposed to the raw data itself. In our context,
we apply SVMs to the HMM, OGS, and SSD scores. In this Fig. 4 Transformation from 2 to 3 dimensions
usage, we can view the resulting SVM as operating on a
“higher plane” than the HMM, OGS, and SSD scores.
Here, we discuss SVM from an intuitive level. For more input space on the left is not linearly separable, that is, no
details on SVMs many good sources are available, includ- separating hyperplane exists. But, after transforming via the
ing [4,10,19,28]. function φ, to the feature space on the right in Fig. 4, we can
SVMs are a supervised learning technique, which means easily construct a hyperplane that separates the two data sets.
that they require labeled data. That is, we must use pre- This is the essence of the so-called kernel trick.
processed data where the labels are known. Since SVMs are In practice, it is necessary to experiment with different ker-
used for binary classification, the labels will be taken to be nel functions as this choice is indeed something of a “trick”
−1 and 1. that plays a large role in the success (or not) of the technique.
The main ideas behind the SVM technique are the follow- There are a variety of standard kernel functions and we test
ing. several of these in our experiments.

• Maximize the “margin”—given labeled training data, 2.3 ROC analysis

we separate the training data using a hyperplane. In
constructing this separating hyperplane, we want to max- In this research, we use the area under the ROC curve (AUC)
imize the separation between the two classes of data in to quantify the success of the experiments. Given a scatter-
the training set. plot, an ROC curve is obtained by plotting the false positive
• Work in a higher dimensional space—we generally try rate (FPR) against the true positive rate (TPR) as the thresh-
to reduce the dimensionality of data, due to the “curse old varies through the range of data values. An AUC of 1.0
of dimensionality.” However, in the context of SVMs, implies that there exists a threshold that results in no false
it is actually beneficial to work in higher dimensions. positives or false negatives, which is the ideal case. In gen-
By moving the problem to a higher dimension, the data eral, the AUC can be interpreted as the probability that a
points tend to be more easily separated, and hence we randomly selected positive instance scores higher than a ran-
have a better chance of finding a separating hyperplane. domly selected negative instance [3]. Therefore, an AUC of
• Kernel trick—this “trick” is the process by which we 0.5 means that the binary classifier is no better than flipping
transform the data to a higher dimension. a coin. Also, an AUC of p will yield an AUC of 1 − p if
we simply flip the classification criteria and, consequently,
Figure 3 gives an example of a separating hyperplane that the AUC can always be interpreted so that it is at least
maximizes the margin. However, often we cannot construct 0.5.
such a hyperplane in the space where the data naturally lies. An example of a scatterplot and the corresponding ROC
This is where the kernel trick comes into the picture. curve is given in Fig. 5. The red circles in the scatterplot rep-
Figure 4 provides an illustration of the potential benefit resent positive instances, while the blue squares represent
of transforming to a higher dimension. In this example, the negative instances. In the context of malware classification,

123
Author's personal copy
Support vector machines and malware detection

the circles are scores for malware files, while the squares are • Harebot is a backdoor that provides remote access to the
scores for benign files. Furthermore, we assume that higher infected system. Because of its many features, it is also
scores are “better”, that is, for this particular score, posi- considered to be a rootkit [9].
tive instances are supposed to score higher than negative • Security Shield is a Trojan that claims to be anti-virus
instances. software. Security Shield reports fake virus detection
Note that if we place the threshold below the lowest point messages and attempts to coerce the users into purchas-
in the scatterplot in Fig. 5, then ing software [23].
• NGVCK is the Next Generation Virus Construction
Kit [26]. This metamorphic family has been the object
TPR = 1 and FPR = 1.
of study in several published research papers, includ-
ing [1,2,8,12,13,22,24,33].
On the other hand, if we place the threshold above the highest • Smart HDD reports various non-existent problems with
point, then the hard drive and tries to convince the user to purchase a
product to fix these “errors”. Smart HDD is named after
S.M.A.R.T., which is a legitimate tool that monitors hard
TPR = 0 and FPR = 0.
disk drives (HDDs) [25].
• Winwebsec pretends to be anti-virus software. An
Consequently, an ROC curve must always include the points infected system displays fake messages claiming mali-
(0, 0) and (1, 1). The intermediate points on the ROC curve cious activity and attempts to convince the user to pay
are determined as the threshold passes through the range of money for software to clean the supposedly infected sys-
values. For example, if we place the threshold at the yellow tem [32].
line in the scatterplot in Fig. 5, the TPR is 0.7, since 7 of the • Zbot also known as Zeus, is a Trojan horse that com-
10 positive instances are classified correctly, while the FPR promises a system by downloading configuration files or
is 0.2, since 2 of the 10 negative cases lie on the wrong side of updates. Zbot is stealth malware that attempts to hide in
the threshold. This gives us the point (0.2, 0.7) on the ROC the file system [30].
curve, which is illustrated by the black circle on the ROC • ZeroAccess is a Trojan horse that makes use of an
graph in Fig. 5. The shaded region in Fig. 5 represents the advanced rootkit to hide itself. ZeroAccess is capable
AUC, which is 0.75 in this example. of creating a new hidden file system, it can create a back-
door on the compromised system, and it can download
additional malware [31].
3 Experiments and results

In this section, we discuss our experimental design and With the exception of NGVCK, all of these malware fam-
present our results. But first we provide details on the datasets ilies were obtained from the Malicia Project [16]; see also
used in this research. [18].
Table 2 gives the number of files used from each malware
family and the benign dataset. As in the paper [33] and else-
3.1 Datasets
where, we use Cygwin utility files [5] as our representative
set of benign samples.
Our malware samples are drawn from the following malware
families.
Table 2 Datasets

1 Family Number of files

Harebot 50
NGVCK 200
TPR

Security Shield 50
Smart HDD 50
Winwebsec 200
Zbot 200
0
0 FPR 1 ZeroAccess 200
Benign 40
Fig. 5 Scatterplot and ROC curve

123
Author's personal copy
T. Singh et al.

3.2 Experimental design 3.3 Results

In all of our experiments we use 5-fold cross validation. That We first consider results for the NGVCK malware family.
is, the malware dataset under consideration is partitioned into Then we give results for experiments where we further morph
five equal-sized subsets, say, S1 , S2 , S3 , S4 , and S5 . We then the NGVCK opcode sequences, which mimics the effect of
train a model using all files in subsets S1 , S2 , S3 , and S4 , with additional morphing applied to the binaries. The resulting
the resulting model used to score subset S5 , and all samples controlled levels of morphing enable us to compare the degra-
in the representative benign set. This process is repeated four dation of the individual scores versus that of the combined
more times, with a different subset reserved for testing in each SVM score. Finally, we conduct similar experiments on sev-
“fold”. Cross-validation serves to smooth any bias in the data, eral additional malware families.
while also maximizing the number of score computations
from a given dataset.
3.3.1 NGVCK
For each experiment, this entire scoring process is repeated
three times, once for each of the three scores (HMM, OGS,
First, we give results for the NGVCK malware dataset. For
and SSD). For the HMM experiments, we use N = 2 hidden
each of the individual scores (HMM, OGS, and SSD), we
states in all cases. The OGS and SSD scores are implemented
are able to obtain ideal separation. That is, there exists a
as discussed above.
threshold for which no false positives or false negatives occur.
Recall that the SVM is applied to the HMM, OGS, and
Consequently, the ROC curve in each case yields an AUC of
SSD scores. For the SVM, we experiment with various kernel
1.0. These results for the NGVCK metamorphic family are
functions, as discussed below.
expected, since similar results were obtained for the HMM,
OGS, and SSD scores in previous research; see [22,24,33],

(a) 0 (b)
−5 1

−10
0.8
−15

−20
0.6
Score

Score

−25

−30
0.4
−35

−40 0.2
−45 Malware Malware
Benign Benign
−50 0
0 5 10 15 20 25 30 35 40 0 5 10 15 20 25 30 35 40

0.8 0.8

0.6 0.6
Score

Score

0.4 0.4

0.2 0.2
Malware Malware
Benign Benign
0 0
0 5 10 15 20 25 30 35 40 0 5 10 15 20 25 30 35 40

Fig. 6 Score scatterplots. a Hidden Markov model. b Opcode graph similarity. c Simple substitution distance. d Support vector machine

123
Author's personal copy
Support vector machines and malware detection

respectively. These results serve to validate our implementa- 1

1.0 0.92
tion. 0.86 0.85
When we apply an SVM to the HMM, OGS, and SSD 0.8
scores for NGVCK, we also obtain ideal separation. Again,

AUC
this gives us an AUC of 1.0. 0.6
Scatterplots for each of the three individual scores, 0.4
namely, HMM, OGS, and SSD, as well as the SVM results
are given in Fig. 6a through d, respectively. In each case, the 0.2
AUC is 1.0, so we omit the corresponding ROC curves.
0.0

l
ia

ia
D

ad
eu
no

R
N
ly
3.3.2 Morphed NGVCK

Po
To generate more challenge test cases, we apply additional Fig. 7 Comparison of SVM kernels (NGVCK at 80 % morphing)
morphing to the NGVCK opcode files. Specifically, we
insert opcode sequences extracted from benign files into the
1 HMM
NGVCK opcode sequences. This process, which serves to OGS
SSD
simulate the effect of a higher degree of code morphing, SVM
has been used in several previous studies, including [13,14]. 0.9

Also, by taking the morphing code from our set of benign

files, the statistical profile of the malware will tend to merge 0.8
with that of the benign set, making detection more difficult,
AUC

at least with respect to statistical-based techniques.

0.7
We experimented with various ways of inserting the dead
code and found that “block morphing” produced the largest
adverse effect on the scores, which is consistent with previous 0.6
research. Consequently, we insert a benign opcode sequence
directly into an NGVCK file in the form of a consecutive
0.5
block. The amount of code inserted is measured as a per- 0 20 40 60 80 100 120
Morphing Percentage
centage of the number of opcodes in the original NGVCK
file. For example, if an NGVCK file contains 1000 opcodes, Fig. 8 AUC at various morphing percentages (NGVCK)
then 40 % morphing means that we extracted a block of 400
consecutive opcodes from a benign file and inserted this block
into the NGVCK opcode file. HMM
OGS
Before giving our results for the morphed NGVCK exper- SSD
iments, we need to determine a kernel function for the SVM. SVM
In experimenting with several standard kernel functions, we
1.00
obtain the results in Fig. 7. For these experiments, the radial
function yields the best results. Consequently, for all subse- 0.80
quent SVM experiments given in this paper, we use the radial
kernel. 0.60
AUC

Figure 8 gives our results, in the form of line graphs, for the
0.40
morphed NGVCK experiments, at morphing rates from 0 to
120 %. We observe that the HMM score deteriorates signif- 0.20
icantly at just 10 % morphing. The OGS score only begins
to fail at 50 % morphing, while the SSD scores begins to 0.00
D

s
t

es
bo

se
iel

fail at 60 % morphing. In contrast, the SVM achieves ideal

cc
b
e

tH
Sh

we
ar

A
H

ro
in
ar
y

separation until the morphing rate reaches 100 %. Further-

rit

Ze
W
Sm
cu
Se

more, in all cases with less than perfect separation, the SVM
exceeds the results obtained for all of the individual scores. Fig. 9 AUC comparisons for Malicia families
This clearly shows the strength of the SVM as a method
for combining malware scores into a higher-level “meta
score”.

123
Author's personal copy
T. Singh et al.

(a) 1 (b) 1

0.9 0.9

0.8 0.8

0.7 0.7

0.6 0.6
AUC

AUC
0.5 0.5

0.4 0.4

0.3 0.3

0.2 HMM 0.2 HMM

OGS OGS
0.1 SSD 0.1 SSD
SVM SVM

0 20 40 60 80 100 120 140 0 20 40 60 80 100 120 140

Morphing Percentage Morphing Percentage

0.9 0.9

0.8 0.8

0.7 0.7

0.6 0.6
AUC

AUC

0.5 0.5

0.4 0.4

0.3 0.3

0.2 HMM 0.2 HMM

OGS OGS
0.1 SSD 0.1 SSD
SVM SVM

0 20 40 60 80 100 120 140 0 20 40 60 80 100 120 140

Morphing Percentage Morphing Percentage

(e) 1 (f) 1

0.9 0.9

0.8 0.8

0.7 0.7

0.6 0.6
AUC

AUC

0.5 0.5

0.4 0.4

0.3 0.3

0.2 HMM 0.2 HMM

OGS OGS
0.1 SSD 0.1 SSD
SVM SVM

0 20 40 60 80 100 120 140 0 20 40 60 80 100 120 140

Morphing Percentage Morphing Percentage

Fig. 10 AUC comparison for morphed Malicia families. a Winwebsec, b Zeroaccess, c Zbot, d Harebot, e Security Shield, f Smart HDD

123
Author's personal copy
Support vector machines and malware detection

3.3.3 Malicia malware families We also applied SVMs to the HMM, OGS, and SSD
scores. We showed that the resulting SVM classifier was
Next, we consider several additional malware families, stronger and significantly more robust than the individual
namely, Harebot, Security Shield, Smart HDD, Winwebsec, scores—in almost every case, the SVM score was better ini-
Zbot, and ZeroAccess. Since all of these were obtained from tially, and it degraded more slowly in the face of increased
the Malicia Project [16] we refer to them collectively as the morphing. This clearly demonstrates the benefit of combin-
Malicia families. ing disparate scores using a technique such as SVM.
Figure 9 presents our experimental results, in the form of a Future work could include similar experiments involving
bar graph, for the various Malicia families. Of the individual additional scores. All of the scores considered in this research
scores, we see that the HMM score consistently performs are statistical-based. Intuitively, the more distinct the features
well, with the SSD score doing well in some cases. The OGS that we measure, the more robust the resulting combined
score is the weakest of the three scores. We also observe that score. Therefore, by including a wider variety of scores, we
the SVM achieves ideal separation for all families, even in would expect to obtain an even more robust SVM score.
cases where one (or more) of the individual scores performs Examples of additional scores that could be tested include
poorly. structural and entropy-based scores [2,13], and call graph-
based scores [7], as well as dynamically extracted features
such as API call sequences [6].
3.3.4 Morphed Malicia families
The experiments reported here use a simple code morph-
ing strategy that relies entirely on dead code insertion. While
We now give experimental results where additional morphing
this is sufficient to show that the SVM is effective as a “meta-
is applied to each of the Malicia families. These experiments
score”, it would also be interesting to quantify the effect of
are analogous to those discussed in Sect. 3.3.2 for NGVCK.
other morphing strategies, as well as combinations of mor-
Recall that the corresponding NGVCK results are given in
phing strategies. Such results would give us a better insight
Fig. 8.
into the challenges presented by metamorphic malware.
The results for all of the morphed Malicia experiments
are presented in Fig. 10. As in the NGVCK experiment, the
HMM score tends to decline significantly at low morphing
rates. But unlike the NGVCK results, the OGS score gen- References
erally gives the poorest results. The SSD score is somewhat
1. Attaluri, S., McGhee, S., Stamp, M.: Profile hidden Markov models
erratic—in some cases the score actually improves at low and metamorphic virus detection. J. Comput. Virol. 5(2), 151–169
morphing rates. Overall, SVMs clearly give the best results, (2009)
although the SSD score does do slightly better in a few cases 2. Baysa, D., Low, R.M., Stamp, M.: Structural entropy and meta-
at midrange morphing rates. morphic malware. J. Comput. Virol. Hacking Tech. 9(4), 179–192
(2013)
Suppose that the AUC for a given experiment is x, where 3. Bradley, A.P.: The use of the area under the ROC curve in the
x < 0.5. Then by simply reversing the sense of the binary evaluation of machine learning algorithms. J. Pattern Recognit.
classifier, we obtain an AUC of 1 − x > 0.5. Consequently, 30(7), 1145–1159 (1997)
some of the low AUC graphs in Fig. 10, actually represent rel- 4. Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector
Machines and Other Kernel-Based Learning Methods. Cambridge
atively strong scores, when properly interpreted. It appears University Press, London (2000)
that the SVM is able to properly interpret such scores; for 5. Cygwin. Cygwin utility files. http://www.cygwin.com/ (2015).
example, compare the graphs in Fig. 10b, c. This is entirely Accessed 21 Sept 2015
plausible based on the geometric intuition behind the SVM 6. Damodaran, A.: Combining dynamic and static analysis for mal-
ware detection. San Jose State University, Department of Computer
technique. In any case, the results in Fig. 10 provide addi- Science, Master’s Projects, Paper 391. http://scholarworks.sjsu.
tional evidence of the strength of SVMs for this particular edu/etd_projects/391 (2015). Accessed 21 Sept 2015
application. 7. Deshpande, P.: Metamorphic detection using function call graph
analysis. San Jose State University, Department of Computer Sci-
ence, Master’s Projects, Paper 336. http://scholarworks.sjsu.edu/
etd_projects/336 (2013). Accessed 21 Sept 2015
8. Deshpande, S., Park, Y., Stamp, M.: Eigenvalue analysis for meta-
4 Conclusion and future work morphic detection. J. Comput. Virol. Hacking Tech. 10(1), 53–65
(2014)
Detection of advanced malware is a challenging research 9. Harebot. http://www.pandasecurity.com/homeusers/security-info/
problem. In this paper, we investigated the effectiveness of 220319/Harebot.M (2015). Accessed 21 Sept 2015
10. Introduction to Support Vector Machines. http://fourier.eng.hmc.
HMM, OGS, and SSD techniques for detection of malware edu/e161/lectures/svm (2015). Accessed 21 Sept 2015
families. We then implemented morphing strategies that sig- 11. Jakobsen, T.: A fast method for the cryptanalysis of substitution
nificantly degraded each of these scores. ciphers. Cryptologia 19, 265–274 (1995)

123
Author's personal copy
T. Singh et al.

12. Jidigam, R.K., Austin, T.H., Stamp, M.: Singular value decompo- 24. Shanmugam, G., Low, R., Stamp, M.: Simple substitution distance
sition and metamorphic detection. J. Comput. Virol. Hacking Tech and metamorphic detection. J. Comput. Virol. Hacking Tech. 9(3),
(2015). (To appear) 159–170 (2013)
13. Lee, J., Austin, T.H., Stamp, M.: Compression-based analysis of 25. Smart HDD. http://support.kaspersky.com/viruses/rogue?qid=
metamorphic malware. Int. J. Secur. Netw (2015). (To appear) 208286454 (2015). Accessed 21 Sept 2015
14. Lin, D., Stamp, M.: Hunting for undetectable metamorphic viruses. 26. Snakebyte. Next generation virus construction kit (NGVCK).
J. Comput. Virol. 7(3), 201–214 (2011) http://vx.netlux.org/vx.php?id=tn02 (2000). Accessed 21 Sept
15. Lu, Y.B., Din, S.C., Zeng, C.F.: Using multi-feature and classifier 2015
ensembles to improve malware detection. J. C.C.I.T 32(2), 57–72 27. Stamp, M.: A revealing introduction to hidden Markov models.
(2010) http://www.cs.sjsu.edu/~stamp/RUA/HMM.pdf (2015). Accessed
16. Malicia Project. http://malicia-project.com/ (2015). Accessed 21 21 Sept 2015
Sept 2015 28. Support vector machines (SVM) introductory overview. http://
17. Menahem, E., Shabtai, A., Rokach, L., Elovici, Y.: Improving mal- www.statsoft.com/textbook/support-vector-machines (2015).
ware detection by applying multi-inducer ensemble. Comput. Stat. Accessed 21 Sept 2015
Data Anal. 53(4), 1483–1494 (2009) 29. Toderici, A.H., Stamp, M.: Chi-squared distance and metamorphic
18. Nappa, A., Zubair Rafique, M., Caballero, J.: Driving in the cloud: virus detection. J. Comput. Virol. Hacking Tech. 9(1), 1–14 (2013)
an analysis of drive-by download operations and abuse reporting. 30. Trojan.Zbot. http://www.symantec.com/security_response/
In: Proceedings of the 10th Conference on Detection of Intrusions writeup.jsp?docid=2010-011016-3514-99 (2015). Accessed 21
and Malware and Vulnerability Assessment. Berlin (2013) Sept 2015
19. Ng, A.: Support vector machines. http://cs229.stanford.edu/notes/ 31. Trojan.ZeroAccess. http://www.symantec.com/security_
cs229-notes3.pdf (2015). Accessed 21 Sept 2015 response/writeup.jsp?docid=2011-071314-0410-99 (2015).
20. Patel, M.: Similarity tests for metamorphic virus detection. San Accessed 21 Sept 2015
Jose State University, Department of Computer Science, Master’s 32. Win32/Winwebsec. http://www.microsoft.com/security/portal/
Projects, Paper 175. http://scholarworks.sjsu.edu/etd_projects/175 threat/encyclopedia/entry.aspx?Name=Win32%2fWinwebsec
(2011). Accessed 21 Sept 2015 (2015). Accessed 21 Sept 2015
21. Qin, Z., Chen, N., Zhang, Q., Di, Y.: Mobile phone viruses detection 33. Wong, W., Stamp, M.: Hunting for metamorphic engines. J. Com-
based on HMM. In: Proceedings of International Conference on put. Virol. 2(3), 211–229 (2006)
Multimedia Information Networking and Security, pp. 516–519 34. Xin, K., Li, G., Qin, Z., Zhang, Q.: Malware detection in smart-
(2011) phones using hidden Markov model. In: Proceedings of Interna-
22. Runwal, N., Low, R.M., Stamp, M.: Opcode graph similarity and tional Conference on Multimedia Information Networking and
metamorphic detection. J. Comput. Virol. 8(1–2), 37–52 (2012) Security, pp. 857–860 (2012)
23. Security Shield. http://www.symantec.com/security_response/ 35. Zhang, B., Yin, J., Hao, J., Zhang, D., Wang, S.: Malicious codes
glossary/define.jsp?letter=s&word=security-shield. Accessed 21 detection based on ensemble learning. In: Proceedings of Auto-
Sep 2015 nomic and Trusted Computing, 4th International Conference, pp.
468–477 (2007)

123

View publication stats

Context Aware Browser
No ratings yet
Context Aware Browser
24 pages
4246 Articolopubblicato
No ratings yet
4246 Articolopubblicato
9 pages
A Comparison of Feature Extraction Methods For The Classification of Dynamic Activities From Accelerometer Data
No ratings yet
A Comparison of Feature Extraction Methods For The Classification of Dynamic Activities From Accelerometer Data
10 pages
Raptor Codes For Reliable Download Delivery in Wireless Broadcast Systems
No ratings yet
Raptor Codes For Reliable Download Delivery in Wireless Broadcast Systems
7 pages
7596 ArticleText 43643 1 10 20200317
No ratings yet
7596 ArticleText 43643 1 10 20200317
3 pages
JSportsMedPhysFitness-12118 Manuscript PDF
No ratings yet
JSportsMedPhysFitness-12118 Manuscript PDF
34 pages
The Variability of Attitudinal Repeat-Rates-5
No ratings yet
The Variability of Attitudinal Repeat-Rates-5
15 pages
Fitzgerald 2020 American College of Rheumatology G
No ratings yet
Fitzgerald 2020 American College of Rheumatology G
18 pages
Relationship Between Tactical and Technical Performance in Youth Soccer Players
No ratings yet
Relationship Between Tactical and Technical Performance in Youth Soccer Players
10 pages
As Relações Entre As Matas Ciliares, Os Rios e Os Peixes
No ratings yet
As Relações Entre As Matas Ciliares, Os Rios e Os Peixes
22 pages
Contrasting Patterns of Climatic Changes During The Holocene Across
No ratings yet
Contrasting Patterns of Climatic Changes During The Holocene Across
9 pages
Tamir 2020
No ratings yet
Tamir 2020
6 pages
Neuromuscular Junction Aging A Role For Biomarkers and Exercise
No ratings yet
Neuromuscular Junction Aging A Role For Biomarkers and Exercise
11 pages
Jbossio Megavatios 306 2006
No ratings yet
Jbossio Megavatios 306 2006
7 pages
How News May Affect Markets' Complex Structure: The Case of Cambridge Analytica
No ratings yet
How News May Affect Markets' Complex Structure: The Case of Cambridge Analytica
13 pages
Consensusstatements ITI2008 IJOMIChenetal 2009
No ratings yet
Consensusstatements ITI2008 IJOMIChenetal 2009
8 pages
Fluid Circulation Determined in The Isolated Bovin
No ratings yet
Fluid Circulation Determined in The Isolated Bovin
11 pages
Amp 562128
No ratings yet
Amp 562128
39 pages
Rethinking Acculturation
No ratings yet
Rethinking Acculturation
16 pages
Design of Low Cost Easy To Fly
No ratings yet
Design of Low Cost Easy To Fly
9 pages
Review: Winery Wastewater Quality and Treatment Options in Australia
No ratings yet
Review: Winery Wastewater Quality and Treatment Options in Australia
13 pages
George Et Al-2016-Journal of Plant Nutrition and Soil Science
No ratings yet
George Et Al-2016-Journal of Plant Nutrition and Soil Science
10 pages
Pushback Tugs Fuel
No ratings yet
Pushback Tugs Fuel
5 pages
JSOMWinter 184 Canada Author Copy 1
No ratings yet
JSOMWinter 184 Canada Author Copy 1
6 pages
Borregaard Et Al-2016-Biological Reviews
No ratings yet
Borregaard Et Al-2016-Biological Reviews
25 pages
Energeticsof Schuttle Runstheeffectsofdistanceandchangeofdirection
No ratings yet
Energeticsof Schuttle Runstheeffectsofdistanceandchangeofdirection
8 pages
Magnetic Model Calibration and Distortion Compensation For Electromagnetic Tracking in A Clinical Environment
No ratings yet
Magnetic Model Calibration and Distortion Compensation For Electromagnetic Tracking in A Clinical Environment
13 pages
Analysis of Concrete-Lined Tunnels Crossing Active Faults: June 2018
No ratings yet
Analysis of Concrete-Lined Tunnels Crossing Active Faults: June 2018
9 pages
Schmitt Branscombe Postmes Garcia 2014
No ratings yet
Schmitt Branscombe Postmes Garcia 2014
29 pages
Study of Driver PerformanceAcceptance Using Aspheric Mirrors in Light Vehicle Applications
No ratings yet
Study of Driver PerformanceAcceptance Using Aspheric Mirrors in Light Vehicle Applications
279 pages
14 Malone IJSPP 2014-0352 489 497
No ratings yet
14 Malone IJSPP 2014-0352 489 497
10 pages
Dispositional Mindfulness As A Positive Predictor of Psychological Well Being and The Role of The Private Self Consciousness Insight Factor
No ratings yet
Dispositional Mindfulness As A Positive Predictor of Psychological Well Being and The Role of The Private Self Consciousness Insight Factor
5 pages
Camarchia Et Al - The Doherty Power Amplifier Review of Recent Solutions and Trends 2015
No ratings yet
Camarchia Et Al - The Doherty Power Amplifier Review of Recent Solutions and Trends 2015
14 pages
Reg.1
No ratings yet
Reg.1
11 pages
Jenkinsetal2013PNAS Earlyedition
No ratings yet
Jenkinsetal2013PNAS Earlyedition
11 pages
Bipolar Depression: Overview and Commentary: Harvard Review of Psychiatry June 2010
No ratings yet
Bipolar Depression: Overview and Commentary: Harvard Review of Psychiatry June 2010
17 pages
Abio - Biocomposites Sailboat
No ratings yet
Abio - Biocomposites Sailboat
15 pages
ADeep Diveinto Brand Boycotts
No ratings yet
ADeep Diveinto Brand Boycotts
12 pages
Usability of A Smartphone Application To Support T
No ratings yet
Usability of A Smartphone Application To Support T
13 pages
The Case of Sarno River (Southern Italy) - Effects of Geomorphology On The Environmental Impacts (8 PP)
No ratings yet
The Case of Sarno River (Southern Italy) - Effects of Geomorphology On The Environmental Impacts (8 PP)
9 pages
CH 32
No ratings yet
CH 32
10 pages
MAP A Scalable Measurement Infrastructure For Secu
No ratings yet
MAP A Scalable Measurement Infrastructure For Secu
10 pages
Seismotectonics of The 6 February 2012 MW 6.7 Negros Earthquake, Central Philippines
No ratings yet
Seismotectonics of The 6 February 2012 MW 6.7 Negros Earthquake, Central Philippines
17 pages
Abba e Tal 2012-Mammalia
No ratings yet
Abba e Tal 2012-Mammalia
15 pages
Consumer Perceptions of Sustainable Products: A Systematic Literature Review
No ratings yet
Consumer Perceptions of Sustainable Products: A Systematic Literature Review
19 pages
Steady State and Transient Short Circuit Analysis
No ratings yet
Steady State and Transient Short Circuit Analysis
10 pages
Cryogenic Rocket Engine Development at Delft Aerospace Rocket Engineering
No ratings yet
Cryogenic Rocket Engine Development at Delft Aerospace Rocket Engineering
13 pages
Strength Trainingfor Endurance Athletes Theoryto Practice
No ratings yet
Strength Trainingfor Endurance Athletes Theoryto Practice
13 pages
Field Assessment of Surface Water-Groundwater Connectivity in A Semi Arid River Basin (Murray-Darling, Australia)
No ratings yet
Field Assessment of Surface Water-Groundwater Connectivity in A Semi Arid River Basin (Murray-Darling, Australia)
13 pages
Digitspan June232010
No ratings yet
Digitspan June232010
8 pages
Snell 2017
No ratings yet
Snell 2017
11 pages
Planning For Sustainable Utility Infrastructure: Urban Design and Planning January 2009
No ratings yet
Planning For Sustainable Utility Infrastructure: Urban Design and Planning January 2009
16 pages
Pagano Fontanella Sica Desideri2010SF
No ratings yet
Pagano Fontanella Sica Desideri2010SF
14 pages
Aerobic Exercise Effect On Neuroplasticity Post Stroke (2014)
No ratings yet
Aerobic Exercise Effect On Neuroplasticity Post Stroke (2014)
17 pages
Healthcare 11 01470
No ratings yet
Healthcare 11 01470
20 pages
Techniques On Vertical Ridge Augmentation: Indications and Effectiveness
No ratings yet
Techniques On Vertical Ridge Augmentation: Indications and Effectiveness
31 pages
International Standards For Neurological and Funct
No ratings yet
International Standards For Neurological and Funct
10 pages
A Survey of Credit Card Fraud Detection Techniques
No ratings yet
A Survey of Credit Card Fraud Detection Techniques
27 pages
Url Safety Checker
No ratings yet
Url Safety Checker
18 pages
Security Threats of URL Shortening: A User's Perspective: Journal of Advances in Computer Networks September 2015
No ratings yet
Security Threats of URL Shortening: A User's Perspective: Journal of Advances in Computer Networks September 2015
8 pages
Performance Analysis of Google Colaboratory As A T
No ratings yet
Performance Analysis of Google Colaboratory As A T
9 pages
6TH Material English
No ratings yet
6TH Material English
14 pages
30 Deep Learningtodetectskincancerusing Google Colab
No ratings yet
30 Deep Learningtodetectskincancerusing Google Colab
9 pages
Boe 13 4 1985
No ratings yet
Boe 13 4 1985
10 pages
Heuristics and Decentralized Planning
No ratings yet
Heuristics and Decentralized Planning
9 pages
Football Analytics
No ratings yet
Football Analytics
15 pages
Packer 2016 CLAMMS
No ratings yet
Packer 2016 CLAMMS
3 pages
Lecture Slides For: Ethem Alpaydin © The MIT Press, 2010
No ratings yet
Lecture Slides For: Ethem Alpaydin © The MIT Press, 2010
30 pages
A Real-Time ASL Recognition System Using Leap Motion Sensors
No ratings yet
A Real-Time ASL Recognition System Using Leap Motion Sensors
4 pages
Undergraduate Fundamentals of Machine Learning Author William J. Deuschle
No ratings yet
Undergraduate Fundamentals of Machine Learning Author William J. Deuschle
143 pages
(Series On Language Processing Pattern Recognition and Intelligent Systems Vol. 4) Neamat El Gayar, Ching Y. Suen - Computational Linguistics, Speech and Image Processing For Arabic Language-World Sci
No ratings yet
(Series On Language Processing Pattern Recognition and Intelligent Systems Vol. 4) Neamat El Gayar, Ching Y. Suen - Computational Linguistics, Speech and Image Processing For Arabic Language-World Sci
286 pages
Lecture 1
No ratings yet
Lecture 1
32 pages
Chen and Tsou PDF
No ratings yet
Chen and Tsou PDF
16 pages
Intelligent Salat Posture Monitoring
No ratings yet
Intelligent Salat Posture Monitoring
12 pages
EBSCO-FullText-07 01 2025
No ratings yet
EBSCO-FullText-07 01 2025
33 pages
Bayes Theorem for ML Enthusiasts
No ratings yet
Bayes Theorem for ML Enthusiasts
37 pages
Introduction To Artificial Intelligence QA 2
No ratings yet
Introduction To Artificial Intelligence QA 2
29 pages
Stock Forecasting with HMM & Fuzzy Logic
No ratings yet
Stock Forecasting with HMM & Fuzzy Logic
8 pages
Context-Free Grammar Explained
No ratings yet
Context-Free Grammar Explained
14 pages
Automatic Trading System
No ratings yet
Automatic Trading System
21 pages
Assignment 3
No ratings yet
Assignment 3
12 pages
NERHMM: A Tool For Named Entity Recognition Based On Hidden Markov Model
No ratings yet
NERHMM: A Tool For Named Entity Recognition Based On Hidden Markov Model
7 pages
"Speech Recognition and Voice Detection System": Bachlor of Technology IN Computer Science Engineering
No ratings yet
"Speech Recognition and Voice Detection System": Bachlor of Technology IN Computer Science Engineering
29 pages
Human Motion Analysis - A Review
No ratings yet
Human Motion Analysis - A Review
13 pages
Unit 2 NLP
No ratings yet
Unit 2 NLP
5 pages
Cis262 HMM
No ratings yet
Cis262 HMM
34 pages
NLP Sem Questions and Answers
100% (1)
NLP Sem Questions and Answers
72 pages
NLP Exercises & Solutions
No ratings yet
NLP Exercises & Solutions
47 pages
AI - 4th Unit
No ratings yet
AI - 4th Unit
19 pages
Literature Review On Automatic Speech Recognition
No ratings yet
Literature Review On Automatic Speech Recognition
7 pages
AI & ML Honours Course Overview
No ratings yet
AI & ML Honours Course Overview
16 pages
Anomaly Detection in Smart Home Operation From User Behaviors and Home Conditions
No ratings yet
Anomaly Detection in Smart Home Operation From User Behaviors and Home Conditions
10 pages
An Introduction To IoT Analytics 1st Edition Harry G Perros Instant Download
No ratings yet
An Introduction To IoT Analytics 1st Edition Harry G Perros Instant Download
133 pages
Automatic Chord Recognition Using Hidden Markov Models: Haverford College
100% (2)
Automatic Chord Recognition Using Hidden Markov Models: Haverford College
22 pages

SVM Malware Detection

Uploaded by

SVM Malware Detection

Uploaded by

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

Support vector machines and malware detection

Article in Journal of Computer Virology and Hacking Techniques · November 2016

Fabio Di Troia Corrado Aaron Visaggio

SEE PROFILE SEE PROFILE

Thomas H. Austin Mark Stamp

SEE PROFILE SEE PROFILE

The user has requested enhancement of the downloaded file.

Tanuvir Singh, Fabio Di Troia, Visaggio

Journal of Computer Virology and

J Comput Virol Hack Tech

Support vector machines and malware detection

Received: 9 July 2015 / Accepted: 12 September 2015

In this section, we discuss the scoring techniques used in the

As with the HMM score, the OGS score [22] is based on

2.2.4 Support vector machines

• Maximize the “margin”—given labeled training data, 2.3 ROC analysis

1 Family Number of files

3.2 Experimental design 3.3 Results

respectively. These results serve to validate our implementa- 1

Also, by taking the morphing code from our set of benign

at least with respect to statistical-based techniques.

fail at 60 % morphing. In contrast, the SVM achieves ideal

separation until the morphing rate reaches 100 %. Further-

0.2 HMM 0.2 HMM

0 20 40 60 80 100 120 140 0 20 40 60 80 100 120 140

0.2 HMM 0.2 HMM

0 20 40 60 80 100 120 140 0 20 40 60 80 100 120 140

0.2 HMM 0.2 HMM

0 20 40 60 80 100 120 140 0 20 40 60 80 100 120 140

View publication stats

You might also like