CN109192217A

CN109192217A - General information towards multiclass low rate compression voice steganography hides detection method

Info

Publication number: CN109192217A
Application number: CN201810884205.9A
Authority: CN
Inventors: 李松斌; 刘鹏; 杨洁
Original assignee: Research Station Of South China Sea Institute Of Acoustics Chinese Academy Of Sciences; Institute of Acoustics CAS
Current assignee: Research Station Of South China Sea Institute Of Acoustics Chinese Academy Of Sciences; Institute of Acoustics CAS
Priority date: 2018-08-06
Filing date: 2018-08-06
Publication date: 2019-01-11
Anticipated expiration: 2038-08-06
Also published as: CN109192217B

Abstract

The invention discloses a general information hiding detection method for multi-type low-rate compressed speech steganography, which is used for detecting the coding of G.723.1 low-rate speech encoder. The specific steps of the method are as follows: Step 1) based on the low-rate compressed speech code The symbols in the stream construct the symbol space-time correlation network CESN associated with all symbols within the frame; step 2) further construct the symbol Bayesian for steganalysis by using the strong correlation edges in the symbol space-time correlation network CESN Network CEBN; Step 3) carry out network parameter learning to symbol Bayesian network CEBN based on training samples and obtain threshold J _thr ; Step 4) given a segment of speech fragments comprising N frames, according to the network of symbol Bayesian network CEBN The parameter calculates the speech steganography index J. When J<J _thr , it is determined as an unsteganographic speech segment; when J≥J _thr , it is determined as a steganographic speech segment. The general steganalysis capability of the method of the present invention is obviously higher than that of the general steganalysis method based on the uncompressed domain MFCC.

Description

General information towards multiclass low rate compression voice steganography hides detection method

Technical field

The present invention relates to field of information security technology, in particular to the general letter towards multiclass low rate compression voice steganography Hidden information detection method.

Background technique

VoIP (Voice over Internet Protocol) speech hiding is a kind of to be embedded into secret information Secret information is set to be difficult to the technology discovered by regulator in VoIP voice bearer.VoIP system has used multiple voice coding Device, including G.711, G.726, G.728, G.729, G.723.1, iLBC, AMR and their derived version etc., they can be big Cause is divided into two classes: first is that it is based on the high-speed wave coder of pulse code modulation (Pulse Code Modulation, PCM), Typical Representative is as G.711；Second is that linear predictive coding (the Analysis by Synthesis-Linear based on analysis-by-synthesis method Predictive Coding, AbS-LPC) middle low rate hybrid coder, Typical Representative is as G.723.1, G.729 and AMR Deng.Since high-speed speech coder is applied simultaneously seldom in actual VoIP system, to save bandwidth resources, low speed is mostly used Rate encoder.Currently, common Methods of Speech Information Hiding is to realize secret letter based on AbS-LPC low rate compression voice coding The insertion of breath.In terms of existing document, the information concealing method based on AbS-LPC low rate compression speech coder is according to insertion The difference of position can be divided into three classes: first kind method is to carry out Information hiding using pitch synthesis filter；Second class method is Information hiding is carried out using LPC composite filter；Third class method is that the symbol in direct modification compression speech code stream carries out letter Breath is hidden.

To detect in speech code stream with the presence or absence of hiding information as purpose Information Hiding & Detecting technology, also known as voice steganography Analytical technology is and speech hiding technology synchronized development, their mutually contradictory and development of mutually promoting.Existing AbS- LPC low rate compression voice steganalysis method is divided into three classes for information concealing method: the first kind is closed for based on tone The steganalysis method of Information hiding is carried out at filter；Second class is to carry out Information hiding for based on LPC composite filter Steganalysis method；Third class is the steganalysis method hiding for symbol modulation information in speech code stream.

Above-mentioned three classes AbS-LPC low rate compression voice steganalysis method mainly includes three steps: being extracted first AbS-LPC low rate compresses the various features in voice；Then Fusion Features processing is carried out, is carried out if characteristic dimension is excessively high Feature Selection or dimension-reduction treatment；Finally using treated feature as feature vector, support vector machines (Support is utilized Vector Machine, SVM) training classifier, realize that AbS-LPC low rate compresses voice steganalysis.These methods are main There are two features: first is that using the frame of " feature extraction --- characteristic processing --- svm classifier "；Second is that hidden for specific information Hiding method is detected, and specific encoded information or specific code element information are only utilized in feature extraction, thus cannot Carry out general steganalysis.If carrying out general steganalysis, need using a variety of steganalysis methods, complexity is high, very Difficulty meets application request.Some steganalysis methods that phonetic feature is extracted based on uncompressed domain, such as mel cepstrum feature, Such methods are a kind of general steganalysis methods, although can be used for AbS-LPC low-bit-rate speech coding steganalysis；But it should Class method is mainly and is uncompressed voice document steganalysis and designs, and AbS-LPC coding principle is not bound with, to AbS-LPC The Detection accuracy that low rate compresses voice is relatively low, and generalization ability is weaker.Therefore it is low to need a kind of efficient general AbS-LPC Rate compresses voice steganalysis method.

Summary of the invention

It is an object of the invention to overcome above-mentioned technological deficiency, propose a kind of towards multiclass low rate compression voice steganography General information hide detection method, for detecting the coding of G.723.1 Speech Coding at Low Bit Rates, the method includes under:

When step 1) constructs the symbol of intra-frame trunk in all symbol frames based on the symbol in low rate compression speech code stream Null Context network C ESN；

Step 2) utilizes the strong incidence edge in symbol space time correlation network C ESN further to construct the code for steganalysis First Bayesian network CEBN；

Step 3) is based on training sample and carries out network parameter study to symbol Bayesian network CEBN and obtain threshold value J_thr；

The given one section of sound bite comprising N frame of step 4), calculates according to the network parameter of symbol Bayesian network CEBN Voice steganography index J, as J < J_thrWhen, it is determined as non-steganography sound bite；As J >=J_thrWhen, it is determined as steganography sound bite.

As a kind of improvement of the above method, the step 1) is specifically included:

Step 1-1) G.723.1 the every frame of the encoding code stream of Speech Coding at Low Bit Rates be made of 24 symbols；24 symbols It include: 3 LPC vector quantization index symbols: VQ₁、VQ₂And VQ₃, 4 adaptive codebooks delay symbols: ACL₀、ACL₁、ACL₂With ACL₃, 4 portfolio premium symbols: GAIN₀、GAIN₁、GAIN₂And GAIN₃, 5 pulse positions index symbols: POS₀、POS₁、 POS₂、POS₃And MSBPOS, 4 impulse codes index symbol: PSIG₀、PSIG₁、PSIG₂And PSIG₃, 4 grid index symbols GRID₀、GRID₁、GRID₂And GRID₃；

Step 1-2) 24 symbols are denoted as C_i, i=1,2 ..., 24, with C_iFor vertex, incidence relation conduct between intra frame The digraph on side, is denoted as figure D=<V, and E>, wherein V and E is indicated are as follows:

Wherein: V indicates the set on vertex in digraph D, v_i[m] indicates i-th of code word of m frame；E indicates the collection on side in D It closes, by vertex v_i[p] is directed toward vertex v_jThe directed edge of [q]；Incidence edge in frame is indicated as i=j and p ≠ q；It is indicated as j > i Intra-frame trunk side；Analyze incidence relation when { 0,1 } j-i ∈: incidence edge has 276, interframe in the frame being made of 24 symbols Incidence edge has 576, shares 852；

Step 1-3) calculate symbol C_iAnd C_jRelative index I_c(C_i,C_j):

Wherein, r_iAnd r_jRespectively indicate symbol C_iAnd C_jMaximum value, p (C_i=c_i) and p (C_j=c_i) respectively indicate symbol C_i=c_iMarginal probability and symbol C_j=c_iMarginal probability, p (C_i=c_i,C_j=c_i) indicate symbol C_i=c_iAnd C_j=c_iConnection Close probability；If symbol C_iAnd C_jIndependently of each other, then arbitrary c is corresponded to_iAnd c_iIt is all satisfied p (C_i=c_i)p(C_j=c_i)=p (C_i= c_i,C_j=c_i), then I_c(C_i,C_j)=0；

Step 1-4) in 852 incidence edges retain I_c(C_i,C_j) > 0.5 strong incidence edge, then symbol space time correlation network CESN includes strong incidence edge and 7 strong incidence edges of interframe totally 11 sides in 4 frames；Strong incidence edge in frame are as follows: ACL₀-ACL₂、VQ₁- VQ₂、VQ₁-VQ₃And VQ₂-VQ₃；The strong incidence edge of interframe are as follows: ACL₀-ACL₀、ACL₀-ACL₂、ACL₂-ACL₀、ACL₂-ACL₂、VQ₁- VQ₁、VQ₂-VQ₂And VQ₃-VQ₃。

As a kind of improvement of the above method, the step 2) is specifically included:

Step 2-1) using voice frame category as root node O, the voice frame category includes: non-steganography, is denoted as 0 and hidden It writes, is denoted as 1；

Step 2-2) respectively using 24 symbol values as the child node of root node O, by grid index symbol GRID₀、 GRID₁、GRID₂、GRID₃This 4 nodes merge into a node, are denoted as GRID, it is 4 that bit, which distributes number,；By symbol ACL₁With ACL₃A node is merged into, is denoted as ACL, it is 4 that bit, which distributes number,；20 child nodes are shared after having merged, and are constituted by O to 20 The directed edge of a node；

Step 2-3) delete step 1-4) strong incidence edge ACL in obtained frame₀-ACL₂、VQ₁-VQ₂、VQ₁-VQ₃And VQ₂-VQ₃ Middle VQ₁To VQ₃Directed edge VQ₁-VQ₃；

Step 2-4) respectively using adjacent interframe value as node ACL₀、ACL₂、VQ₁、VQ₂、VQ₃Child node ACL₀’、 ACL₂’、VQ₁’、VQ₂’、VQ₃', it constitutes by ACL₀To ACL₀’、ACL₀To ACL₂’、ACL₂To ACL₂’、ACL₂To ACL₀’、VQ₁It arrives VQ₁’、VQ₂To VQ₂’、VQ₃To VQ₃' seven directed edges；Remove ACL₀To ACL₂' and ACL₂To ACL₀Two sides；By directed edge O To ACL₀', O to ACL₂', O to VQ₁', O to VQ₂', O to VQ₃' be added in network；

Step 2-5) final constructed symbol Bayesian network CEBN is a multitiered network being made of 26 nodes, All code element informations are contained, wherein 15 node GAIN₀、GAIN₁、GAIN₂、GAIN₃、POS₀、POS₁、POS₂、POS₃、 MSBPOS、PSIG₀、PSIG₁、PSIG₂、PSIG₃, ACL, GRID be isolated node.

As a kind of improvement of the above method, the step 3) is specifically included:

Step 3-1) node in symbol Bayesian network CEBN is denoted as stochastic variable X respectively₁,X₂,…,X₂₆, wherein X₁ Corresponding root node O, other stochastic variables correspond to child node；The value of stochastic variable is denoted as x respectively₁,x₂,…,x₂₆, then network Joint probability distribution are as follows:

Wherein: Pa (X_i) indicate stochastic variable X_iFather node；P(X_i|Pa(X_i)) indicate stochastic variable X_iConditional probability；

Step 3-2) note stochastic variable X_iShared K_iA value, θ_ijkIndicate X_iTake its k-th of value, Pa (X_i) take its jth Conditional probability when a value, then θ_ijkIt indicates are as follows:

θ_ijk=P (X_i=x_ik|Pa(X_i)=Pa (X_i)_j)

Step 3-3) using Bayesian network parameters study space time correlation network C EBN parameters θ_ijkValue

Step 3-4) pass through training sample calculating threshold value J_thr。

As a kind of improvement of the above method, the step 3-3) it specifically includes:

Step 3-3-1) Bayesian network parameters study is using mode shown in following formula:

Wherein: π (θ) indicates prior distribution, and θ is parameter；χ indicates sample information；π (θ | χ) indicate Posterior distrbutionp, priori point Cloth π (θ) is Dirichlet distribution, then:

Wherein, Γ () is gamma function；α_ijk,1≤k≤K_iFor hyper parameter, Dir () indicates Dirichlet distribution Function；

Step 3-3-2) it is located in sample χ and meets X_i=x_ikAnd Pa (X_i)=Pa (X_i)_jNumber be β_ijk, due to posteriority point Cloth π (θ | χ) also obeys Dirichlet distribution, then π (θ | χ) is indicated are as follows:

Step 3-3-3) space time correlation network C EBN parameters θ_ijkMAP estimationAre as follows:

As a kind of improvement of the above method, the step 3-4) it specifically includes:

Step 3-4-1) assume to include M sections of voices in training sample, calculate steganography of the every section of voice in non-steganography Steganography index in the case of index and steganography；Remember that steganography index of all training samples in non-steganography constitutes set J_C= {J_c1,J_c2,…,J_cM, the steganography index in steganography constitutes set J_S={ J_s1,J_s2,…,J_sM}；

Step 3-4-2) threshold value J_thrIt is obtained by following formula:

Wherein: CNT (J_C:J_cl< J_thr) and CNT (J_S:J_sl≥J_thr) respectively indicate non-steganography index J_CIn meet J_cl< J_thrNumber and steganography index J_SIn meet J_sl≥J_thrNumber, 1≤l≤M.

As a kind of improvement of the above method, the step 4) is specifically included:

Step 4-1) one section of sound bite comprising N frame is given, calculate the whole steganography degree for measuring sound bite Voice steganography index J=N^stego/ N, N^stegoIndicate the quantity for being judged as steganography frame in sound bite；

Classified using diagnostic reasoning from bottom to top to each speech frame, i.e., known child node parameter distribution, to count Calculate the probability of father node:

Work as x₁=0 and x₁When=1, speech frame is respectively indicated in stochastic variable X_i=x_i, i=2 is non-steganography when 3 ..., 26 The posterior probability of frame and steganography frame；Work as x₁=1 probability is greater than x₁When=0 probability, it is believed that the frame steganography, otherwise it is assumed that the frame Non- steganography；Thus to obtain N^stego；

Step 4-2) as J < J_thrWhen, it is determined as non-steganography voice；As J >=J_thrWhen, it is determined as steganography voice.

Present invention has an advantage that

1, the general steganalysis ability of method of the invention is apparently higher than based on uncompressed domain MFCC general steganalysis Method；

2, when voice duration is shorter, method of the invention is than traditional SVM Stego-detection method based on feature extraction Advantage is more obvious；

3, the method for the invention general steganalysis ability still with higher when network complexity is lower；

4, method of the invention not only can be used as dedicated steganalysis method, but also can be used as general steganalysis method； And current dedicated steganalysis method can only aimed detection part steganography.

Detailed description of the invention

Fig. 1 is symbol space time correlation network diagram proposed by the invention；

Fig. 2 is symbol Bayesian network proposed by the invention.

Specific embodiment

Synthesis analysis linear predictive coding (Analysis by Synthesis-Linear Predictive Coding, AbS-LPC) it is widely used in a variety of Low-ratespeech codings, existing AbS-LPC low rate compresses voice Stego-detection side Method is designed for certain types of steganography method, and generalization ability is weaker.For this purpose, the invention proposes one kind towards multiclass low speed The general information that rate compresses voice steganography hides detection method.Since the symbol in AbS-LPC low rate compression speech code stream is deposited In space-time relationship, and all AbS-LPC low rates compression voice steganography method is inherently to change symbol value.Therefore, The present invention is from the angle of symbol, first building symbol space time correlation network, then using in symbol space time correlation network Strong incidence edge constructs symbol Bayesian network, uses Dirichlet distribution as prior distribution learning network parameter, and be based on shellfish The general Stego-detection of this implementation of inference multiclass low rate of leaf compression voice steganography.

The invention will be further described in the following with reference to the drawings and specific embodiments.

The invention proposes the general informations towards multiclass low rate compression voice steganography to hide detection method, the method Specific step is as follows:

Step S1) network of intra-frame trunk in all symbol frames constructed based on the symbol in low rate compression speech code stream, Referred to as symbol space time correlation network (Code Element Spatiotemporal Network, CESN)；

The step S1) it specifically includes:

Step S1-1) G.723.1 the every frame of the encoding code stream of Speech Coding at Low Bit Rates be made of 24 symbols, comprising: LPC Vector quantization indexes symbol VQ₁、VQ₂、VQ₃, adaptive codebook delay symbol ACL₀、ACL₁、ACL₂、ACL₃, portfolio premium symbol GAIN₀、GAIN₁、GAIN₂、GAIN₃, pulse position index symbol POS₀、POS₁、POS₂、POS₃, MSBPOS, impulse code index Symbol PSIG₀、PSIG₁、PSIG₂、PSIG₃With grid index symbol GRID₀、GRID₁、GRID₂、GRID₃.24 symbols are denoted as C_i(i=1,2 ..., 24) is constructed with C_iFor vertex, digraph of the incidence relation as side between intra frame, as shown in Figure 1, being denoted as Figure D=<V, E>, wherein V and E is indicated are as follows:

Wherein: V indicates the set on vertex in digraph D, v_i[m] indicates i-th of code word of m frame；E indicates the collection on side in D It closes, by vertex v_i[p] is directed toward vertex v_jThe directed edge of [q].Incidence edge in frame is indicated as i=j and p ≠ q；It is indicated as j > i Intra-frame trunk side.Since symbol interframe incidence relation power can be by time effects, time interval is bigger, and intra-frame trunk relationship is got over It is weak；Therefore for simplifying the analysis, the present invention only considers the symbol association relationship of adjacent interframe, i.e., when only analyzing { 0,1 } j-i ∈ Incidence relation.Therefore in the frame being made of 24 symbols incidence edge have 276, intra-frame trunk side have 576, share 852.

Step S1-2) due to the incidence edge for including in CESN it is too many, be unfavorable for analyzing.Due to the corresponding symbol of each edge Correlation is strong and weak different, therefore the weaker side of correlation in CESN can be removed, and the strong side of retention relationship is with abbreviation CESN. In order to quantify the power of symbol relevance, the present invention defines symbol C_iAnd C_jRelative index I_c(C_i,C_j) are as follows:

Wherein, r_iAnd r_jRespectively indicate symbol C_iAnd C_jMaximum value, p (C_i=c_i) and p (C_j=c_i) respectively indicate symbol C_i=c_iMarginal probability and symbol C_j=c_iMarginal probability, p (C_i=c_i,C_j=c_i) indicate symbol C_i=c_iAnd C_j=c_iConnection Close probability.If symbol C_iAnd C_jIndependently of each other, then arbitrary c is corresponded to_iAnd c_iIt is all satisfied p (C_i=c_i)p(C_j=c_i)=p (C_i= c_i,C_j=c_i), i.e. I_c(C_i,C_j)=0.Therefore symbol C_iAnd C_jCorrelation is stronger, I_cIt is bigger.

Using 0.5 as I_cThreshold value classify to 852 sides, only remain larger than 0.5 strong incidence edge.After abbreviation CESN includes strong incidence edge and 7 strong incidence edges of interframe totally 11 sides in 4 frames.Strong incidence edge in frame are as follows: ACL₀-ACL₂、VQ₁- VQ₂、VQ₁-VQ₃And VQ₂-VQ₃.The strong incidence edge of interframe are as follows: ACL₀-ACL₀、ACL₀-ACL₂、ACL₂-ACL₀、ACL₂-ACL₂、VQ₁- VQ₁、VQ₂-VQ₂And VQ₃-VQ₃。

Statistics indicate that relevance is weak between inhomogeneity symbol, VQ between similar symbol₁、VQ₂、VQ₃Space-time relationship is strong, ACL₀With ACL₂Space-time relationship is strong.This is because symbol LPC vector quantization indexes symbol VQ₁、VQ₂、VQ₃It is to divide in short-term speech model Analysis as a result, adaptive codebook postpone symbol ACL₀、ACL₂It is to be analyzed when long voice signal as a result, i.e. pitch period, therefore Their relevances are stronger；Symbol ACL₁And ACL₃The difference with previous period of sub-frame is indicated, respectively by ACL₀And ACL₃It determines, because This relevance is weaker；And other symbols are used to indicate that voice by the residual signal in short-term, after long-term prediction, does not have obvious Space-time relationship.

Step S2) utilize the strong incidence edge in symbol space time correlation network C ESN further to construct the code for steganalysis First Bayesian network (Code Element Bayesian Network, CEBN)；

The step S2) it specifically includes:

Step S2-1) using voice frame category as root node O, there are two kinds of non-steganography (being denoted as 0) and steganography (being denoted as 1).

Step S2-2) respectively using 24 symbol values as the child node of root node O, due to grid index symbol GRID₀、 GRID₁、GRID₂、GRID₃Relevance is weak, and symbol accounting spy's very little, is 1 bit, thus this 4 nodes are merged into one Node is denoted as GRID, and it is 4 that bit, which distributes number,；Similar, by symbol ACL₁And ACL₃A node is merged into, ACL, bit are denoted as Distributing number is 4；Share 20 child nodes after having merged, constitute the directed edge by O to 20 node, each node value according to When symbol association is analyzed by the way of, the value of the symbol by value range greater than T is divided into T section.

Step S2-3) according to incidence edge ACL strong in frame₀-ACL₂、VQ₁-VQ₂、VQ₁-VQ₃、VQ₂-VQ₃, due to constitute by ACL₀To ACL₂、VQ₁To VQ₂、VQ₁To VQ₃、VQ₂To VQ₃Four directed edges；Due to VQ₂-VQ₃Relative index be greater than VQ₁- VQ₃Relative index, i.e. VQ₃Value by VQ₂Influence ratio VQ₁Greatly, thus remove VQ₁To VQ₃Directed edge.

Step S2-4) according to the strong incidence edge ACL of interframe₀-ACL₀、ACL₀-ACL₂、ACL₂-ACL₀、ACL₂-ACL₂、VQ₁- VQ₁、VQ₂-VQ₂、VQ₃-VQ₃, ACL₂-ACL₂Compare ACL₀-ACL₂Relevance it is stronger, respectively using adjacent interframe value as node ACL₀、ACL₂、VQ₁、VQ₂、VQ₃Child node ACL₀’、ACL₂’、VQ₁’、VQ₂’、VQ₃', it constitutes by ACL₀To ACL₀’、ACL₀It arrives ACL₂’、ACL₂To ACL₂’、ACL₂To ACL₀’、VQ₁To VQ₁’、VQ₂To VQ₂’、VQ₃To VQ₃' seven directed edges；Due to interframe ACL₀-ACL₀Compare ACL₂-ACL₀Relevance it is stronger, ACL₂-ACL₂Compare ACL₀-ACL₂Relevance it is stronger, therefore remove ACL₀It arrives ACL₂' and ACL₂To ACL₀Two sides；Further, since ACL₀’、ACL₂’、VQ₁’、VQ₂’、VQ₃' value will receive speech frame class Other influence, therefore by directed edge O to ACL₀', O to ACL₂', O to VQ₁', O to VQ₂', O to VQ₃' be added in network.

Final constructed symbol space time correlation network as shown in Fig. 2, be a multitiered network being made of 26 nodes, All code element informations are contained, wherein 15 node GAIN₀、GAIN₁、GAIN₂、GAIN₃、POS₀、POS₁、POS₂、POS₃、 MSBPOS、PSIG₀、PSIG₁、PSIG₂、PSIG₃, ACL, GRID be isolated node.

Step S3) training sample is based on to symbol Bayesian network CEBN progress network parameter study；

The step S3) it specifically includes:

Step S3-1) for ease of description, the node in CEBN is denoted as stochastic variable X respectively₁,X₂,…,X₂₆, wherein X₁ Corresponding root node O, other stochastic variables correspond to child node.The value of stochastic variable is denoted as x respectively₁,x₂,…,x₂₆, then network Joint probability distribution are as follows:

Wherein: Pa (X_i) indicate stochastic variable X_iFather node；P(X_i|Pa(X_i)) indicate stochastic variable X_iConditional probability.

Step S3-2) note stochastic variable X_iShared K_iA value, θ_ijkIndicate X_iTake its k-th of value, Pa (X_i) take its jth Conditional probability when a value, then θ_ijkIt may be expressed as:

θ_ijk=P (X_i=x_ik|Pa(X_i)=Pa (X_i)_j)

Step S3-3) study of network parameter is substantially to learn each θ_ijkValue, Bayesian network parameters study is logical Frequently with mode shown in following formula:

Wherein: π (θ) indicates prior distribution；χ indicates sample information；π (θ | χ) indicate that Posterior distrbutionp, parameter learning combine Its prior information and sample information.

Step S3-4) in Bayesian network parameters study, prior distribution generally chooses conjugation distribution, i.e. prior distribution π (θ) and Posterior distrbutionp π (θ | χ) belong to same type distribution, the present invention selects common Dirichlet distribution as priori point Cloth, then:

Wherein Γ () is gamma function, α_ijk,1≤k≤K_iFor hyper parameter, Dir () indicates that Dirichlet is distributed letter Number.

Step S3-5) it is located in sample χ and meets X_i=x_ikAnd Pa (X_i)=Pa (X_i)_jNumber be β_ijk, due to posteriority point Cloth π (θ | χ) also obeys Dirichlet distribution, then π (θ | χ) may be expressed as:

The MAP estimation of network parameter θ are as follows:

Step S4) one section of sound bite comprising N frame is given, the present invention defines voice steganography index J to measure voice sheet The whole steganography degree of section, i.e. J=N^stego/ N, N^stegoIndicate the quantity for being judged as steganography frame in sound bite；The present invention passes through Threshold value J is set_thrCome judge voice whether steganography, as J < J_thrWhen, it is determined as non-steganography voice；As J >=J_thrWhen, it is determined as hidden Write voice.

Wherein, J_thrIt is obtained by training sample, so that the determination rate of accuracy highest in training sample.Assuming that in training sample Comprising M sections of voices, steganography index in the case of the available steganography index and steganography in non-steganography of every section of voice remembers institute There is steganography index of the training sample in non-steganography to constitute set J_C={ J_c1,J_c2,…,J_cM, it is hidden in steganography It writes index and constitutes set J_S={ J_s1,J_s2,…,J_sM, then J_thrIt is obtained by following formula:

Wherein: CNT (J_C:J_cj< J_thr) and CNT (J_S:J_sj≥J_thr) respectively indicate non-steganography index J_CIn meet J_cj< J_thrNumber and steganography index J_SIn meet J_sj≥J_thrNumber.Obtain J_thrIt afterwards, can be into for given test sample Row Stego-detection.

Given one section of sound bite comprising N frame, can calculate the probability that each frame is non-steganography frame and steganography frame, note I-th frame is that the probability of non-steganography frame is p_i ^cover, be the probability of steganography frame it is p_i ^stego.Theoretically speaking if p_i ^cover<p_i ^stego, Then the speech frame is steganography frame；It otherwise is non-steganography frame.However, it is difficult to which accurately frame each in sound bite is all determined just Really, therefore the present invention and determine whether entire sound bite is steganography voice without using a certain frame steganography classification results, and make Steganography is determined whether with sound bite entirety steganography degree.

Classified using diagnostic reasoning from bottom to top to sample, i.e., known child node parameter distribution, to calculate father's section The probability of point.I.e.

Work as x₁=0 and x₁When=1, speech frame is respectively indicated in stochastic variable X_i=x_iIt is not hidden when (i=2,3 ..., 26) Write the posterior probability of frame and steganography frame.Work as x₁=1 probability is greater than x₁When=0 probability, it is believed that the frame steganography.

It should be noted last that the above examples are only used to illustrate the technical scheme of the present invention and are not limiting.Although ginseng It is described the invention in detail according to embodiment, those skilled in the art should understand that, to technical side of the invention Case is modified or replaced equivalently, and without departure from the spirit and scope of technical solution of the present invention, should all be covered in the present invention Scope of the claims in.

Claims

1. a kind of general information towards multiclass low rate compression voice steganography hides detection method, G.723.1 low for detecting The coding of speed speech encoder, specific step is as follows for the method:

Step 1) is closed based on the symbol space-time that the symbol in low rate compression speech code stream constructs intra-frame trunk in all symbol frames Network network CESN；

Step 2) utilizes the strong incidence edge in symbol space time correlation network C ESN further to construct the symbol shellfish for steganalysis This network C of leaf EBN；

The given one section of sound bite comprising N frame of step 4), calculates voice according to the network parameter of symbol Bayesian network CEBN Steganography index J, as J < J_thrWhen, it is determined as non-steganography sound bite；As J >=J_thrWhen, it is determined as steganography sound bite.

2. the general information according to claim 1 towards multiclass low rate compression voice steganography hides detection method, It is characterized in that, the step 1) specifically includes:

Step 1-1) G.723.1 the every frame of the encoding code stream of Speech Coding at Low Bit Rates be made of 24 symbols；24 symbols include: 3 LPC vector quantizations index symbol: VQ₁、VQ₂And VQ₃, 4 adaptive codebooks delay symbols: ACL₀、ACL₁、ACL₂And ACL₃, 4 portfolio premium symbols: GAIN₀、GAIN₁、GAIN₂And GAIN₃, 5 pulse positions index symbols: POS₀、POS₁、POS₂、POS₃ And MSBPOS, 4 impulse codes index symbol: PSIG₀、PSIG₁、PSIG₂And PSIG₃, 4 grid index symbol GRID₀、 GRID₁、GRID₂And GRID₃；

Step 1-2) 24 symbols are denoted as C_i, i=1,2 ..., 24, with C_iFor vertex, incidence relation is as side between intra frame Digraph, is denoted as figure D=<V, and E>, wherein V and E is indicated are as follows:

Wherein: V indicates the set on vertex in digraph D, v_i[m] indicates i-th of code word of m frame；E indicates the set on side in D, by Vertex v_i[p] is directed toward vertex v_jThe directed edge of [q]；Incidence edge in frame is indicated as i=j and p ≠ q；Indicate that interframe is closed as j > i Join side；Analyze incidence relation when { 0,1 } j-i ∈: incidence edge has 276, intra-frame trunk side in the frame being made of 24 symbols There are 576, shares 852；

Step 1-3) calculate symbol C_iAnd C_jRelative index I_c(C_i,C_j):

Wherein, r_iAnd r_jRespectively indicate symbol C_iAnd C_jMaximum value, p (C_i=c_i) and p (C_j=c_i) respectively indicate symbol C_i= c_iMarginal probability and symbol C_j=c_iMarginal probability, p (C_i=c_i,C_j=c_i) indicate symbol C_i=c_iAnd C_j=c_iJoint it is general Rate；If symbol C_iAnd C_jIndependently of each other, then arbitrary c is corresponded to_iAnd c_iIt is all satisfied p (C_i=c_i)p(C_j=c_i)=p (C_i=c_i,C_j= c_i), then I_c(C_i,C_j)=0；

Step 1-4) in 852 incidence edges retain I_c(C_i,C_j) > 0.5 strong incidence edge, then symbol space time correlation network C ESN Include strong incidence edge in 4 frames and 7 strong incidence edges of interframe totally 11 sides；Strong incidence edge in frame are as follows: ACL₀-ACL₂、VQ₁-VQ₂、 VQ₁-VQ₃And VQ₂-VQ₃；The strong incidence edge of interframe are as follows: ACL₀-ACL₀、ACL₀-ACL₂、ACL₂-ACL₀、ACL₂-ACL₂、VQ₁-VQ₁、 VQ₂-VQ₂And VQ₃-VQ₃。

3. the general information according to claim 2 towards multiclass low rate compression voice steganography hides detection method, It is characterized in that, the step 2) specifically includes:

Step 2-1) using voice frame category as root node O, the voice frame category includes: non-steganography, is denoted as 0 and steganography, is remembered It is 1；

Step 2-2) respectively using 24 symbol values as the child node of root node O, by grid index symbol GRID₀、GRID₁、 GRID₂、GRID₃This 4 nodes merge into a node, are denoted as GRID, it is 4 that bit, which distributes number,；By symbol ACL₁And ACL₃Merge For a node, it is denoted as ACL, it is 4 that bit, which distributes number,；20 child nodes are shared after having merged, and are constituted by O to 20 node Directed edge；

Step 2-3) delete step 1-4) strong incidence edge ACL in obtained frame₀-ACL₂、VQ₁-VQ₂、VQ₁-VQ₃And VQ₂-VQ₃In VQ₁To VQ₃Directed edge VQ₁-VQ₃；

Step 2-4) respectively using adjacent interframe value as node ACL₀、ACL₂、VQ₁、VQ₂、VQ₃Child node ACL₀’、ACL₂’、 VQ₁’、VQ₂’、VQ₃', it constitutes by ACL₀To ACL₀’、ACL₀To ACL₂’、ACL₂To ACL₂’、ACL₂To ACL₀’、VQ₁To VQ₁’、VQ₂ To VQ₂’、VQ₃To VQ₃' seven directed edges；Remove ACL₀To ACL₂' and ACL₂To ACL₀Two sides；By directed edge O to ACL₀’、 O to ACL₂', O to VQ₁', O to VQ₂', O to VQ₃' be added in network；

Step 2-5) final constructed symbol Bayesian network CEBN is a multitiered network being made of 26 nodes, include All code element informations, wherein 15 node GAIN₀、GAIN₁、GAIN₂、GAIN₃、POS₀、POS₁、POS₂、POS₃、MSBPOS、 PSIG₀、PSIG₁、PSIG₂、PSIG₃, ACL, GRID be isolated node.

4. the general information according to claim 3 towards multiclass low rate compression voice steganography hides detection method, It is characterized in that, the step 3) specifically includes:

Step 3-1) node in symbol Bayesian network CEBN is denoted as stochastic variable X respectively₁,X₂,…,X₂₆, wherein X₁It is corresponding Root node O, other stochastic variables correspond to child node；The value of stochastic variable is denoted as x respectively₁,x₂,…,x₂₆, then the joint of network Probability distribution are as follows:

Step 3-2) note stochastic variable X_iShared K_iA value, θ_ijkIndicate X_iTake its k-th of value, Pa (X_i) take its j-th of value When conditional probability, then θ_ijkIt indicates are as follows:

θ_ijk=P (X_i=x_ik|Pa(X_i)=Pa (X_i)_j)

Step 3-4) pass through training sample calculating threshold value J_thr。

5. the general information according to claim 4 towards multiclass low rate compression voice steganography hides detection method, It is characterized in that, the step 3-3) it specifically includes:

Wherein: π (θ) indicates prior distribution, and θ is parameter；χ indicates sample information；π (θ | χ) indicate Posterior distrbutionp, prior distribution π (θ) is Dirichlet distribution, then:

Step 3-3-2) it is located in sample χ and meets X_i=x_ikAnd Pa (X_i)=Pa (X_i)_jNumber be β_ijk, due to Posterior distrbutionp π (θ | χ) also obeys Dirichlet distribution, then π (θ | χ) is indicated are as follows:

6. the general information according to claim 5 towards multiclass low rate compression voice steganography hides detection method, It is characterized in that, the step 3-4) it specifically includes:

Step 3-4-1) assume to include M sections of voices in training sample, calculate steganography index of the every section of voice in non-steganography With steganography index in the case of steganography；Remember that steganography index of all training samples in non-steganography constitutes set J_C={ J_c1, J_c2,…,J_cM, the steganography index in steganography constitutes set J_S={ J_s1,J_s2,…,J_sM}；

Step 3-4-2) threshold value J_thrIt is obtained by following formula:

Wherein: CNT (J_C:J_cl< J_thr) and CNT (J_S:J_sl≥J_thr) respectively indicate non-steganography index J_CIn meet J_cl< J_thr's Number and steganography index J_SIn meet J_sl≥J_thrNumber, 1≤l≤M.

7. the general information according to claim 6 towards multiclass low rate compression voice steganography hides detection method, It is characterized in that, the step 4) specifically includes:

Step 4-1) one section of sound bite comprising N frame is given, calculate the language for measuring the whole steganography degree of sound bite Sound steganography index J=N^stego/ N, N^stegoIndicate the quantity for being judged as steganography frame in sound bite；

Classified using diagnostic reasoning from bottom to top to each speech frame, i.e., known child node parameter distribution, to calculate father The probability of node:

Work as x₁=0 and x₁When=1, speech frame is respectively indicated in stochastic variable X_i=x_i, i=2, when 3 ..., 26 for non-steganography frame and The posterior probability of steganography frame；Work as x₁=1 probability is greater than x₁When=0 probability, it is believed that the frame steganography, otherwise it is assumed that the frame is not hidden It writes；Thus to obtain N^stego；