CN108566642B

CN108566642B - Two-dimensional joint feature authentication method based on machine learning

Info

Publication number: CN108566642B
Application number: CN201810239508.5A
Authority: CN
Inventors: 陈松林; 文红; 陈洁; 郑烜
Original assignee: Chengdu Alaifu Information Technology Co ltd
Current assignee: Chengdu Alaifu Information Technology Co ltd
Priority date: 2018-03-22
Filing date: 2018-03-22
Publication date: 2021-08-13
Anticipated expiration: 2038-03-22
Also published as: CN108566642A

Abstract

The invention discloses a two-dimensional combined feature authentication method based on machine learning, which comprises the steps that a base station B firstly carries out channel information acquisition on a legal information sender A and a simulated illegal information sender E, then, channel information difference values between continuous data frames are calculated according to channel information between the base station B and the information sender, and on the basis of the channel information difference values, amplitude-based test statistics are constructed and processed to obtain amplitude-based normalized LRT statistics; based on amplitude and phase combination, constructing test statistic, processing to obtain normalized LRT statistic based on amplitude and phase combination, then constructing a sample set of two-dimensional combination characteristics, establishing an authentication model, generating a classifier by using a machine learning method, training according to the sample set to obtain the classifier with the standard detection rate, and carrying out validity judgment on an information sender with unknown identity. Compared with a single characteristic dimension channel authentication method, the method has higher accuracy.

Description

Two-dimensional joint feature authentication method based on machine learning

Technical Field

The invention relates to an authentication technology of channel information, in particular to a two-dimensional joint feature authentication method based on machine learning.

Background

The 5G mobile communication system puts forward the requirements of high speed, high efficiency and high security, and when a plurality of mobile devices access to the wireless network simultaneously, the burden of identity authentication in the network is greatly increased. 5G involves the interconnection and communication between a large number of machines and devices, and therefore, in the context of dense application of the network, a lightweight authentication method is required, which is a prerequisite for the operation of the Internet of things. In the last decade, the development of physical layer security technology has brought new vitality to the wireless mobile communication field. Physical layer authentication is a non-password authentication because acquisition of channel information is easy and channel characteristics are difficult to forge, and thus physical layer authentication based on channel characteristics has been receiving wide attention from researchers.

Generally, channel information of a radio can be used to detect the validity of a sender identity in a wireless network. However, the authentication method using a single characteristic has a certain limitation, and only one channel characteristic is used as a division basis, so that the method has no high accuracy.

Disclosure of Invention

The invention aims to overcome the defects of the prior art and provides a two-dimensional joint feature authentication method based on machine learning, which is used for verifying a statistic T based on amplitude_ABased on the normalized LRT statistic T based on the improved amplitude_aAnd in a test statistic T based on a combination of amplitude and phase_BBased on the improved normalized LRT statistic T based on amplitude and phase combination_bThen establishing a two-dimensional feature (T) based_a，T_b) Compared with the channel authentication method with single characteristic dimension, the authentication model has higher accuracy in channel authentication.

The purpose of the invention is realized by the following technical scheme: a two-dimensional joint feature authentication method based on machine learning comprises the following steps:

s1, the base station B carries out channel information acquisition on a legal information sender A and a simulated illegal information sender E to obtain a channel information data set of the legal information sender A

And a channel information data set simulating an illegal information sender E

S2, calculating a channel information difference value between continuous data frames for channel information between the base station B and an information sender;

s3, on the basis of the channel information difference value, constructing an amplitude-based test statistic T based on the amplitude difference of the subcarriers_AAnd to T_AProcessing to obtain a product based on T_AImproved normalized LRT statistic T_aTaking it as a first dimension feature:

wherein, the first dimension between the legal information sender A and the base station B is characterized in that

Simulating the first dimension characteristic between an illegal information sender E and a base station B as

S4, on the basis of the channel information difference value, constructing a test statistic T based on amplitude and phase combination_BTo T_BProcessing to obtain a product based on T_BImproved normalized LRT statistic T_bAnd taking the two-dimensional feature as a second-dimensional feature:

wherein, the second dimension between the legal information sender A and the base station B is characterized in that

Simulating a second dimension between the illegal message sender E and the base station B as

S5, utilizing the channel information data set

And

constructing a two-dimensional union feature (T)_a，T_b) As a sample set, two-dimensional joint features (T) are combined for data frames at the same time_a，T_b) As a comprehensive basis for determinationConstructing an authentication model [ (T)_a，T_b),y]Wherein:

the sample set of the two-dimensional joint characteristics between the legal information sender A and the base station B is as follows:

adding an identifier y to + 1;

the sample set for simulating the two-dimensional joint characteristics between the illegal information sender E and the base station B is as follows:

adding an identifier y to-1;

s6, adopting a machine learning method to construct a classifier, and according to the sample set T^ABAnd T^EBTraining the classifier until the detection rate of the classifier reaches the standard;

and S7, the base station judges the legality of the information sender with unknown identity by using the classifier with the detection rate reaching the standard, so that the channel authentication of the two-dimensional combined characteristics based on machine learning is realized.

The step S1 includes the following sub-steps:

s101, a legal information sender A sends a signal to a base station B, and the base station B collects channel information of the legal information sender A

Wherein, N represents the number of frames,

channel information representing the estimation of the kth OFDM symbol between the base station B and the legitimate information sender a, k being 1,2, 3.

S102, simulating an illegal information sender E to send a signal to a base station B, and collecting signals simulating the illegal information sender E by the base station BChannel information

N represents the number of frames,

channel information indicating the estimation of the kth OFDM symbol between the base station B and the transmitter E of the analog illegal information, k being 1,2, 3.

Further, the channel information data frames of the legal information sender A and the simulated illegal information sender E are continuously sent, and the time interval between the two adjacent frames of data is collected within the relevant time, and the channel information has correlation.

In step S2, the channel information difference between consecutive data frames is calculated by the following formula:

in the formula,

representing the difference between the channel information data set to be calculated, the channel information of the (k + 1) th frame and the channel of the (k) th frame.

The step S3 includes the following sub-steps:

s301, on the basis of the channel information difference value, constructing an amplitude-based test statistic T based on the amplitude difference of subcarriers_A：

Wherein,

m and n respectively represent the m-th row and the n-th column of the channel matrix; n represents total N frame data; sigma²Representing the noise power; n is a radical of_sRepresenting channel information of data frames, containing N_sA frequency channel matrix, which is an OFDM symbol of an N-dimensional square matrix; x is N_sA cumulative variable of the number of (1); k represents the kth frame data; the superscript XB represents an information sender X and a base station B, when the information sender X is a legal information sender A, XB (AB) in the formula is constructed to obtain amplitude-based test statistic of channel information between the legal information sender A and the base station B, and when the information sender X is a simulated illegal information sender E, XB (EB) in the formula is constructed to obtain amplitude-based test statistic of channel information between the simulated illegal information sender E and the base station B;

s302, test statistic T_AProcessing to obtain a product based on T_AImproved normalized LRT statistic T_aTaking it as a first dimension feature:

The step S4 includes the following sub-steps:

s401, on the basis of the channel information difference value, based onAmplitude and phase combination to construct test statistic T_B：

The upper standard XB represents an information sender X and a base station B, when the information sender X is a legal information sender A, XB (AB) in the formula is constructed to obtain the inspection statistic of channel information between the legal information sender A and the base station B based on the combination of amplitude and phase, and when the information sender X is a simulated illegal information sender E, XB (EB) in the formula is constructed to obtain the inspection statistic of channel information between the simulated illegal information sender E and the base station B based on the combination of amplitude and phase;

s402, testing statistic T_BProcessing to obtain a product based on T_BImproved normalized LRT statistic T_bAnd taking the two-dimensional feature as a second-dimensional feature:

The step S5 includes the following sub-steps:

s501, sending the legal information from the sender A to the baseChannel information data set of station B

And (3) processing:

data set of channel information

Bringing in

Obtaining M frames of first dimension characteristics of channel information between a legal information sender A and a base station B in the calculation formula:

data set of channel information

Bringing in

Obtaining M frames of second dimension characteristics of channel information between a legal information sender A and a base station B in the calculation formula:

in the formula, M is N-2, that is, N frames of original data, and N-2 frames of feature data can be obtained;

s502, constructing a two-dimensional joint feature sample set T from a legal information sender A to a base station B^AB：

Due to channel characteristics

And channel characteristics

Combined into a two-dimensional combined feature T^ABSo that the k frame data of the sample feature becomes

S503, collecting a sample set T^ABAs a legal data source, and for the sample set T^ABAdding an identifier y to +1 in each frame of data;

s504, simulating the channel information data set from the illegal information sender E to the base station B

And (3) processing:

data set of channel information

Bringing in

In the calculation formula, obtaining M frames of first dimension characteristics of channel information between the simulated illegal information sender E and the base station B:

data set of channel information

Bringing in

In the calculation formula, obtaining a second dimension characteristic of M frames simulating channel information between an illegal information sender E and a base station B:

S505. constructing a two-dimensional joint feature sample set T simulating illegal information sender E to base station B^EB:

Due to channel characteristics

And channel characteristics

Combined into a two-dimensional combined feature T^EBSo that any k-th frame data of the sample feature becomes

S506, collecting the sample set T^EBAs an illegal data source and for the sample set T^EBAdds the identifier y-1 to each frame data in (1).

Further, the machine learning method in step S6 is any machine learning algorithm with a binary function, which is based on two-dimensional joint features (T)_a,T_b) Carrying out classification training on data frames from a legal information sender A or a simulated illegal information sender E to obtain a corresponding classifier;

specifically, the step S6 includes the following sub-steps:

s601, establishing a training set and a testing set, and performing training and testing on a sample set T^ABAnd sample set T^EBAdding each extracted partial data frame into training set, sample set T^ABAnd sample set T^EBAdding the rest data frames into the test set;

s602, constructing a classifier by adopting a machine learning method, and training the classifier by utilizing data in a training set;

s603, testing the classifier obtained by training by using the data in the test set, judging whether the detection rate of the classifier reaches a set threshold value, and if so, outputting the classifier with the detection rate reaching the standard; if not, the process returns to step S1, and the sample set acquisition and classifier training of steps S1 to S6 are repeated.

The step S7 includes the following sub-steps:

s701, the base station B carries out channel information acquisition on the information sender C with unknown identity to obtain a channel information data set

Wherein, N represents the number of frames,

channel information representing the kth OFDM symbol estimate between the base station B and the information sender C of unknown identity, k being 1,2, 3.., N;

s702. for the channel information data set

On the basis of the channel information difference value, constructing an amplitude-based test statistic based on the amplitude difference of subcarriers, and processing the test statistic to obtain a normalized LRT statistic

As a first dimension characteristic of channel information between the base station B and the unknown identity information sender C:

data set of channel information

Bringing in

In the formula for calculating (a) of (b),obtaining M frames of first dimension characteristics of channel information between an unknown identity information sender C and a base station B:

M＝N-2；

s703. for the channel information data set

On the basis of the channel information difference value, constructing test statistic based on amplitude and phase combination, and processing the test statistic to obtain normalized LRT statistic

A second dimension characteristic as channel information between the base station B and the unknown identity information sender C:

data set of channel information

Bringing in

Obtaining M frames of second dimension characteristics of channel information between the unknown identity information sender C and the base station B in the calculation formula:

M＝N-2；

s704, constructing a two-dimensional joint feature sample set T of channel information between a base station B and an unknown identity information sender C^CB：

S705, utilizing the classifier with the detection rate reaching the standard to perform T on the sample set^CBIs judged to obtainWhen the y value is +1, the information sender C with unknown identity is legal; a value of-1 for y would represent that the sender C of the information of unknown identity is illegal.

The invention has the beneficial effects that: in an amplitude-based test statistic T_ABased on the normalized LRT statistic T based on the improved amplitude_aAnd in a test statistic T based on a combination of amplitude and phase_BBased on the improved normalized LRT statistic T based on amplitude and phase combination_bThen establishing a two-dimensional feature (T) based_a，T_b) The authentication model realizes channel authentication under two-dimensional combined characteristics by combining a machine learning algorithm, improves the defect of low accuracy of the existing authentication technology, and has higher accuracy compared with a channel authentication method with a single characteristic dimension.

Drawings

FIG. 1 is a flow chart of the method of the present invention.

Detailed Description

The technical solutions of the present invention are further described in detail below with reference to the accompanying drawings, but the scope of the present invention is not limited to the following.

As shown in fig. 1, a two-dimensional joint feature authentication method based on machine learning includes the following steps:

And a channel information data set simulating an illegal information sender E

S5, utilizing the channel information data set

And

constructing a two-dimensional union feature (T)_a，T_b) As a sample set, two-dimensional joint features (T) are combined for data frames at the same time_a，T_b) As a comprehensive judgment basis, an authentication model [ (T) is constructed_a，T_b),y]Wherein:

adding an identifier y to + 1;

adding an identifier y to-1;

The step S1 includes the following sub-steps:

Wherein, N represents the number of frames,

S102, the illegal information sender E is simulated to send signals to the base station B, and the base station B collects channel information of the illegal information sender E

N represents the number of frames,

in the formula,

The step S3 includes the following sub-steps:

Wherein,

The step S4 includes the following sub-steps:

s401, on the basis of the channel information difference value, constructing a test statistic T based on amplitude and phase combination_B：

The step S5 includes the following sub-steps:

s501, channel information data set from legal information sender A to base station B

And (3) processing:

data set of channel information

Bringing in

data set of channel information

Bringing in

Due to channel characteristics

And channel characteristics

And (3) processing:

data set of channel information

Bringing in

data set of channel information

Bringing in

s505, constructing a two-dimensional joint feature sample set T simulating illegal information from a sender E to a base station B^EB:

Due to channel characteristics

And channel characteristics

Further, the machine learning method in step S6 is any machine learning algorithm with a binary function, which is based on two-dimensional joint features (T)_a,T_b) Carrying out classification training on a data frame from a legal information sender A or a simulated illegal information sender E to obtain a corresponding classifier;

specifically, the step S6 includes the following sub-steps:

In an embodiment of the present application, the step S603 specifically includes: sequentially inputting the data frames in the test set into a classifier obtained by training, judging each frame of data by the classifier to obtain a corresponding y value, counting and judging the correct number of data frames according to the marking information of the y value in the test set, dividing the correct number of data frames by the total number of the data frames in the test set to obtain the detection rate, if the detection rate is higher than a set threshold value, judging that the detection rate reaches the standard, outputting the classifier with the detection rate reaching the standard, if the detection rate is lower than the set threshold value, judging that the detection rate does not reach the standard, returning to the step S1, and repeatedly performing the sample set acquisition and the classifier training of the steps S1-S6.

The step S7 includes the following sub-steps:

Wherein, N represents the number of frames,

s702. for the channel information data set

data set of channel information

Bringing in

Obtaining M frames of first dimension characteristics of channel information between an unknown identity information sender C and a base station B in the calculation formula:

M＝N-2；

s703. for the channel information data set

data set of channel information

Bringing in

M＝N-2；

s704, constructing a base station B and an unknown identity information senderTwo-dimensional joint characteristic sample set T of channel information between C^CB：

S705, utilizing the classifier with the detection rate reaching the standard to perform T on the sample set^CBJudging any frame data to obtain a corresponding y value, wherein the y value is +1 and represents that an information sender C with unknown identity is legal; a value of-1 for y would represent that the sender C of the information of unknown identity is illegal.

In conclusion, because the optimal judgment threshold value is difficult to find manually by the two-dimensional combined channel characteristics, when the dimension is increased and the legal or illegal identity is judged, a single limit is not defined according to a single standard any more, the invention comprehensively considers the two-dimensional combined characteristics to make the final judgment; in particular, in an amplitude-based test statistic T_ABased on the normalized LRT statistic T based on the improved amplitude_aAnd in a test statistic T based on a combination of amplitude and phase_BBased on the improved normalized LRT statistic T based on amplitude and phase combination_bThen establishing a two-dimensional feature (T) based_a，T_b) The authentication model of (1); the machine learning algorithm can be used for improving the authentication performance, input data of channel information is used as training data, each group of training data is added with a clear identifier, then a prediction model is determined, a learning process is established, the prediction result is compared with the actual result of the training data, the prediction model is continuously adjusted until the prediction result of the model reaches an expected accuracy, finally, tested channel information data is used as the input data, and an output label is given by a classifier generated by machine learning, so that the authentication is realized.

Claims

1. A two-dimensional joint feature authentication method based on machine learning is characterized in that: the method comprises the following steps:

And a channel information data set simulating an illegal information sender E

Imitating illegal information sender EThe second dimension between the base station B and the base station B is characterized in that

S5, utilizing the channel information data set

And

adding an identifier y to + 1;

adding an identifier y to-1;

where M-N-2, N denotes a channel information data set

And

the number of frames;

2. The machine learning-based two-dimensional joint feature authentication method according to claim 1, wherein: the step S1 includes the following sub-steps:

Wherein, N represents the number of frames,

N represents the number of frames,

3. The machine learning-based two-dimensional joint feature authentication method according to claim 1, wherein: the channel information data frames of the legal information sender A and the simulated illegal information sender E are continuously sent, and the time interval between the two adjacent frames of data is collected within the relevant time, and the channel information has correlation.

4. The two-dimensional joint feature authentication method based on machine learning according to claim 2, wherein: in step S2, the channel information difference between consecutive data frames is calculated by the following formula:

in the formula,

5. The machine learning-based two-dimensional joint feature authentication method according to claim 4, wherein: the step S3 includes the following sub-steps:

Wherein,

m and n respectively represent the m-th row and the n-th column of the channel matrix; n represents total N frame data; sigma²Representing the noise power; n is a radical of_sRepresenting data frame channel information, packetsContaining N_sEach frequency channel matrix is an OFDM symbol of an N-dimensional square matrix; x is N_sA cumulative variable of the number of (1); k represents the kth frame data; the superscript XB represents an information sender X and a base station B, when the information sender X is a legal information sender A, XB (AB) in the formula is constructed to obtain amplitude-based test statistic of channel information between the legal information sender A and the base station B, and when the information sender X is a simulated illegal information sender E, XB (EB) in the formula is constructed to obtain amplitude-based test statistic of channel information between the simulated illegal information sender E and the base station B;

6. The machine learning-based two-dimensional joint feature authentication method according to claim 5, wherein: the step S4 includes the following sub-steps:

7. The machine learning-based two-dimensional joint feature authentication method according to claim 1, wherein: the step S5 includes the following sub-steps:

And (3) processing:

data set of channel information

Bringing in

data set of channel information

Bringing in

in the formula, M is N-2, namely N frames of original data, and N-2 frames of feature data are obtained;

Due to channel characteristics

And channel characteristics

And (3) processing:

data set of channel information

Bringing in

data set of channel information

Bringing in

Due to channel characteristics

And channel characteristics

8. The machine learning-based two-dimensional joint feature authentication method according to claim 1, wherein: the step S6 includes the following sub-steps:

s601, establishing a training set and a testing set, and performing training and testing on a sample set T^ABAnd sample set T^EBAdding each extracted partial data frame into training set, sample set T^ABAnd sample set T^EBThe rest of the data frames are added into the test set；

9. The machine learning-based two-dimensional joint feature authentication method according to claim 6, wherein: the step S7 includes the following sub-steps:

Wherein, N represents the number of frames,

s702. for the channel information data set

As between the base station B and the unknown identity information sender CFirst dimension characteristic of channel information:

data set of channel information

Bringing in

s703. for the channel information data set

data set of channel information

Bringing in

In the calculation formula (2), obtaining the unknown identity informationM frames of second dimension characteristics of channel information between sender C and base station B: