CN117332048B

CN117332048B - Logistics information query method, device and system based on machine learning

Info

Publication number: CN117332048B
Application number: CN202311623491.0A
Authority: CN
Inventors: 刘利; 王鹏飞; 刘炯伟; 黄颖琴
Original assignee: Y2T Technology Co Ltd
Current assignee: Y2T Technology Co Ltd
Priority date: 2023-11-30
Filing date: 2023-11-30
Publication date: 2024-03-22
Anticipated expiration: 2043-11-30
Also published as: CN117332048A

Abstract

The invention provides a logistic information query method, a device and a system based on machine learning, which are used for solving the problems that query intention cannot be accurately matched with data in a logistic information system when physical information is queried in natural language, complete and accurate logistic information cannot be obtained, and the quality and efficiency of physical information query by using natural language are improved by generating an antagonism network based on attention through antagonism training learning internal characteristics.

Description

Logistics information query method, device and system based on machine learning

Technical Field

The invention relates to the technical field of logistics, in particular to a logistics information query method, device and system based on machine learning.

Background

In the prior art, the problem of limited vocabulary coverage exists in a system for directly inquiring logistics information by using natural language. Because the data structure and the fields in the logistics information system are all defined in advance, the gap exists between the data structure and the fields and the expression of natural language. This can result in the inability to accurately match query intent and data in the logistic information system when using natural language as a query, and the inability to obtain complete and accurate logistic information.

It is noted that generating an countermeasure network can obtain the inherent mapping characteristics of the query logistics information system by means of countermeasure training, and the process from no labeling data to generating realistic logistics information is realized. The generator that generates the countermeasure network learns to encode the potential representation of the true distribution information. Thus, the generation of the countermeasure network can directly model the mode in the logistics information system without depending on the expression of natural language, and key data characteristics required by information inquiry are generated, so that accurate inquiry is realized.

In addition, the generation of the countermeasure network in combination with the attention mechanism can pay attention to key fields in logistics information, and the most important features such as a number, a place, a state and the like in the training process. The attention mechanism enables the generated information to be more real, and the capability of the discriminator for correctly judging the true and false information is enhanced. The finally formed query system can capture information characteristics more accurately, and realize accurate unbiased query of logistics information.

Therefore, the problems of limited vocabulary matching and inaccurate query results of the prior art which directly query by using natural language can be effectively solved by using the generated countermeasure learning, especially the generated countermeasure network of the attention mechanism.

In view of this, we propose a logistic information query method, device and system based on machine learning.

Disclosure of Invention

In view of the problem in the prior art that physical distribution information inquiry is carried out by utilizing natural language, the invention provides a physical distribution information inquiry method based on machine learning, which comprises the following steps:

step 1: designing a generation network capable of generating false logistics list numbers and logistics information; the generating network consists of an encoder and a decoder, wherein the encoder inputs random noise vectors, and the decoder outputs false logistics information; in order to make the generated information more realistic, attention mechanisms are introduced in the decoder enabling it to focus on different parts of the input vector when generating the information.

The method specifically comprises the following steps:

step 1.1: the input to the generation network is a random noise vector z, wherein each element obeys a gaussian or uniform distribution; generating stream information with false output of network, wherein the stream information comprises m fields, each field uses a vector x _i Representing, i.e. generating, the network output as a sequence of vectors (x ₁ ,x ₂ ,...,x _m )；

Step 1.2: the encoder structure design encoder consists of a plurality of fully connected layers, wherein each layer uses a ReLU activation function; the transform formula of the layer I encoder is as follows: h is a ^l = ReLU(W ^l h ^(l-1) +b ^l ) Which is a kind ofMiddle W ^l And b ^l Respectively a weight matrix and a bias vector of the first layer; the last layer of output of the encoder is the vector h mapped to the hidden space;

step 1.3: for step i of the decoder, attention weights are defined: alpha _i,j = softmax(v ^T tanh(W ₁ h _i +W ₂ h _j ) Wherein v, W ₁ And W is ₂ Is a learned parameter; the attention weight reflects that the decoder is paying attention to different parts of the input vector h at the present step;

step 1.4: based on the attention mechanism, the input of the decoder at step i is: c _i = Σ _j α _i,j h _j The decoder will c _i Mapping to output vector x through a multi-layer network _i ；

Step 1.5: the optimization objective of the generated network is to minimize the probability that the generated logistics information is judged to be false by the discriminant, i.e., to minimize cross entropy loss.

Step 2: constructing a discrimination network capable of carrying out true and false judgment on the input logistics information; the network uses a convolutional neural network to extract characteristic representation of input information, and then judges whether the characteristics belong to the distribution of real logistics information or not on the basis of the characteristics; attention mechanisms are also used in the discrimination network to enable the discrimination network to focus on key fields (such as single numbers, places and the like) in the logistics information so as to improve the discrimination accuracy.

The method specifically comprises the following steps:

step 2.1: judging that the input of the network is logistics information and comprises m fields, wherein each field is a vector; the input is denoted as (x) ₁ ,x ₂ ,...,x _m ) Judging that the network outputs a probability p E [0,1 ]]Representing the probability that the input information is judged as real logistics information;

step 2.2: the distinguishing network uses a convolutional neural network to extract the characteristics of each field input; for the ith field x _i Extracting features through a convolution layer: f (f) _i = ReLU(conv(x _i ) Wherein conv represents a convolution operation;

step 2.3: defining an attention weight: beta _i = softmax(u ^T tanh(Wf _i +b)) wherein u, W, b are learning parameters; the attention weight reflects the importance of different fields to the judgment result; finally, feature fusion is obtained: f=Σ _i β _i f _i ；

Step 2.4: after the fusion characteristic f is obtained, the true and false probability is obtained through a multi-layer full-connection network: p=sigmoid (W 'f+b'), where W 'and b' are full connectivity layer parameters; the optimization objective of the authenticity judgment is to maximize the log likelihood that the judgment is correct.

Step 3: the real logistics data set is used for training the discriminator in advance, so that the discriminator is familiar with the characteristic distribution of the real logistics information, and the authenticity of the input information can be accurately judged.

The method specifically comprises the following steps:

step 3.1: collecting a large amount of real logistics list numbers and logistics information, and constructing a training data set of the discriminator; the dataset includes input fields: (single number, place, status, etc.) and tags (true/false);

step 3.2: initializing and judging all weight matrixes and bias vectors of a convolution layer, a full connection layer and the like in a network; the weight matrix is initialized to a small random number, and the bias vector is initialized to 0;

step 3.3: the training target of the discriminator is to maximize the probability of judging true logistics information as true, and maximize the probability of judging false information as false; the loss function is defined as a binary cross entropy loss: l= - [ ylogp+ (1-y) log (1-p) ] where y is a true tag and p is a judgment result;

step 3.4: updating parameters in the discriminator by using an optimization algorithm of the RMSProp self-adaptive learning rate, continuously reducing a loss function by a gradient descent method, and optimizing the discrimination effect;

step 3.5: setting a training early stopping strategy; when the index (such as accuracy) of the model on the verification set is not improved continuously, training is stopped, over-fitting is avoided, and model parameters with the best effect are saved.

Step 4: in the case of fixed arbiter parameters, the generator is trained to increase its ability to generate real information in order to maximize the probability of spoofing the arbiter.

The method specifically comprises the following steps:

step 4.1: under the condition that parameters (convolution layer and full connection layer weights) of the discriminator are unchanged, only parameters of the generator are optimized;

step 4.2: the goal of generating a network is to generate realistic false information to fool the discriminant; the loss function is therefore defined as: l_g= -log (D (G (z))) where G represents the generator model, D represents the discriminant model, z is the random noise input of the generator; the loss function represents the probability of maximizing the judgment of the spoofing discriminator as true;

step 4.3: inputting the random noise vector into a generator to generate false logistics information, and then inputting the false logistics information into a discriminator to calculate a generator loss function L_G; the ability to generate realistic false information is optimized by minimizing l_g based on model parameters of the encoder and decoder in the l_g through gradient descent update generator.

Step 5: after the generator parameters are fixed, the updated parameters are used for training the discriminators again, so that the discriminators can learn from the updated generated information, and the judging capability of the real information features is further improved.

The method specifically comprises the following steps:

step 5.1: under the condition that the model parameters of the generator are fixed, only optimizing the parameters in the discriminator;

step 5.2: inputting random noise to generate new false logistics data by using the generator of the parameters optimized in the step 4;

step 5.3: combining the new false data generated by the generator with the real logistics data to construct a new training set of the discriminator for retraining;

step 5.4: and (3) updating and optimizing all network layer parameters of the discriminator by using the new training set which is constructed in the step 5.3 and contains true and false data, and maximizing the capability of judging the true and false data.

Step 6: when the countermeasure training is finished, the capacities of the generator and the discriminator are improved; at the moment, inputting a real logistics list number and generating a large number of false list numbers at the same time; inquiring all single numbers simultaneously for a logistics system to obtain complete information; and then inputting the physical distribution information into a trained discriminator for judgment, and finding out the real physical distribution information.

The method specifically comprises the following steps:

step 6.1: when inquiring, providing a single number of the physical distribution information to be actually inquired, and marking the single number as id_real;

step 6.2: generating N false logistics list numbers using a random function, denoted as { id_fake1, id_fake2, & gt, id_faken };

step 6.3: combining the n+1 true and false single numbers together, and simultaneously inquiring a logistics information system to acquire complete logistics information corresponding to each single number;

step 6.4: the logistics information set containing true and false data obtained by inquiry is simultaneously input into a pre-trained discriminator; the discriminator outputs the probability that each piece of information belongs to the real logistics;

step 6.5: and comparing the n+1 real probabilities output by the discriminator, wherein the information with the highest probability value is corresponding to the real logistics list number id_real, so that the correct logistics information inquiry is completed.

The invention also provides a logistics information query device based on machine learning, which is any electronic equipment for providing a query interface, wherein the electronic equipment is used for executing the logistics information query method.

The invention also provides a logistics information query system based on machine learning, which comprises a front subsystem, a transmission subsystem and a rear subsystem, wherein the front subsystem is used for receiving the logistics information query request, the transmission subsystem is used for transmitting the logistics information query request to the rear subsystem, and the rear subsystem is used for executing the logistics information query method and transmitting the query result back to the front subsystem through the transmission subsystem.

Drawings

Fig. 1 is a flowchart of a logistic information query method based on machine learning provided by the invention.

Detailed Description

As shown in fig. 1, the embodiment of the invention provides a logistic information query method based on machine learning, which comprises the following steps:

step 1: designing a generating network capable of generating false logistics list numbers and logistics information, wherein the generating network consists of an encoder and a decoder, the encoder inputs random noise vectors, the decoder outputs false logistics information, and in order to make the generated information more real, a attention mechanism is introduced into the decoder so that the decoder can pay attention to different parts of input vectors when generating the information;

step 2: constructing a discrimination network capable of carrying out true and false discrimination on the input logistics information, extracting characteristic representation of the input logistics information by using a convolutional neural network, and then judging whether the characteristics belong to the distribution of the real logistics information or not on the basis of the characteristics, wherein an attention mechanism is also used in the discrimination network, so that key fields (such as single numbers and places) in the logistics information can be focused on, and the discrimination accuracy is improved;

step 3: the real logistics data set is used for training the discriminator in advance, so that the discriminator is familiar with the characteristic distribution of the real logistics information, and the authenticity of the input information can be accurately judged;

step 4: under the condition that parameters of the discriminant are fixed, the generator is trained to improve the capability of generating real information, and the aim is to maximize the probability of cheating the discriminant;

step 5: after the generator parameters are fixed, the updated parameters are used for training the discriminators again, so that the discriminators can learn from the updated generated information, and the judging capability of the real information characteristics is further improved;

step 6: when the countermeasure training is finished, the capacities of the generator and the discriminator are improved, real logistics list numbers are input, a large number of false list numbers are generated at the same time, the logistics system is queried for all the list numbers at the same time, complete information is obtained, and then the complete information is input into the trained discriminator for judgment, so that the real logistics information can be found;

specifically, the specific process is as follows:

step 1.1: the input of the generating network is a random noise vector z, each element is subjected to Gaussian distribution or even distribution, different random seeds can be set to generate different noise inputs for increasing randomness, the input noise of the generating network represents potential coding of query intention, and the output of the generating network is false logistics information, wherein the logistics information comprises m fields, each field uses a vector x _i Representing, i.e. generating, the network output as a sequence of vectors (x ₁ ,x ₂ ,...,x _m ) The field of the logistics information can comprise a plurality of dimension information such as single number, place, state, progress and the like, each field is coded by vectorization so as to be input into a neural network for processing, the dimension of the vector can be set according to the complexity of the field information, and the field number m is a settable super parameter;

step 1.2: encoder structural design the encoder consists of multiple fully connected layers, each layer using a ReLU activation function, the encoder aims to map random noise to potential spatial features, the transform formula of the first layer encoder is as follows: h is a ^l = ReLU(W ^l h ^(l-1) +b ^l ) Wherein W is ^l And b ^l The method is characterized in that the method comprises the steps of respectively obtaining a weight matrix and an offset vector of a first layer, wherein the weight matrix is used for training and learning an abstract mapping mode, the number of network layers and the number of nodes are adjustable parameters, the last layer of output of an encoder is a vector h mapped to a hidden space, the hidden space vector h reflects inherent structural information mapped to real logistics data distribution, and the correct learning of a mapping relation is a key for successful generation of a network;

step 1.3: for step i of the decoder, attention weights are defined: alpha _i,j = softmax(v ^T tanh(W ₁ h _i +W ₂ h _j ) Wherein v, W ₁ And W is ₂ Regularizing weight values of different positions to a range of 0-1 by using softmax function in an attention weight calculation formula for learning parameters, wherein the learning parameters v and W ₁ , W ₂ Will be updated by gradient descent to automatically focus on important features, the attention weight reflectingThe decoder focuses on different parts of the input vector h in the current step, and by dynamically adjusting the attention weight, the decoder can adaptively integrate the feature representations on different dimensions of the input vector, so that key information of logistics information is captured, and the capability of focusing on important features is the core advantage of an attention mechanism;

step 1.4: based on the attention mechanism, the input of the decoder at step i is: c _i = Σ _j α _i,j h _j The decoder will c _i Mapping to output vector x through a multi-layer network _i Wherein c _i Representing the input feature representation fused according to the attention weight, which contains the most important feature in the input vector h at the current step, the decoder may contain a multi-layer fully connected network, and the input feature c is obtained by complex nonlinear transformation _i The parameters of each layer of network are updated by a back propagation algorithm to reduce the loss of the reconstructed stream information field, the layer number of the decoding network is an adjustable parameter, and the output vector x _i I.e. the ith field in the generated false stream data, and finally realizing the generation of the whole false information in a vector sequence mode;

step 1.5: the optimization goal of the generating network is to minimize the probability that the generated logistics information is judged as false by the discriminator, namely to minimize the cross entropy loss, the objective function forces the generating network to generate more realistic information by reducing the probability that the generated information is judged as false by the discriminator, wherein the cross entropy provides a judging standard for calculating the difference between two probability distribution, in training, the generating network is continuously optimized to enable the distribution of the generated information to be as close as possible to the distribution of the real data, the distance between the two distributions is reduced, so that the aim of decepting the discriminator is fulfilled, and the discriminator is also improving the accuracy of judging the real and false information;

the method specifically comprises the following steps:

step 2.1: the input of the discrimination network is logistics information and comprises m fields, each field is a vector, the input of the discrimination network can be from a real data set or false information generated by a generation network, wherein m represents the field number of the logistics information and can comprise a plurality of information dimensions such as order number, place, state, freight and the like, each field is converted into a vector representation of a fixed dimension through the prior vectorization coding and is marked as (x) ₁ ,x ₂ ,...,x _m ) The vectorization method can include one-hot coding or word embedding, etc., and the discrimination network outputs a probability p E [0,1 ]]Representing the probability that the input information is judged to be real logistics information, wherein [0,1]Representing a completely defined range of possibilities to which subsequent calculations can be stabilized against outliers by normalizing the output to that range;

step 2.2: the distinguishing network uses convolutional neural network to extract the characteristics of each field, and for the ith field x _i Extracting features through a convolution layer: f (f) _i = ReLU(conv(x _i ) Wherein conv represents a convolution operation, the convolutional neural network can efficiently and automatically extract local feature patterns in input data, which is effective for processing similar images or text scenes, wherein a convolutional layer slides on the input data through a convolution kernel, feature mapping is calculated, nonlinearity is increased through an activation function, hierarchical features with different granularities can be extracted, feature dimensions are reduced through a pooling layer, the whole convolutional network is subjected to end-to-end back propagation training, a filtering mode required by feature extraction can be automatically learned, manual engineering is not needed, and an independent convolutional network is used for an ith field to obtain the feature representation f of the convolutional network _i Parameters such as the size, the number and the like of the convolution kernels can be optimized to achieve the best effect;

step 2.3: defining an attention weight: beta _i = softmax(u ^T tanh(Wf _i +b)) where u, W, b are learning parameters, attention weightsThe formula of (2) also comprises a tanh activation function, which can increase the nonlinearity of expression and facilitate the study of more complex modes, the attention mechanism can evaluate the contribution of different input fields to final judgment so as to dynamically modify the weight proportion, wherein u, W and b are parameters which can be trained and updated, the parameters can be converged to a better weight calculation mode after a plurality of iterations, the attention weight reflects the importance of different fields to the judgment result, and finally the feature fusion is obtained: f=Σ _i β _i f _i The feature representations of the fields are weighted and combined, wherein the weights are determined by dynamic attention scores, and finally feature information in all dimensions is fused for true and false judgment, so that the accuracy rate can be improved by focusing on key parts of analysis information;

step 2.4: after the fusion characteristic f is obtained, the true and false probability is obtained through a multi-layer full-connection network: p=sigmoid (W 'f+b'), where W 'and b' are full connection layer parameters, and the optimization objective of the true-false judgment is to maximize the log likelihood of judging correctly;

the method specifically comprises the following steps:

step 3.1: a large number of real logistics list numbers and logistics information are collected, a training data set of the discriminator is constructed, and the data set comprises input fields: (single number, place, status, etc.), and labels (true/false), in which step we need to collect a large number of real logistics single numbers and related logistics information to construct a dataset for training the arbiter, this dataset should include multiple input fields, such as single number, place, status, etc., and labels for each sample, the labels indicating whether the sample is true or false, this dataset will be the training data for our arbiter network to help the model learn how to discern real and false logistics information;

step 3.2: initializing all weight matrixes and bias vectors of a convolution layer, a full connection layer and the like in a discrimination network, wherein the weight matrixes are initialized to small random numbers, the bias vectors are initialized to 0, in this step, various layers in the discrimination network, including the convolution layer, the full connection layer and the like, and the weight matrixes and the bias vectors thereof, are initialized to small random numbers to ensure that a model has certain randomness at the beginning, the bias vectors are initialized to 0, the step ensures that the network is in an initial state and training can be started, and in the initialization process, a plurality of weight initialization methods, such as Xavier initialization, are considered to be used for better adapting to different types of network structures and activation functions;

step 3.3: the training goal of the discriminator is to maximize the probability of judging true logistics information as true, and simultaneously maximize the probability of judging false information as false, and the loss function is defined as binary cross entropy loss: l= - [ ylogp+ (1-y) log (1-p) ] where y is a true label, p is a judgment result, representing the probability that the logistic information is true, our goal is to maximize the probability that the discriminator correctly judges that the true information is true and that the counterfeit information is counterfeit, reduce the loss function by continuously adjusting the network parameters,

step 3.4: parameters in the discriminator are updated by using an optimization algorithm of RMSProp self-adaptive learning rate, a loss function is continuously reduced by a gradient descent method, a discrimination effect is optimized, parameters in a discriminator network are updated by using an optimization algorithm of RMSProp self-adaptive learning rate, the optimization algorithms continuously adjust a weight matrix and a bias vector by the gradient descent method to minimize the loss function, the model is helped to gradually improve the capability of discriminating true and false information, so that the performance of the model is improved, the learning rate can be adjusted by the algorithm of self-adaptive learning rate according to the historical gradient information of each parameter, the change condition of different parameters is better adapted, the training process is accelerated, the adjustment amplitude of the learning rate is reduced, and the problem that the learning rate is overlarge or overlarge is avoided;

step 3.5: setting a training early stopping strategy, stopping training when indexes (such as accuracies) of a model on a verification set are not improved, avoiding overfitting, and preserving model parameters with the best effect, wherein the training is stopped by monitoring performance indexes (such as accuracies) of the model on the verification set, if continuous N epochs (training rounds) are not improved, the training is stopped, the best model parameters are helped to be preserved, performance of the model is not reduced after the training rounds are excessive, the strategy is favorable for improving generalization capability of the model, when the early stopping strategy is set, proper N values are needed to be selected, and are usually determined according to actual conditions and the size of the verification set, the smaller N values possibly lead to early stopping, and the stopping time is possibly delayed by the larger N values, so that adjustment and verification are needed in the experiment to find the best N values;

the method specifically comprises the following steps:

step 4.1: in the case that the parameters of the discriminant (the weights of the convolution layer and the full connection layer) are unchanged, only the parameters of the generator are optimized, in this step, we need to keep the parameters of the discriminant (the weights of the convolution layer and the full connection layer) unchanged, only the parameters of the generator are optimized, which means that we need to ensure that the discriminant will not distinguish real information from generated false information more finely in the current state, so that the generator is forced to improve the capability of generating the real false information, and when the parameters of the generator are optimized, the learning rate and the optimization algorithm are usually required to be adjusted so as to ensure that the generator can effectively learn how to generate the more real false information, which is a key step in generating an countermeasure network (GANs), and the performance of the generator is improved by continuous iterative optimization;

step 4.2: the goal of generating a network is to generate realistic false information to fool the arbiter, so its loss function is defined as: l_g= -log (D (G (z))) where G represents a generator model, D represents a discriminant model, z is the random noise input of the generator, and a loss function represents maximizing the probability of spoofing the discriminant to be authentic, it should be noted here that the goal of the generator is to reduce l_g as much as possible, which will encourage the generator to generate more realistic false information to make the discriminant more vulnerable to spoofing, the training process of the generator and the training process of the discriminant are opposed to each other, which together push the model forward;

step 4.3: inputting random noise vectors into a generator to generate false logistics information, inputting the false information into a discriminator to calculate a generator loss function L_G, updating model parameters of an encoder and a decoder in the generator through gradient descent based on the L_G, minimizing the L_G, optimizing the capability of generating realistic false information, and in training of the generator, normally using a back propagation algorithm to calculate gradients and updating parameters according to the direction of the gradients, wherein the process is repeated until the generator can generate enough realistic false information, so that the discriminator is difficult to judge that the false information is forged;

the method specifically comprises the following steps:

step 5.1: under the condition that model parameters of a generator are fixed, only parameters in the discriminator are optimized, and the aim of the step is to further improve the performance of the discriminator so that real data and generated false data can be better distinguished;

when optimizing parameters of the arbiter, we will generally continue to use the optimization algorithm of the adaptive learning rate used in the previous step, to ensure that the network parameters can be effectively adjusted, the performance of the arbiter directly affects the training process of the generator,

step 5.2: using the generator of the parameters optimized in step 4, inputting random noise to generate new false logistics data, this step is a testing stage of the generator, the goal is to generate false data similar to real data by using a generator model so as to be used for training of the discriminators, when the false data are generated, we need to ensure that the parameters of the generator remain unchanged, because we only train the discriminators, so that we can generate false data with diversity to test the performance of the discriminators;

step 5.3: combining new dummy data generated by the generator with real logistics data to construct a new training set of the discriminator for retraining, wherein the new training set comprises real data and the generated dummy data and can be used for retraining of the discriminator;

step 5.4: using the new training set comprising true and false data constructed in the step 5.3 to update and optimize all network layer parameters of the discriminator, maximizing the capability of judging the true and false data, and continuing the circulation process until the generator and the discriminator reach a certain performance level, thereby realizing the training target of generating the countermeasure network, namely generating the true and false data by the generator, and accurately judging the true and false data by the discriminator, which is a complex but effective training strategy for training to generate the countermeasure network;

the method specifically comprises the following steps:

step 6.1: during inquiry, providing a single number of the physical distribution information to be actually inquired, which is recorded as id_real, wherein the single number is used for simulating a real inquiry scene so as to verify the accuracy and performance of the system;

step 6.2: generating N false logistics list numbers, noted as { id_fake1, id_fake2, & gt, id_faken }, using a random function, which will be introduced into the query for simulating potential fraud to test the robustness and security of the system;

step 6.3: combining the n+1 true and false single numbers together, inquiring a logistics information system at the same time, acquiring complete logistics information corresponding to each single number, including a plurality of fields such as places, states and the like, wherein the step simulates an actual inquiring flow, ensures that the system can process real and false data and return corresponding logistics information, and ensures the integrity and consistency of the data when inquiring the obtained logistics information so as to facilitate subsequent processing and judgment;

step 6.4: the logistics information set containing true and false data obtained by inquiry is simultaneously input into a pre-trained discriminator, and the discriminator outputs the probability that each piece of information belongs to a true logistics;

step 6.5: and comparing the n+1 real probabilities output by the discriminator, wherein the information with the highest probability value corresponds to the real logistics list id_real, so that the correct logistics information inquiry is completed, in practical application, a threshold value can be set, the probability output by the discriminator is compared with the threshold value to determine whether to accept the inquiry result, if the probability value is higher than the threshold value, the inquiry result can be considered to be credible, otherwise, the inquiry result can be considered to be potential fake information.

The beneficial effects of the invention include that by generating a combination of the antagonism network and the attention mechanism, the logistics information can be queried more accurately. In addition, the generated countermeasure network can learn the internal mapping characteristics of the logistics information system from unlabeled data, so that the query system has better generalization capability when processing new query, can identify and process the condition which is not seen, and the generated countermeasure network and the attention mechanism can help the system to pay attention to key fields in the logistics information, thereby improving the query efficiency. The system is able to more quickly identify and extract critical information.

Claims

1. A logistic information query method based on machine learning, which is characterized by comprising the following steps:

step 1: designing a generating network capable of generating false logistics list numbers and logistics information, wherein the generating network consists of an encoder and a decoder, the encoder inputs random noise vectors, and the decoder outputs the false logistics information; in order to make the generated information more realistic, attention mechanisms are introduced in the decoder enabling it to focus on different parts of the input vector when generating the information;

step 2: constructing a discrimination network capable of carrying out true and false judgment on the input logistics information; the network uses a convolutional neural network to extract characteristic representation of input information, and then judges whether the characteristics belong to the distribution of real logistics information or not on the basis of the characteristics; the attention mechanism is also used in the discrimination network, so that the attention mechanism can pay attention to key fields in the logistics information, and the discrimination accuracy is improved;

step 3: the real logistics data set is used for training the discriminator in advance, so that the discriminator is familiar with the characteristic distribution of the real logistics information, and the authenticity of the input information can be accurately judged, specifically:

step 3.1: collecting a large amount of real logistics list numbers and logistics information, and constructing a training data set of the discriminator; the data set comprises an input field and a true and false label;

step 3.2: initializing all weight matrixes and bias vectors of a convolution layer and a full connection layer in a discrimination network, wherein the weight matrixes are initialized to small random numbers, and the bias vectors are initialized to 0;

step 3.5: setting a training early stopping strategy; when the index of the model on the verification set is continuously improved by N epochs, stopping training, avoiding over fitting, and storing the model parameters with the best effect;

step 4: under the condition that parameters of the discriminant are fixed, the training generator is used for improving the capability of generating real information, and the aim is to maximize the probability of cheating the discriminant, specifically:

step 4.1: under the condition that the parameters of the discriminator are unchanged, only the parameters of the generator are optimized;

step 4.3: inputting the random noise vector into a generator to generate false logistics information, and then inputting the false logistics information into a discriminator to calculate a generator loss function L_G; updating model parameters of an encoder and a decoder in a generator through gradient descent based on the L_G, minimizing the L_G, and optimizing the capability of generating realistic false information;

2. The method according to claim 1, wherein the step 1 is specifically:

step 1.1: the input to the generation network is a random noise vector z, where each element obeys a gaussian distribution; generating stream information with false output of network, wherein the stream information comprises m fields, each field uses a vector x _i Representing, i.e. generating, the network output as a sequence of vectors (x ₁ ，x ₂ ，...，x _m )；

Step 1.2: the encoder structure design encoder consists of a plurality of fully connected layers, wherein each layer uses a ReLU activation function; the transform formula of the layer I encoder is as follows: h is a ^l = ReLU(W ^l h ^(l-1) +b ^l ) Wherein W is ^l And b ^l Respectively a weight matrix and a bias vector of the first layer; the last layer output of the encoder is mapped to the hidden spaceVector h between;

step 1.3: for step i of the decoder, attention weights are defined: alpha _i,j = softmax(v ^T tanh(W ₁ h _i +W ₂ h _j ) Wherein v, W ₁ And W is ₂ For the learned parameters, the attention weights reflect that the decoder is paying attention to different parts of the input vector h at the current step;

3. The method according to claim 1, wherein the step 2 is specifically:

step 2.1: judging that the input of the network is logistics information and comprises m fields, wherein each field is a vector; the input is denoted as (x) ₁ ，x ₂ ，...，x _m ) Judging that the network outputs a probability p E [0,1 ]]Representing the probability that the input information is judged as real logistics information;

step 2.2: the distinguishing network uses convolutional neural network to extract the characteristics of each field, and for the ith field x _i Extracting features through a convolution layer: f (f) _i = ReLU(conv(x _i ) Wherein conv represents a convolution operation;

4. The method according to claim 1, wherein the step 5 is specifically:

5. The method according to claim 1, wherein the step 6 is specifically:

step 6.3: combining the n+1 true and false single numbers together, and simultaneously inquiring a logistics information system to obtain complete logistics information corresponding to each single number, wherein the complete logistics information comprises a plurality of fields of places and states;

step 6.5: and comparing the n+1 real probabilities output by the discriminator, wherein the information with the highest probability value is corresponding to the real logistics list number id_real, and the correct logistics information query is completed.

6. A logistic information query device based on machine learning, characterized in that the device is any electronic equipment providing a query interface, said electronic equipment being adapted to perform the steps of the method according to any one of claims 1-5.

7. A logistic information query system based on machine learning, characterized in that the system comprises a front subsystem, a transmission subsystem and a rear subsystem, wherein the front subsystem is used for receiving logistic information query requests, the transmission subsystem transmits the logistic information query requests to the rear subsystem, and the rear subsystem is used for executing the steps in any one of the methods of claims 1-5 to obtain query results and transmitting the query results back to the front subsystem through the transmission subsystem.