CN118395996B

CN118395996B - An automatic evaluation method for machine translation based on deep cross-network

Info

Publication number: CN118395996B
Application number: CN202410872045.1A
Authority: CN
Inventors: 李茂西; 魏嘉琴; 万旻晖
Original assignee: Jiangxi Normal University
Current assignee: Jiangxi Normal University
Priority date: 2024-07-01
Filing date: 2024-07-01
Publication date: 2024-08-30
Anticipated expiration: 2044-07-01
Also published as: CN118395996A

Abstract

The present invention discloses a method for automatic evaluation of machine translation based on a deep cross network, the steps of which are: obtaining a training set, normalizing the training set; extracting a sentence-level machine translation quality feature vector under an independent representation mode; extracting a sentence-level machine translation quality feature vector under a unified representation mode; extracting a machine translation quality cross feature vector; predicting the quality of machine translation; and training a machine translation automatic evaluation model based on a deep cross network. The prediction method steps are: inputting machine translation and artificial reference translation into the above-mentioned machine translation automatic evaluation model based on a deep cross network to predict the quality of machine translation; at the same time, using a large language model vector database to directly perform sentence vector representation on machine translation and artificial reference translation, calculating the cosine similarity between machine translation and artificial reference translation, and linearly weighting the predicted machine translation quality with the cosine similarity to obtain the machine translation quality score.

Description

Machine translation automatic evaluation method based on deep crossover network

Technical Field

The invention relates to the technical field of natural language processing, in particular to a machine translation automatic evaluation method based on a deep cross network.

Background

In research and application of machine translation, automatic evaluation of machine translation plays an important role. With the vigorous development of a pre-training language model in recent years, the automatic machine translation evaluation method based on the neural network extracts the deep characterization of the machine translation and the manual reference translation through the pre-training language model, and builds the neural network to compare the difference of the deep characterization of the machine translation and the manual reference translation so as to predict the quality of the machine translation.

The automatic machine translation evaluation method based on the neural network is generally based on a double-tower architecture. The method comprises the steps of independently extracting the depth representation of a machine translation to form a machine translation representation tower, independently extracting the depth representation of an artificial reference translation to form an artificial reference translation representation tower, and then performing simple interaction on double towers, such as double-tower feature vector splicing, element multiplication and absolute value subtraction operation on the same position of the double-tower feature vector, and the like, wherein the feature interaction modes do not directly and explicitly model the relationship between elements at different positions of the double-tower feature, namely feature cross operation. The low-order combination features and the high-order combination features generated by the cross operation play an important role in automatic evaluation of the machine translation. Therefore, the invention provides a machine translation automatic evaluation method based on a depth intersection network, which utilizes the depth intersection network to carry out depth interaction on the machine translation characteristics and the manual reference translation characteristics so as to capture semantic relations between the machine translation characteristics and the manual reference translation characteristics, thereby improving the effect of the machine translation automatic evaluation.

Disclosure of Invention

The invention provides a machine translation automatic evaluation method based on a deep cross network, which is used for improving the correlation between the machine translation automatic evaluation and human evaluation.

The technical scheme adopted by the invention is as follows: a machine translation automatic evaluation method based on a depth cross network comprises the following steps:

step S1, a training set is obtained, normalization processing is carried out on the training set, and the training set after normalization processing is obtained; the training set is composed of a plurality of different samples, and each sample comprises a machine translation, a manual reference translation and a human evaluation score of the machine translation;

S2, extracting sentence-level machine translation quality feature vectors in an independent characterization mode; inputting the machine translation and the manual reference translation in each sample in the normalized training set into a cross-language pre-training model respectively, outputting a machine translation sub-word level feature vector in an independent characterization mode and a manual reference translation sub-word level feature vector in the independent characterization mode, carrying out interaction on the machine translation sub-word level feature vector in the independent characterization mode and the manual reference translation sub-word level feature vector in the independent characterization mode by using an external attention mechanism to obtain a machine translation interaction feature vector in the independent characterization mode and a manual reference translation interaction feature vector in the independent characterization mode, connecting the machine translation interaction feature vector in the independent characterization mode and the manual reference translation interaction feature vector in the independent characterization mode, carrying out average pooling operation, and outputting a sentence level machine translation quality feature vector in the independent characterization mode;

S3, extracting sentence-level machine translation quality feature vectors in a unified characterization mode; performing character string connection on the machine translation and the manual reference translation in each sample in the training set after normalization processing to obtain a translation joint character string, inputting the translation joint character string into a cross-language pre-training model, outputting sub-word level feature vectors in a unified characterization mode, performing interaction on the sub-word level feature vectors in the unified characterization mode by using a self-attention mechanism to obtain interaction feature vectors in the unified characterization mode, performing average pooling operation on the interaction feature vectors in the unified characterization mode, and outputting sentence level machine translation quality feature vectors in the unified characterization mode;

S4, extracting a machine translation quality cross feature vector; splicing the sentence-level machine translation quality feature vector in the independent characterization mode in the step S2 and the sentence-level machine translation quality feature vector in the unified characterization mode in the step S3, inputting the spliced sentence-level machine translation quality feature vector into a deep cross network containing 4 stacked cross layers, and outputting the machine translation quality cross feature vector;

s5, predicting the quality of the machine translation; inputting the machine translation quality cross feature vector in the step S4 into a three-layer feedforward neural network, and outputting a machine translation quality score;

S6, training a machine translation automatic evaluation model based on a depth cross network; and training parameters of the automatic machine translation evaluation model based on the depth cross network by minimizing the mean square error loss on the training set after the normalization processing according to the machine translation quality score output in the step S5 and the human evaluation score of the machine translations in the training set after the normalization processing in the step S1, so as to obtain the automatic machine translation evaluation model based on the depth cross network after training.

Further, in step S1, the training set is composed of a plurality of different samples, and each sample is specifically:

given a training set of samples d= { (h, r), y }, where d represents a training sample, h represents a machine translation, r represents an artificial reference translation, y represents a human evaluation score for machine translation h, the human evaluation score being a real value between 0-1.

Further, in step S2, a sentence-level machine translation quality feature vector in an independent characterization mode is extracted, which specifically includes:

Step S21, inputting the machine translation h and the artificial reference translation r into a cross-language pre-training model XLM-RoBERTa respectively, and performing sub-word segmentation on the machine translation h and the artificial reference translation r by using a sub-word segmentation method SENTENCEPIECE algorithm through the cross-language pre-training model XLM-RoBERTa to respectively obtain sub-word sequences comprising m sub-words and n sub-words:

；

Wherein m and n respectively represent the number of the subwords contained in the machine translation and the artificial reference translation after being segmented by using a subword segmentation method SENTENCEPIECE algorithm; h ₁,h₂,h_m represents the 1 st sub word, the 2 nd sub word and the m th sub word after the machine translation is segmented by using the sub word segmentation method SENTENCEPIECE algorithm; r ₁,r₂,r_n represents the 1 st sub word, the 2 nd sub word and the nth sub word after the manual reference translation is segmented;

S22, outputting a machine translation sub-word level feature vector in an independent characterization mode and a manual reference translation sub-word level feature vector in the independent characterization mode according to the positions of the sub-words and the sub-words in sentences by a cross-language pre-training model XLM-RoBERTa, wherein the cross-language pre-training model XLM-RoBERTa is shown as a formula (1) and a formula (2);

(1)；

(2)；

Wherein v _h and v _r respectively represent a machine translation and an artificial reference translation using a cross-language pre-training model XLM-RoBERTa to output a machine translation sub-word level feature vector in an independent characterization mode and an artificial reference translation sub-word level feature vector in an independent characterization mode, and XLM-RoBERTa ()' represents a cross-language pre-training model XLM-RoBERTa;

Step S23, interacting the machine translation sub-word level feature vector in the independent characterization mode and the manual reference translation sub-word level feature vector in the independent characterization mode by using an external attention mechanism to obtain a machine translation interaction feature vector in the independent characterization mode and a manual reference translation interaction feature vector in the independent characterization mode, wherein the machine translation interaction feature vector in the independent characterization mode and the manual reference translation interaction feature vector in the independent characterization mode are shown as a formula (3) and a formula (4):

(3)；

(4)；

Wherein, For machine translation interaction feature vectors in the independent characterization mode,Manually referencing the translation interaction feature vector in the independent characterization mode; multiHead () represents a multi-headed attention mechanism function, which contains three parameters, query, key and value, which is converted into an external attention mechanism when the query is not identical to the key and value;

Step S24, connecting the machine translation interaction feature vector in the independent characterization mode and the manual reference translation interaction feature vector in the independent characterization mode, carrying out an average pooling operation, and outputting the sentence-level machine translation quality feature vector in the independent characterization mode as shown in a formula (5):

(5)；

wherein vs _hr represents the sentence-level machine translation quality feature vector in the independent characterization mode; avgPooling () represents the average pooling operation.

Further, in step S3, a sentence-level machine translation quality feature vector in a unified characterization mode is extracted, which specifically includes:

step S31, the machine translation h and the manual reference translation r are connected in character strings to obtain a translation joint character string, as shown in a formula (6):

(6)；

Wherein hr is a translation joint string, "</s >" represents the start and stop characters of the string;

step S32, inputting the translation joint character string into a cross-language pre-training model XLM-RoBERTa, and performing sub-word segmentation on the translation joint character string by using a sub-word segmentation method SENTENCEPIECE algorithm through the cross-language pre-training model XLM-RoBERTa to obtain a sub-word sequence containing p sub-words, wherein the sub-word sequence is shown in a formula (7):

(7)；

Wherein, p represents the number of the subwords contained in the translation joint character string after the subwords are segmented by using a subword segmentation method SENTENCEPIECE algorithm; hr ₁,hr₂,hr_p represents the 1 st subword, the 2 nd subword and the p th subword of the translation joint string after being segmented by using the SENTENCEPIECE algorithm;

Step S33, a cross-language pre-training model XLM-RoBERTa outputs a subword level feature vector in a unified characterization mode according to the subwords and the positions of the subwords in sentences, as shown in a formula (8):

(8)；

Wherein v _hr represents a subword level feature vector in the unified characterization mode;

Step S34, the sub-word level feature vectors in the unified characterization mode are interacted by using a self-attention mechanism to obtain interaction feature vectors in the unified characterization mode, as shown in a formula (9):

(9)；

Wherein, Representing interaction feature vectors in a unified characterization mode, multiHead ()' representing a multi-head attention mechanism function, wherein the multi-head attention mechanism function comprises three parameters of query, key and value, and the multi-head attention mechanism is converted into a self-attention mechanism when the query is identical to the key and the value;

step S35, carrying out average pooling operation on the interaction feature vectors in the unified characterization mode, and outputting sentence-level machine translation quality feature vectors in the unified characterization mode, wherein the sentence-level machine translation quality feature vectors are shown in a formula (10):

(10)；

where vu _hr represents the sentence-level machine translation quality feature vector in the unified characterization mode.

Further, in step S4, a machine translation quality cross feature vector is extracted, which specifically includes:

step S41, splicing the sentence-level machine translation quality feature vector in the independent characterization mode in step S2 and the sentence-level machine translation quality feature vector in the unified characterization mode in step S3, as shown in a formula (11):

(11)；

Wherein x ₀ represents a machine translation splice feature vector, and the symbol 'three' represents a vector splice operation;

Step S42, inputting the machine translation splicing feature vector into a depth cross network containing 4 stacked cross layers, and outputting a machine translation quality cross feature vector as shown in formula (12), formula (13), formula (14) and formula (15):

(12)；

(13)；

(14)；

(15)；

wherein x ₁,x₂,x₃,x₄ is the cross feature vector output by the 1 st, 2 nd, 3 rd and 4 th stacked cross layers respectively, and x ₄ is taken as the machine translation quality cross feature vector; sign' "Is Hadamard product operation, W ₁,W₂,W₃,W₄ and b ₁,b₂,b₃,b₄ are the linear weight parameters and bias of the 1 st, 2 nd, 3 rd and 4 th stacked cross layers respectively; it should be noted that the deep crossover network may comprise several stacked crossover layers, and the present invention uses 4 stacked crossover layers, which are obtained from a large amount of experimental experience.

Further, in step S5, the quality of the machine translation is predicted, specifically:

Inputting the machine translation quality cross feature vector in the step S4 into a three-layer feedforward neural network, and outputting a machine translation quality score; as shown in equation (16):

(16)；

Wherein Score is the quality Score of the machine translation, and Feed-Forward is a three-layer feedforward neural network;

further, in step S6, the mean square error is lost, as shown in formula (17);

(17)；

where MSE represents the mean square error loss, N represents the number of samples in the training set, i represents the ith sample in the training set, y ⁽ⁱ⁾ represents the human evaluation Score of the machine translation of the ith sample in the training set, score ⁽ⁱ⁾ represents the machine translation quality Score of the prediction of the ith sample.

Further, another technical scheme adopted by the invention is as follows: a machine translation automatic evaluation method based on a depth cross network further comprises the following steps:

s7, carrying out standardization processing on the manual reference translation of the machine translation to be evaluated;

s8, inputting the normalized machine translation to be evaluated and the manual reference translation of the machine translation to be evaluated in the S7 into the machine translation automatic evaluation model based on the depth cross network trained in the S6, and predicting the quality score of the machine translation;

S9, calculating the cosine similarity of the machine translation sentence vector representation and the artificial reference translation sentence vector representation based on the large language model vector library; respectively inputting the normalized machine translation to be evaluated and the artificial reference translation of the machine translation to be evaluated in the step S7 into a large language model vector library, directly outputting the machine translation sentence vector representation and the artificial reference translation sentence vector representation, and calculating the cosine similarity of the machine translation sentence vector representation and the artificial reference translation sentence vector representation;

Step S10, calculating a final prediction value of the quality of the machine translation; and linearly weighting the machine translation quality score predicted in the step S8 and the cosine similarity of the machine translation sentence vector representation and the artificial reference translation sentence vector representation based on the large language model vector library in the step S9 to obtain a final machine translation quality prediction score.

Further, in step S9, the cosine similarity of the machine translation sentence vector representation and the artificial reference translation sentence vector representation based on the large language model vector library is calculated, specifically:

Inputting the normalized machine translation to be evaluated and the artificial reference translation of the machine translation to be evaluated in the step S7 into a large language model vector database Chromadb respectively, and directly outputting a machine translation sentence vector representation and an artificial reference translation sentence vector representation, wherein the machine translation sentence vector representation and the artificial reference translation sentence vector representation are shown as a formula (18) and a formula (19);

(18)；

(19)；

Wherein E _h,E_r represents the machine translation sentence vector representation and the manual reference translation sentence vector representation respectively, chromadb _zephyr-7b () represents the Chromadb vector database output function with the large language model zephyr-7b as the base large model;

Calculating cosine similarity of the translation sentence vector representation of the machine and the artificial reference translation sentence vector representation, as shown in a formula (20);

(20)；

Wherein CosSim (h, r) represents cosine similarity of the machine translation sentence vector representation and the artificial reference translation sentence vector representation, and T represents transposition operation.

Further, in step S10, a final prediction score of the machine translation quality is calculated, specifically:

the machine translation quality score predicted in the step S8 and the cosine similarity of the machine translation sentence vector representation and the artificial reference translation sentence vector representation based on the large language model vector library in the step S9 are linearly weighted, as shown in a formula (21);

（21）；

Wherein, Representing a final predictive value of machine translation quality; score represents the machine translation quality Score predicted in step S8; and 0.8 is linear interpolation weight, and is obtained according to experimental experience.

The beneficial effects of the invention are as follows: the invention decomposes a machine translation automatic evaluation method based on a depth cross network into extracting sentence-level machine translation quality feature vectors in an independent characterization mode by using a pre-training model, an external attention mechanism and average pooling; extracting sentence-level machine translation quality feature vectors in a unified characterization mode by using a pre-training model, a self-attention mechanism and average pooling according to the whole information of the machine translation and the manual reference translation; the method comprises the steps of utilizing an experiment to verify that a depth intersection network containing 4 stacked intersection layers is effective to extract machine translation quality intersection feature vectors, and inputting the machine translation quality intersection feature vectors into a feedforward neural network to automatically predict the machine translation quality; meanwhile, a large language model vector database is adopted to directly represent sentence vectors of the machine translation and the artificial reference translation, cosine similarity of the machine translation and the artificial reference translation is calculated, and the automatic predicted machine translation quality and the cosine similarity of the machine translation and the artificial reference translation are linearly weighted to obtain the score of the machine translation quality so as to improve the automatic evaluation effect of the machine translation.

Drawings

FIG. 1 is a schematic flow chart of a machine translation automatic evaluation model training method based on a depth cross network;

FIG. 2 is a schematic flow chart of the machine translation automatic evaluation method based on the deep cross network;

FIG. 3 is a schematic diagram of a machine translation automatic evaluation model structure based on a depth cross network according to the invention;

Fig. 4 is a schematic diagram of a deep crossover network structure of the present invention containing 4 stacked crossover layers.

Detailed Description

The present invention will be described in detail below with reference to the accompanying drawings and embodiments.

As shown in fig. 1, this embodiment works as follows, and is a machine translation automatic evaluation method based on a deep crossover network, which includes the following steps:

S2, extracting sentence-level machine translation quality feature vectors in an independent characterization mode; inputting the machine translation and the manual reference translation in each sample in the training set after normalization processing into a cross-language pre-training model respectively, outputting a machine translation sub-word level feature vector in an independent characterization mode and a manual reference translation sub-word level feature vector in the independent characterization mode, carrying out interaction on the machine translation sub-word level feature vector in the independent characterization mode and the manual reference translation sub-word level feature vector in the independent characterization mode by using an external attention mechanism to obtain a machine translation interaction feature vector in the independent characterization mode and a manual reference translation interaction feature vector in the independent characterization mode, connecting the machine translation interaction feature vector in the independent characterization mode and the manual reference translation interaction feature vector in the independent characterization mode, carrying out average pooling operation, and outputting a sentence level machine translation quality feature vector in the independent characterization mode;

S4, extracting a machine translation quality cross feature vector; the sentence-level machine translation quality feature vector in the independent characterization mode in the step S2 and the sentence-level machine translation quality feature vector in the unified characterization mode in the step S3 are spliced, then a deep intersection network containing 4 stacked intersection layers is input, and a machine translation quality intersection feature vector is output;

As shown in fig. 2, a machine translation automatic evaluation method based on a deep cross network further includes:

In step S1, a sample in the training set is specifically:

given a training set of samples d= { (h, r), y }, where d represents a training sample, h represents a machine translation, r represents an artificial reference translation corresponding to the machine translation, y represents a human evaluation score for the machine translation h, the human evaluation score being an intervention of a real value between 0-1.

Table 1: one training sample example in a training set

Inputting the machine translation h and the artificial reference translation r into a cross-language pre-training model XLM-RoBERTa respectively, and performing sub-word segmentation on the machine translation h and the artificial reference translation r by using a sub-word segmentation method SENTENCEPIECE algorithm through the cross-language pre-training model XLM-RoBERTa to respectively obtain sub-word sequences comprising m sub-words and n sub-words:

；

Outputting a machine translation sub-word level feature vector in an independent characterization mode and a manual reference translation sub-word level feature vector in the independent characterization mode according to the positions of the sub-words and the sub-words in sentences by a cross-language pre-training model XLM-RoBERTa;

(1)；

(2)；

Alternatively, the cross-language pre-training model XLM-RoBERTa uses the basic model "XLM-roberta-large" model therein, has 24 transducer encoder hidden layers and 16 self-attention heads, and outputs 1024-dimensional vectors for each subword.

And the machine translation sub-word level feature vector in the independent characterization mode and the manual reference translation sub-word level feature vector in the independent characterization mode are interacted by using an external attention mechanism to obtain a machine translation interaction feature vector in the independent characterization mode and a manual reference translation interaction feature vector in the independent characterization mode:

(3)；

(4)；

Connecting the machine translation interaction feature vector in the independent characterization mode with the manual reference translation interaction feature vector in the independent characterization mode, carrying out average pooling operation, and outputting the sentence-level machine translation quality feature vector in the independent characterization mode:

(5)；

and (3) carrying out character string connection on the machine translation h and the artificial reference translation r to obtain a translation joint character string:

(6)；

the translation joint strings of the training sample example in the attached table 1 are:

</s>This document is proposed by the Ministry of Industry and Information Technology.</s></s>TheMinistry of Industry and Information Technology is responsible for the proposal and administration of this document.</s>;

Inputting the translation joint character string into a cross-language pre-training model XLM-RoBERTa, and performing sub-word segmentation on the translation joint character string by using a sub-word segmentation method SENTENCEPIECE algorithm through the cross-language pre-training model XLM-RoBERTa to obtain a sub-word sequence containing p sub-words:

(7)；

outputting a subword level feature vector in a unified characterization mode according to the subwords and the positions of the subwords in sentences by the cross-language pre-training model XLM-RoBERTa;

(8)；

Wherein v _hr represents a subword level feature vector in unified characterization mode, XLM-RoBERTa ()' represents a cross-language pre-training model XLM-RoBERTa;

Interaction is carried out on the subword level feature vectors in the unified characterization mode by using a self-attention mechanism to obtain interaction feature vectors in the unified characterization mode:

(9)；

and carrying out average pooling operation on the interaction feature vectors in the unified characterization mode, and outputting sentence-level machine translation quality feature vectors in the unified characterization mode:

(10)；

Splicing the sentence-level machine translation quality feature vector in the independent characterization mode in the step S2 and the sentence-level machine translation quality feature vector in the unified characterization mode in the step S3:

(11)；

Inputting the machine translation splicing feature vector into a depth intersection network containing 4 stacked intersection layers, and outputting a machine translation quality intersection feature vector:

(12)；

(13)；

(14)；

(15)；

wherein the symbol is' "Is Hadamard product operation, W ₁,W₂,W₃,W₄ and b ₁,b₂,b₃,b₄ are the linear weight parameters and bias of the 1 st, 2 nd, 3 rd and 4 th stacked cross layers respectively; x ₁,x₂,x₃,x₄ is the cross feature vector output by the 1 st, 2 nd, 3 rd and 4 th stacked cross layers respectively, and x ₄ is taken as the machine translation quality cross feature vector; it should be noted that the deep crossover network may include a plurality of stacked crossover layers, and the present patent uses 4 stacked crossover layers obtained according to a great deal of experimental experience.

inputting the machine translation quality cross feature vector in the step S4 into a three-layer feedforward neural network, and outputting a machine translation quality score;

(16)；

Wherein Score is the machine translation quality Score, and Feed-Forward is a three-layer feedforward neural network.

Further, in step S6, the mean square error loss is shown in formula (17);

(17)；

Inputting the normalized machine translation to be evaluated and the artificial reference translation of the machine translation to be evaluated in the step S7 into the large language model vector database Chromadb respectively, and directly outputting the machine translation sentence vector representation and the artificial reference translation sentence vector representation:

(18)；

(19)；

Wherein Chromadb _zephyr-7b () represents a Chromadb vector database output function with the large language model zephyr-7b as the base large model; h, r respectively represent the machine translation to be evaluated and the manual reference translation of the machine translation to be evaluated; e _h,E_r represents the machine translation sentence vector representation and the manual reference translation sentence vector representation respectively;

calculating cosine similarity of the translation sentence vector representation of the machine and the artificial reference translation sentence vector representation:

(20)；

Wherein CosSim (h, r) represents the cosine similarity of the machine translation sentence vector representation and the artificial reference translation sentence vector representation.

The machine translation quality score predicted in the step S8 and the cosine similarity of the machine translation sentence vector representation and the artificial reference translation sentence vector representation based on the large language model vector library in the step S9 are linearly weighted:

（21）；

Wherein, Representing a final predictive value of machine translation quality; score represents the machine translation quality Score predicted in step S8; and 0.8 is a linear interpolation weight, and is obtained according to a large amount of experimental experience.

The machine translation automatic evaluation method DCN-MTE based on the depth cross network is tested on the news field dataset Newstest with the tasks of Deying, zhongying and England directions for automatic evaluation and evaluation of the machine translation of the international machine translation conference at the 7 th time. The automatic machine translation evaluation method BLEU, the automatic machine translation evaluation method chrF, the automatic machine translation evaluation method YiSi-1, the automatic machine translation evaluation method BLEURT-20, the automatic machine translation evaluation method UNITE, the automatic machine translation evaluation method BERTScore, the automatic machine translation evaluation method MS-COMET-22, the automatic machine translation evaluation method COMET-22 and the like are used as comparison methods, wherein the automatic machine translation evaluation method COMET-22 is the performance optimal method participating in evaluation.

And in performance measurement, following the official practice in the automatic evaluation task of the machine translation of the 7 th international machine translation conference, respectively using different methods of Kendel correlation coefficient and Pelson correlation coefficient evaluation to evaluate the correlation with human evaluation on sentence level and system level, wherein the larger the Kendel correlation coefficient or the Pelson correlation coefficient is, the better the automatic evaluation effect of the machine translation is.

Table 2: and automatically evaluating and evaluating sentence level correlation and system level correlation of different machine translations in the directions of task Deying, zhongying and Ying in the 7 th international machine translation conference machine translation with human evaluation.

The sentence level correlation and the system level correlation of the automatic evaluation method of the machine translations of the international machine translation conference with human evaluation in the directions of the automatic evaluation task of the machine translations of 7 th, the Chinese and the English are shown in the attached table 2. The data in Table 2 shows that the machine translation automatic evaluation method DCN-MTE based on the deep cross network is superior to the machine translation automatic evaluation methods BLEU, chrF, yiSi-1, BLEURT-20, UNITE, BERTScore, MS-COMET-22, the machine translation automatic evaluation method COMET-22 and the like in terms of the comprehensive values of sentence level correlation and system level correlation. The machine translation automatic evaluation method based on the depth cross network has the advantages that DCN-MTE is higher than the optimal system COMET-22 participating in evaluation by 0.002 on sentence level correlation and higher than the optimal system COMET-22 on system level correlation by 0.015.

The method for automatically evaluating the machine translation based on the depth intersection network can carry out depth interaction on the machine translation features and the manual reference translation features and capture semantic relations between the machine translation features and the manual reference translation features, and can consistently improve the effect of automatically evaluating the machine translation.

The methods of the present disclosure have general applicability because the methods of the present disclosure are not presented for two particular languages. Although the present disclosure has been experimentally verified in only three translation directions among german, chinese and english, the present disclosure is equally applicable to other language pairs such as chinese-japanese and chinese-vietnamese.

The protection of the present invention is not limited to the above embodiments. Variations and advantages that would occur to one skilled in the art are included in the invention without departing from the spirit and scope of the inventive concept, and the scope of the invention is defined by the appended claims.

Claims

1. A method for automatic evaluation of machine translation based on deep cross-network, characterized by the following steps:

Step S1, obtaining a training set, normalizing the training set, and obtaining a normalized training set; the training set is composed of a plurality of different samples, each sample including a machine translation, a manual reference translation, and a human evaluation score of the machine translation;

Step S2, extracting sentence-level machine translation quality feature vectors under the independent representation mode; inputting the machine translation and manual reference translation in each sample in the training set after normalization into the cross-language pre-training model respectively, outputting the machine translation subword-level feature vector under the independent representation mode and the manual reference translation subword-level feature vector under the independent representation mode, using the external attention mechanism to interact the machine translation subword-level feature vector under the independent representation mode and the manual reference translation subword-level feature vector under the independent representation mode to obtain the machine translation interaction feature vector under the independent representation mode and the manual reference translation interaction feature vector under the independent representation mode, connecting the machine translation interaction feature vector under the independent representation mode and the manual reference translation interaction feature vector under the independent representation mode, and performing an average pooling operation, and outputting the sentence-level machine translation quality feature vector under the independent representation mode;

Step S3, extracting the sentence-level machine translation quality feature vector under the unified representation mode; connecting the machine translation and the manual reference translation in each sample in the training set after normalization to obtain a translation joint string, inputting the translation joint string into the cross-language pre-training model, outputting the subword-level feature vector under the unified representation mode, using the self-attention mechanism to interact the subword-level feature vectors under the unified representation mode to obtain the interaction feature vector under the unified representation mode, performing an average pooling operation on the interaction feature vector under the unified representation mode, and outputting the sentence-level machine translation quality feature vector under the unified representation mode;

Step S4, extracting a machine translation quality cross feature vector; concatenating the sentence-level machine translation quality feature vector under the independent representation mode in step S2 and the sentence-level machine translation quality feature vector under the unified representation mode in step S3, inputting them into a deep cross network containing 4 stacked cross layers, and outputting a machine translation quality cross feature vector;

Step S5, predicting the quality of machine translation; inputting the cross-feature vector of the machine translation quality in step S4 into a three-layer feedforward neural network, and outputting the machine translation quality score;

Step S6, training a machine translation automatic evaluation model based on a deep cross network; according to the machine translation quality score output in step S5 and the human evaluation score of the machine translation in the training set after normalization in step S1, the parameters of the machine translation automatic evaluation model based on a deep cross network are trained by minimizing the mean square error loss on the training set after normalization, so as to obtain a trained machine translation automatic evaluation model based on a deep cross network;

In step S2, the sentence-level machine translation quality feature vector in the independent representation mode is extracted, which is specifically:

Step S21, the machine translation h and the manual reference translation r are respectively input into the cross-language pre-training model XLM-RoBERTa, and the cross-language pre-training model XLM-RoBERTa uses the subword segmentation method SentencePiece algorithm to segment the machine translation h and the manual reference translation r into subwords, and obtain subword sequences containing m subwords and n subwords respectively:

;

Wherein, m and n represent the number of subwords contained in the machine translation and the manual reference translation respectively after segmentation using the subword segmentation method SentencePiece algorithm; h ₁ , h ₂ , h _m represent the first subword, the second subword, and the mth subword of the machine translation after segmentation using the subword segmentation method SentencePiece algorithm; r ₁ , r ₂ , r _n represent the first subword, the second subword, and the nth subword of the manual reference translation after segmentation;

Step S22, the cross-language pre-training model XLM-RoBERTa outputs a subword-level feature vector of a machine translation in an independent representation mode and a subword-level feature vector of a manual reference translation in an independent representation mode according to the subword and the position of the subword in the sentence, as shown in formula (1) and formula (2);

(1);

(2);

Wherein, v _h and v _r represent the machine translation subword-level feature vector and the manual reference translation subword-level feature vector under the independent representation mode output by the cross-language pre-training model XLM-RoBERTa, respectively. XLM-RoBERTa(•) represents the cross-language pre-training model XLM-RoBERTa;

Step S23, the sub-word level feature vector of the machine translation under the independent representation mode and the sub-word level feature vector of the manual reference translation under the independent representation mode are interacted using an external attention mechanism to obtain the interactive feature vector of the machine translation under the independent representation mode and the interactive feature vector of the manual reference translation under the independent representation mode, as shown in formula (3) and formula (4):

(3);

(4);

in, is the interactive feature vector of machine translation in the independent representation mode, is the interactive feature vector of the artificial reference translation in the independent representation mode; MultiHead(•) represents the multi-head attention mechanism function, which contains three parameters: query, key and value. When the query is different from the key and value, the multi-head attention mechanism is transformed into an external attention mechanism;

Step S24, connect the interactive feature vector of the machine translation in the independent representation mode and the interactive feature vector of the manual reference translation in the independent representation mode, and perform an average pooling operation to output the sentence-level machine translation quality feature vector in the independent representation mode as shown in formula (5):

(5);

Where vs _hr represents the sentence-level machine translation quality feature vector in the independent representation mode; AvgPooling() represents the average pooling operation;

In step S3, the sentence-level machine translation quality feature vector under the unified representation mode is extracted, which is specifically:

Step S31, the machine translation h and the manual reference translation r are connected by string to obtain a translation joint string, as shown in formula (6):

(6);

Among them, hr is the translation combined string, and “</s>” indicates the start and end characters of the string;

Step S32, input the translation joint string into the cross-language pre-training model XLM-RoBERTa, and the cross-language pre-training model XLM-RoBERTa uses the subword segmentation method SentencePiece algorithm to segment the translation joint string into subwords, and obtain a subword sequence containing p subwords, as shown in formula (7):

(7);

Wherein, p represents the number of subwords contained in the translation joint string after the subword segmentation method SentencePiece algorithm is used; hr ₁ , hr ₂ , hr _p represent the first subword, the second subword, and the pth subword of the translation joint string after the SentencePiece algorithm is used;

Step S33, the cross-language pre-trained model XLM-RoBERTa outputs a subword-level feature vector under a unified representation mode according to the subword and the position of the subword in the sentence, as shown in formula (8):

(8);

Among them, v _hr represents the subword level feature vector under the unified representation mode;

Step S34, the subword level feature vectors in the unified representation mode are interacted using the self-attention mechanism to obtain the interactive feature vector in the unified representation mode, as shown in formula (9):

(9);

in, represents the interactive feature vector in the unified representation mode, MultiHead(•) represents the multi-head attention mechanism function, which contains three parameters: query, key and value. When the query is the same as the key and value, the multi-head attention mechanism is transformed into a self-attention mechanism;

Step S35, average pooling operation is performed on the interactive feature vector under the unified representation mode, and the sentence-level machine translation quality feature vector under the unified representation mode is output, as shown in formula (10):

(10);

Among them, vu _hr represents the sentence-level machine translation quality feature vector under the unified representation mode.

2. According to the method for automatic evaluation of machine translation based on deep cross network in claim 1, it is characterized in that: the training set in step S1 is composed of a plurality of different samples, each sample is specifically:

Given a sample d={(h, r), y} in the training set, d represents a training sample, h represents the machine translation, r represents the manual reference translation, and y represents the human evaluation score of the machine translation h. The human evaluation score is a real value between 0 and 1.

3. According to the method for automatic evaluation of machine translation based on deep cross network in claim 2, it is characterized in that: in step S4, the cross feature vector of machine translation quality is extracted, specifically:

Step S41, concatenate the sentence-level machine translation quality feature vector under the independent representation mode in step S2 and the sentence-level machine translation quality feature vector under the unified representation mode in step S3, as shown in formula (11):

(11);

Among them, x ₀ represents the machine translation concatenation feature vector, and the symbol “⊕” represents the vector concatenation operation;

Step S42, input the machine translation concatenation feature vector into a deep cross network containing 4 stacked cross layers, and output the machine translation quality cross feature vector, as shown in formula (12), formula (13), formula (14) and formula (15):

(12);

(13);

(14);

(15);

Among them, x ₁ , x ₂ , x ₃ , x ₄ are the cross feature vectors output by the first, second, third, and fourth stacked cross layers respectively, and x ₄ is taken as the cross feature vector of machine translation quality; the symbol “ ” is the Hadamard product operation, W ₁ , W ₂ , W ₃ , W ₄ and b ₁ , b ₂ , b ₃ , b ₄ are the linear weight parameters and biases of the first, second, third and fourth stacked cross layers respectively; it should be noted that the deep cross network contains several stacked cross layers.

4. According to the method of automatic evaluation of machine translation based on deep cross network in claim 3, it is characterized in that: the quality of machine translation is predicted in step S5, specifically:

The machine translation quality cross feature vector in step S4 is input into the three-layer feedforward neural network, and the machine translation quality score is output; as shown in formula (16):

(16);

Among them, Score is the quality score of the machine translation, and Feed-Forward is a three-layer feedforward neural network.

5. The method for automatic evaluation of machine translation based on deep cross network according to claim 4, characterized in that: the mean square error loss in step S6 is as shown in formula (17);

(17);

Where MSE represents mean square error loss, N represents the number of samples in the training set, i represents the i-th sample in the training set, y ⁽ⁱ⁾ represents the human evaluation score of the machine translation of the i-th sample in the training set, and Score ⁽ⁱ⁾ represents the quality score of the machine translation predicted by the i-th sample.

6. According to claim 5, a method for automatic evaluation of machine translation based on deep cross network is characterized in that it also includes the following steps:

Step S7, normalizing the machine translation to be evaluated and the manual reference translation of the machine translation to be evaluated;

Step S8, inputting the machine translation to be evaluated after the normalization processing in step S7 and the manual reference translation of the machine translation to be evaluated into the machine translation automatic evaluation model based on the deep cross network trained in step S6 to predict the quality score of the machine translation;

Step S9, calculating the cosine similarity between the machine translation sentence vector representation and the manual reference translation sentence vector representation based on the large language model vector library; inputting the machine translation to be evaluated and the manual reference translation of the machine translation to be evaluated after the normalization processing in step S7 into the large language model vector library respectively, directly outputting the machine translation sentence vector representation and the manual reference translation sentence vector representation, and calculating the cosine similarity between the machine translation sentence vector representation and the manual reference translation sentence vector representation;

Step S10, calculating the final predicted score of the machine translation quality; linearly weighting the machine translation quality score predicted in step S8 and the cosine similarity of the machine translation sentence vector representation based on the large language model vector library and the artificial reference translation sentence vector representation in step S9 to obtain the final predicted score of the machine translation quality.

7. According to the method of automatic evaluation of machine translation based on deep cross network in claim 6, it is characterized in that: in step S9, the cosine similarity between the machine translation sentence vector representation based on the large language model vector library and the artificial reference translation sentence vector representation is calculated, specifically:

The machine translation to be evaluated and the manual reference translation of the machine translation to be evaluated after the normalization processing in step S7 are respectively input into the large language model vector database Chromadb, and the machine translation sentence vector representation and the manual reference translation sentence vector representation are directly output, as shown in formula (18) and formula (19);

(18);

(19);

Wherein, E _h , E _r represent the vector representation of machine translation sentence and the vector representation of manual reference translation sentence respectively, Chromadb _zephyr-7b () represents the Chromadb vector database output function of the large model based on the large language model zephyr-7b;

Calculate the cosine similarity between the machine translation sentence vector representation and the manual reference translation sentence vector representation, as shown in formula (20);

(20);

Among them, CosSim(h, r) represents the cosine similarity between the machine translation sentence vector representation and the manual reference translation sentence vector representation, and T represents the transposition operation.

8. According to the method of automatic evaluation of machine translation based on deep cross network in claim 7, it is characterized in that: the final predicted score of the machine translation quality is calculated in step S10, specifically:

The machine translation quality score predicted in step S8 and the cosine similarity between the machine translation sentence vector representation based on the large language model vector library and the manual reference translation sentence vector representation in step S9 are linearly weighted, as shown in formula (21);

(twenty one);

in, represents the final predicted score of the machine translation quality; Score represents the machine translation quality score predicted in step S8; 0.8 is the linear interpolation weight, which is obtained based on experimental experience.