CN114444514B

CN114444514B - Semantic matching model training method, semantic matching method and related device

Info

Publication number: CN114444514B
Application number: CN202210117994.XA
Authority: CN
Inventors: 颜璟; 陈艳; 刘璟
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2022-02-08
Filing date: 2022-02-08
Publication date: 2023-01-24
Anticipated expiration: 2042-02-08
Also published as: CN114444514A

Abstract

The invention provides a semantic matching model training method, a semantic matching device, electronic equipment, a computer readable storage medium and a computer program product, and relates to the technical field of artificial intelligence such as deep learning, natural language processing and language recognition. One embodiment of the method comprises: after a training sample set is obtained, determining biased training samples in the training sample set based on sample common words included in the training samples, and continuously training an initial semantic matching model by using the training samples in the training sample set, wherein the probability of extracting non-biased training samples in a pre-set turn of training is higher than that of the biased training samples, the non-biased training samples are training samples in the training sample set except the biased training samples, and the final training turn responds when reaching a preset target turn to generate the semantic matching model. The semantic matching model provided by the embodiment can be applied to more accurately generate semantic similar matching results.

Description

Semantic matching model training method, semantic matching method and related device

Technical Field

The present disclosure relates to the field of artificial intelligence technologies such as deep learning, natural language processing, and language recognition, and in particular, to a semantic matching model training method, and a corresponding apparatus, electronic device, computer-readable storage medium, and computer program product.

Background

The problem matching task aims to judge whether the semantics of two natural question sentences are equivalent and similar, is an important research direction in the field of natural language processing, has high commercial value simultaneously, and plays an important role in the fields of information retrieval, intelligent customer service and the like.

In recent years, a problem matching technology based on deep learning has been greatly improved, and the main principle is that a semantic matching model for semantic matching learns knowledge from a training sample in advance, and then judges whether two received problems are matched or not by judging whether semantics are similar or not according to the learned knowledge.

Disclosure of Invention

The embodiment of the disclosure provides a semantic matching model training method, a semantic matching device, an electronic device, a computer-readable storage medium and a computer program product.

In a first aspect, an embodiment of the present disclosure provides a semantic matching model training method, including: acquiring a training sample set; determining a training sample including a target keyword in the training sample set as a biased training sample, wherein the target keyword is determined based on sample common words included in the training sample; acquiring an initial semantic matching model, and continuously training the initial semantic matching model by using training samples in the training sample set, wherein the probability of extracting non-biased training samples in a pre-set turn of training is higher than that of the biased training samples, and the non-biased training samples are training samples except the biased training samples in the training sample set; and generating a semantic matching model in response to the training round reaching a preset target round.

In a second aspect, an embodiment of the present disclosure provides a semantic matching model training apparatus, including: a training sample set acquisition unit configured to acquire a training sample set, wherein training samples in the training sample set are paired sentences marked with similar or dissimilar semantemes; a biased training sample determination unit configured to determine a training sample including a target keyword in the training sample set as a biased training sample, wherein the target keyword is determined based on sample common words included in the training sample; a semantic matching model training unit configured to obtain an initial semantic matching model and continuously train the initial semantic matching model by using training samples in the training sample set, wherein the probability that non-biased training samples are extracted in a previous preset turn of training is higher than that of the biased training samples, and the non-biased training samples are training samples in the training sample set except the biased training samples; a semantic matching model generation unit configured to generate a semantic matching model in response to the training round reaching a preset target round.

In a third aspect, an embodiment of the present disclosure provides a semantic matching method, including: acquiring a first sentence to be matched and a second sentence to be matched; inputting the first sentence to be matched and the second sentence to be matched into a semantic matching model for processing, wherein the semantic matching model is obtained by training based on a training sample set comprising biased training samples, the biased training samples are determined based on the training samples comprising target keywords in the training sample set, and the target keywords are determined based on sample common words included in the training samples; and generating semantic matching results of the first sentence to be matched and the second sentence to be matched according to the semantic matching result output by the semantic matching model, wherein the semantic matching model is obtained according to a semantic matching model training method described in any one implementation mode in the first aspect.

In a fourth aspect, an embodiment of the present disclosure provides a semantic matching apparatus, including: a sentence to be matched acquisition unit configured to acquire a first sentence to be matched and a second sentence to be matched; a semantic similarity matching unit configured to input the first sentence to be matched and the second sentence to be matched into a semantic matching model for processing, wherein the semantic matching model is obtained by training based on a training sample set including biased training samples, the biased training samples are determined based on training samples including target keywords in the training sample set, and the target keywords are determined based on sample common words included in the training samples; a matching result output unit configured to generate semantic matching results of the first sentence to be matched and the second sentence to be matched according to the semantic matching result output by the semantic matching model, wherein the semantic matching model is obtained according to the semantic matching model training device described in any one of the implementation manners of the second aspect.

In a fifth aspect, an embodiment of the present disclosure provides an electronic device, including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to cause the at least one processor to perform the method for training a semantic matching model as described in any of the implementations of the first aspect or the method for semantic matching as described in any of the implementations of the third aspect when executed.

In a sixth aspect, the disclosed embodiments provide a non-transitory computer-readable storage medium storing computer instructions for enabling a computer to implement the semantic matching model training method as described in any one of the implementations of the first aspect or the semantic matching method as described in any one of the implementations of the third aspect when executed.

In a seventh aspect, the embodiments of the present disclosure provide a computer program product including a computer program, which when executed by a processor can implement the semantic matching model training method as described in any one of the implementations of the first aspect or the semantic matching method as described in any one of the implementations of the third aspect.

The semantic matching model training and semantic matching method provided by the embodiment of the disclosure can classify training samples based on sample common words included in the training samples, divide the training samples in the training sample set into biased training samples and non-biased training samples with different learning difficulties, and emphasize training the semantic matching model by using the non-biased training samples with high learning difficulty in the initial training stage of the semantic matching model, so that the technical problem of shortcut learning of the semantic matching model caused by too low difficulty of the training samples is solved, and the quality of the trained semantic matching model is improved, so that the semantic matching model can be used subsequently to generate a semantic similar matching result more accurately.

It should be understood that the statements in this section are not intended to identify key or critical features of the embodiments of the present disclosure, nor are they intended to limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.

Drawings

Other features, objects and advantages of the disclosure will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings in which:

FIG. 1 is an exemplary system architecture to which the present disclosure may be applied;

FIG. 2 is a flowchart of a semantic matching model training method provided by an embodiment of the present disclosure;

FIG. 3 is a flow chart of one implementation of determining biased training samples provided by an embodiment of the present disclosure;

FIG. 4 is a flowchart of another semantic matching model training method provided by the embodiments of the present disclosure;

FIG. 5 is an effect diagram of a semantic matching model training method under an application scenario according to the embodiment of the present disclosure;

fig. 6 is a block diagram of a semantic matching model training apparatus according to an embodiment of the present disclosure;

fig. 7 is a block diagram of a semantic matching apparatus according to an embodiment of the present disclosure;

fig. 8 is a schematic structural diagram of an electronic device adapted to execute a semantic matching model training method and/or a semantic matching method according to an embodiment of the present disclosure.

Detailed Description

Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the disclosure are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness. It should be noted that, in the present disclosure, the embodiments and features of the embodiments may be combined with each other without conflict.

In the technical scheme of the disclosure, the acquisition, storage, application and the like of the personal information of the related user all accord with the regulations of related laws and regulations, necessary security measures are taken, and the customs of the public order is not violated.

FIG. 1 illustrates an exemplary system architecture 100 to which embodiments of the present application for training semantic matching models, and semantic matching methods, apparatus, electronic devices, and computer-readable storage media may be applied.

As shown in fig. 1, the system architecture 100 may include

terminal devices

101, 102, 103, a network 104, and a server 105. The network 104 serves as a medium for providing communication links between the

terminal devices

101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.

The user may use the

terminal devices

101, 102, 103 to interact with the server 105 via the network 104 to receive or send messages or the like. Various applications for realizing information communication between the

terminal devices

101, 102, 103 and the server 105, such as a semantic matching query application, an information retrieval application, an online shopping application, etc., may be installed on the

terminal devices

101, 102, 103 and the server.

The

terminal apparatuses

101, 102, 103 and the server 105 may be hardware or software. When the

terminal devices

101, 102, 103 are hardware, they may be various electronic devices with display screens, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like; when the

terminal devices

101, 102, and 103 are software, they may be installed in the electronic devices listed above, and they may be implemented as multiple software or software modules, or may be implemented as a single software or software module, and are not limited in this respect. When the server 105 is hardware, it may be implemented as a distributed server cluster composed of multiple servers, or may be implemented as a single server; when the server is software, the server may be implemented as a plurality of software or software modules, or may be implemented as a single software or software module, which is not limited herein.

The server 105 can provide various services through various built-in applications, taking a semantic matching application which can provide a function for judging whether the information and the semantics are similar as an example, when the server 105 runs the information retrieval application, the following effects can be achieved: the server 105 receives a first sentence to be matched and a second sentence to be matched, which are transmitted by a user through the

terminal equipment

101, 102 and 103; then, the server 105 inputs the first sentence to be matched and the second sentence to be matched into a semantic matching model for processing, wherein the semantic matching model is obtained by training based on a training sample set including biased training samples, the biased training samples are determined based on the training samples including target keywords in the training sample set, and the target keywords are determined based on sample common words included in the training samples; finally, the server 105 generates semantic matching results of the first sentence to be matched and the second sentence to be matched according to the semantic matching result output by the semantic matching model.

The semantic matching model can be obtained by training a semantic matching model training application built in the server 105 according to the following steps: firstly, acquiring a training sample set; then, determining a training sample comprising a target keyword in the training sample set as a biased training sample, wherein the target keyword is determined based on sample common words included in the training sample; further, an initial semantic matching model is obtained, and training samples in the training sample set are continuously used for continuously training the initial semantic matching model, wherein the probability that non-biased training samples are extracted in the previous preset turn of training is higher than that of the biased training samples, and the non-biased training samples are training samples except the biased training samples in the training sample set; and finally, responding when the training round reaches a preset target round to generate a semantic matching model.

Since the semantic matching model obtained by training needs to occupy more computation resources and stronger computation capability, the semantic matching model training method provided in the following embodiments of the present application is generally executed by the server 105 having stronger computation capability and more computation resources, and accordingly, the semantic matching model training apparatus is generally also disposed in the server 105. However, it should be noted that when the

terminal devices

101, 102, and 103 also have computing capabilities and computing resources meeting the requirements, the

terminal devices

101, 102, and 103 may also complete the above-mentioned operations performed by the server 105 through the semantic matching model training application installed thereon, and then output the same result as the server 105. Correspondingly, the semantic matching model training device may also be disposed in the

terminal equipment

101, 102, 103. In such a case, the exemplary system architecture 100 may also not include the server 105 and the network 104.

Of course, the server used for training the semantic matching model may be different from the server used for calling the trained semantic matching model. Particularly, the semantic matching model trained by the server 105 may also obtain a lightweight semantic matching model suitable for being placed in the

terminal devices

101, 102, and 103 in a model distillation manner, that is, the lightweight semantic matching model in the

terminal devices

101, 102, and 103 may be flexibly selected for use according to the recognition accuracy of the actual requirement, or the more complex semantic matching model in the server 105 may be selected for use.

It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.

Referring to fig. 2, fig. 2 is a flowchart of a semantic matching model training method according to an embodiment of the disclosure, where the process 200 includes the following steps:

step 201, a training sample set is obtained.

In this embodiment, a set of training samples is obtained by an executive (e.g., the server 105 shown in fig. 1) of the semantic matching model training method, and the training samples in the set of training samples are usually paired sentences labeled with tokens with similar or dissimilar semantics.

In some alternative embodiments, the training samples in the training sample set may also include multiple sentences, and this disclosure is not limited in any way.

Further, the semantic similar or dissimilar identification for marking the training sample may also be in the form of a first identification, a second identification, and the like, and in practice, it is usually determined whether the sentences are semantically similar or dissimilar according to the similarity degree, the same feature quantity, and the like between the sentences included in the training sample, and the training sample is correspondingly marked according to the determined result to mark whether the semantics between the sentences included in the training sample are similar (i.e., the semantics are similar or the semantics are dissimilar).

It should be noted that the training sample set may be obtained by the execution subject directly from a local storage device, or may be obtained from a non-local storage device (e.g.,

terminal devices

101, 102, 103 shown in fig. 1). The local storage device may be a data storage module arranged in the execution main body, such as a server hard disk, in which case the training sample set can be quickly read locally; the non-local storage device may also be any other electronic device arranged to store data, such as some user terminals, etc., in which case the executing entity may obtain the desired set of training samples by sending a obtaining command to the electronic device.

Step 202, determining the training samples including the target keywords in the training sample set as biased training samples.

In this embodiment, specific contents of sentences in training samples in a training sample set are obtained, sample common words appearing in different sentences for multiple times are determined from the extracted specific contents of the sentences, whether an association relationship exists between the sample common words and a result of a mark is determined based on a mark (similar or dissimilar in semantic meaning) corresponding to a training sample to which the sentence including the sample common words belongs, the sample common words having the association relationship are determined as target keywords, and the training sample to which the sentence including the target keywords belongs is determined as a biased training sample.

Whether the incidence relation exists between the sample common words and the marked results is determined based on the identifications (similar or dissimilar semantics) corresponding to the training samples to which the sentences including the sample common words belong, whether the incidence relation exists between the sample common words and the marked results can be determined based on whether the difference between the similar semantics identifications and the dissimilar semantics corresponding to the training samples to which the sentences including the sample common words belong meets a preset difference threshold, and in practice, the difference threshold can be preset based on the number of the training samples included in the training sample set.

Illustratively, if the number of training samples marked as semantically similar marks including the sample common word is "10", the number of training samples marked as semantically dissimilar marks is "1", the number of the difference between the two training samples is "9", the difference threshold is determined to be 5 according to the number of training samples "20" included in the training sample set, and since the number of the difference between the two training samples "9" exceeds the difference threshold "5", it is determined whether there is an association relationship between the sample common word and the marked result, and the sample common word is determined to be the target keyword.

Furthermore, the sample common words can be screened based on the proportion of the training samples including the sample common words in the training sample set, so that the reference value of the determined sample common words is further improved, and the determination deviation caused by multiple occurrences of specific content in a small part of the training samples is avoided.

Step 203, obtaining an initial semantic matching model, and continuously using the training samples in the training sample set to continuously train the initial semantic matching model.

In this embodiment, an initial Semantic matching model is obtained, which may generally be a model that can be used for performing Semantic comparison and generating Semantic similarity, such as a Dynamic State Space Model (DSSM), a Deep Semantic matching model (CLSM), a Long Short-Term Memory model (LSTM), and the like, after the initial Semantic matching model is obtained, training samples (biased training samples or non-biased training samples) are sequentially extracted from a training sample set as input, and corresponding Semantic similarities or dissimilarities of the training samples are used as output, and the initial model is continuously trained, where a probability that a non-biased training sample is extracted within a preset round before training is higher than that of the biased training sample, and the non-biased training sample is a training sample other than the biased training sample in the training sample set.

The preset turn can be adaptively determined based on the number of combinations of training samples and the type of the selected initial semantic matching model.

And step 204, generating a semantic matching model in response to the training round reaching a preset target round.

In this embodiment, after the training samples are extracted from the training sample set and the round of training the initial semantic matching model reaches the preset target round, the training of the initial semantic matching model is stopped to obtain the semantic matching model.

The preset target round can be adaptively determined based on the type of the selected initial semantic matching model, and in practice, the change of a loss function of the initial semantic matching model in the training process can be monitored to determine whether the current training round meets the requirement of the preset target round.

The semantic matching model training method provided by the embodiment of the disclosure can classify training samples based on sample common words included in the training samples, divide the training samples in a training sample set into biased training samples and non-biased training samples with different learning difficulties, and emphasize on training the semantic matching model by using the non-biased training samples with high learning difficulty in the initial training stage of the semantic matching model, thereby relieving the technical problem of semantic matching model shortcut learning caused by too low training sample difficulty.

In some optional implementation manners of this embodiment, the training method for semantic matching models further includes: in response to the number of training samples included in the set of training samples being lower than the preset target round, resetting the set of training samples after the extraction of all training samples in the set of training samples is completed.

Specifically, the response is performed when the number of training samples included in the training sample set is less than the target round, and the training sample set is reset after all training samples in the training sample set are extracted, so that an executive body of the language matching model training method can use the training samples in the training sample set again to train the initial semantic matching model which is not trained to the preset target round, and the semantic matching model training obstacle caused by the insufficient number of the training samples in the training sample set is avoided.

In some optional implementations of this embodiment, in response to the set of training samples being reset, further comprising: and continuously extracting the training sample from the training sample set by using the equal probability as input, and continuously training the initial semantic matching model by using the identifier corresponding to the training sample as output.

Specifically, after the training sample set is reset, the extraction rule of the training samples in the training sample set can be adjusted, the training samples are continuously extracted from the training sample set with equal probability as input, and the identifications corresponding to the training samples are used as output to continuously train the unfinished initial semantic matching model, so that the training strategy can be adjusted after the training of the non-biased training samples tends to be performed, the problem of shortcut learning of the semantic matching model caused by the single training strategy is avoided, and the quality of the obtained semantic matching model is further improved.

Referring to fig. 3, fig. 3 is a flowchart for determining a training sample including a target keyword in a training sample set as a biased training sample according to an embodiment of the present disclosure, that is, a specific implementation manner is provided for step 202 in the flow 200 shown in fig. 2, other steps in the flow 200 are not adjusted, and the specific implementation manner provided in this embodiment is used to replace step 201 to obtain a new complete embodiment, where the training sample is a pair of sentences marked with identifiers having similar or dissimilar semantics. Wherein the process 300 comprises the following steps:

step 301, performing word segmentation processing on each training sample included in the training sample set, and generating a word segmentation result set after collecting word segmentation results of each training sample.

Specifically, after a training sample set is obtained, word segmentation processing is performed on each training sample included in the training sample set, word segmentation results of sentences in each training sample are generated and then collected, and a word segmentation result set is generated.

Step 302, in response to the sample common words with the occurrence times exceeding the preset frequency threshold value existing in the word segmentation result set, generating a quantity ratio between the first training sample and the second training sample including the sample common words.

Specifically, the content included in the word segmentation result set is analyzed, after a word segmentation result with the occurrence frequency exceeding a preset frequency threshold exists, the word segmentation result is determined as a "sample common word", training samples including the "sample common word" are respectively obtained, training samples marked as semantically similar identifications are determined as first training samples, training samples marked as semantically dissimilar identifications are determined as second training samples, and then the number of the first training samples and the number of the second training samples are obtained, so that the number ratio between the first training samples and the second training samples is generated.

Step 303, in response to the number ratio exceeding the preset ratio threshold, determining the sample common words as target keywords, and determining the training samples including the target keywords as biased training samples.

Specifically, when the quantity ratio exceeds a preset ratio threshold, the sample common words are determined as target keywords, and each training sample including the target keywords is determined as a biased training sample, wherein the preset ratio threshold is usually a range threshold, that is, an upper threshold and a lower threshold are included at the same time, and when the obtained ratio threshold is lower than the lower threshold or higher than the upper threshold, the corresponding sample common words are determined as the target keywords.

In this implementation manner, the difference between the first training sample and the second training sample can be determined based on a proportional threshold, so as to avoid negative effects caused by unreasonable gap threshold setting, and further improve the generation quality of the target keywords and the biased training samples.

Further referring to fig. 4, fig. 4 is a flowchart of another semantic matching model training method provided in the embodiment of the present disclosure, where the process 400 includes the following steps:

step 401, a training sample set is obtained.

Step 402, determining the training samples including the target keywords in the training sample set as biased training samples.

Step 403, obtaining an initial semantic matching model.

In step 404, the training sample set is divided into a biased training sample set and a non-biased training sample set.

In this embodiment, a biased training sample set is generated based on training samples of the training sample set that are determined to be biased training samples, and a non-biased training sample set is generated based on training samples of the training sample set that are not determined to be biased training samples.

Step 405, generating a set selection sequence based on a preset set extraction function.

In this embodiment, a set selection sequence is determined by using a preset set function, and training samples are recorded in each ordinal of the set selection sequence, wherein the training samples are from a biased training sample set or a non-biased training sample set, and the number of the non-biased training sample sets included in the preset ordinal before the set selection sequence is greater than that of the biased training sample set.

In some optional implementations of this embodiment, the set decimation function may be:

k＝100-(α×i)

wherein, alpha is a hyper parameter determined according to the selected initial semantic matching model, i is the sequence number in the set selection sequence, and k is the selection parameter of the set selection sequence.

After the selection parameter k is determined based on the mode, randomly selecting an integer r from [1,100], if the extracted integer r is more than or equal to k, determining that the i-order bit is selected as a biased training sample set, and if the extracted integer r is less than k, determining that the i-order bit is selected as a non-biased training sample set.

And 406, continuously extracting training samples from the biased training sample set and the biased training sample set according to the set selection sequence as input, and continuously training the initial semantic matching model by taking the identifier corresponding to the training sample as output.

In this embodiment, training samples are continuously extracted from the biased training sample set and the biased training sample set based on the set selection sequence determined in step 405 as input, and the initial semantic matching model is continuously trained with the identifier corresponding to the training sample as output.

Illustratively, when the set selection sequence is a non-biased training sample set, a non-biased training sample set and a biased training sample set, a training sample is extracted from the non-biased training sample set to train an initial training model, then a training sample is extracted from the non-biased training sample set to train the initial training model obtained after the previous training, and finally a training sample is extracted from the biased training sample set to train the initial training model obtained after the previous training.

The above steps 401 to 403 correspond to the contents shown in the embodiment shown in fig. 2, and please refer to the corresponding parts of the previous embodiment for the same contents, which are not repeated here, in this embodiment, on the basis of the embodiment shown in fig. 2, further, the extraction and the sorting of biased training samples and non-biased training samples can be completed based on the set extraction function, and the extraction efficiency can be improved, and meanwhile, the training sample selection condition during the training of the semantic matching model can be fed back by the set selection sequence, so as to optimize the training process of the semantic matching model and improve the training quality of the semantic matching model.

In some optional implementation manners of this embodiment, the training method for semantic matching models further includes: and responding to the received set selection sequence query request, and feeding back the set selection sequence aiming at the set selection sequence query request.

Specifically, after receiving the set selection sequence query request, responding, and returning the set selection sequence generated based on the preset set extraction function, so that the sending main body of the set selection sequence query request can know the training process of the semantic matching model according to the set selection sequence, and then adjust and decide the set selection sequence according to actual requirements, thereby improving the training quality of the semantic matching model.

In order to highlight the effect of the trained semantic matching model from the actual use scene as much as possible, the present disclosure also specifically provides a scheme for solving the actual problem by using the trained semantic matching model, and the semantic matching method includes the following steps: inputting the first sentence to be matched and the second sentence to be matched into a semantic matching model for processing, wherein the semantic matching model is obtained by training based on a training sample set comprising biased training samples, the biased training samples are determined based on the training samples comprising target keywords in the training sample set, and the target keywords are determined based on sample common words included in the training samples; and generating semantic matching results of the first sentence to be matched and the second sentence to be matched according to the semantic matching result output by the semantic matching model.

In order to deepen understanding, the disclosure further provides a specific implementation scheme in combination with a specific application scenario, specifically as follows:

acquiring a training sample set generated based on paired sentence training samples marked with similar or dissimilar semantically marked identifications, wherein the training samples marked as 'semantically similar' are included: "where we go-where you play better", "where we can go on weekends-where you can go on weekends want to go on playing better", "what you eat in the evening-where you can go to eat in the evening", and training samples marked as "semantically dissimilar" where we can go on weekends-where you can go out to play in the spring is not costly "," what you can have a good meal in the morning-where you can go.

Performing word segmentation processing on each training sample included in the training sample set, collecting word segmentation results of each training sample, and then generating a word segmentation result set, wherein when a sample common word with the occurrence frequency exceeding a preset frequency threshold value (4 times) exists in the word segmentation results ("play"), a number ratio between a first training sample including the sample common word and a second training sample ("2.

When the number ratio ("2") exceeds a preset ratio threshold ("1.5"), the sample common word ("play") is determined as a target keyword, and training samples including the target keyword ("where we go-where play is better," "where we can go on weekends-where go to play is better on weekends" and "where can go on weekends-going out play is more costly") are determined as biased training samples.

Obtaining an initial semantic matching model, continuously extracting the training samples from the training sample set (the probability of extracting non-biased training samples in the first 3 rounds of training is higher than that of extracting biased training samples), taking the corresponding identifications of the training samples as outputs, and continuously training the initial semantic matching model.

Further, in order to better explain the technical effect of the training method of the semantic matching model provided by the present disclosure, reference may be made to fig. 5, which shows the change of the loss value of the semantic matching model along with the training algebra (Epoch) in the training process under the condition that the initial semantic matching model is trained according to the sequence of biased training samples-non-biased training samples (curve 1), according to the sequence of non-biased training samples (curve 2), and the initial semantic matching model is trained without distinguishing biased training samples from non-biased training samples (curve 3).

With further reference to fig. 6 and 7, as an implementation of the methods shown in the above figures, the present disclosure provides an embodiment of a semantic matching model training apparatus and an embodiment of a semantic matching apparatus, respectively, where the embodiment of the semantic matching model training apparatus corresponds to the embodiment of the semantic matching model training method shown in fig. 2, and the embodiment of the semantic matching apparatus corresponds to the embodiment of the semantic matching method. The device can be applied to various electronic equipment.

As shown in fig. 6, the semantic matching model training apparatus 600 of the present embodiment may include: training sample set acquisition section 601, biased training sample determination section 602, semantic matching model training section 603, and semantic matching model generation section 604. Wherein, the training sample set obtaining unit 601 is configured to obtain a training sample set; a biased training sample determination unit 602 configured to determine a training sample including a target keyword in the training sample set as a biased training sample, wherein the target keyword is determined based on sample common words included in the training sample; a semantic matching model training unit 603 configured to obtain an initial semantic matching model, and continuously train the initial semantic matching model using training samples in the training sample set, wherein a probability that a non-biased training sample is extracted in a previous preset round of training is higher than that of the biased training sample, and the non-biased training sample is a training sample other than the biased training sample in the training sample set; a semantic matching model generating unit 604 configured to generate a semantic matching model in response to the training round reaching a preset target round.

In this embodiment, in the semantic matching model training apparatus 600: the specific processing of the training sample set obtaining unit 601, the biased training sample determining unit 602, the semantic matching model training unit 603, and the semantic matching model generating unit 604 and the technical effects thereof can refer to the related descriptions of steps 201 to 204 in the corresponding embodiment of fig. 2, which are not repeated herein.

In some optional implementations of this embodiment, the training samples are paired sentences labeled with tokens having similar or dissimilar semantics, and the biased training sample determining unit 602 includes: the word segmentation result set generation subunit is configured to perform word segmentation processing on each training sample included in the training sample set, and generate a word segmentation result set after collecting word segmentation results of each training sample; a sample number proportion generation subunit configured to generate a number proportion between a first training sample and a second training sample including the sample common word in response to the sample common word having the occurrence frequency exceeding a preset frequency threshold value existing in the word segmentation result set, wherein the first training sample is labeled with a semantic similar identifier, and the second training sample is labeled with a semantic dissimilar identifier; and a biased training sample determining subunit configured to determine the sample common word as a target keyword and determine a training sample including the target keyword as a biased training sample in response to the number ratio exceeding a preset ratio threshold.

In some optional implementations of this embodiment, the semantic matching model training unit 603 includes: an initial semantic matching model obtaining subunit configured to obtain an initial semantic matching model; a sample set classification subunit configured to classify the training sample set into a biased training sample set and a non-biased training sample set; the set selection sequence generation subunit is configured to generate a set selection sequence based on a preset set extraction function, wherein the number of non-biased training sample sets included in a preset sequence before the set selection sequence is more than that of biased training sample sets; and the semantic matching model training subunit is configured to continuously extract the training samples from the biased training sample set and the biased training sample set as input according to the set selection sequence, and continuously train the initial semantic matching model by taking the identifications corresponding to the training samples as output.

In some optional implementations of this embodiment, the semantic matching model training apparatus 600 further includes: and the set selection sequence returning unit is configured to respond to the received set selection sequence query request and feed back the set selection sequence aiming at the set selection sequence query request.

In some optional implementations of this embodiment, the semantic matching model training apparatus 600 further includes: and the training sample set resetting unit is configured to respond to the condition that the number of the training samples included in the training sample set is lower than the preset target turn, and reset the training sample set after all the training samples in the training sample set are extracted.

In some optional implementations of this embodiment, in response to the training sample set being reset, the semantic matching model training unit 603 is further configured to continuously extract the training samples from the training sample set as input using equal probabilities, and the corresponding identifiers of the training samples as output, and continuously train the initial semantic matching model.

As shown in fig. 7, the semantic matching apparatus 700 of the present embodiment may include: a sentence to be matched obtaining unit 701, a semantic similarity matching unit 702 and a matching result output unit 703. The sentence to be matched obtaining unit 701 is configured to obtain a first sentence to be matched and a second sentence to be matched; a semantic similarity matching unit 702 configured to input the first sentence to be matched and the second sentence to be matched into a semantic matching model for processing, where the semantic matching model is trained based on a training sample set including biased training samples, the biased training samples are determined based on training samples including target keywords in the training sample set, and the target keywords are determined based on sample common words included in the training samples; a matching result output unit 703 configured to generate semantic matching results of the first sentence to be matched and the second sentence to be matched according to the semantic matching result output by the semantic matching model, wherein the semantic matching results include semantic similarity or semantic dissimilarity.

In the present embodiment, in the semantic matching apparatus 700: the specific processing of the statement to be matched acquisition unit 701, the semantic similarity matching unit 702, and the matching result output unit 703 and the technical effects brought by the processing may respectively correspond to the related descriptions in the method embodiments, and are not described herein again.

The semantic matching model training device and the semantic matching device provided by the embodiment can classify training samples based on sample common words included in the training samples, divide the training samples in a training sample set into biased training samples and non-biased training samples with different learning difficulties, and emphasize training the semantic matching model by using the non-biased training samples with high learning difficulty in the initial training stage of the semantic matching model, so that the technical problem of shortcut learning of the semantic matching model caused by too low difficulty of the training samples is solved, the quality of the trained semantic matching model is improved, and the semantic matching model can be used for generating a semantic similar matching result more accurately.

According to an embodiment of the present disclosure, the present disclosure also provides an electronic device including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor, the instructions being executable by the at least one processor to enable the at least one processor to implement the semantic matching model training method and/or the semantic matching method described in any of the above embodiments when executed.

According to an embodiment of the present disclosure, the present disclosure further provides a readable storage medium storing computer instructions for enabling a computer to implement the semantic matching model training method and/or the semantic matching method described in any of the above embodiments when executed.

The disclosed embodiments provide a computer program product, which when executed by a processor can implement the semantic matching model training method and/or the semantic matching method described in any of the above embodiments.

FIG. 8 shows a schematic block diagram of an example electronic device 800 that may be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. Electronic devices may also represent various forms of mobile devices, such as personal digital processors, cellular telephones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein.

As shown in fig. 8, the apparatus 800 includes a computing unit 801 that can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM) 802 or a computer program loaded from a storage unit 808 into a Random Access Memory (RAM) 803. In the RAM 803, various programs and data necessary for the operation of the device 800 can also be stored. The calculation unit 801, the ROM 802, and the RAM 803 are connected to each other by a bus 804. An input/output (I/O) interface 805 is also connected to bus 804.

A number of components in the device 800 are connected to the I/O interface 805, including: an input unit 806, such as a keyboard, a mouse, or the like; an output unit 807 such as various types of displays, speakers, and the like; a storage unit 808, such as a magnetic disk, optical disk, or the like; and a communication unit 809 such as a network card, modem, wireless communication transceiver, etc. The communication unit 809 allows the device 800 to exchange information/data with other devices via a computer network such as the internet and/or various telecommunication networks.

Computing unit 801 may be a variety of general and/or special purpose processing components with processing and computing capabilities. Some examples of the computing unit 801 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various dedicated Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and the like. The calculation unit 801 performs the respective methods and processes described above, such as the semantic matching model training method and/or the semantic matching method. For example, in some embodiments, the semantic matching model training method and/or the semantic matching method may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as storage unit 808. In some embodiments, part or all of a computer program may be loaded onto and/or installed onto device 800 via ROM 802 and/or communications unit 809. When loaded into RAM 803 and executed by computing unit 801, a computer program may perform one or more steps of the semantic matching model training method and/or semantic matching method described above. Alternatively, in other embodiments, the computing unit 801 may be configured to perform the semantic matching model training method and/or the semantic matching method in any other suitable manner (e.g., by way of firmware).

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.

Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program code, when executed by the processor or controller, causes the functions/acts specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the Internet.

The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The Server may be a cloud Server, which is also called a cloud computing Server or a cloud host, and is a host product in a cloud computing service system, so as to solve the defects of high management difficulty and weak service extensibility in the conventional physical host and Virtual Private Server (VPS) service.

According to the technical scheme of the embodiment of the disclosure, training samples can be classified based on sample common words included in the training samples, the training samples in the training sample set are divided into biased training samples and non-biased training samples with different learning difficulties, the non-biased training samples with high learning difficulty are used for training the semantic matching model in the initial training stage of the semantic matching model, the technical problem of shortcut learning of the semantic matching model caused by too low difficulty of the training samples is solved, the quality of the semantic matching model obtained by training is improved, and the semantic matching model is used for generating a semantic similar matching result more accurately in the subsequent stage.

It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present disclosure may be executed in parallel, sequentially, or in different orders, as long as the desired results of the technical solutions disclosed in the present disclosure can be achieved, and the present disclosure is not limited herein.

The above detailed description should not be construed as limiting the scope of the disclosure. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made, depending on design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present disclosure should be included in the scope of protection of the present disclosure.

Claims

1. A semantic matching model training method comprises the following steps:

acquiring a training sample set;

performing word segmentation processing on each training sample included in the training sample set, and generating a word segmentation result set after collecting word segmentation results of each training sample; wherein the training samples are paired sentences marked with similar or dissimilar semanteme marks;

in response to the existence of a sample common word with the occurrence frequency exceeding a preset frequency threshold in the word segmentation result set, generating a quantity ratio between a first training sample and a second training sample of the sample common word, wherein the first training sample is marked with the semantically similar identifications, and the second training sample is marked with the semantically dissimilar identifications;

in response to the fact that the quantity proportion exceeds a preset proportion threshold value, determining the sample common words as target keywords, and determining training samples including the target keywords as biased training samples;

obtaining an initial semantic matching model, and continuously training the initial semantic matching model by using training samples in the training sample set, wherein the probability of extracting non-biased training samples in a preset turn before training is higher than that of the biased training samples, and the non-biased training samples are training samples in the training sample set except the biased training samples;

and generating a semantic matching model in response to the training round reaching a preset target round.

2. The method of claim 1, wherein the obtaining an initial semantic matching model and continuously training the initial semantic matching model using training samples of the set of training samples continuously comprises:

acquiring an initial semantic matching model;

dividing the training sample set into a biased training sample set and a non-biased training sample set;

generating a set selection sequence based on a preset set extraction function, wherein the number of the non-biased training sample sets included in a preset sequence before the set selection sequence is more than that of the biased training sample sets;

and continuously extracting the training samples from the biased training sample set and the biased training sample set according to the set selection sequence to be used as input, and continuously training the initial semantic matching model by using the identifications corresponding to the training samples as output.

3. The method of claim 2, further comprising:

and responding to a received set selection sequence query request, and feeding back the set selection sequence aiming at the set selection sequence query request.

4. The method of claim 1, further comprising:

in response to the number of training samples included in the training sample set being lower than the preset target round, resetting the training sample set after the extraction of all the training samples in the training sample set is completed.

5. The method of claim 4, in response to the set of training samples being reset, further comprising:

and continuously extracting the training samples from the training sample set by using equal probability as input, and continuously training the initial semantic matching model by using the identifications corresponding to the training samples as output.

6. A semantic matching method, comprising:

acquiring a first sentence to be matched and a second sentence to be matched;

inputting the first sentence to be matched and the second sentence to be matched into a semantic matching model for processing, wherein the semantic matching model is obtained based on the semantic matching model training method of any one of claims 1 to 5;

and generating semantic matching results of the first statement to be matched and the second statement to be matched according to the semantic matching result output by the semantic matching model.

7. A semantic matching model training apparatus, comprising:

a training sample set acquisition unit configured to acquire a training sample set;

the biased training sample determining unit is configured to perform word segmentation on each training sample included in the training sample set, and generate a word segmentation result set after word segmentation results of each training sample are collected; wherein the training samples are paired sentences marked with similar or dissimilar semanteme marks; in response to the existence of a sample common word with the occurrence frequency exceeding a preset frequency threshold in the word segmentation result set, generating a quantity ratio between a first training sample and a second training sample comprising the sample common word, wherein the first training sample is marked with the semantically similar identifier, and the second training sample is marked with the semantically dissimilar identifier; in response to the fact that the quantity proportion exceeds a preset proportion threshold value, determining the sample common words as target keywords, and determining training samples including the target keywords as biased training samples;

a semantic matching model training unit configured to obtain an initial semantic matching model and continuously train the initial semantic matching model using training samples in the training sample set, wherein a probability that a non-biased training sample is extracted in a previous preset round of training is higher than that of the biased training sample, and the non-biased training sample is a training sample in the training sample set except for the biased training sample;

a semantic matching model generation unit configured to generate a semantic matching model in response to the training round reaching a preset target round.

8. The apparatus of claim 7, wherein the semantic matching model training unit comprises:

an initial semantic matching model obtaining subunit configured to obtain an initial semantic matching model;

a sample set classification subunit configured to classify the training sample set into a biased training sample set and a non-biased training sample set;

a set selection sequence generation subunit configured to generate a set selection sequence based on a preset set extraction function, wherein the number of the non-biased training sample sets included in a preset ordinal before the set selection sequence is greater than that of the biased training sample sets;

and the semantic matching model training subunit is configured to continuously extract the training samples from the biased training sample set and the biased training sample set as input according to the set selection sequence, and continuously train the initial semantic matching model by taking the identifications corresponding to the training samples as output.

9. The apparatus of claim 8, further comprising:

a set selection sequence returning unit configured to, in response to receiving a set selection sequence query request, feed back the set selection sequence for the set selection sequence query request.

10. The apparatus of claim 7, further comprising:

a training sample set resetting unit configured to reset the training sample set after the extraction of all training samples in the training sample set is completed in response to the number of training samples included in the training sample set being lower than the preset target round.

11. The apparatus of claim 10, in response to the set of training samples being reset, the semantic matching model training unit further configured to continuously train the initial semantic matching model using equal probabilities to continuously extract the training samples from the set of training samples as inputs and the identifications corresponding to the training samples as outputs.

12. A semantic matching apparatus comprising:

a sentence to be matched acquisition unit configured to acquire a first sentence to be matched and a second sentence to be matched;

a semantic similarity matching unit configured to input the first sentence to be matched and the second sentence to be matched into a semantic matching model for processing, wherein the semantic matching model is obtained based on the semantic matching model training device of any one of claims 7-11;

and the matching result output unit is configured to generate semantic matching results of the first sentence to be matched and the second sentence to be matched according to the semantic matching result output by the semantic matching model.

13. An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the semantic matching model training method of any one of claims 1-5 and/or the semantic matching method of claim 6.

14. A non-transitory computer-readable storage medium storing computer instructions for causing the computer to perform the semantic matching model training method of any one of claims 1-5 and/or the semantic matching method of claim 6.