CN113919506B

CN113919506B - A model training online method, device, storage medium and equipment

Info

Publication number: CN113919506B
Application number: CN202111175935.XA
Authority: CN
Inventors: 王阳阳; 吴良庆
Original assignee: Jingdong Technology Information Technology Co Ltd
Current assignee: Jingdong Technology Information Technology Co Ltd
Priority date: 2021-10-09
Filing date: 2021-10-09
Publication date: 2024-12-24
Anticipated expiration: 2041-10-09
Also published as: CN113919506A

Abstract

The application discloses a model training online method, a device, a storage medium and equipment, under the condition of receiving a test version number input by a user, and identifying the model to be online corresponding to the test version number as a test model, and identifying the evaluation value of the test model as a test evaluation value. And under the condition that the test evaluation value is larger than the target evaluation value and a test instruction of a user is received, starting a preset algorithm service to perform AB test on the test model and the current online model to obtain the test result of the test model and the test result of the current online model. And under the condition that the test result of the test model is higher than that of the current online model, releasing the memory of the current online model, and updating the state of the test model to be online. By utilizing the scheme disclosed by the application, an information synchronization mechanism is not required to be adopted in the training process and the online process of the model, and manual participation is not required, so that the labor cost of online training of the model can be effectively reduced.

Description

Model training online method, device, storage medium and equipment

Technical Field

The application relates to the field of intelligent customer service dialogue systems, in particular to a model training online method, device, storage medium and equipment.

Background

In intelligent customer service dialogue systems, a variety of models are generally used, and more common models include text classification, semantic matching, entity recognition and other types. With the development of natural language processing (Neuro-Linguistic Programming, NLP) algorithms, the above several types of models gradually evolve from initial statistical learning into deep learning algorithms. The performance requirements on the machine are higher and higher when the models are trained, and the common central processing unit (Central Processing Unit, CPU) machine cannot meet the deep learning requirement, so that many large companies start to purchase the machines of the GPU and the TPU for training the deep learning model.

If the production environment is to employ a large-structure model, graphics processor (Graphics Processing Unit, GPU), tensor processor (TensorProcessing Unit, TPU) resources are required for training. Generally, the machine of the GPU is mainly an offline machine, and because many enterprises isolate the production environment, the machine of the production environment cannot obtain data of offline service, and the offline service cannot obtain data of the production environment. Therefore, this situation requires a method of interaction between the production environment and the offline GPU machine to achieve model training on-line. For this approach, it is common practice to isolate the model training from the online, debug the model manually and continuously to obtain the final model, and then subject the final model and the model currently in use to AB testing to determine whether to online the final model.

Obviously, the training process and the online process of the final model need to adopt an information synchronization mechanism, however, most of the information synchronization mechanisms are manually completed by manpower, the complexity of a production environment can be improved by adopting the information synchronization mechanism, a large amount of computing resources are consumed in the AB test process, and the online process needs to be additionally developed for different types of models and businesses, so that the labor cost is greatly increased.

Disclosure of Invention

The application provides a model training online method, a device, a storage medium and equipment, and aims to reduce labor cost of model training online.

In order to achieve the above object, the present application provides the following technical solutions:

a model training online method, comprising:

Under the condition that a test version number input by a user is received, marking a model to be online corresponding to the test version number as a test model, and marking an evaluation value of the test model as a test evaluation value, wherein the model to be online and the evaluation value of the model to be online are obtained by training a model to be trained;

Under the condition that the test evaluation value is larger than a target evaluation value and a test instruction of a user is received, starting a preset algorithm service to conduct AB test on the test model and the current online model to obtain a test result of the test model and a test result of the current online model;

And under the condition that the test result of the test model is higher than the test result of the current online model, releasing the memory of the current online model, and updating the state of the test model to be online.

Optionally, the model to be online and the evaluation value of the model to be online are all obtained by training the model to be trained, and the method includes:

creating a training task of the model to be trained based on the type and the parameters of the model to be trained, and storing the training corpus of the model to be trained to a cloud;

The training task is sent to an offline graphic processor, the offline graphic processor is triggered to acquire the training corpus from the cloud end, and the model to be trained is trained according to the type to be trained, the parameters and the training corpus to obtain the model to be online and an evaluation value of the model to be online;

Setting a state label and a version number for the to-be-online model, and storing information of the to-be-online model into a preset model table.

Optionally, the creating the training task of the model to be trained based on the type and the parameters of the model to be trained includes:

Obtaining the type, parameters and training corpus of a model to be trained;

Setting a state label for the model to be trained, and storing training corpus of the model to be trained to a cloud;

Storing the type, the parameters and the state labels of the model to be trained into a database;

under the condition that an access request sent by an offline graphic processor is received, a training task of the model to be trained is created based on the type and the parameters of the model to be trained, and the state of the model to be trained is updated to be in training.

Optionally, setting a status tag and a version number for the to-be-online model, and storing information of the to-be-online model into a preset model table, including:

setting a version number for the to-be-online model under the condition that the to-be-online model and the evaluation value sent by the offline graphics processor are received;

storing the model to be online and the evaluation value into a database, and updating the state of the model to be trained to be training completion;

setting a state label for the to-be-online model, and storing information of the to-be-online model into a preset model table;

and storing the model to be online to a cloud end, and displaying the model to be online and the evaluation value to a user through a preset front-end interface.

Optionally, under the condition that the test version number input by the user is received, identifying the to-be-online model corresponding to the test version number as a test model, and identifying the evaluation value of the test model as a test evaluation value, including:

the offline graphics processor sends the offline models to be online and the evaluation values of the offline models to be online in advance to a database;

under the condition that a test version number input by a user is received, selecting a to-be-online model corresponding to the test version number from the database, marking the to-be-online model as a test model, and marking an evaluation value of the test model as a test evaluation value;

And updating the state of the test model to be tested.

Optionally, under the condition that the test evaluation value is greater than the target evaluation value and a test instruction of a user is received, starting a preset algorithm service to perform an AB test on the test model and the current online model to obtain a test result of the test model and a test result of the current online model, including:

comparing the test evaluation value with a target evaluation value;

Judging whether the current online model is the test model or not under the condition that the test evaluation value is larger than the target evaluation value;

If the current online model is the test model, prompting a user that the test model is in an online state, and updating the state of the test model to be online;

If the current online model is not the test model, under the condition that a test instruction of the user is received, starting a preset algorithm service to conduct AB test on the test model and the current online model, and obtaining test results of the test model and the current online model.

Optionally, the method further comprises:

And under the condition that the test performance of the test model is lower than that of the current online model, releasing the internal memory of the test model, and deleting the information of the test model from the model table.

A model training on-line device, comprising:

The system comprises an identification unit, a test model, a test evaluation unit and a training unit, wherein the identification unit is used for identifying a model to be online corresponding to a test version number as a test model and identifying the evaluation value of the test model as a test evaluation value under the condition that the test version number input by a user is received;

The test unit is used for starting a preset algorithm service to conduct AB test on the test model and the current online model under the condition that the test evaluation value is larger than a target evaluation value and a test instruction of a user is received, so as to obtain a test result of the test model and a test result of the current online model;

And the online unit is used for releasing the memory of the current online model and updating the state of the test model into online under the condition that the test result of the test model is higher than the test result of the current online model.

A computer readable storage medium comprising a stored program, wherein the program performs the model training on-line method.

The model training online equipment comprises a processor, a memory and a bus, wherein the processor is connected with the memory through the bus;

The memory is used for storing a program, and the processor is used for running the program, wherein the model training online method is executed when the program runs.

According to the technical scheme provided by the application, under the condition that the test version number input by the user is received, the to-be-online model corresponding to the test version number is identified as a test model, and the evaluation value of the test model is identified as a test evaluation value. And under the condition that the test evaluation value is larger than the target evaluation value and a test instruction of a user is received, starting a preset algorithm service to perform AB test on the test model and the current online model to obtain the test result of the test model and the test result of the current online model. The target evaluation value is obtained based on training a current online model by using a test set. And under the condition that the test result of the test model is higher than that of the current online model, releasing the memory of the current online model, and updating the state of the test model to be online. By utilizing the scheme disclosed by the application, an information synchronization mechanism is not required to be adopted in the training process and the online process of the model, and manual participation is not required, so that the labor cost of online training of the model is effectively reduced.

Drawings

In order to more clearly illustrate the embodiments of the application or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a schematic diagram of a model training online method according to an embodiment of the present application;

FIG. 2 is a schematic diagram of another model training online method according to an embodiment of the present application;

Fig. 3 is a schematic diagram of a model training online device according to an embodiment of the present application.

Detailed Description

The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.

As shown in fig. 1, a schematic diagram of a model training online method according to an embodiment of the present application is applied to a production environment, and includes the following steps:

s101, obtaining the type, parameters and training corpus of the model to be trained.

Parameters include, but are not limited to, training algorithms, training tasks, maximum length of training corpus, whether to splice the above information, etc.

S102, setting a state label for the model to be trained, and storing training corpus of the model to be trained to a cloud end.

The training process of the model to be trained is performed in an offline environment (i.e., an offline graphics processor), that is, the offline graphics processor cannot directly access the database. For this reason, the training corpus of the model to be trained needs to be stored in the cloud end, so as to be convenient for the offline graphics processor to acquire.

And S103, storing the type, the parameters and the state labels of the model to be trained into a database.

And S104, under the condition that an access request sent by the offline graphic processor is received, creating a training task of the model to be trained based on the type and the parameters of the model to be trained, and updating the state of the model to be trained into training.

The offline graphics processor can send an access request through a preset http interface. In addition, by pre-deploying a timing access service on the offline graphics processor, the offline graphics processor can be controlled to send access requests at a timing.

And S105, transmitting the training task to the offline graphic processor, triggering the offline graphic processor to acquire training corpus of the model to be trained from the cloud, and training the model to be trained according to the type, the parameters and the training corpus of the model to be trained to obtain a model file and a result file.

The model file comprises a model to be online, the model to be trained is obtained through training of the model to be trained, and the result file comprises an evaluation value for evaluating the training effect of the model to be online, wherein the evaluation value is obtained through training of the model to be online by using a test set.

It should be noted that, the offline graphics processor will generally pre-configure a resource table, where the resource table is used to record the usage states of its own computing resources, and before executing the training task, the offline graphics processor will query the usage states of the computing resources in the resource table, and if there are computing resources whose usage states are idle, identify the computing resources whose usage states are idle as target resources, and execute the training task by using the target resources.

Generally, the offline graphic processor can realize the training process of the model to be trained by calling a pre-deployed training process, specifically, in codes shown in the training process, the type, parameters and training corpus of the model to be trained are configured, and the training process is operated to generate a model file and a result file.

In addition, the offline graphics processing may also identify the use status of the target resource as in-use during the execution of the training task using the target resource. After the model file and the result file are obtained, i.e., after the training task is finished, the offline graphics processor also identifies the use state of the target resource as idle.

And S106, setting a version number for the to-be-online model under the condition that the to-be-online model and the evaluation value sent by the offline graphics processor are received.

And S107, storing the model to be online and the evaluation value into a database, and updating the state of the model to be trained to be training completion.

S108, setting a state label for the to-be-online model, and storing information of the to-be-online model into a preset model table.

And S109, storing the model to be online to a cloud end, and displaying the model to be online and the evaluation value to a user through a preset front end interface.

The process shown in S101-S109 mainly describes that the production environment issues a training task of the model to be trained to the offline graphics processor, and after the offline graphics processor completes the training task, the model file and the result file obtained by training are fed back to the production environment, so as to realize automatic training of the model to be trained. Compared with the prior art, the training process of the model to be trained does not need to be manually participated, and the labor cost of model training on line is effectively reduced.

S110, under the condition that the test version number input by the user is received, selecting a model to be online corresponding to the test version number from the database, and marking the model as a test model.

And S111, identifying the evaluation value of the test model as a test evaluation value.

S112, updating the state of the test model to be tested.

And S113, comparing the test evaluation value with the target evaluation value.

The target evaluation value is obtained based on training a current online model by using a test set. In the embodiment of the application, the target evaluation value is used for evaluating the training effect of the current online model.

And S114, judging whether the current online model is a test model or not under the condition that the test evaluation value is larger than the target evaluation value.

If the current online model is a test model, S115 is executed, otherwise S116 is executed.

In order to avoid redundant database operations caused by repeated online of the test model, whether the current online model is the test model needs to be judged, if the current online model is the test model, the test model is in an online state, and online operations are not needed.

S115, prompting the user that the test model is in the online state, and updating the state of the test model to be online.

And S116, under the condition that a test instruction of a user is received, starting a preset algorithm service to perform AB test on the test model and the current online model, so as to obtain the test result of the test model and the test result of the current online model.

After S116 is performed, S117 is continued.

Wherein the algorithm service is configured to perform the following logic:

1. And (5) regularly inquiring the model table, and judging whether the model table contains the information of the test model.

2. Under the condition that the model table contains the information of the test model, loading the test model obtained from the cloud, adding first buried point information into the current online model, and adding second buried point information into the test model.

3. And splitting the flow of the current service according to the message id to obtain a first flow and a second flow, enabling the first flow to call the current online model, and enabling the second flow to call the test model.

4. And (3) associating information shown in different message ids through the first embedded point information to obtain a service index of the first flow, and taking the value of the service index of the first flow as the test result of the current online model.

5. And (3) correlating the information shown by different message ids through the second embedded point information to obtain service indexes of the second flow, and taking the values of the service indexes of the second flow as test results of the test model.

6. And sending a prompt of AB test failure to a user when the model table does not contain information of the test model.

S117, comparing the test result of the test model with the test result of the current online model.

And S118, under the condition that the test result of the test model is higher than that of the current online model, releasing the memory of the current online model, and updating the state of the test model to be online.

And S119, under the condition that the test result of the test model is lower than the test result of the current online model, releasing the internal memory of the test model, and deleting the information of the test model from the model table.

And S120, under the condition that the test model is detected to be a new model, loading the test model, and updating the state of the test model to be on line.

In summary, by using the scheme shown in the embodiment, the training process and the online process of the model do not need to adopt an information synchronization mechanism, and do not need to be manually participated, so that the labor cost of model training and online is effectively reduced.

It should be noted that S119 mentioned in the foregoing embodiment is an alternative implementation of the model training online method shown in the present application. In addition, S120 mentioned in the foregoing embodiment is also an alternative implementation of the model training online method shown in the present application. For this reason, the flow mentioned in the above embodiment can be summarized as the method shown in fig. 2.

As shown in fig. 2, a schematic diagram of another model training online method according to an embodiment of the present application includes the following steps:

And S201, under the condition that the test version number input by the user is received, identifying the model to be online corresponding to the test version number as a test model, and identifying the evaluation value of the test model as a test evaluation value.

The model to be online and the evaluation value of the model to be online are obtained by training the model to be trained.

And S202, under the condition that the test evaluation value is larger than the target evaluation value and a test instruction of a user is received, starting a preset algorithm service to perform AB test on the test model and the current online model to obtain the test result of the test model and the test result of the current online model.

The target evaluation value is obtained based on training a current online model by using a test set.

And S203, under the condition that the test result of the test model is higher than that of the current online model, releasing the memory of the current online model, and updating the state of the test model to be online.

Corresponding to the model training online method provided by the embodiment of the application, the embodiment of the application also provides a model training online device.

As shown in fig. 3, an architecture diagram of a model training online device according to an embodiment of the present application includes:

the identifying unit 100 is configured to identify, when a test version number input by a user is received, a model to be online corresponding to the test version number as a test model, and an evaluation value of the test model as a test evaluation value.

The identification unit 100 is specifically configured to create a training task of a model to be trained based on a type and parameters of the model to be trained, store training corpus of the model to be trained to a cloud, send the training task to an offline graphics processor, trigger the offline graphics processor to acquire the training corpus from the cloud, train the model to be trained according to the type, parameters and training corpus to obtain an online model to be trained, and an evaluation value of the online model to be trained, wherein the evaluation value is obtained based on training the online model by using a test set, set a state tag and a version number for the online model to be trained, and store information of the online model to a preset model table.

The identification unit 100 is specifically configured to obtain a type, a parameter and a training corpus of a model to be trained, set a state label for the model to be trained, store the training corpus of the model to be trained in a cloud, store the type, the parameter and the state label of the model to be trained in a database, create a training task of the model to be trained based on the type and the parameter of the model to be trained when an access request sent by an offline graphics processor is received, and update the state of the model to be trained in training.

The identification unit 100 is specifically configured to set a version number for the to-be-online model under the condition that the to-be-online model and the evaluation value sent by the offline graphics processor are received, store the to-be-online model and the evaluation value in a database, update the state of the to-be-trained model to be training completion, set a state label for the to-be-online model, store information of the to-be-online model in a preset model table, store the to-be-online model to a cloud, and display the to-be-online model and the evaluation value to a user through a preset front end interface.

The identification unit 100 is specifically configured to store, in advance, an offline model and an evaluation value of the offline model sent by the offline graphics processor into a database, select, from the database, the offline model corresponding to the test version number, identify the online model as a test model, identify the evaluation value of the test model as a test evaluation value, and update a state of the test model as a to-be-tested under the condition that the test version number input by a user is received.

And the test unit 200 is used for starting a preset algorithm service to perform AB test on the test model and the current online model under the condition that the test evaluation value is larger than the target evaluation value and a test instruction of a user is received, so as to obtain the test result of the test model and the test result of the current online model, and the target evaluation value is obtained by training the current online model by using the test set.

The test unit 200 is specifically configured to compare the test evaluation value with the target evaluation value, determine whether the current online model is a test model if the test evaluation value is greater than the target evaluation value, prompt the user that the test model is in an online state and update the state of the test model to be online if the current online model is the test model, and start a preset algorithm service to perform an AB test on the test model and the current online model to obtain a test result of the test model and a test result of the current online model if the current online model is not the test model, under the condition that a test instruction of the user is received.

And the online unit 300 is configured to perform memory release on the current online model and update the state of the test model to online when the test result of the test model is higher than the test result of the current online model.

And the deleting unit 400 is configured to release the memory of the test model and delete the information of the test model from the model table when the test result of the test model is lower than the test result of the current online model.

The application also provides a computer readable storage medium, wherein the computer readable storage medium comprises a stored program, and the program executes the model training online method provided by the application.

The application also provides model training online equipment which comprises a processor, a memory and a bus. The processor is connected with the memory through a bus, the memory is used for storing a program, and the processor is used for running the program, wherein the model training online method provided by the application is executed when the program runs, and the method comprises the following steps:

Obtaining the type, parameters and training corpus of a model to be trained;

And updating the state of the test model to be tested.

comparing the test evaluation value with a target evaluation value;

Optionally, the method further comprises:

The functions of the methods of embodiments of the present application, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored on a computing device readable storage medium. Based on such understanding, a part of the present application that contributes to the prior art or a part of the technical solution may be embodied in the form of a software product stored in a storage medium, comprising several instructions for causing a computing device (which may be a personal computer, a server, a mobile computing device or a network device, etc.) to execute all or part of the steps of the method described in the embodiments of the present application. The storage medium includes a U disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, an optical disk, or other various media capable of storing program codes.

In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, so that the same or similar parts between the embodiments are referred to each other.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. A model training on-line method, comprising:

2. The method of claim 1, wherein the model to be online and the evaluation value of the model to be online are obtained by training a model to be trained, comprising:

3. The method of claim 2, wherein creating the training task for the model to be trained based on the type and parameters of the model to be trained comprises:

Obtaining the type, parameters and training corpus of a model to be trained;

4. The method of claim 2, wherein the setting a status tag and a version number for the to-be-online model and storing information of the to-be-online model in a preset model table includes:

5. The method according to claim 2, wherein in the case of receiving a test version number input by a user, identifying a model to be online corresponding to the test version number as a test model, and identifying an evaluation value of the test model as a test evaluation value, includes:

And updating the state of the test model to be tested.

6. The method according to claim 1, wherein, when the test evaluation value is greater than the target evaluation value and a test instruction of a user is received, starting a preset algorithm service to perform an AB test on the test model and the current online model, to obtain a test result of the test model and a test result of the current online model, including:

comparing the test evaluation value with a target evaluation value;

7. The method as recited in claim 1, further comprising:

8. A model training on-line device, comprising:

9. A computer readable storage medium, characterized in that the computer readable storage medium comprises a stored program, wherein the program performs the model training on-line method of any of claims 1-7.

10. The model training online equipment is characterized by comprising a processor, a memory and a bus, wherein the processor is connected with the memory through the bus;

The memory is used for storing a program, and the processor is used for running the program, wherein the program executes the model training online method according to any one of claims 1-7.