Disclosure of Invention
Therefore, the invention provides a product testing method and system based on an AI large model and machine learning, which are used for solving the problem that in the prior art, because of unstable network when data are acquired by cooperation of multiple devices, partial data are lost, and the model prediction after training is inaccurate, so that the stability of product testing is reduced.
The invention provides a product testing method based on an AI large model and machine learning, which comprises the steps of collecting product testing data, sequentially carrying out cleaning, merging and conversion operation on the product testing data to output optimized data, extracting characteristics of the optimized data by using a machine learning algorithm, training an original model by using the characteristics to output the AI large model, updating the AI large model by using newly collected product testing data, predicting a testing result of a product by using the AI large model to output a prediction result, respectively obtaining lost data quantity of the product testing data and total data quantity of the product testing data in a plurality of collecting periods, determining stability of the product testing based on a packet loss rate of the product testing data, adjusting a transmission rate of the product testing data if the stability of the product testing is not in accordance with requirements, or determining training validity of the AI large model based on a training time extension proportion of the AI large model, adjusting a learning rate of the AI large model or adjusting a batch of the product testing data based on average update delay time length of the AI large model if the training validity is not in accordance with requirements.
Further, determining stability of the product test includes:
comparing the packet loss rate of the product test data with a preset first packet loss rate;
If the packet loss rate of the product test data is larger than the preset first packet loss rate, determining that the stability of the product test is not in accordance with the requirement.
Further, determining training effectiveness of the AI large model includes:
comparing the packet loss rate of the product test data with the preset first packet loss rate and the preset second packet loss rate respectively;
If the packet loss rate of the product test data is larger than the preset first packet loss rate and smaller than or equal to the preset second packet loss rate, the training effectiveness of the AI large model is preliminarily determined to be not in accordance with the requirements, and whether the training effectiveness of the AI large model is in accordance with the requirements or not is determined according to the training time extension proportion of the AI large model.
Further, adjusting the transmission rate of the product test data includes:
comparing the packet loss rate of the product test data with the preset second packet loss rate;
and if the packet loss rate of the product test data is larger than the preset second packet loss rate, reducing the transmission rate of the product test data.
Further, the reduction amplitude of the transmission rate of the product test data is determined by the difference value between the packet loss rate of the product test data and the preset second packet loss rate.
Further, adjusting the learning rate of the AI large model includes:
respectively comparing the training time extension ratio of the AI large model with a preset first extension ratio and a preset second extension ratio;
if the training time extension proportion of the AI large model is larger than the preset first extension proportion, determining that the training effectiveness of the AI large model does not meet the requirement;
If the training time extension proportion of the AI large model is larger than a preset first extension proportion and smaller than or equal to the preset second extension proportion, the learning rate of the AI large model is reduced;
If the training time extension proportion of the AI large model is larger than the preset second extension proportion, preliminarily determining that the updating instantaneity of the AI large model does not meet the requirements, and determining whether the updating instantaneity of the AI large model meets the requirements according to the average updating delay time length of the AI large model.
Further, the reduction amplitude of the learning rate of the AI large model is determined by the difference between the training time extension ratio of the AI large model and the preset first extension ratio.
Further, adjusting the batch size of the product test data includes:
comparing the average update delay time length of the AI large model with a preset delay time length;
if the average update delay time length of the AI large model is longer than the preset delay time length, determining that the update instantaneity of the AI large model is not in accordance with the requirement, and reducing the batch processing size of the product test data.
Further, the reduction amplitude of the batch processing size of the product test data is determined by the difference value between the average update delay time length of the AI large model and the preset delay time length.
The invention also provides a product testing system based on the AI large model and the machine learning, which is characterized by comprising the following steps:
The data acquisition module is used for acquiring product test data;
The data processing module is connected with the data acquisition module and comprises a preprocessing unit used for preprocessing the product test data to output optimized data and a feature extraction unit connected with the preprocessing unit and used for extracting features of the optimized data by using a machine learning algorithm;
The model training module is connected with the data processing module and comprises a model training unit, a result prediction unit and a model updating unit, wherein the model training unit is connected with the characteristic extraction unit and used for training an original model according to the characteristics to output an AI large model, the result prediction unit is connected with the model training unit and used for predicting a test result of a product according to the AI large model to output a prediction result, and the model updating unit is connected with the model training unit and used for updating the AI large model according to newly acquired product test data;
The storage module is respectively connected with the data acquisition module, the data processing module and the model training module and used for respectively storing the product test data, the optimization data, the characteristics, the machine learning algorithm, the AI large model and the prediction result;
The control module is respectively connected with the data acquisition module, the data processing module, the model training module and the storage module and is used for determining the transmission rate of the product test data according to the packet loss rate based on the product test data, or determining the learning rate of the AI large model according to the training time extension proportion of the AI large model, and determining the batch processing size of the product test data according to the training time extension proportion of the AI large model and the average update delay time length of the AI large model.
Compared with the prior art, the method has the advantages that the transmission rate of the product test data is regulated according to the packet loss rate of the product test data, partial data is lost due to unstable network when the data is acquired by the cooperation of multiple devices, so that the model prediction after training is inaccurate, the data can be transmitted in the network in a more stable mode by reducing the transmission rate of the product test data, the network load is lightened, the probability of data loss is reduced, more data can completely reach a destination for model training, the learning rate of the large model is regulated according to the training time extension proportion of the AI large model, partial data noise is not cleaned during cleaning due to the fact that the acquired data contains multiple types of data, noise interference is generated, fitting these noises can be tried during model training, resulting in an increase in the number of training rounds, thereby causing model over-training, by reducing the learning rate of the machine learning model, the model can have more time and opportunity to resolve effective information and noises in data, excessive training caused by too fast fitting of noises is avoided, batch processing size of product test data is adjusted according to average update delay time of AI large model, and since the acquired data volume is too large, it takes a long time to store, read and process large-scale data, so that the model cannot quickly acquire the latest processed data for updating, the batch processing size of product test data is reduced, each batch of processing can be completed more quickly, and the model can acquire partial processed data for updating more quickly, the stability of product test is improved.
Furthermore, the method adjusts the transmission rate of the product test data by setting the preset first packet loss rate and the preset second packet loss rate, and the network is unstable when the data is acquired by the cooperation of multiple devices, so that partial data is lost, and the model prediction after training is inaccurate, and the data can be transmitted in the network in a more stable manner by reducing the transmission rate of the product test data, thereby reducing the network load, reducing the probability of data loss, enabling more data to completely reach a destination for model training, and further improving the stability of the product test.
Furthermore, the method adjusts the learning rate of the AI large model by setting the preset first extension proportion and the preset second extension proportion, and part of data noise is not cleaned during cleaning and noise interference is generated because the collected data contains various types of data, so that the noise is tried to be fitted during model training, the training wheel number is increased, the model is over-trained, the learning rate of the machine learning model is reduced, the model can have more time and opportunity to distinguish effective information and noise in the data, the over-training caused by the too-fast fitting of the noise is avoided, and the stability of product testing is further improved.
Furthermore, the method adjusts the batch processing size of the product test data by setting the preset delay time, and long time is spent on storing, reading and processing large-scale data due to overlarge acquired data volume, so that the model cannot quickly acquire the latest processed data for updating, the batch processing size of the product test data is reduced, the processing of each batch can be more quickly completed, the model can more quickly acquire partial processed data for updating, and the stability of product test is further improved.
Detailed Description
The invention will be further described with reference to examples for the purpose of making the objects and advantages of the invention more apparent, it being understood that the specific examples described herein are given by way of illustration only and are not intended to be limiting.
Preferred embodiments of the present invention are described below with reference to the accompanying drawings. It should be understood by those skilled in the art that these embodiments are merely for explaining the technical principles of the present invention, and are not intended to limit the scope of the present invention.
It should be noted that, in the description of the present invention, terms such as "upper," "lower," "left," "right," "inner," "outer," and the like indicate directions or positional relationships based on the directions or positional relationships shown in the drawings, which are merely for convenience of description, and do not indicate or imply that the apparatus or elements must have a specific orientation, be constructed and operated in a specific orientation, and thus should not be construed as limiting the present invention.
In addition, it should be noted that, in the description of the present invention, unless explicitly specified and limited otherwise, the terms "mounted," "connected," and "connected" are to be construed broadly, and may be, for example, fixedly connected, detachably connected, integrally connected, mechanically connected, electrically connected, directly connected, indirectly connected through an intermediate medium, or in communication between two elements. The specific meaning of the above terms in the present invention can be understood by those skilled in the art according to the specific circumstances.
Fig. 1, fig. 2, fig. 3, and fig. 4 show an overall flowchart, an overall structure block diagram, a logic flow diagram, and a specific flow diagram of a process for adjusting a transmission rate of product test data according to an embodiment of the invention. As shown in FIG. 1, the product testing method based on the AI large model and the machine learning comprises the following steps:
Step S1, collecting product test data, sequentially cleaning, combining and converting the product test data to output optimized data, extracting the characteristics of the optimized data by using a machine learning algorithm, and training an original model by using the characteristics to output an AI large model;
Step S2, updating the AI large model by using newly acquired product test data, and predicting a test result of a product by using the AI large model so as to output a prediction result;
Step S3, respectively acquiring the lost data quantity of the product test data and the total data quantity of the product test data in a plurality of acquisition periods;
Step S4, determining the stability of the product test based on the packet loss rate of the product test data;
Step S5, if the stability of the product test does not meet the requirement, adjusting the transmission rate of the product test data, or determining the training effectiveness of the AI large model based on the training time extension ratio of the AI large model;
And S6, if the training effectiveness does not meet the requirement, adjusting the learning rate of the AI large model, or adjusting the batch processing size of the product test data based on the average update delay time length of the AI large model.
Specifically, the products include office software, image processing software, and electronic products.
Specifically, the product test data includes the number of user clicks, the software start-up duration, and the software memory occupation data.
Specifically, the optimization data includes software response time after deletion value removal, product name after uniform format, and user name after merging.
Specifically, the machine learning algorithm includes decision trees, random forests, support vector machines.
Specifically, the characteristics include an average response time length of the software, a byte amount of the product name, and a byte amount of the user name.
Specifically, AI large models include a transducer model, a neural network model, and a GPT model.
Specifically, the newly acquired product test data is the product test data acquired after the AI large model training is completed.
Specifically, the prediction result includes the resource utilization rate of the product in operation, the fluctuation range of the sales quantity of the product, and the probability of error occurrence of the product function.
Specifically, the learning rate of the AI large model is the step size used by the AI large model in the training process each time the model parameters are updated.
Specifically, the batch size of product test data is the number of data samples processed simultaneously at a time during the product test.
In implementation, the method adjusts the transmission rate of the product test data according to the packet loss rate of the product test data, because the network is unstable when the data is acquired by the cooperation of multiple devices, partial data is lost, so that model prediction after training is inaccurate, the data can be transmitted in the network in a more stable manner by reducing the transmission rate of the product test data, network burden is reduced, probability of data loss is reduced, more data can completely reach a destination for model training, the learning rate of the AI large model is adjusted according to the training time extension proportion of the AI large model, the acquired data contains multiple types of data, partial data noise is not cleaned during cleaning, noise interference is generated, fitting of the noise is attempted during model training, the number of training wheels is increased, thereby causing model over training, more time and opportunity for resolving effective information and noise in the data are avoided by reducing the learning rate of the machine learning model, over training is avoided due to over-fast fitting noise, the large-batch processing of the product test data is performed according to average update delay of the AI large model, the large-batch processing of the product test data is performed in a long time, the data can be processed more quickly, the data can be updated and the data can be processed in a batch is updated in a batch mode, the most stable manner is not be processed, and the data is updated in a batch is processed quickly, and the large-scale is not can be processed, and the data is updated in a batch is more stable.
Specifically, determining the stability of the product test includes:
respectively acquiring the lost data quantity of the product test data and the total data quantity of the product test data in a plurality of acquisition periods, and calculating the packet loss rate of the product test data;
comparing the packet loss rate of the product test data with a preset first packet loss rate;
If the packet loss rate of the product test data is larger than the preset first packet loss rate, determining that the stability of the product test is not in accordance with the requirement.
Specifically, determining the training effectiveness of the AI large model includes:
comparing the packet loss rate of the product test data with the preset first packet loss rate and the preset second packet loss rate respectively;
If the packet loss rate of the product test data is larger than the preset first packet loss rate and smaller than or equal to the preset second packet loss rate, the training effectiveness of the AI large model is preliminarily determined to be not in accordance with the requirements, and whether the training effectiveness of the AI large model is in accordance with the requirements or not is determined according to the training time extension proportion of the AI large model.
It can be understood that three intervals corresponding to the preset first packet loss rate and the preset second packet loss rate correspond to three situations respectively:
the first interval is the condition that the packet loss rate of the product test data is smaller than or equal to a preset first packet loss rate and corresponds to the stability of the product test meeting the requirement;
The second interval is that the packet loss rate of the product test data is larger than the preset first packet loss rate and smaller than or equal to the preset second packet loss rate, and the corresponding acquired data contains multiple types of data, so that part of data noise is not cleaned during cleaning, noise interference is generated, fitting of the noise is tried during model training, the number of training wheels is increased, and the model is over-trained;
The third interval is that the packet loss rate of the product test data is larger than the preset second packet loss rate, and the network is unstable when the corresponding multiple devices are cooperated to collect the data, so that partial data is lost, and the model prediction after training is inaccurate.
In practice, the range of the preset first packet loss rate is generally selected to be [4%,6% ], and the range of the preset second packet loss rate is generally selected to be [7%,9% ].
Preferably, the preferred embodiment of the preset first packet loss rate is 5%, and the preferred embodiment of the preset second packet loss rate is 8%.
Specifically, the packet loss rate of the product test data is the ratio of the lost data amount of the product test data to the total data amount of the product test data in a plurality of acquisition periods.
In implementation, the method determines the stability of the product test by setting the preset first packet loss rate and the preset second packet loss rate, reduces the influence of the reduction of the accuracy of the product test caused by the inaccurate determination of the stability of the product test, and further improves the stability of the product test.
Specifically, the method for adjusting the transmission rate of the product test data comprises the following steps:
comparing the packet loss rate of the product test data with the preset second packet loss rate;
and if the packet loss rate of the product test data is larger than the preset second packet loss rate, reducing the transmission rate of the product test data.
Specifically, the reduction amplitude of the transmission rate of the product test data is determined by the difference value between the packet loss rate of the product test data and the preset second packet loss rate.
Specifically, when the difference between the packet loss rate of the product test data and the preset second packet loss rate is within 3%, the transmission rate of the product test data is reduced to 0.95 times of the original transmission rate, when the difference between the packet loss rate of the product test data and the preset second packet loss rate exceeds 3%, the transmission rate of the product test data is reduced by 50KB/s every time the difference exceeds 2%, for example, the difference between the packet loss rate of the product test data and the preset second packet loss rate is 7%, the transmission rate of the current product test data is 600KB/s, and the transmission rate of the reduced product test data is 600×0.95-50×2=470 KB/s.
In implementation, the method adjusts the transmission rate of the product test data by setting the preset first packet loss rate and the preset second packet loss rate, and the network is unstable when the data is acquired by the cooperation of multiple devices, so that partial data is lost, and the model prediction after training is inaccurate, and the data can be transmitted in the network in a more stable mode by reducing the transmission rate of the product test data, so that the network load is reduced, the probability of data loss is reduced, more data can completely reach a destination for model training, and the stability of the product test is further improved.
Specifically, the adjusting the learning rate of the AI large model includes:
respectively acquiring the actual training time of the AI large model and the standard training time of the AI large model, and calculating the training time extension ratio of the AI large model;
respectively comparing the training time extension ratio of the AI large model with a preset first extension ratio and a preset second extension ratio;
if the training time extension proportion of the AI large model is larger than the preset first extension proportion, determining that the training effectiveness of the AI large model does not meet the requirement;
If the training time extension proportion of the AI large model is larger than a preset first extension proportion and smaller than or equal to the preset second extension proportion, the learning rate of the AI large model is reduced;
If the training time extension proportion of the AI large model is larger than the preset second extension proportion, preliminarily determining that the updating instantaneity of the AI large model does not meet the requirements, and determining whether the updating instantaneity of the AI large model meets the requirements according to the average updating delay time length of the AI large model.
It can be understood that the three intervals corresponding to the preset first extension ratio and the preset second extension ratio correspond to three situations respectively:
the first interval is the condition that the training time extension ratio of the AI large model is larger than the preset first extension ratio, and the training effectiveness of the AI large model meets the requirement;
The training time extension proportion of the AI large model in the second interval is larger than the preset first extension proportion and smaller than or equal to the preset second extension proportion, and the corresponding acquired data contains multiple types of data, so that part of data noise is not cleaned during cleaning, noise interference is generated, fitting of the noise is tried during model training, the number of training wheels is increased, and the model is overtrained;
The training time extension proportion of the AI large model in the third interval is larger than the preset second extension proportion, and the corresponding acquired data volume is too large, so that a long time is taken for storing, reading and processing large-scale data, and the model cannot quickly acquire the latest processed data for updating.
In practice, the preset first extension ratio is generally selected within the range of [8%,12% ], and the preset second extension ratio is generally selected within the range of [13%,17% ].
Preferably, the first extension ratio is preset to be 10% and the second extension ratio is preset to be 15%.
Specifically, the training time extension ratio of the AI large model is a ratio of a difference between an actual training time of the AI large model and a standard training time of the AI large model to the standard training time of the AI large model.
Specifically, the standard training time of the AI large model is the time required for AI large model training when not applied to the product testing method described in the present scheme.
In implementation, the method of the invention determines the training effectiveness of the AI large model by setting the preset first extension proportion and the preset second extension proportion, thereby reducing the influence of the reduction of the stability of the product test caused by inaccurate determination of the training effectiveness of the AI large model and further improving the stability of the product test.
Specifically, the reduction amplitude of the learning rate of the AI large model is determined by the difference between the training time extension ratio of the AI large model and the preset first extension ratio.
Specifically, when the difference between the training time extension ratio of the AI large model and the preset first extension ratio is within 5%, the learning rate of the AI large model is reduced by 0.92 times as much as the original learning rate, when the difference between the training time extension ratio of the AI large model and the preset first extension ratio exceeds 5%, the learning rate of the AI large model is reduced by 0.003 every time the difference between the training time extension ratio of the AI large model and the preset first extension ratio exceeds 2%, for example, the difference between the training time extension ratio of the AI large model and the preset first extension ratio is 9%, the learning rate of the current AI large model is 0.05, and the learning rate of the reduced AI large model is 0.05x0.92-0.003 x2=0.04.
In implementation, the method adjusts the learning rate of the AI large model by setting the preset first extension proportion and the preset second extension proportion, and part of data noise is not cleaned during cleaning and noise interference is generated because the collected data contains various types of data, so that the noise is tried to be fitted during model training, the training wheel number is increased, the model is over-trained, the learning rate of the machine learning model is reduced, the model can have more time and opportunity to distinguish effective information and noise in the data, the over-training caused by the too-fast fitting of the noise is avoided, and the stability of product testing is further improved.
Specifically, adjusting the batch size of the product test data includes:
Acquiring the update delay time length of the AI large model in a plurality of update periods, and calculating the average update delay time length of the AI large model;
comparing the average update delay time length of the AI large model with a preset delay time length;
if the average update delay time length of the AI large model is longer than the preset delay time length, determining that the update instantaneity of the AI large model is not in accordance with the requirement, and reducing the batch processing size of the product test data.
It can be understood that the two intervals corresponding to the preset delay time length correspond to two cases respectively:
The first interval is the condition that the average update delay time length of the AI large model is smaller than or equal to the preset delay time length and the update instantaneity of the AI large model meets the requirement;
The second interval is the case that the average update delay time length of the AI large model is longer than the preset delay time length, and the corresponding acquired data volume is too large, so that a long time is taken for storing, reading and processing large-scale data, and the model cannot quickly acquire the latest processed data for updating.
In practice, the preset delay period is typically selected to be within the range of 3s,7 s.
Preferably, the preferred embodiment of the preset delay period is 5s.
Specifically, the average update delay time of the AI large model is a ratio of a total update delay time of the AI large model to the number of update cycles in a plurality of update cycles.
In implementation, the method of the invention determines the update instantaneity of the AI large model by setting the preset delay time length, thereby reducing the influence of the reduction of the stability of the product test caused by inaccurate determination of the update instantaneity of the AI large model and further improving the stability of the product test.
Specifically, the reduction amplitude of the batch processing size of the product test data is determined by the difference value between the average update delay time length of the AI large model and the preset delay time length.
Specifically, when the difference between the average update delay time length of the AI large model and the preset delay time length is within 2s, the batch processing size of the product test data is reduced to 0.9 times of the original batch processing size, when the difference between the average update delay time length of the AI large model and the preset delay time length exceeds 2s, the batch processing size of the product test data is reduced by 3 pieces per exceeding 2s, for example, the difference between the average update delay time length of the AI large model and the preset delay time length is 6s, the batch processing size of the current product test data is 250 pieces, and the batch processing size of the reduced product test data is 250 multiplied by 0.9-3 multiplied by 2=219 pieces.
In implementation, the method adjusts the batch processing size of the product test data by setting the preset delay time, and long time is spent on storing, reading and processing large-scale data due to overlarge acquired data volume, so that the model cannot quickly acquire the latest processed data for updating, the batch processing size of the product test data is reduced, the processing of each batch can be more quickly completed, the model can more quickly acquire partial processed data for updating, and the stability of product test is further improved.
As shown in fig. 2, an embodiment of the present invention provides a product testing system based on AI large model and machine learning, including:
The data acquisition module is used for acquiring product test data;
The data processing module is connected with the data acquisition module and comprises a preprocessing unit used for preprocessing the product test data to output optimized data and a feature extraction unit connected with the preprocessing unit and used for extracting features of the optimized data by using a machine learning algorithm;
The model training module is connected with the data processing module and comprises a model training unit, a result prediction unit and a model updating unit, wherein the model training unit is connected with the characteristic extraction unit and used for training an original model according to the characteristics to output an AI large model, the result prediction unit is connected with the model training unit and used for predicting a test result of a product according to the AI large model to output a prediction result, and the model updating unit is connected with the model training unit and used for updating the AI large model according to newly acquired product test data;
The storage module is respectively connected with the data acquisition module, the data processing module and the model training module and used for respectively storing the product test data, the optimization data, the characteristics, the machine learning algorithm, the AI large model and the prediction result;
The control module is respectively connected with the data acquisition module, the data processing module, the model training module and the storage module and is used for determining the transmission rate of the product test data according to the packet loss rate based on the product test data, or determining the learning rate of the AI large model according to the training time extension proportion of the AI large model, and determining the batch processing size of the product test data according to the training time extension proportion of the AI large model and the average update delay time length of the AI large model.
Specifically, the preprocessing includes cleaning, merging and converting the product test data sequentially.
Thus far, the technical solution of the present invention has been described in connection with the preferred embodiments shown in the drawings, but it is easily understood by those skilled in the art that the scope of protection of the present invention is not limited to these specific embodiments. Equivalent modifications and substitutions for related technical features may be made by those skilled in the art without departing from the principles of the present invention, and such modifications and substitutions will be within the scope of the present invention.