CN114239666B

CN114239666B - Classification model training method, device, and computer-readable medium

Info

Publication number: CN114239666B
Application number: CN202010928518.7A
Authority: CN
Inventors: 何世明
Original assignee: ZTE Corp
Current assignee: ZTE Corp
Priority date: 2020-09-07
Filing date: 2020-09-07
Publication date: 2025-10-14
Anticipated expiration: 2040-09-07
Also published as: CN114239666A

Abstract

The embodiment of the disclosure provides a method for training a classification model, which comprises the steps of processing a training sample set through a particle swarm algorithm to obtain initial parameters of the classification model, training the initial classification model with the initial parameters by using the training sample set to obtain the classification model, wherein the training sample set comprises a plurality of training samples, each training sample comprises a plurality of communication parameters of a cell, each cell has a determined type, a plurality of cells corresponding to the plurality of training samples of the training sample set have at least two different types, and the classification model is a machine learning model and is used for determining the type of the cell through the plurality of communication parameters of the cell. The embodiment of the disclosure also provides a device and a computer readable medium.

Description

Method, apparatus, and computer readable medium for classification model training

Technical Field

The embodiment of the disclosure relates to the technical field of cell classification, in particular to a method, equipment and a computer readable medium for training a classification model.

Background

As a highly complex and integrated system, a communication system can have a serious impact on the operation of the entire system if a certain part (e.g., a cell) fails.

However, in the prior art, the situation of each cell cannot be identified rapidly and accurately, that is, effective classification of each cell cannot be achieved.

Disclosure of Invention

Embodiments of the present disclosure provide a method, apparatus, computer readable medium for classification model training.

In a first aspect, embodiments of the present disclosure provide a method of classification model training, comprising:

processing the training sample set through a particle swarm algorithm to obtain initial parameters of a classification model;

training an initial classification model with initial parameters by using a training sample set to obtain a classification model;

Wherein, the

The training sample set comprises a plurality of training samples, each training sample comprises a plurality of communication parameters of a cell, each cell has a determined type, a plurality of cells corresponding to the plurality of training samples of the training sample set have at least two different types, and the classification model is a machine learning model and is used for determining the type of the cell through the plurality of communication parameters of the cell.

In some embodiments, the processing the training sample set by the particle swarm algorithm to obtain initial parameters of the classification model includes:

determining the position and velocity of a plurality of particles of a population of particles;

updating the position and speed of each particle and the inertia weight of the particle group;

Updating the individual optimal solution of each particle and the global optimal solution of the particle group;

and returning to the step of updating the position and speed of each particle and the inertia weight of the particle group when the termination condition is not met.

In some embodiments, the updating the inertial weights of the particle swarm includes determining the inertial weights w according to the following formula:

Wherein w (k+1) represents the inertial weight determined in the k+1th update, iter _max represents the preset maximum update times, w _max represents the preset maximum inertial weight, w _min represents the preset minimum inertial weight, d ₁ represents the preset inertial weight modification rate, g _best represents the global optimal solution, f () represents the fitness function, and 0≤w _min<w_max≤1,0≤d₁≤1.

In some embodiments, the updating the inertial weights of the population of particles further comprises:

And if the global optimal solution is unchanged and the number of times of inertial weight reset does not reach n in the continuous m times of updating, performing inertial weight reset, wherein the inertial weight reset comprises the steps of enabling the number of times of updating to be equal to 1 and enabling the inertial weight to be equal to the maximum inertial weight, m is a preset integer which is greater than or equal to 2, and n is a preset integer which is greater than or equal to 1.

In some embodiments, the training samples are of the same type as their corresponding cells, and the training the initial classification model with initial parameters with the training sample set includes determining a type weight W _i for the training samples of type i in the initial classification model by:

Where q _i is a preset importance coefficient of training samples of type i, 0<q _i≤1;P_i is a ratio of the number of training samples of type i to the total number of training samples in the training sample set.

In some embodiments, the training the initial classification model with initial parameters with the training sample set to obtain the classification model includes:

Training an initial classification model with initial parameters by using a training sample set to obtain an intermediate model;

selecting input features with pre-positioned or pre-determined proportion before importance ranking from the input features of the intermediate model as reserved features;

And removing other input features except the reserved features from the input features of the intermediate model, and training the intermediate model by using a training sample set to obtain a classification model.

dividing the training sample set into a plurality of samples;

In the training process of each candidate classification model, one sample is used for verification, the other samples are used for training, and different samples are selected for verification in the training process of different candidate classification models;

A candidate classification model is determined as a classification model.

In some embodiments, the types of cells include a normal type and at least one failure type.

In some embodiments, the classification model includes at least one of:

random forest classification model and gradient lifting tree classification model.

In a second aspect, embodiments of the present disclosure provide an apparatus comprising one or more memories, one or more processors, the memories storing a computer program executable by the processors, the computer program when executed by the processors implementing the steps of:

Wherein, the

In a third aspect, the disclosed embodiments provide a computer readable medium having stored thereon a computer program which, when executed by a processor, performs the steps of:

Wherein, the

In the method of the embodiment of the disclosure, the preferred parameters (initial parameters) of the classification model are approximately determined through a Particle Swarm Optimization (PSO), and then the classification model is obtained by training according to the initial parameters, so that the parameters of the classification model can be quickly adjusted to a proper range, the training time is short, and the obtained classification model has high accuracy and is not easy to fall into a local optimal solution.

Drawings

In the drawings of the embodiments of the present disclosure:

FIG. 1 is a flow chart of a method of classification model training provided by an embodiment of the present disclosure;

FIG. 2 is a flow chart of another method of classification model training provided by an embodiment of the present disclosure;

Fig. 3 is a block diagram of an apparatus provided in an embodiment of the present disclosure.

Detailed Description

For a better understanding of the technical solutions of the embodiments of the present disclosure, the following describes in detail a method, apparatus, and computer readable medium for training a classification model provided by the embodiments of the present disclosure with reference to the accompanying drawings.

Embodiments of the present disclosure will be described more fully hereinafter with reference to the accompanying drawings, in which embodiments shown may be embodied in different forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.

The accompanying drawings, which are included to provide a further understanding of embodiments of the disclosure and are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description serve to explain the disclosure, without limitation to the disclosure. The above and other features and advantages will become more readily apparent to those skilled in the art from the description of the detailed exemplary embodiments with reference to the accompanying drawings,

Embodiments of the present disclosure may be described with reference to plan and/or cross-sectional views with the aid of idealized schematic diagrams of the present disclosure. Accordingly, the example illustrations may be modified in accordance with manufacturing techniques and/or tolerances.

Embodiments of the disclosure and features of embodiments may be combined with each other without conflict.

The terminology used in the present disclosure is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. The term "and/or" as used in this disclosure includes any and all combinations of one or more of the associated listed items. As used in this disclosure, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. The terms "comprising," "including," and "includes" as used in this disclosure "made by.,. The presence of said features, integers, steps, operations, elements, and/or components, but does not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components and/or groups thereof.

Unless otherwise defined, all terms (including technical and scientific terms) used in this disclosure have the same meaning as commonly understood by one of ordinary skill in the art. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the relevant art and the present disclosure, and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.

The embodiments of the present disclosure are not limited to the embodiments shown in the drawings, but include modifications of the configuration formed based on the manufacturing process. Thus, the regions illustrated in the figures have schematic properties and the shapes of the regions illustrated in the figures illustrate the particular shapes of the regions of the elements, but are not intended to be limiting.

In some related technologies, a fault cell can be determined by detecting communication equipment (such as communication equipment in a base station) in real time and combining with expert experience analysis, but the method requires a great deal of labor investment and has extremely high requirements on network operation and maintenance, and in addition, different communication equipment with the same function often have great differences in structure and performance, so that the method has no universality.

In addition, the communication parameters of each cell may be analyzed by an AI (artificial intelligence) classification model, so as to determine whether or not a cell has a fault, what has a fault, and the like.

However, as the number of cells with faults in reality is small, the number of fault samples for training the classification model is small, so that parameters are difficult to adjust in place in the training process of the classification model, the training time is too long, and the obtained classification model is low in accuracy and easy to fall into a local optimal solution.

In particular, the difference between communication parameters of different types of cells (such as cells with different faults) in the communication field is not obvious (i.e. the distance between classes is smaller), so that the corresponding classification difficulty is high, on the other hand, the requirement of the communication field on the classification accuracy is high, because once the problem is positioned incorrectly (such as determining the faulty cell with the error), the incorrect operation is caused, and the whole network is seriously affected.

In a first aspect, referring to fig. 1, an embodiment of the present disclosure provides a method of classification model training, comprising:

s101, processing a training sample set through a particle swarm algorithm to obtain initial parameters of a classification model.

S102, training an initial classification model with initial parameters by using a training sample set to obtain a classification model.

The training sample set comprises a plurality of training samples, each training sample comprises a plurality of communication parameters of a cell, each cell is of a determined type, a plurality of cells corresponding to the plurality of training samples of the training sample set are of at least two different types, and the classification model is a machine learning model and is used for determining the type of the cell through the plurality of communication parameters of the cell.

The particle swarm Optimization algorithm (PSO, particle Swarm Optimization) is also called as a particle swarm Optimization algorithm, and is an algorithm for searching an optimal solution of a problem, wherein a particle swarm consisting of a plurality of particles is simulated in the particle swarm algorithm, each particle has a certain position and speed, and each particle is enabled to track an individual optimal solution of the particle swarm and a global optimal solution of the particle swarm in an iterative mode, so that the position and the speed of the particle can be continuously updated until a better global optimal solution is obtained when a termination condition is met.

The training sample set includes a plurality of training samples, each training sample includes a plurality of communication parameters of one cell, and cells corresponding to different training samples have different types (such as a failure type and a normal type), that is, the training sample set includes a plurality of communication parameters of different types of cells.

The communication parameters are parameters which are generated by each cell in the communication process and can be determined and learned, and can represent the state (type) of the cell. For example, the communication parameters may include the maximum number of RRC (radio resource control) connection establishment users, the number of RRC establishment successes, the number of intra-eNB (e-base station) handover requests, RB values of RBs (resources) of the cell, and the like. The different communication parameters in each cell (training sample) may be parameters of different items (types), or may be values of the same item of parameters at different times.

Wherein it should be understood that the communication parameters (but not the specific values) included in the different training samples should be the same and that the communication parameters included in each cell should be the same during classification with the classification model.

Among them, the classification model is a machine learning model that can determine the type (state) of a cell, for example, determine whether the cell is a faulty cell or a normal cell, determine what a cell has a specific fault, and the like, by analyzing communication parameters of the cell.

In the method of the embodiment of the disclosure, a training sample set is firstly processed by a Particle Swarm Optimization (PSO) to obtain initial parameters of a classification model and an initial classification model with the initial parameters, and then the initial classification model is trained by the training sample set, and the structure, the parameters and the like of the initial classification model are further adjusted to obtain the classification model. The final classification model can be used to determine the cell type based on the cell communication parameters, i.e. for cell classification (e.g. failure cell identification).

The types of cells may include, for example, normal types (corresponding to normal cells) and failure types (corresponding to failed cells), wherein the failure types may be further divided into more different specific failure types (corresponding to cells of different specific failures). That is, the classification model trained by the method of the embodiments of the present disclosure may be used for faulty cell identification, because it is important in the field of communications to identify faulty cells to eliminate faults. Of course, the classification model trained by the method according to the embodiment of the present disclosure may also be classified in other manners, for example, the normal cells are also classified into multiple types (such as high-load cells, low-load cells, etc.).

In some embodiments, the classification model includes at least one of:

The classification model trained by embodiments of the present disclosure may be specifically a Random Forest (RF) classification model, a gradient-lifted tree (GBDT, gradient Boosting Decision Tree) classification model, or the like. The above two classification models are most suitable for training using the methods of the embodiments of the present disclosure.

Of course, the number and form of the initial parameters determined by the particle swarm algorithm are different for different classification models.

Of course, the methods of embodiments of the present disclosure may also be used to train other machine-learned classification models.

In some embodiments, referring to fig. 2, before processing the training sample set by the particle swarm algorithm to obtain initial parameters of the classification model (S101), further comprising:

s1001, preprocessing data of each communication parameter of the training sample.

Obviously, each training sample comprises a plurality of communication parameters (original input variables), and the physical meanings of different communication parameters are different, so that the units (dimensions), the magnitude and the like of the training samples are greatly different, and in order to eliminate the influence of the differences on data processing, all the communication parameters can be preprocessed, so that the normalization of the original input variables is realized.

Illustratively, the normalization of the communication parameter j in the training sample i may be performed by the following formula:

Wherein, x _ij is the original value of the communication parameter j in the training sample i, x _ij is the normalized value of x _ij, mean (x _j) is the average value of the communication parameter j in all the training samples, and std (x _j) is the standard deviation of the communication parameter j in all the training samples.

S1002, extracting key features from all communication parameters of a training sample.

In some embodiments, the correlation between the communication parameters (input variables) may be further removed to obtain key features (principal components) therein.

For example, the communication parameters (input variables) may be linearly transformed such that the components of the transformed variables are uncorrelated with each other while ensuring that their covariance matrix is a unitary matrix, i.e., principal component analysis (PCA, PRINCIPAL COMPONENT ANALYSIS) is performed.

The principal component analysis may be performed by the formula m=su, where S is an input variable, M is a principal component score matrix, and U is a load matrix, and correspondingly, performing data reconstruction to obtain the formula s=mu ^T, where T represents a transpose of the matrix.

Further, a contribution rate of less than 100% may be set so that the number of principal components finally selected is less than the number of input variables (i.e., the number of communication parameters contained in each training sample). For example, the principal component analysis can be performed by the formula s=mu ^T +e, where E is the residual matrix.

By the mode, the complexity of the classification model can be reduced under the condition that the accuracy of the classification model is not reduced basically, the training process is quickened, and the scale of the classification model is reduced.

Of course, it is also possible to perform the above steps of data preprocessing, key feature extraction, etc., or to perform the steps of data preprocessing, key feature extraction, etc., in other specific manners.

In some embodiments, referring to fig. 2, processing the training sample set by the particle swarm algorithm to obtain initial parameters of the classification model (S101) includes:

S1011, determining the positions and the speeds of a plurality of particles of the particle swarm.

I.e. initializing the population of particles, determining the number of particles therein, as well as the position, velocity, and parameters for each particle for use in subsequent operations.

Specifically, N particles are randomly generated in the solution space of the D-dimensional (D equals the number of communication parameters included in each training sample), where the position of any particle i is denoted as L _i＝(L_i1,L_i2,...L_iD, and the velocity is denoted as V _i＝(V_i1,V_i2,...V_iD).

And, the particle learning rate c ₁、c₂ is set, and the maximum update number item _max is set.

Further, an individual optimal solution P _best for each particle origin and a global optimal solution g _best for the particle group (population) can be determined.

S1012, updating the position and velocity of each particle and the inertia weight of the particle group.

The position, velocity, and inertial weight w of the particle swarm are iteratively updated.

Wherein the position L _i and velocity V _i of any particle i in each update can be calculated by the following formula:

L_i(k+1)＝L_i(k)+V_i(k+1);

V_i(k+1)＝w*V_i(k)+c₁*r₁*[P_best-L_i(k)]+c₂*r₂*[g_best-L_i(k)];

Wherein L _i (k) and V _i (k) are the position and velocity of the particle i after the kth update, L _i (k+1) and V _i (k+1) are the position and velocity of the particle i after the kth+1 update, respectively, and r ₁ and r ₂ are random numbers greater than 0 and less than 1 generated in each update.

In some embodiments, the inertial weight w is determined in this step (S1012) according to the following formula:

Generally, the larger the inertia weight w is, the stronger the global searching ability of the particle swarm algorithm is, and the weaker the local searching ability is, and the smaller the inertia weight w is, the weaker the global searching ability of the particle swarm algorithm is, and the stronger the local searching ability is. In the embodiment of the disclosure, the inertial weight w is calculated according to the above specific formula, so that the inertial weight w gradually converges from w _max to w _min according to the inverse proportion trend, and a portion with gradually expanding fitness according to the global optimal solution g _best is superimposed in the inertial weight w (of course, the portion does not exist if d ₁ =0). Therefore, when the adaptability of the global optimal solution g _best is larger, the inertia weight w is larger, the local optimal solution is not easy to fall into, the global searching capability of the particle swarm algorithm is increased, the more accurate finding of the optimal parameters of the classification model is facilitated, and the classification result of the classification model is more accurate.

S1013, updating the individual optimal solution of each particle and the global optimal solution of the particle group.

And updating the individual optimal solution of each particle and the global optimal solution of the particle group according to the updated position and speed of each particle.

Updating the individual optimal solution P _best for each particle may include comparing the current fitness value f (x) of the particle with the individual optimal solution P _best, using the current position of the particle as the individual optimal solution P _best if the fitness value f (x) is better than the individual optimal solution P _best, and otherwise, keeping the individual optimal solution P _best unchanged.

The fitness value f (x) can be calculated by the following formula;

wherein y _j represents a true value, The predicted value is represented, and t represents the number of samples.

The updating of the global optimal solution g _best of the particle swarm may include comparing fitness values of all particles with a current global optimal solution g _best, selecting a position of the particle with the optimal fitness value and superior to the global optimal solution g _best as a global optimal solution g _best, and if the optimal fitness value is not superior to the current global optimal solution g _best, keeping the global optimal solution g _best unchanged.

In some embodiments, further comprising:

S1014, if the global optimal solution is unchanged in the continuous m times of updating and the number of times of inertial weight resetting does not reach n, performing inertial weight resetting.

Wherein the inertial weight resetting includes making the number of updates equal to 1 and making the inertial weight equal to the maximum inertial weight, where m is a preset integer greater than or equal to 2 and n is a preset integer greater than or equal to 1.

When the global optimal solution g _best does not change in the continuous multiple updates, the inertial weight is reset, that is, the additional inertial weight w is equal to the preset maximum inertial weight w _max, and the number of updates (the number of updates that have been performed currently) k is reset to 1, which is equivalent to the process of restarting the iterative update from w _max.

Of course, the above number of inertial weight resets is limited, and when the number of inertial weight resets reaches the preset maximum number of inertial weight resets (i.e., n above), the inertial weight reset is not performed any more. Therefore, each time the inertial weight is reset, the number s of inertial weight resets needs to be increased by 1 (e.g., let s=s+1).

In the embodiment of the disclosure, the inertial weight w can be reset under the condition that the global optimal solution g _best is unchanged for a long time, and iterative updating is restarted, so that the local optimal solution can be better avoided, and the optimal parameters of the classification model can be more accurately found.

Of course, it is also possible if the above inertial weight resetting step is not performed.

S1015, taking the current global optimal solution as an initial parameter of the classification model when the termination condition is met, and returning to update the position and the speed of each particle and the inertia weight of the particle swarm when the termination condition is not met (S1012).

After updating the position, the speed, the inertia weight w, the individual optimal solution P _best and the global optimal solution g _best each time, judging whether the termination condition is met:

If yes, the iteration is considered to be terminated, so that the current global optimal solution g _best is output as an initial parameter of the classification model, and the initial classification model is continuously trained;

if not, the update count is increased by 1 (i.e., k=k+1), and the process returns to step S1012 to start updating the position, speed, inertial weight, and the like at the next time.

The termination condition may be set as necessary.

The termination condition may include, for example, the number of updates k reaching a preset maximum number of updates, or the accuracy being less than a certain preset value, etc.

Of course, the specific termination conditions may be set as other matters as required, and will not be described in detail herein.

In some embodiments, referring to FIG. 2, training samples of the same type as their corresponding cells, training an initial classification model with initial parameters with a training sample set (S102) includes:

s1021, determining the type weight of the training sample in the initial classification model.

In some embodiments, the type weight W _i of the training sample of type i in the initial classification model may be determined by the following formula:

In the field of communications, certain faults are very serious, i.e. cells in which such faults exist must be accurately detected, otherwise serious consequences may arise. However, serious faults often occur rarely, so that the number of fault samples is often small, and the training of the classification model is not facilitated.

In addition, the severity degree judgment (bias) of different faults by different users (such as different operators) can also be different, some users can consider that the faults of the type A are more serious, some users consider that the faults of the type B are more serious, and the personalized requirements are also reflected in the classification model.

For this purpose, according to the importance degree of each type of the cell, an importance coefficient q between 0 and 1 (the larger the value thereof is, the more important the type is) is manually set, and then, according to the importance coefficient q, the type weight W corresponding to the training sample of the type is calculated according to the above formula. Thus, in the training process of the classification model, each type of training sample can use the corresponding type weight W to reflect the importance degree of different types.

Through the mode, the user can set the importance degree of each type according to own bias, and a personalized classification model which meets own requirements is obtained.

In addition, the type weight W used in the classification model training is further calculated through an importance coefficient q set by a user, and in the calculation process, the smaller the number of training samples of a certain type is, the larger the relative importance coefficient q of the type weight W is, so that the problem of unbalanced number of the training samples of different types is solved.

Of course, it is also possible if the above type weight calculation step is not performed, or the type weights are calculated by other means (e.g., directly employing importance coefficients as type weights).

In some embodiments, referring to fig. 2, training an initial classification model with initial parameters with a training sample set to obtain a classification model (S102) includes:

s1022, training the initial classification model with initial parameters by using the training sample set to obtain an intermediate model.

S1023, selecting input features with importance degrees in a pre-determined position or a pre-determined proportion before ranking from the input features of the intermediate model as reserved features.

S1024, removing other input features except the reserved features from the input features of the intermediate model, and training the intermediate model by using a training sample set to obtain a classification model.

In the embodiment of the disclosure, an initial classification model with initial parameters may be trained to obtain an "intermediate model", and the importance degree (feature_ importance) of each input feature (corresponding to each communication parameter) of the intermediate model is naturally known. Therefore, each input feature can be ranked according to the importance degree, the input feature with higher importance degree (reserved feature) is reserved, the input feature with lower importance degree (other features) is removed, and then the intermediate model with reserved feature only is trained to obtain the classification model.

According to the method, feature reselection is performed based on the importance degree of the input features, and model retraining is performed after feature reselection, so that the accuracy of the obtained classification model can be further improved, and the classification model capable of meeting the high-accuracy requirement in the communication field can be obtained under the conditions that the number of training samples is small and the inter-class distance is small.

Of course, it is also possible to use the intermediate model directly as the final classification model without performing the above feature reselection retraining step.

In some embodiments, training an initial classification model with initial parameters with a training sample set to obtain a classification model (S102) includes equally dividing the training sample set into a plurality of samples, training the initial classification model with initial parameters with the training sample set to obtain a plurality of candidate classification models, using one sample for verification during the training of each candidate classification model, using other samples for training, and selecting different samples for verification during the training of different candidate classification models, and determining one candidate classification model as a classification model.

As one way of embodiments of the present disclosure, training may be performed using a cross-validation approach, namely:

The training sample set is divided into a plurality of samples (such as 10 samples) in a layering sampling mode, so that the number of training samples in each sample is the same, and the proportion of the various types of training samples in different samples is the same.

Then, starting from the initial classification model, training a plurality of (e.g. 10) candidate classification models, wherein one sample is selected for verification in the training process of each candidate classification model, the other samples (e.g. 9 samples) are used for training, and different samples are selected for verification in the training process of different candidate classification models (of course, the samples used for training in the training process of different candidate classification models are not identical);

and analyzing the obtained candidate classification models to obtain the advantages and disadvantages of the candidate classification models, and selecting the optimal candidate classification model from the candidate classification models as an output classification model, wherein the output classification model can be practically used in the process of classifying cells (such as fault cell identification).

Wherein the manner in which the classification model is determined from the plurality of candidate classification models is varied.

For example, a test sample set may be provided, the test sample set comprising a plurality of different types of test samples (each test sample comprising a plurality of communication parameters of one cell). And classifying the test samples in the test sample sets through a plurality of candidate classification models, and determining the optimal candidate classification model through analyzing the classification accuracy and recall rate of each test sample set.

Of course, it is also possible if the cross-validation approach is not used, but rather a classification model is directly trained with the training sample set.

In a second aspect, referring to FIG. 3, an embodiment of the present disclosure provides an apparatus comprising one or more memories, one or more processors, the memories storing a computer program executable by the processors, the computer program when executed by the processors implementing the following steps (i.e., a method of training any of the classification models above):

Wherein, the

The processor is a device with data processing capability, including but not limited to a Central Processing Unit (CPU), the memory is a device with data storage capability, including but not limited to a random access memory (RAM, more specifically SDRAM, DDR, etc.), a read-only memory (ROM), a charged erasable programmable read-only memory (EEPROM) and a FLASH memory (FLASH), and the I/O interface (read-write interface) is connected between the processor and the memory, so that the information interaction between the memory and the processor can be realized, including but not limited to a data Bus (Bus), etc.

In a third aspect, embodiments of the present disclosure provide a computer readable medium having stored thereon a computer program which, when executed by a processor, performs the following steps (i.e. a method of training any one of the classification models above):

Wherein, the

Example 1:

a random forest classification model was trained for classifying cells into the following 5 types, the latter 4 of which are different fault types.

The types of communication parameters used in the classification are as follows:

And selecting the data of the current moment and the previous 4 historical synchronous moments as communication parameters for each cell or training sample by each type of communication parameters. For example, when the current time is 18 points on monday, the 4 history contemporaneous times are 18 points on monday of the previous week, 18 points on monday of the previous two weeks, 18 points on monday of the previous three weeks, and 18 points on monday of the previous four weeks.

Thus, there are 15×5=75 communication parameters (75 dimensions) for each cell or training sample.

A corresponding training sample set is prepared.

S201, preprocessing data of each communication parameter of the training sample.

The preprocessing can be specifically normalized by the following formula:

Wherein x _ij is the original value of the communication parameter j in the training sample i, x is _ij is the normalized value of x _ij, mean (x _j) is the average value of the communication parameter j in all the training samples, and std (x _j) is the standard deviation of the communication parameter j in all the training samples.

S202, extracting key features from all communication parameters of the training sample.

And carrying out principal component analysis through a formula S=MU ^T +E, and reserving 85% of contribution degree, wherein S is an input variable, M is a principal component scoring matrix, U is a load matrix, T is a transpose of the matrix, and E is a residual matrix.

S203, processing the training sample set through a particle swarm algorithm to obtain initial parameters of the random forest classification model.

The initial parameters of the random forest classification model to be determined are 6, wherein the initial parameters comprise the number n_ estimators of the base classifiers, the maximum depth max_depth of the base classifiers, the maximum feature number max_features selected by the base classifiers, the minimum sample number min_samples_split of leaf nodes in a splitting process, the minimum sample number min_samples_leaf of nodes in a pruning process and an evaluation criterion function criterion.

The method specifically comprises the following steps:

S2031, initializing a population.

N particles are randomly generated in the solution space of the D-dimensional (75-dimensional) problem, where the position of any particle i is denoted as L _i＝(L_i1,L_i2,...L_iD) and the velocity is denoted as V _i＝(V_i1,V_i2,...V_iD.

The particle learning rate c ₁＝c₂ =2, the inertial weight maximum value w _max =0.8, the inertial weight minimum value w _min =0.2, the inertial weight modification rate d ₁ =0.1, the population number (particle number) n=20, the maximum update number item _max =100, and the maximum inertial weight reset number n=3 are set. Meanwhile, the current update number k=1 and the current inertia weight reset number s=1 are set.

S2032, updates the position and velocity of each particle, and the inertia weight of the particle group.

Wherein the position L _i and velocity V _i of particle i in each update can be calculated by the following formulas:

L_i(k+1)＝L_i(k)+V_i(k+1);

V_i(k+1)＝w*V_i(k)+c₁*r₁*[P_best-L_i(k)]+c₂*r₂*[g_best-L_i(k)];

And the inertial weight w can be calculated by the following formula:

Wherein L _i (k) and V _i (k) are the position and velocity of the particle i after the kth update, L _i (k+1) and V _i (k+1) are the position and velocity of the particle i after the kth+1 update, r ₁ and r ₂ are random numbers greater than 0 and less than 1, w (k+1) represents the inertial weight determined in the kth+1 update, g _best represents the global optimal solution, and f () represents the fitness function.

S2033, updating the individual optimal solution of each particle and the global optimal solution of the particle group.

Updating the individual optimal solution P _best for each particle may include comparing the current fitness value f (x) of the particle with the individual optimal solution P _best, using the current location of the particle as the individual optimal solution P _best if the fitness value f (x) is better than the individual optimal solution P _best, and otherwise, keeping the individual optimal solution P _best unchanged.

The fitness value f (x) can be calculated by the following formula;

The updating of the global optimal solution g _best of the particle swarm may include comparing fitness values of all particles with the global optimal solution g _best, selecting a position of the particle with the optimal fitness value and superior to the current global optimal solution g _best as the global optimal solution g _best, and if the optimal fitness value is not superior to the current global optimal solution g _best, keeping the global optimal solution g _best unchanged.

S2034, performing inertial weight reset if the global optimal solution g _best is unchanged and the number S of times of inertial weight reset performed has not reached the maximum number n=3 of inertial weight reset times in the continuous 5 updates.

The inertial weight reset includes making the number of updates k=1, w=w _max, and s=s+1.

S2035, judging the termination condition.

Judging whether the update times k reach the maximum update times iter _max =100 or whether the precision is smaller than 0.001, if one is yes, stopping the iterative update, and only entering the next step, if not, making k=k+1, and returning to the step S2032.

S204, training the initial classification model with the initial parameters by using the training sample set to obtain a random forest classification model.

Specifically, the step may include:

S2041, determining the type weight W _i of the training sample of the type i in the initial classification model through the following formula:

wherein q _i is a preset importance coefficient of training samples of type i, and P _i is a ratio of the number of training samples of type i to the total number of training samples in the training sample set.

In this embodiment, the cells with random access and RRC access or hand-in request are the most important, so the importance coefficient q of each type of cell is set as follows:

S2042, training by adopting a 10-layer cross validation method to obtain candidate classification models (each is a random forest classification model).

The training sample is divided into 10 parts, 9 parts are used as training, 1 part is used as verification, and when different candidate classification models are trained, different parts are used for verification, so that 10 candidate classification models are obtained in total.

S2043, performing feature reselection retraining.

Each random forest classification model obtained through the training has the importance degree of each input feature, the input features are ranked from big to small according to the importance degree of the input features, the first 80% of the input features are reserved, and the random forest classification model is trained again.

S2044, determining a classification model.

The test samples in the test sample set are classified by each candidate classification model (the ratio of the number of samples in the training sample set to the number of samples in the test sample set may be 3 to 1). Wherein, the classification result of each candidate classification model is as follows:

Where N _ab represents the number of test samples of type a classified as type b by the candidate classification model, it is apparent that the classification "correct" of the candidate classification model is indicated when b=a.

Thus, for any candidate classification model, its classification accuracy for type a is: and recall for the samples of type a is:

therefore, the classification capability score of the candidate classification model for the type a can be calculated according to the accuracy rate and the recall rate:

and the total classification ability score (total score) of the candidate classification model may be its average of all types of classification ability scores F.

Thus, the candidate classification model with the highest total score or the candidate classification model with the best combination of the total score and the classification ability score of each type can be selected as the output classification model (random forest classification model).

Example 2:

a gradient-lifting tree (GBDT) classification model was trained for classifying cells into the following 10 types, the latter 9 of which are different fault types.

And the communication parameters used in the classification are RB values of RB (resource) 0 to RB100 of each cell.

Thus, there are 101 communication parameters (101 dimensions) for each cell or training sample

A corresponding training sample set is prepared.

In this embodiment, since all communication parameters selected in the present embodiment are the same in kind, they have the same unit and similar orders of magnitude, and thus data preprocessing (normalization) is not required.

S301, extracting key features from all communication parameters of the training sample.

And carrying out principal component analysis through a formula S=MU ^T +E, and reserving 80% of contribution degree, wherein S is an input variable, M is a principal component scoring matrix, U is a load matrix, T is a transpose of the matrix, and E is a residual matrix.

S302, processing a training sample set through a particle swarm algorithm to obtain initial parameters of GBDT classification models.

The number of initial parameters of GBDT classification models to be determined is 4, including the number of base classifiers n_ estimators, the maximum depth max_depth of the base classifiers, the maximum feature number max_features selected by the base classifiers, and the learning depth learning_rate.

The method specifically comprises the following steps:

S3021, initializing a population.

N particles are randomly generated in the solution space of the D-dimensional (101-dimensional) problem, where the position of any particle i is denoted as L _i＝(L_i1,L_i2,...L_iD) and the velocity is denoted as V _i＝(V_i1,V_i2,...V_iD.

S3022, updating the position and velocity of each particle, and the inertia weight of the particle group.

L_i(k+1)＝L_i(k)+V_i(k+1);

V_i(k+1)＝w*V_i(k)+c₁*r₁*[P_best-L_i(k)]+c₂*r₂*[g_best-L_i(k)];

And the inertial weight w can be calculated by the following formula:

S3023, updating the individual optimal solution of each particle and the global optimal solution of the particle group.

The fitness value f (x) can be calculated by the following formula;

S3024, if the global optimal solution g _best is unchanged and the number S of times of inertial weight reset has not reached the maximum number n=3 of inertial weight reset in the continuous 5 updates, performing inertial weight reset.

S3025, judging termination conditions.

Judging whether the update times k reach the maximum update times iter _max =100 or whether the precision is smaller than 0.001, if one of the update times k is smaller than 0.001, ending the iterative update, and entering the next step, if the update times k are smaller than the maximum update times iter _max =100, returning to the step S3022.

S303, training the initial classification model with the initial parameters by using the training sample set to obtain GBDT classification models.

Specifically, the step may include:

s3031, determining the type weight W _i of the training sample of the type i in the initial classification model by the following formula:

In this embodiment, the importance coefficient q of each type of cell is set as follows:

s3032, training by adopting a 5-layer cross validation method to obtain candidate classification models (GBDT classification models in each case).

The training sample is divided into 5 parts, 4 parts are used as training and 1 part is used as verification, and when different candidate classification models are trained, different parts are used for verification, so that 5 candidate classification models are obtained in total.

S3033, performing feature reselection retraining.

Each GBDT classification model obtained by training has the importance degree of each input feature, the input features are ranked from big to small according to the importance degree of the input features, the first 85% of input features are reserved, and the GBDT classification model is trained again.

S3034, determining a classification model.

Thus, for this candidate classification model, its classification accuracy for type a is: and recall for the samples of type a is:

Thus, the candidate classification model with the highest total score or the candidate classification model with the best combination of the total score and the classification ability score of each type can be selected as the output classification model (GBDT classification model).

Those of ordinary skill in the art will appreciate that all or some of the steps, systems, functional modules/units in the apparatus disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof.

In a hardware implementation, the division between functional modules/units mentioned in the above description does not necessarily correspond to the division of physical components, for example, one physical component may have a plurality of functions, or one function or step may be cooperatively performed by several physical components.

Some or all of the physical components may be implemented as software executed by a processor, such as a Central Processing Unit (CPU), digital signal processor, or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit. Such software may be distributed on computer readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media). The term computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data, as known to those skilled in the art. Computer storage media includes, but is not limited to, random access memory (RAM, more specifically SDRAM, DDR, etc.), read-only memory (ROM), electrically charged erasable programmable read-only memory (EEPROM), FLASH memory (FLASH) or other magnetic disk storage, compact disk read-only memory (CD-ROM), digital Versatile Disk (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage, and any other medium which can be used to store the desired information and which can be accessed by a computer. Furthermore, as is well known to those of ordinary skill in the art, communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media.

The present disclosure has disclosed example embodiments, and although specific terms are employed, they are used and should be interpreted in a generic and descriptive sense only and not for purpose of limitation. In some instances, it will be apparent to one skilled in the art that features, characteristics, and/or elements described in connection with a particular embodiment may be used alone or in combination with other embodiments unless explicitly stated otherwise. It will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the scope of the disclosure as set forth in the appended claims.

Claims

1. A method of classification model training, comprising:

Wherein, the

The training sample set comprises a plurality of training samples, wherein each training sample comprises a plurality of communication parameters of a cell, each cell has a determined type, and a plurality of cells corresponding to the plurality of training samples of the training sample set have at least two different types;

Training an initial classification model with initial parameters by using a training sample set, and obtaining the classification model comprises the following steps:

Removing other input features except the reserved features from the input features of the intermediate model, and training the intermediate model by using a training sample set to obtain a classification model;

the step of processing the training sample set through the particle swarm algorithm to obtain initial parameters of the classification model comprises the following steps:

Returning to the step of updating the position and speed of each particle and the inertia weight of the particle group when the termination condition is not met;

the updating the inertial weight of the particle swarm includes determining the inertial weight w according to the following formula:

2. The method of claim 1, wherein the updating the inertial weights of the population of particles further comprises:

3. The method of claim 1, wherein the training samples are of the same type as their corresponding cells, wherein training an initial classification model with initial parameters with the training sample set, the deriving a classification model comprises:

the type weight W _i of the training sample of type i in the initial classification model is determined by the following formula:

4. The method of claim 1, wherein training an initial classification model with initial parameters with a training sample set to obtain a classification model comprises:

dividing the training sample set into a plurality of samples;

A candidate classification model is determined as a classification model.

5. The method of claim 1, wherein the classification model comprises at least one of:

6. An apparatus comprising one or more memories, one or more processors, the memories storing a computer program executable by the processors, the computer program when executed by the processors implementing the steps of:

Wherein, the

7. A computer readable medium having stored thereon a computer program which when executed by a processor performs the steps of:

Wherein, the