CN113807374A

CN113807374A - Information processing apparatus, information processing method, and computer-readable storage medium

Info

Publication number: CN113807374A
Application number: CN202010536138.9A
Authority: CN
Inventors: 张姝; 高玥; 孙俊
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2020-06-12
Filing date: 2020-06-12
Publication date: 2021-12-17
Also published as: JP2021197164A

Abstract

The present application discloses an information processing apparatus, an information processing method, and a computer-readable storage medium. The information processing device includes: a constraint condition generation unit, which generates a plurality of constraint conditions based on a sample set; a sample grouping unit, which groups the sample set into a plurality of sample subsets corresponding to the plurality of constraint conditions one-to-one; a candidate constraint condition selection unit, One or more constraints that the target object conforms to are selected as candidate constraints; the model training unit obtains a corresponding trained white-box model corresponding to each candidate constraint by performing training; the white-box model score calculation unit is based on calculating a score of the trained white-box model for the confidence and/or support of the constraints corresponding to the trained white-box model and the classification performance of the trained white-box model; and an analysis result output unit that outputs candidate constraints and The score of the trained white-box model corresponding to the candidate constraints, as the result of the analysis of the target object.

Description

Information processing apparatus, information processing method, and computer-readable storage medium

Technical Field

The present disclosure relates to the field of information processing, and in particular, to an information processing apparatus, an information processing method, and a computer-readable storage medium.

Background

In recent years, machine learning has been used in various fields such as object recognition. In machine learning, a classification model configured by a nonlinear model such as a neural network is generally used to classify samples, thereby obtaining a classification result of the samples. However, the classification model configured by the nonlinear model is a black box having unknown internal behavior, and thus it is difficult to analyze a specific cause causing a corresponding classification result.

Disclosure of Invention

The following presents a simplified summary of the disclosure in order to provide a basic understanding of some aspects of the disclosure. However, it should be understood that this summary is not an exhaustive overview of the disclosure. It is not intended to identify key or critical elements of the disclosure or to delineate the scope of the disclosure. Its sole purpose is to present some concepts of the disclosure in a simplified form as a prelude to the more detailed description that is presented later.

It is an object of the present disclosure to provide an improved information processing apparatus, information processing method, and computer-readable storage medium that at least enable better analysis of specific causes that cause classification results.

According to an aspect of the present disclosure, there is provided an information processing apparatus including: a constraint condition generating unit configured to generate a plurality of constraint conditions based on the sample set; a sample grouping unit configured to group the sample set into a plurality of sample subsets in one-to-one correspondence with the plurality of constraints based on the plurality of constraints; a candidate constraint condition selection unit configured to select one or more constraint conditions to which a target object conforms as candidate constraint conditions from the plurality of constraint conditions; a model training unit configured to train, for each of the candidate constraints, a white-box model with a sample subset corresponding to the candidate constraint and a classification result based on a pre-trained black-box model of the sample subset corresponding to the candidate constraint to obtain a trained white-box model corresponding to the candidate constraint; a white-box model score calculation unit configured to calculate a score of the trained white-box model based on at least one of a confidence and a support of a constraint condition corresponding to the trained white-box model and a classification performance of the trained white-box model with respect to the pre-trained black-box model; and an analysis result output unit configured to output the candidate constraint condition and a score of the trained white-box model corresponding to the candidate constraint condition as an analysis result of the target object.

According to another aspect of the present disclosure, there is provided an information processing method including: a constraint condition generating step of generating a plurality of constraint conditions based on the sample set; a sample grouping step of grouping the sample set into a plurality of sample subsets in one-to-one correspondence with the plurality of constraints based on the plurality of constraints; a candidate constraint condition selection step of selecting one or more constraint conditions to which the target object meets from the plurality of constraint conditions as candidate constraint conditions; a model training step, configured to train, for each of the candidate constraints, a white box model using a sample subset corresponding to the candidate constraint and a classification result based on a pre-trained black box model of the sample subset corresponding to the candidate constraint to obtain a trained white box model corresponding to the candidate constraint; a white-box model score calculating step of calculating a score of the trained white-box model based on at least one of a confidence and a support of a constraint condition corresponding to the trained white-box model and a classification performance of the trained white-box model with respect to the pre-trained black-box model; and an analysis result output step of outputting the candidate constraint condition and a score of the trained white-box model corresponding to the candidate constraint condition as an analysis result of the target object.

According to other aspects of the present disclosure, there are also provided computer program code and a computer program product for implementing the above-described method according to the present disclosure, and a computer readable storage medium having recorded thereon the computer program code for implementing the above-described method according to the present disclosure.

Additional aspects of the disclosed embodiments are set forth in the description section that follows, wherein the detailed description is presented to fully disclose the preferred embodiments of the disclosed embodiments without imposing limitations thereon.

Drawings

The disclosure may be better understood by reference to the following detailed description taken in conjunction with the accompanying drawings, in which like or similar reference numerals are used throughout the figures to designate like or similar components. The accompanying drawings, which are incorporated in and form a part of the specification, further illustrate preferred embodiments of the present disclosure and explain the principles and advantages of the present disclosure, are incorporated in and form a part of the specification. Wherein:

fig. 1 is a block diagram showing a functional configuration example of an information processing apparatus according to an embodiment of the present disclosure;

fig. 2 is a block diagram showing an example of an architecture of an information processing apparatus according to an embodiment of the present disclosure;

fig. 3 is an example showing a sample set used in an example case of classifying iris;

FIG. 4 illustrates an example of constraints derived based on the sample set shown in FIG. 3;

FIG. 5 illustrates an example of a subset of samples derived based on the sample set shown in FIG. 3 and the constraints shown in FIG. 4;

fig. 6A and 6B show examples of constraint boundaries corresponding to a model for a certain target sample obtained using a related art and an information processing apparatus according to an embodiment of the present disclosure, respectively;

FIG. 7 is a flow chart illustrating an example of a flow of an information processing method 400 according to an embodiment of the present disclosure; and

fig. 8 is a block diagram showing an example structure of a personal computer employable in the embodiments of the present disclosure.

Detailed Description

Exemplary embodiments of the present disclosure will be described hereinafter with reference to the accompanying drawings. In the interest of clarity and conciseness, not all features of an actual implementation are described in the specification. It will of course be appreciated that in the development of any such actual embodiment, numerous implementation-specific decisions must be made to achieve the developers' specific goals, such as compliance with system-related and business-related constraints, which will vary from one implementation to another. Moreover, it will be appreciated that such a development effort might be complex and time-consuming, but would nevertheless be a routine undertaking for those of ordinary skill in the art having the benefit of this disclosure.

Here, it should be further noted that, in order to avoid obscuring the present disclosure with unnecessary details, only the device structures and/or processing steps closely related to the scheme according to the present disclosure are shown in the drawings, and other details not so relevant to the present disclosure are omitted.

Embodiments according to the present disclosure are described in detail below with reference to the accompanying drawings.

First, implementation examples of an information processing apparatus according to an embodiment of the present disclosure will be described with reference to fig. 1 to 5. Fig. 1 is a block diagram showing a functional configuration example of an information processing apparatus according to an embodiment of the present disclosure. Fig. 2 is a block diagram showing an architecture example of one specific implementation of an information processing apparatus according to an embodiment of the present disclosure. Fig. 3 is a diagram showing an example of a sample set used in an example case of classifying iris flowers, fig. 4 shows an example of a constraint derived based on the corresponding sample set, and fig. 5 shows an example of a sample subset derived based on the corresponding sample set and the constraint. As shown in fig. 1, the information processing apparatus 100 according to the embodiment of the present disclosure may include a constraint condition generation unit 102, a sample grouping unit 104, a candidate constraint condition selection unit 106, a model training unit 108, a white-box model score calculation unit 110, and an analysis result output unit 112.

The functions of the respective units of the information processing apparatus 100 according to the embodiment of the present disclosure will be described below with reference to a specific example of classifying iris flowers. Note that, although the information processing apparatus 100 according to the embodiment of the present disclosure is described below in conjunction with a specific example of classifying iris flowers, it is to be understood that the information processing apparatus 100 of the present disclosure may be used in a case of classifying various objects.

Iris flowers comprise a wide variety of species, generally iris flowers are classified by calyx length, calyx width, petal length and petal width. For example, considered herein are classification examples for determining whether iris flowers are discolored irises based on the length of the calyx, the width of the calyx, the length of the petals and the width of the petals.

The constraint generating unit 102 may be configured to generate a plurality of constraints based on a sample set (global data). For example, as shown in FIG. 2, a plurality of constraints R may be generated based on a sample set₁、R₂…R_s。

For example, in the case of classifying iris, the constraint generating unit 102 may generate the constraint R shown in fig. 4 based on the sample set shown in fig. 3₁、R₂、R₃、R₄… are provided. Wherein R is₁The width of the petal is less than 0.8cm and the length of the petal is less than 5.0 cm; r₂The length of the petal is less than 5.0cm and the length of the calyx is more than 6.0 cm; r₃The length of the petal is more than 6.0 cm; r₄The width of the calyx is more than 2.8cm and the length of the petal is more than 6.0 cm.

For example, the constraint condition generating unit 102 may generate a plurality of constraint conditions based on the sample set by using a decision tree (decision tree), a decision rule (decision rules), a rule fitting (rule fit), or the like.

By way of illustration and not limitation, the sample set may include training samples for which the classification result is known and/or randomly generated samples (samples for which the classification result is unknown).

The sample grouping unit 104 may be configured to group the set of samples into a plurality of sample subsets in one-to-one correspondence with a plurality of constraints based on the plurality of constraints. For example, as shown in FIG. 2, the constraint R may be based on₁、R₂…R_sGrouping the sample sets into respective groups with a constraint R₁、R₂…R_sCorresponding sample subset D₁、D₂…D_s。

FIG. 5 shows a graph based onSample set shown in fig. 3 and example of sample subset resulting from the constraints shown in fig. 4. As shown in fig. 5, sample subset D₁Including sample 1, sample 2, sample 7 …; subset of samples D₂Including sample 3, sample 4, sample 5, sample 6 …; subset of samples D₃Including sample 8, sample 9 …; subset of samples D₄Including sample 9 ….

The candidate constraint selecting unit 106 may be configured to select one or more constraints, to which the target object conforms, from the plurality of constraints as candidate constraints. For example, as shown in fig. 2, for a sample to be interpreted (i.e., a target object) x, one candidate constraint condition R to which the sample to be interpreted x conforms may be selected₁。

For example, the target object may be a sample in a sample set, or the target object may not be a sample in a sample set.

The model training unit 108 may be configured to, for each of the candidate constraints, train the white-box model with a sample subset corresponding to the candidate constraint and a classification result based on a pre-trained black-box model of the sample subset corresponding to the candidate constraint to obtain a trained white-box model corresponding to the candidate constraint. For example, as shown in FIG. 2, for a sample to be interpreted (i.e., a target object) x, the model training unit 108 may use a candidate constraint R associated with the sample to be interpreted₁Corresponding sample subset D₁And D of a subset of samples₁Classification result P based on the pre-trained black-box model₁Training the white box model to obtain the candidate constraint condition R₁Corresponding trained white-box model g₁。

In addition, in FIG. 2, P₂And P_sRespectively represent and constrain the condition R₂And R_sCorresponding sample subset D₂And D_sBased on the classification results of the pre-trained black-box model, and g₂And g_sRespectively represent and constrain the condition R₂And R_sCorresponding trained white-box models.

By way of example, the white-box model may be a linear model, such as a common linear model, a logistic regression model, a Least Absolute convergence and Selection Operator (Lasso) regression model, and a poisson regression model, among others. The person skilled in the art can select a suitable white-box model according to the actual needs.

By way of illustration and not limitation, the pre-trained black-box model may be trained using the sample set described above. Note that the sample set used for training the black box model is not limited to the above sample set, and for example, the black box model may be trained using a sample set different from the above sample set to obtain the above pre-trained black box model.

The white-box model score calculation unit 110 may be configured to calculate a score of the trained white-box model based on at least one of a confidence and a support of the constraint condition corresponding to the trained white-box model and a classification performance of the trained white-box model with respect to the pre-trained black-box model. The confidence of a constraint may be calculated as the ratio of the number of samples in the subset of samples corresponding to the constraint that are classified as positive samples based on a pre-trained black-box model to the total number of samples in the subset of samples. The support of a constraint may be a ratio of the number of samples in the subset of samples to which the constraint corresponds to the total number of samples in the set of samples.

For example, the classification performance of the trained white-box model relative to the pre-trained black-box model may be a measure of the ability of the trained white-box model to fit the pre-trained black-box model described above.

For example, the classification performance of the trained white-box model relative to the pre-trained black-box model may be characterized by accuracy. For example, the accuracy Precision may be calculated according to the following equation (1).

Precision TP/(TP + FP) formula (1)

In equation (1), TP represents the number of samples in a particular sample set (e.g., a subset of samples corresponding to a constraint corresponding to a trained white-box model) that are each classified as positive samples by the trained white-box model and a pre-trained black-box model, and FP represents the number of samples in the particular sample set that are classified as positive samples by the trained white-box model and as negative samples by the pre-trained black-box model.

In addition, the classification performance of the trained white-box model relative to the pre-trained black-box model may be characterized by accuracy, for example. For example, the accuracy may be calculated as the accuracy of the classification result based on the trained white-box model with respect to the classification result based on the pre-trained black-box model for a particular sample set (e.g., a subset of samples corresponding to the constraint corresponding to the trained white-box model).

Furthermore, the classification performance of the trained white-box model relative to the pre-trained black-box model can be characterized, for example, by the Area Under the Receiver Operating Characteristic (ROC) Curve (AUC).

For example, the score of the trained white-box model may represent the degree to which the constraint corresponding to the trained white-box model positively affects the classification result. For example, the higher the score of the trained white-box model, the greater the degree of positive influence of the constraint corresponding to the trained white-box model on the classification result.

For example, the white-box model score calculation unit 110 may calculate, as a score of the trained white-box model, for each trained white-box model, a product of the accuracy of the trained white-box model and one of: the confidence of the constraint condition corresponding to the trained white-box model; the support degree of the constraint condition; and the sum of the confidence and the support of the constraint conditions. In addition, those skilled in the art may also calculate the score of the trained white-box model based on at least one of the confidence and the support of the constraint corresponding to the trained white-box model and the classification performance of the trained white-box model relative to the pre-trained black-box model in other ways. For example, a weight may be set on at least one of the confidence and the support of the constraint condition corresponding to the trained white-box model, and the product of the confidence, the support and/or the sum of the confidence and the support after the weight is set and a parameter characterizing the classification performance of the trained white-box model with respect to the pre-trained black-box model may be used as the score of the trained white-box model.

The analysis result output unit 112 may be configured to output the candidate constraint condition and the score of the trained white-box model corresponding to the candidate constraint condition as the analysis result of the target object.

In machine learning, a classification model configured by a nonlinear model such as a neural network is generally used to classify samples, thereby obtaining a classification result of the samples. However, the classification model configured by the nonlinear model is a black box having an unknown internal behavior, and thus it is difficult to know a specific cause causing a corresponding classification result, resulting in difficulty in analyzing the specific cause causing the classification result. For this reason, many methods (for example, LIME method) have been proposed to fit the black box model, thereby interpreting the classification results of the black box model, i.e., analyzing the specific causes causing the classification results. In the LIME method, new samples are randomly generated around a sample to be interpreted (i.e., a target object), classification results of the respective samples are obtained according to a black box model, and a linear model is trained to fit the black box model according to the randomly generated samples and the corresponding classification results, so as to achieve local interpretability. However, the LIME method may have the following problems: according to the method, for each classification result to be explained, samples need to be randomly generated to train a corresponding training model, time and labor are consumed, and the method is not suitable for large-scale or scenes with corresponding speed requirements.

The information processing apparatus according to the embodiment of the present disclosure may not need to randomly generate a sample to train a corresponding white-box model for each target object, so that the training process may be more efficient.

In addition, the information processing apparatus according to the embodiment of the present disclosure trains the whitepack model mainly using samples in a sample set, which are real samples and have more meaning than randomly generated samples.

Further, the information processing apparatus according to the embodiment of the present disclosure may calculate the score of the trained white-box model based on at least one of the confidence and the support of the constraint condition corresponding to the trained white-box model and the classification performance of the trained white-box model with respect to the pre-trained black-box model, and output the candidate constraint condition and the score of the trained white-box model corresponding to the candidate constraint condition as the analysis result of the target object, so that the analysis result is easier to understand. In order to better understand the process of the information processing apparatus according to the embodiment of the present disclosure and the technical effects thereof, the following will be described with reference to an example of classifying iris flowers.

For example, for a target object 1 having a petal width of 1.5cm, a petal length of 6.5cm, a calyx width of 1.5cm and a calyx length of 6.5cm, a constraint condition R that the target object 1 satisfies is selected₃As candidate constraints, for candidate constraint R₃Training the white-box model to obtain a trained white-box model P₃(not shown) and further calculated to yield a trained white-box model P₃The fraction of (a) is 0.1. In this case, for example, the analysis result output unit 112 may output as an analysis result: petal length > 6.0cm (i.e., candidate constraint R)₃),0.1. The above analysis results can be interpreted as: petal lengths > 6.0cm give a probability of 0.1 for the target sample 1 to be classified as a discolored iris. The above analysis results allow a better understanding of the reason why the target sample 1 was classified as (or not) a iris discoloured by the black-box model trained in advance.

Further, what is obtained by the LIME method of the related art is the importance of each feature quantity, and the correlation and mutual influence between feature quantities are not considered.

As an example, the at least one candidate constraint defines a value or range of values for two or more features. In this case, the trained white-box model for the at least one candidate constraint may capture the association and interplay between the two or more features, such that the fitting accuracy of the trained white-box model for the black-box model may be further improved. For example, in the above example of classifying iris,candidate constraint R₁A range of values for two features (i.e., petal width and petal length) is defined, for which candidate constraints a trained white-box model P is applied₁The correlation and interplay between petal width and petal length can be captured.

Fig. 6A shows the constraint boundaries corresponding to the model for the sample y to be interpreted, obtained using the LIME method. As shown in fig. 6A, the linear model generated using the LIME method corresponds to only a simple linear decision boundary (constraint boundary), but the linear model does not capture the non-linear dependency.

According to an embodiment of the present disclosure, the candidate constraint selecting unit 106 may be further configured to select two or more constraints to which the target object meets from the plurality of constraints as the candidate constraints. Wherein there is an overlap of at least two of the plurality of sample subsets. In other words, the at least two subsets of samples contain one or more identical samples. For example, for a certain sample to be interpreted (i.e., target object) y, the candidate constraint selecting unit 106 may select two constraints R to which the target object y conforms from a plurality of constraints₁And R₂As candidate constraints. Further, the model training unit 108 may use the candidate constraint condition R with the target object y, respectively₁And R₂Corresponding sample subset D₁And D₂And D of a subset of samples₁And D₂Classification result P based on pre-trained black-box model₁And P₂Training the white box model to obtain the candidate constraint condition R₁And R₂Corresponding trained white-box model g₁And g₂. For example, FIG. 6B shows a white-box model g under training₁And g₂In the case of a linear model, a trained white-box model g₁And g₂The corresponding decision boundary (constraint boundary).

The information processing apparatus according to the embodiment of the present disclosure may generate a plurality of constraints based on the sample set, select a plurality of candidate constraints to which the target object conforms, and acquire a plurality of trained white-box models for the plurality of candidate constraints, so that the non-linear dependency may be captured, and thus the black-box model may be fitted with the plurality of trained white-box models (e.g., linear models) more accurately.

As described above, the LIME method in the prior art requires a random generation of samples for training the corresponding training model for each classification result to be interpreted, which is time-consuming and labor-consuming. According to an embodiment of the present disclosure, the information processing apparatus 100 may further include a model fusion unit 114 (shown in fig. 1 with a dashed box), and the model fusion unit 114 may be configured to fuse two or more trained white-box models obtained via the model training unit 108 to obtain a final model of the target object. For example, in a case where a final model for a specific target object is obtained, the final model may be applied to an object whose feature quantity is similar to that of the specific target object (for example, a difference between the respective feature quantities is within a predetermined range), so that the overall model training efficiency may be further improved.

According to an embodiment of the present disclosure, the information processing apparatus 100 may further include a constraint score calculation unit 116 (shown by a dashed box in fig. 1), and the constraint score calculation unit 116 may be configured to calculate, for each of two or more constraints to which the target object conforms, a score of the constraint based on: confidence of the constraint; the support degree of the constraint condition and the classification result of the sample subset corresponding to the constraint condition based on the pre-trained black box model; or the confidence and support of the constraint and the classification result of the sample subset corresponding to the constraint. In this case, the white-box model score calculation unit 110 may be further configured to calculate the score of the trained white-box model based on the classification performance of the trained white-box model with respect to the pre-trained black-box model and the scores of the constraints corresponding to the trained white-box model.

For example, the constraint condition score calculation unit 116 may calculate the score Rule of each constraint condition according to the following formula (2)_score。

Rule_score＝w*rule_confidence+v*(num₊/num_{_}) Ratio formula (2)

In the above formula (2), rule_confidenceRepresenting confidence of the constraint, num₊The term/num represents the classification result of the sample subset corresponding to the constraint condition based on the pre-trained black-box model, the ratio represents the ratio of the number of samples in the sample subset corresponding to the constraint condition to the total number of samples (i.e. the support), and w and v are rule respectively_confidenceAnd (num)₊/num_{_}) Weight of ratio, w ≧ 0 and v ≧ 0. Wherein, num₊And num_-Respectively representing the number of positive samples and the number of negative samples in the sample subset corresponding to the constraint condition, which are obtained based on the pre-trained black box model. Further, the larger w is, the higher the importance of the constraint itself (i.e., the confidence of the constraint) is represented when calculating the score of the constraint, and the larger v is, the greater the importance of the distribution (i.e., the ratio of the number of positive samples to the number of negative samples and the degree of support) of the sample subset corresponding to the constraint when calculating the score of the constraint.

In equation (2), the case where w is 0 indicates the confidence of the constraint condition when calculating the score of the constraint condition without considering the constraint condition. The case where v is 0 indicates a classification result based on a pre-trained black box model that does not consider the constraint condition support degree and the sample subset corresponding to the constraint condition when calculating the score of the constraint condition. In addition, the specific values of w and v can be set by those skilled in the art according to actual needs.

Note that the way of calculating the score of the constraint condition is not limited to the above equation (2), and a person skilled in the art may calculate the score of the constraint condition based on at least one of the confidence and the support of the constraint condition and/or the classification result of the sample subset corresponding to the constraint condition based on the pre-trained black box model according to actual needs in other ways, which will not be described herein again.

For example, the white-box Model score calculation unit 110 may calculate a score Model of the trained white-box Model according to equation (3) below_score。

Model_score＝Rule_scoreWm type (3)

In equation (3), Wm is a value characterizing the classification performance of the trained white-box model relative to the pre-trained black-box model, such as accuracy, precision, area under the ROC curve, and the like.

Note that the way of calculating the score of the trained white-box model is not limited to the above equation (3).

According to an embodiment of the present disclosure, the candidate constraint selecting unit 106 may be further configured to select, as the candidate constraint, the top n constraints with the largest score from two or more constraints to which the target object conforms, as the candidate constraint, select, as the candidate constraint, a constraint with a score larger than a predetermined threshold from the two or more constraints, or select, as the candidate constraint, a non-repetitive constraint from the two or more constraints, where n is a natural number larger than 0. Wherein a non-repeating constraint means that there is no overlap in the values or value ranges of the respective features defined by the respective constraints. For example, in the example of classifying irises described above, the constraint R₁The characteristic 'petal width' and 'petal length' are limited, and the constraint condition R₂The characteristics "length of calyx" and "length of petals" are defined and for the constraint R₁And a constraint R₂The range of the characteristic petal length values is less than 5.0 cm. That is, for the constraint R₁And a constraint R₂The values of the characteristic "petal length" overlap, so that the constraint R₁And a constraint R₂Can be considered as a repetitive constraint. In addition, the constraint R₃A characteristic "petal length" is defined and has a value in the range of "> 6.0 cm". Thus, the constraint R₁And a constraint R₃Can be regarded as a non-repetitive constraint and the constraint R₂And a constraint R₃May also be considered as a non-repetitive constraint.

For example, the value of n can be set by those skilled in the art according to actual needs. Preferably, n is a natural number greater than 1.

By further selecting candidate constraints from the two or more constraints to which the target object meets in this way, the number of white-box models that need to be trained can be reduced, and thus training efficiency can be further improved.

According to an embodiment of the present disclosure, the model training unit 108 may be further configured to: in the case that the sample subset corresponding to a candidate constraint condition comprises a number of samples smaller than a predetermined number, a plurality of samples are randomly generated based on the target object, and the white box model is trained by using the sample subset corresponding to the candidate constraint condition, the randomly generated samples, and classification results based on the pre-trained black box model of the sample subset corresponding to the candidate constraint condition and the randomly generated samples, so as to obtain a trained white box model corresponding to the candidate constraint condition. For example, the model training unit 108 may randomly generate a plurality of samples based on the target object using the LIME method. Further, for example, the number of randomly generated samples may be equal to or greater than a difference between the number of samples included by the predetermined number of sample subsets corresponding to the respective candidate constraints.

By randomly generating a plurality of samples based on the target object in the case that the sample subset corresponding to one candidate constraint condition includes a number of samples smaller than the predetermined number, and further training the white-box model based on the sample subset corresponding to the candidate constraint condition and the randomly generated plurality of samples to obtain the white-box model corresponding to the candidate constraint condition, as described above, the classification performance of the trained white-box model with respect to the black-box model can be improved. Note that although the information processing apparatus according to the present disclosure is mainly described above in conjunction with binary classification (i.e., positive-negative sample classification), it is to be understood that the information processing apparatus of the present disclosure may be used in the case of other classifications, for example, in the case of classifying a target object into one of two or more classes.

Further, it is to be noted that partial units (i.e., the model fusion unit 114 and the constraint condition score calculation unit 116) are illustrated with a dashed-line box in fig. 1, which means that the information processing apparatus 100 according to some embodiments of the present disclosure may include the model fusion unit 114 and/or the constraint condition score calculation unit 116, while the information processing apparatus 100 according to other embodiments of the present disclosure may not include the model fusion unit 114 and/or the constraint condition score calculation unit 116.

Having described the information processing apparatus according to the embodiment of the present disclosure above with reference to fig. 1 to 5 and fig. 6A and 6B, the present disclosure also provides an embodiment of the following information processing method, corresponding to the embodiment of the information processing apparatus described above.

Fig. 7 is a flowchart illustrating an example of a flow of an information processing method 400 according to an embodiment of the present disclosure. As shown in fig. 7, an information processing method 400 according to an embodiment of the present disclosure may start at a start step S401 and end at an end step S416. The information processing method 400 may include a constraint condition generation step S402, a sample grouping step S404, a candidate constraint condition selection step S406, a model training step S408, a white-box model score calculation step S410, and an analysis result output step S412.

In the constraint condition generation step S402, a plurality of constraint conditions may be generated based on the sample set (global data). For example, the constraint condition generating step S402 may be implemented by the constraint condition generating unit 102 described above, and specific details are not described herein again.

In the sample grouping step S404, the sample set may be grouped into a plurality of sample subsets in one-to-one correspondence with a plurality of constraints based on the plurality of constraints. For example, the sample grouping step S404 can be implemented by the sample grouping unit 104 described above, and specific details are not described herein again.

In the candidate constraint condition selection step S406, one or more constraint conditions to which the target object meets may be selected from the plurality of constraint conditions as candidate constraint conditions. For example, the candidate constraint condition selecting step S406 may be implemented by the candidate constraint condition selecting unit 106 described above, and specific details are not described herein again.

In the model training step S408, for each of the candidate constraints, the white-box model may be trained using the sample subset corresponding to the candidate constraint and the classification result based on the pre-trained black-box model of the sample subset corresponding to the candidate constraint to obtain a trained white-box model corresponding to the candidate constraint. For example, the model training step S408 can be implemented by the model training unit 108 described above, and specific details are not described herein again.

By way of example, the white-box model may be a linear model, such as a general linear model, a logistic regression model, a Lasso regression model, and a poisson regression model, among others. The person skilled in the art can select a suitable white-box model according to the actual needs.

In the white-box model score calculating step S410, a score of the trained white-box model may be calculated based on at least one of a confidence and a support of the constraint condition corresponding to the trained white-box model and a classification performance of the trained white-box model with respect to the pre-trained black-box model. The confidence of a constraint may be calculated as the ratio of the number of samples in the subset of samples corresponding to the constraint that are classified as positive samples based on a pre-trained black-box model to the total number of samples in the subset of samples. The support of a constraint may be a ratio of the number of samples in the subset of samples to which the constraint corresponds to the total number of samples in the set of samples.

For example, the classification performance of a trained white-box model relative to a pre-trained black-box model can be characterized by accuracy, precision, or area under the Receiver Operating Characteristics (ROC) curve (AUC).

In the analysis result output step S412, the candidate constraint condition and the score of the trained white-box model corresponding to the candidate constraint condition may be output as the analysis result of the target object.

As described above, in order to analyze a specific cause causing a classification result obtained by a classification model configured by a nonlinear model such as a neural network, many methods (for example, LIME method) have been proposed to fit a black box model, thereby interpreting the classification result of the black box model, that is, analyzing the specific cause causing the classification result. In the LIME method, new samples are randomly generated around a sample to be interpreted (i.e., a target object), classification results of the respective samples are obtained according to a black box model, and a linear model is trained to fit the black box model according to the randomly generated samples and the corresponding classification results, so as to achieve local interpretability. However, the LIME method may have the following problems: according to the method, for each classification result to be explained, samples need to be randomly generated to train a corresponding training model, time and labor are consumed, and the method is not suitable for large-scale or scenes with corresponding speed requirements.

Similar to the information processing apparatus according to the embodiment of the present disclosure, the information processing method according to the embodiment of the present disclosure may not require a sample to train a corresponding white-box model to be randomly generated for each target object, so that the training process may be more efficient.

In addition, similar to the information processing apparatus according to the embodiment of the present disclosure, the information processing method according to the embodiment of the present disclosure trains the whitepack model mainly using samples in a sample set, which are real samples and have more meaning than randomly generated samples.

Further, similarly to the information processing apparatus according to the embodiment of the present disclosure, the information processing method according to the embodiment of the present disclosure may calculate the score of the trained white-box model based on at least one of the confidence and the support of the constraint condition corresponding to the trained white-box model and the classification performance of the trained white-box model with respect to the pre-trained black-box model, and output the candidate constraint condition and the score of the trained white-box model corresponding to the candidate constraint condition as the analysis result of the target object, so that the analysis result is easier to understand.

In addition, what is obtained by the LIME method is the importance of each feature quantity, and the correlation and mutual influence between feature quantities are not considered.

As an example, the at least one candidate constraint defines a value or range of values for two or more features. In this case, the trained white-box model for the at least one candidate constraint may capture the association and interplay between the two or more features, such that the fitting accuracy of the trained white-box model for the black-box model may be further improved.

According to an embodiment of the present disclosure, in the candidate constraint selecting step S406, two or more constraints to which the target object meets may be selected from the plurality of constraints as candidate constraints. Wherein there is an overlap of at least two of the plurality of sample subsets. In other words, the at least two subsets of samples contain one or more identical samples.

The information processing method according to this embodiment of the present disclosure may generate a plurality of constraints based on a sample set, select a plurality of candidate constraints to which a target object conforms, and acquire a plurality of trained white-box models for the plurality of candidate constraints, so that a non-linear dependency may be captured, and thus a black-box model may be fitted with the plurality of trained white-box models (e.g., linear models) more accurately.

As described above, the LIME method in the prior art requires a random generation of samples for training the corresponding training model for each classification result to be interpreted, which is time-consuming and labor-consuming. According to an embodiment of the present disclosure, the information processing apparatus method 400 may further include a model fusion step S414 (shown by a dashed box in fig. 7). In the model fusion step S414, the two or more trained white-box models obtained via the model training step S408 may be fused to obtain a final model of the target object. For example, the model fusion step S414 can be implemented by the model fusion unit 414 described above, and specific details are not described herein again.

According to an embodiment of the present disclosure, the information processing apparatus method 400 may further include a constraint condition score calculation step S405 (shown by a dashed box in fig. 7). In the constraint condition score calculating step S405, a score of each of two or more constraint conditions to which the target object meets may be calculated based on: confidence of the constraint; the support degree of the constraint condition and the classification result of the sample subset corresponding to the constraint condition based on the pre-trained black box model; or the confidence and support of the constraint and the classification result of the sample subset corresponding to the constraint. In this case, in the white-box model score calculating step S410, the score of the trained white-box model may be calculated based on the classification performance of the trained white-box model with respect to the pre-trained black-box model and the scores of the constraint conditions corresponding to the trained white-box model. For example, the constraint condition score calculating step S405 may be implemented by the constraint condition score calculating unit 416 described above, and specific details are not described herein again.

According to an embodiment of the present disclosure, in the candidate constraint selecting step S406, the first n constraints with the largest score may be selected from two or more constraints to which the target object conforms as candidate constraints, a constraint with a score larger than a predetermined threshold may be selected from the two or more constraints as candidate constraints, or a non-repetitive constraint may be selected from the two or more constraints as candidate constraints, where n is a natural number larger than 0. Wherein a non-repeating constraint means that there is no overlap in the values or value ranges of the respective features defined by the respective constraints.

According to an embodiment of the present disclosure, in the model training step S408, in a case where the sample subset corresponding to one candidate constraint condition includes a number of samples less than a predetermined number, a plurality of samples may be randomly generated based on the target object, and the white box model may be trained using the sample subset corresponding to the candidate constraint condition, the randomly generated plurality of samples, and the classification result based on the pre-trained black box model of the sample subset corresponding to the candidate constraint condition and the randomly generated plurality of samples to obtain a trained white box model corresponding to the candidate constraint condition.

By generating a plurality of samples at random based on the target object in the case that the sample subset corresponding to one candidate constraint condition includes a number of samples smaller than the predetermined number, and further training the white-box model based on the sample subset corresponding to the candidate constraint condition and the randomly generated plurality of samples to obtain a corresponding white-box model, as described above, the classification performance of the trained white-box model with respect to the black-box model can be improved.

Note that, in fig. 7, partial steps (i.e., the constraint condition score calculation step S405 and the model fusion step S414) are illustrated with a dashed box, which means that the information processing method 400 according to some embodiments of the present disclosure may include the constraint condition score calculation step S405 and/or the model fusion step S414, while the information processing method 400 according to another embodiment of the present disclosure may not include the constraint condition score calculation step S405 and/or the model fusion step S414.

In addition, note that although the information processing method according to the present disclosure is mainly described above in conjunction with binary classification (i.e., positive-negative sample classification), it is to be understood that the information processing method of the present disclosure may be used in the case of other classifications, for example, in the case of classifying a target object into one of two or more classes.

It should be noted that although the functional configurations and operations of the information processing apparatus and the information processing method according to the embodiments of the present disclosure are described above, this is merely an example and not a limitation, and a person skilled in the art may modify the above embodiments according to the principles of the present disclosure, for example, functional modules and operations in the respective embodiments may be added, deleted, or combined, and such modifications fall within the scope of the present disclosure.

In addition, it should be further noted that the method embodiments herein correspond to the apparatus embodiments described above, and therefore, the contents that are not described in detail in the method embodiments may refer to the descriptions of the corresponding parts in the apparatus embodiments, and the description is not repeated here.

In addition, the present disclosure also provides a storage medium and a program product. It should be understood that the machine-executable instructions in the storage medium and the program product according to the embodiments of the present disclosure may also be configured to perform the above-described information processing method, and thus, the contents not described in detail herein may refer to the description of the corresponding parts previously, and the description will not be repeated herein.

Accordingly, storage media for carrying the above-described program products comprising machine-executable instructions are also included in the present disclosure. Including, but not limited to, floppy disks, optical disks, magneto-optical disks, memory cards, memory sticks, and the like.

Further, it should be noted that the above series of processes and means may also be implemented by software and/or firmware. In the case of implementation by software and/or firmware, a program constituting the software is installed from a storage medium or a network to a computer having a dedicated hardware structure, such as a general-purpose personal computer 500 shown in fig. 8, which is capable of executing various functions and the like when various programs are installed.

In fig. 8, a Central Processing Unit (CPU)501 executes various processes in accordance with a program stored in a Read Only Memory (ROM)502 or a program loaded from a storage section 508 to a Random Access Memory (RAM) 503. In the RAM 503, data necessary when the CPU 501 executes various processes and the like is also stored as necessary.

The CPU 501, ROM502, and RAM 503 are connected to each other via a bus 504. An input/output interface 505 is also connected to bus 504.

The following components are connected to the input/output interface 505: an input section 506 including a keyboard, a mouse, and the like; an output section 507 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker and the like; a storage section 508 including a hard disk and the like; and a communication section 509 including a network interface card such as a LAN card, a modem, or the like. The communication section 509 performs communication processing via a network such as the internet.

A driver 510 is also connected to the input/output interface 505 as needed. A removable medium 511 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 510 as needed, so that a computer program read out therefrom is installed in the storage section 508 as needed.

In the case where the above-described series of processes is realized by software, a program constituting the software is installed from a network such as the internet or a storage medium such as the removable medium 511.

It should be understood by those skilled in the art that such a storage medium is not limited to the removable medium 511 shown in fig. 8 in which the program is stored, distributed separately from the apparatus to provide the program to the user. Examples of the removable medium 511 include a magnetic disk (including a floppy disk (registered trademark)), an optical disk (including a compact disc read only memory (CD-ROM) and a Digital Versatile Disc (DVD)), a magneto-optical disk (including a Mini Disk (MD) (registered trademark)), and a semiconductor memory. Alternatively, the storage medium may be the ROM502, a hard disk included in the storage section 508, or the like, in which programs are stored and which are distributed to users together with the device including them.

The preferred embodiments of the present disclosure are described above with reference to the drawings, but the present disclosure is of course not limited to the above examples. Various changes and modifications within the scope of the appended claims may be made by those skilled in the art, and it should be understood that these changes and modifications naturally will fall within the technical scope of the present disclosure.

For example, a plurality of functions included in one unit may be implemented by separate devices in the above embodiments. Alternatively, a plurality of functions implemented by a plurality of units in the above embodiments may be implemented by separate devices, respectively. In addition, one of the above functions may be implemented by a plurality of units. Needless to say, such a configuration is included in the technical scope of the present disclosure.

In this specification, the steps described in the flowcharts include not only the processing performed in time series in the described order but also the processing performed in parallel or individually without necessarily being performed in time series. Further, even in the steps processed in time series, needless to say, the order can be changed as appropriate.

In addition, the technique according to the present disclosure can also be configured as follows.

Supplementary note 1. an information processing apparatus comprising:

a constraint condition generating unit configured to generate a plurality of constraint conditions based on the sample set;

a sample grouping unit configured to group the sample set into a plurality of sample subsets in one-to-one correspondence with the plurality of constraints based on the plurality of constraints;

a candidate constraint condition selection unit configured to select one or more constraint conditions to which a target object conforms as candidate constraint conditions from the plurality of constraint conditions;

a model training unit configured to train, for each of the candidate constraints, a white-box model with a sample subset corresponding to the candidate constraint and a classification result based on a pre-trained black-box model of the sample subset corresponding to the candidate constraint to obtain a trained white-box model corresponding to the candidate constraint;

a white-box model score calculation unit configured to calculate a score of the trained white-box model based on at least one of a confidence and a support of a constraint condition corresponding to the trained white-box model and a classification performance of the trained white-box model with respect to the pre-trained black-box model; and

an analysis result output unit configured to output the candidate constraint condition and a score of the trained white-box model corresponding to the candidate constraint condition as an analysis result of the target object.

Supplementary note 2. the information processing apparatus according to supplementary note 1,

wherein the candidate constraint condition selection unit is further configured to select, as candidate constraint conditions, two or more constraint conditions to which a target object meets from the plurality of constraint conditions; and

wherein at least two of the plurality of sample subsets overlap.

Note 3 that the information processing apparatus according to note 2 further includes: a model fusion unit fusing the two or more trained white-box models obtained via the model training unit to obtain a final model of the target object.

Note 4. the information processing apparatus according to

note

2 or 3, further comprising: a constraint condition score calculating unit configured to calculate, for each of the two or more constraint conditions to which the target object conforms, a score of the constraint condition based on:

confidence of the constraint;

the support degree of the constraint condition and the classification result of the sample subset corresponding to the constraint condition based on the pre-trained black box model; or

The confidence and support of the constraint and the classification result of the sample subset corresponding to the constraint,

wherein the white-box model score calculation unit is further configured to calculate a score of the trained white-box model based on a classification performance of the trained white-box model with respect to the pre-trained black-box model and a score of a constraint condition corresponding to the trained white-box model.

Reference numeral 5 denotes the information processing apparatus according to reference numeral 4,

the candidate constraint selecting unit is further configured to select, as the candidate constraint, the top n constraints with the largest score from the two or more constraints to which the target object conforms, a constraint with a score larger than a predetermined threshold from the two or more constraints, or a constraint with a non-repetition from the two or more constraints, where n is a natural number larger than 0,

wherein the non-repetitive constraints are defined as values or value ranges of the respective features defined by the respective constraints that do not overlap.

Supplementary note 6. the information processing apparatus according to any one of supplementary notes 1 to 3, wherein the model training unit is further configured to: in the case that the sample subset corresponding to a candidate constraint condition comprises a number of samples smaller than a predetermined number, randomly generating a plurality of samples based on the target object, and training a white box model by using the sample subset corresponding to the candidate constraint condition, the randomly generated plurality of samples, and the classification results of the sample subset corresponding to the candidate constraint condition and the randomly generated plurality of samples based on the pre-trained black box model to obtain a trained white box model corresponding to the candidate constraint condition.

Note 7. the information processing apparatus according to any one of notes 1 to 3, wherein the pre-trained black box model is obtained by training using the sample set.

Note 8 the information processing apparatus according to any one of notes 1 to 3, wherein the white-box model is a linear model.

Supplementary note 9. the information processing apparatus according to any one of supplementary notes 1 to 3, wherein the sample set includes training samples whose classification results are known and/or randomly generated samples.

Reference 10. the information processing apparatus according to any one of references 1 to 3, wherein at least one of the candidate constraints defines a value or a range of values of two or more features.

Note 11. an information processing method includes:

a constraint condition generating step of generating a plurality of constraint conditions based on the sample set;

a sample grouping step of grouping the sample set into a plurality of sample subsets in one-to-one correspondence with the plurality of constraints based on the plurality of constraints;

a candidate constraint condition selection step of selecting one or more constraint conditions to which the target object meets from the plurality of constraint conditions as candidate constraint conditions;

a model training step, configured to train, for each of the candidate constraints, a white box model using a sample subset corresponding to the candidate constraint and a classification result based on a pre-trained black box model of the sample subset corresponding to the candidate constraint to obtain a trained white box model corresponding to the candidate constraint;

a white-box model score calculating step of calculating a score of the trained white-box model based on at least one of a confidence and a support of a constraint condition corresponding to the trained white-box model and a classification performance of the trained white-box model with respect to the pre-trained black-box model; and

and outputting the candidate constraint condition and the score of the trained white-box model corresponding to the candidate constraint condition as the analysis result of the target object.

Reference numeral 12, an information processing method according to the reference numeral 11,

wherein, in the candidate constraint condition selection step, two or more constraint conditions to which a target object meets are selected from the plurality of constraint conditions as candidate constraint conditions; and

wherein at least two of the plurality of sample subsets overlap.

Note 13. the information processing method according to note 12, further comprising: a model fusion step for fusing the two or more trained white-box models obtained via the model training step to obtain a final model of the target object.

Note 14. the information processing method according to note 12 or 13, further comprising: a constraint score calculating step of calculating, for each of two or more constraints to which the target object conforms, a score of the constraint based on:

confidence of the constraint;

wherein, in the white-box model score calculating step, the score of the trained white-box model is calculated based on the classification performance of the trained white-box model with respect to the pre-trained black-box model and the score of the constraint condition corresponding to the trained white-box model.

Supplementary note 15. according to the information processing method described in supplementary note 14,

in the candidate constraint selecting step, the top n constraints with the largest score are selected from the two or more constraints to which the target object conforms as the candidate constraints, constraints with scores larger than a predetermined threshold are selected from the two or more constraints as the candidate constraints, or non-repetitive constraints are selected from the two or more constraints as the candidate constraints, where n is a natural number larger than 0,

Note 16. the information processing method according to any one of notes 11 to 13, wherein in the model training step, in a case where a sample subset corresponding to one candidate constraint condition includes a number of samples smaller than a predetermined number, a plurality of samples are randomly generated based on the target object, and the white box model is trained using the sample subset corresponding to the candidate constraint condition, the randomly generated plurality of samples, and a classification result based on the pre-trained black box model of the sample subset corresponding to the candidate constraint condition and the randomly generated plurality of samples to obtain a trained white box model corresponding to the candidate constraint condition.

Supplementary note 17 the information processing method according to any one of supplementary notes 11 to 13, wherein the pre-trained black box model is obtained by training using the sample set.

Supplementary note 18 the information processing method according to any one of supplementary notes 11 to 13, wherein the white-box model is a linear model.

Supplementary notes 19. the information processing method according to any one of supplementary notes 11 to 13, wherein the sample set includes training samples whose classification results are known and/or randomly generated samples.

Reference 20. a computer readable storage medium storing program instructions for performing the method of any of the references 11 to 19 when executed by a computer.

Claims

1. An information processing apparatus comprising:

2. The information processing apparatus according to claim 1,

wherein at least two of the plurality of sample subsets overlap.

3. The information processing apparatus according to claim 2, further comprising: a model fusion unit fusing the two or more trained white-box models obtained via the model training unit to obtain a final model of the target object.

4. The information processing apparatus according to claim 2 or 3, further comprising: a constraint condition score calculating unit configured to calculate, for each of the two or more constraint conditions to which the target object conforms, a score of the constraint condition based on:

confidence of the constraint;

5. The information processing apparatus according to claim 4,

6. The information processing apparatus according to any one of claims 1 to 3, wherein the model training unit is further configured to: in the case that the sample subset corresponding to a candidate constraint condition comprises a number of samples smaller than a predetermined number, randomly generating a plurality of samples based on the target object, and training a white box model by using the sample subset corresponding to the candidate constraint condition, the randomly generated plurality of samples, and the classification results of the sample subset corresponding to the candidate constraint condition and the randomly generated plurality of samples based on the pre-trained black box model to obtain a trained white box model corresponding to the candidate constraint condition.

7. The information processing apparatus according to any one of claims 1 to 3, wherein the pre-trained black box model is trained using the sample set.

8. The information processing apparatus according to any one of claims 1 to 3, wherein the white-box model is a linear model.

9. An information processing method comprising:

10. A computer readable storage medium storing program instructions for performing the method of claim 9 when executed by a computer.