Disclosure of Invention
Aiming at the defects of the prior art, the invention discloses an artificial intelligent model prediction result root cause tracing method, system and device based on expert rules.
In order to solve the problems, the technical scheme of the invention is as follows:
the first aspect of the embodiment of the invention provides an artificial intelligent model prediction result root cause tracing method based on expert rules, which specifically comprises the following steps:
Step S1, collecting historical data of the vehicle insurance cases, and constructing a database of the historical cases of the vehicle insurance;
S2, constructing a map network by adopting the map structure data obtained in the step S1, training a learning model through a network representation based on the map network to obtain expert factor vectors, forming expert rule vectors by splicing or averaging the expert factor vectors, and storing the expert factor vectors and the expert rule vectors into a rule vector database;
Step S3, acquiring real-time vehicle risk data of a risk case judged by the artificial intelligent model, extracting risk factors and expert factors to obtain a triggered expert factor set, and obtaining a quasi-trigger rule vector by combining the expert rule vector obtained in the step S2 and comparing a default filling mode;
s4, calculating the similarity of the expert rule vector and the quasi-trigger rule vector;
and S5, providing a tracing result for the real-time case of the car insurance according to the similarity result.
The method comprises the steps of S1, firstly collecting vehicle insurance case history data to construct a vehicle insurance history case database, then extracting fields associated with expert rules in the vehicle insurance history case database to form a risk factor data set, then extracting expert factors, judging each field in the risk factor data set according to factor composition in the expert rules to generate new fields to form an expert factor data set, and finally converting the expert factor data set into the graph structure data.
Further, the graph structure data is a triplet, an edge table, or an adjacency matrix.
Further, the method of calculating the specific rule vector in step S2 may also adopt a method that can be implemented by a self-encoder or nesting, etc. to convert discrete variables into continuous vector representations.
Further, the default filling in the step S3 is specifically to fill in the default value of the expert factor which is not triggered in the quasi-trigger rule, and the default value is usually a zero vector.
Further, step S4 calculates the similarity using the COS cosine theorem.
A second aspect of the embodiments of the present invention provides an artificial intelligence model result tracing system based on expert rules, the system comprising:
the vehicle insurance history case database is used for storing vehicle insurance case history data;
The risk factor extraction unit is used for forming a risk factor data set by finding out the internal/external connection between expert rules and historical data in the vehicle risk historical case database;
the expert factor extraction unit judges and processes the risk factor data set according to the expert factors in the expert rules to obtain new fields which take the expert factors as guidance to form the expert factor data set;
the diagram structure data generating unit is used for converting the expert factor data set into a data format suitable for network representation learning input;
the network representation learning training unit adopts an unsupervised or self-supervised graph representation learning method to train data, and an expert factor vector is obtained;
The rule vector database is used for storing expert factor vectors, splicing the expert factor vectors to obtain expert rule vectors, and obtaining the quasi-trigger rule vectors by adopting a default filling method;
The rule vector similarity calculation unit is used for calculating the similarity of the expert rule vector and the rule vector to be triggered;
the decision unit provides tracing for the case according to the similarity of the expert rule vector and the quasi-trigger rule vector;
a third aspect of the embodiments of the present invention provides an artificial intelligence model result tracing device based on expert rules, which includes one or more processors configured to implement the above-mentioned artificial intelligence model result tracing method based on expert rules.
A fourth aspect of an embodiment of the present invention provides a computer readable storage medium having a program stored thereon, which when executed by a processor, is configured to implement the above-described expert rule-based artificial intelligence model result tracing method.
The method has the beneficial effects that the method can assist the artificial intelligent model, and provide a case checking thought for car insurance experts and investigation personnel by additionally prompting possible fraud reasons, so that the investigation personnel can check cases in a targeted manner, and the investigation work efficiency is greatly improved. The invention can automatically and efficiently trace the artificial intelligent risk cases, and the tracing principle is based on expert rules, so that the tracing quality is ensured while less tracing time is ensured.
Detailed Description
Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples do not represent all implementations consistent with the invention. Rather, they are merely examples of apparatus and methods consistent with aspects of the invention as detailed in the accompanying claims.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in this specification and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any or all possible combinations of one or more of the associated listed items.
It should be understood that although the terms first, second, third, etc. may be used herein to describe various information, these information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of the invention. The term "if" as used herein may be interpreted as "at..once" or "when..once" or "in response to a determination", depending on the context.
The following description of embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The invention relates to an artificial intelligent model result tracing method (shown in figure 1) based on expert rules, which comprises the following steps:
step S1, collecting historical data of the vehicle insurance cases from an insurance company, and constructing a database of the historical cases of the vehicle insurance. The history data of the vehicle insurance cases are converted into a format of graph structure data according to expert rules, and the specific steps are as follows:
Step S101, a vehicle insurance case history database is built based on vehicle insurance case history data, wherein the vehicle insurance case history data comprise link data such as underwriting, insurance emergence, case reporting, investigation, damage assessment, price checking and damage, and the like, in order to ensure the reliability of backtracking, the data are data triggering expert rules, and the vehicle insurance case history database needs to meet the requirement of incremental updating, namely new data are continuously entered into the vehicle insurance case history database along with the time.
Step S102, field data associated with expert rules are extracted to form a risk factor data set by finding out the intrinsic/extrinsic relation between the expert rules and original fields in the vehicle insurance history case database, wherein the expert rules are derived from the empirical summary of the vehicle insurance expert on the vehicle insurance fraud case, one expert rule is often composed of one or more expert factors, the expert rules are triggered only when all expert factors are triggered, wherein the intrinsic/extrinsic relation between the expert rules and the original fields in the vehicle insurance history case database is constructed by depending on the condition factors of the expert rules, if the calculation of the condition factors needs to involve complex judgment of one or more original fields (for example, calculation of the distance between an emergency place and a repair shop), the existing relation is called, if the calculation of the condition factors needs to involve common judgment of one or more original fields (for example, whether the vehicle accident is or not), the existing extrinsic relation is called, wherein the risk factor data set is usually a field data set associated with the rules in the vehicle insurance case history database, and the condition factors are also called a new data subset of the vehicle insurance case data set.
Step S103, corresponding judgment operation is carried out on each field in the risk factor data set according to expert factor composition in the expert rules, so that a new field oriented by the expert factors is obtained, and therefore an expert factor data set is formed, wherein the expert factors are components of the expert rules, one expert factor corresponds to one judgment condition, and the expert rules are triggered when all the expert factors are triggered, and the expert factor data set refers to the new data set with the expert factors as fields, which is obtained by processing each field in the risk factor data set according to the judgment condition of the expert factors.
Step S104, converting the expert factor data set into graph structure data suitable for network representation learning. The graph structure data needs to be in a data format conforming to network representation learning, including but not limited to a triplet, an edge table, an adjacency matrix and the like.
And S2, constructing a map network by adopting the map structure data obtained in the step S1, expressing a learning model through the network based on the map network, training to obtain expert factor vectors, splicing (or averaging) the expert factor vectors to obtain expert rule vectors, and storing the expert factor vectors and the expert rule vectors into a rule vector database, wherein the splicing is a wide form, any method for outputting one or more vectors into a fixed dimension vector through a certain combination or calculation is applicable, specific modes include but are not limited to vector splicing, vector averaging and the like, wherein the rule vector database is used for storing the expert factor vectors and the expert rule vectors, and the expert factor vectors and the expert rule vectors are regularly trained and updated along with the updating of a vehicle risk history case database in order to ensure the effectiveness of the rule vector database.
Step S3, collecting real-time data of the vehicle risk judged to be a risk case by the artificial intelligent model from an insurance company to obtain a triggered expert factor set, and combining the rule vector database pre-constructed in the step 2, and obtaining a quasi-trigger rule vector by comparing expert rules in a default filling mode, wherein the method comprises the following specific steps of:
and step 301, analyzing the real-time vehicle risk data which is judged to be a risk case by the artificial intelligent model, and calculating a triggered expert factor set based on the steps 102 and 103, wherein the analysis refers to converting the data format of the real-time vehicle risk data into a format which accords with the historical data of the vehicle risk case, and if the data formats of the real-time vehicle risk data and the data format of the data are the same, the process is not needed.
Step S302, based on expert factor vectors obtained by pre-training the network representation learning model in step S2, a quasi-trigger rule vector is obtained in a default filling mode, wherein the default filling refers to the fact that the expert factors which are not triggered in the quasi-trigger rule fill default values, the default values are usually zero vectors, and the quasi-trigger rule refers to all the expert rules which are not triggered.
And S4, calculating the similarity between the expert rule vector and the quasi-trigger rule vector obtained in the step S3, wherein the calculation criterion of the similarity comprises, but is not limited to, various vector similarity calculation methods such as COS cosine theorem and the like.
And S5, providing a tracing way for the vehicle insurance expert according to the similarity result of the expert rule vector and the quasi-trigger rule vector, wherein the tracing way comprises but is not limited to returning the quasi-trigger rule with the highest similarity, returning the quasi-trigger rule based on threshold screening and the like, and can provide basis for real-time data of the vehicle insurance.
The invention also provides an artificial intelligent model result tracing system (shown in figure 2) based on expert rules, which comprises a vehicle risk history case database, a risk factor extraction unit, an expert factor extraction unit, a graph structure data generation unit, a network representation learning training unit, a rule vector database, a rule vector similarity calculation unit and a decision unit.
The vehicle insurance history case database is used for storing vehicle insurance case history data and supporting incremental update of the data, namely, new data are continuously entered into the vehicle insurance history case database along with the time.
The risk factor extraction unit extracts field data associated with expert rules by finding out internal/external relations of original fields corresponding to historical data in the vehicle risk historical case database, and forms a risk factor data set.
The expert factor extraction unit carries out corresponding judgment processing on the risk factor data set extracted by the risk factor extraction unit according to the expert factors in the expert rules to obtain new fields which take the expert factors as guidance, and forms an expert factor data set.
The graph structure data generating unit is used for converting the expert factor data set into data formats of triples, edge tables, adjacency matrixes and the like which are suitable for input required by the network representation learning training unit.
The network representation learning training unit adopts an unsupervised or self-supervised graph representation learning method to train data, and an expert factor vector is obtained.
The rule vector database is used for storing training results of the network representation learning unit and comprises expert factor vectors, and the expert factor vectors are spliced to form expert rule vectors, and a default filling method is adopted to obtain the quasi-trigger rule vectors. In order to ensure the validity of the data, the rule vector database is periodically updated according to the increment updating condition of the vehicle insurance history case database, and in addition, the rule vector database supports the increment updating of the data so as to cope with the transition of expert rules.
The rule vector similarity calculation unit is used for calculating the similarity between the expert rule vector and the quasi-trigger rule vector output by the rule vector database.
The decision unit provides tracing for the case according to the similarity condition of the expert rule vector and the quasi-trigger rule vector.
Example 1
The invention relates to an artificial intelligent model result tracing method based on expert rules, which comprises the following 5 steps:
the method comprises the following steps of (1) collecting historical data of the vehicle insurance cases, and converting the historical data of the vehicle insurance cases into structured graph data according to expert rules, wherein the specific steps are as follows:
(1.1) risk factor dataset extraction.
The history data of the car insurance cases comprise all aspects of data from the process of underwriting to claim settlement, and expert rules are taken as an experience summary from professionals, and only partial abnormal points in the data, namely the values of certain specific fields, are often focused on. In order to facilitate the development of subsequent work, it is important to extract the field associated with the expert factor from the vehicle risk history data according to the existing expert rules.
10000 Car insurance cases history data are adopted as data samples in the embodiment, and in order to carefully describe the implementation process of the scheme, one expert rule, two car insurance history cases and one car insurance real-time case are selected as specific examples to describe the scheme:
The first rule is that a single car accident + target + overtake + estimated loss amount [30000 ] + target car age-years [7 ] + ] | person injury → false rollover
First data:
{
first case number
Whether or not to mark is
Accident type, bicycle accident
Accident cause of capsizing
Estimated amount 83800
Vehicle age 8
Ternary parts for fee
Fee name of outer taillight
...
}
Second data:
{
Case number second case number
Whether or not to mark is
Accident type-double car accident
Accident cause of capsizing
Estimated amount 45000
Vehicle age 10
Expense name of headlight
Paint for sheet metal
...
}
Taking the first rule as an example, comparing the first data and the second data can find that 5 fields related to the ordinary judgment of the expert factor (accident cause: related to expert factor bicycle accident, whether the accident is marked: related to expert factor mark, accident type: related to expert factor overtime, estimated amount: related to expert factor exceeding estimated amount_30000, vehicle age: related to expert factor exceeding vehicle age_7) are contained in the first data and the second data, and 1 field related to the complex judgment of the expert factor (expense name: corresponding expert factor non-human injury, whether the personnel medical expense item is contained by the expense name to judge whether the personnel injury is contained or not) is contained in the risk factor data set, and the risk factor data set is composed of the fields.
(1.2) Expert factor dataset extraction.
After the risk factor data set is obtained, each field in the risk factor data set needs to be processed according to the judging condition of the expert factor, so that a new data set taking the expert factor as the field is obtained. Taking the second data as an example, the new data formed is as follows:
{
Case number second case number
The standard is
Accident of bicycle
Toppling over is
Exceeds the estimated amount 30000, is
Beyond the age of 7, is
Injury of non-human being
}
(1.3) Graph structure data conversion.
In this example, the graph structure data is stored in the form of an edge table (edgelist) using the case number and expert factor as nodes. Taking the second data as an example, the term with the expert factor field value of true will be stored in the edge table data. At this time, the table data should include the following items:
(second case number, target)
(Second case number, toppling)
(Second case number, exceeding the estimated amount_30000)
(Second case number, beyond vehicle age_7)
(Second case number, non-human injury)
Wherein the bicycle accident field value is no, i.e. the second data does not trigger the factor and is therefore not included in the side entry.
And (2) constructing a network through the graph structure data, training based on a network representation learning model to obtain an expert factor vector, and further obtaining an expert rule vector. In this embodiment, 10000 pieces of map data are employed to construct a network. Taking the first data, the second data and the first rule as examples, a construction form of a graph network structure is shown in fig. 3.
As shown in fig. 3, the upper part of the dotted line is a case number node, and the lower part of the dotted line is an expert factor node. In this embodiment, the training map structure data is used for depth walk (deepwalk) to obtain the expert factor vector, and the vector average is used as the calculation mode of the expert rule vector. Taking the first rule as an example, the following are expert rule vectors obtained by averaging the expert factor vectors and the vectors:
| Expert factor |
Vector value |
| Accident of bicycle |
(0.12,0.05,...,0.77) |
| Target (C) |
(0.24,0.35,...,0.99) |
| Capsizing |
(0.35,0.45,...,0.16) |
| Exceeding the estimated amount_30000 |
(0.72,0.32,...,0.27) |
| Exceeding vehicle age_7 |
(0.93,0.48,...,0.47) |
| Injury of non-person |
(0.64,0.35,...,0.67) |
The expert factor vector and expert rule vector will be stored in the rule vector database for subsequent calculation.
And (3) collecting real-time data of the vehicle risk judged to be a risk case by the artificial intelligent model, and calculating a quasi-trigger rule vector, wherein the method comprises the following specific steps of:
(3.1) calculating expert factor sets triggered by real-time data of vehicle insurance.
Third data of the existing real-time case of one vehicle risk:
{
case number three
Whether or not to mark is
Accident type, bicycle accident
Accident cause of collision
Estimated amount 83800
Vehicle age 12
Fee name, back wall
Cost name trunk lid
...
}
Against the first rule, an activated expert factor set (single car accident, standard, exceeding a specific vehicle age_1, exceeding an estimated amount_1, non-human injury) is obtained by step (1).
And (3.2) inquiring a rule vector database, and obtaining the quasi trigger rule vector by a default filling mode.
After the triggered expert factor set is obtained, the corresponding vector value is obtained by inquiring the rule vector database, and the quasi-trigger rule vector is obtained by adopting a vector splicing or averaging mode. In this example, vector averaging is employed as a way of calculating the rule vector to trigger. At this time, since the expert factor override is not triggered, the default value is filled according to the default filling rule, i.e. a zero vector is filled.
And (4) calculating the similarity of the expert rule vector (v 1,v2,...,vn) and the quasi-trigger rule vector (v 1′,v2′,...,vn').
In this embodiment, the cos cosine theorem is used to calculate the similarity of vectors, and the calculation formula is as follows:
Calculating the similarity of the first expert rule and the first to-be-triggered rule, wherein v 1,v2,...,vn represents the numerical values of the 1 st, 2 nd and n th dimensions of the expert rule vector respectively, and v 1′,v2′,...,vn' represents the numerical values of the 1 st, 2 nd and n th dimensions of the to-be-triggered rule vector respectively:
Existing expert rule 1= (0.50,0.33,..0.56) to trigger rule 1= (0.44,0.26,..0.53), then according to the above formula, the similarity of both is:
And (5) providing a tracing result for the vehicle insurance expert by finding a quasi-trigger rule with highest similarity or returning a quasi-trigger vector with high similarity based on threshold screening, wherein in the example, the threshold value is selected to be 0.8 by adopting a method based on threshold screening, and the first rule is returned as a root tracing result.
Corresponding to the embodiment of the artificial intelligent model prediction result root cause tracing method based on expert rules, the invention also provides an embodiment of the artificial intelligent model prediction result root cause tracing device based on expert rules.
Referring to fig. 4, the device for tracing the root cause of the prediction result of the artificial intelligent model based on the expert rules provided by the embodiment of the invention comprises one or more processors, which are used for realizing the root cause tracing method of the prediction result of the artificial intelligent model based on the expert rules in the embodiment.
The embodiment of the artificial intelligent model prediction result root cause tracing device based on expert rules can be applied to any equipment with data processing capability, and the equipment with data processing capability can be equipment or a device such as a computer. The apparatus embodiments may be implemented by software, or may be implemented by hardware or a combination of hardware and software. Taking software implementation as an example, the device in a logic sense is formed by reading corresponding computer program instructions in a nonvolatile memory into a memory by a processor of any device with data processing capability. In terms of hardware, as shown in fig. 4, the hardware structure diagram of the device with data processing capability according to the present invention, where the source tracing device is located, is shown in fig. 4, and in addition to the processor, the memory, the network interface, and the nonvolatile memory shown in fig. 4, any device with data processing capability in the embodiment generally includes other hardware according to the actual function of the any device with data processing capability, which is not described herein.
The implementation process of the functions and roles of each unit in the above device is specifically shown in the implementation process of the corresponding steps in the above method, and will not be described herein again.
For the device embodiments, reference is made to the description of the method embodiments for the relevant points, since they essentially correspond to the method embodiments. The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purposes of the present invention. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
The embodiment of the invention also provides a computer readable storage medium, wherein a program is stored on the computer readable storage medium, and when the program is executed by a processor, the method for tracing the root cause of the prediction result of the artificial intelligent model based on expert rules in the embodiment is realized.
The computer readable storage medium may be an internal storage unit, such as a hard disk or a memory, of any of the data processing enabled devices described in any of the previous embodiments. The computer readable storage medium may also be any device having data processing capabilities, such as a plug-in hard disk, a smart memory card (SMART MEDIA CARD, SMC), an SD card, a flash memory card (FLASH CARD), or the like, provided on the device. Further, the computer readable storage medium may include both internal storage units and external storage devices of any data processing device. The computer readable storage medium is used for storing the computer program and other programs and data required by the arbitrary data processing apparatus, and may also be used for temporarily storing data that has been output or is to be output.
The above examples only use the network representation learning of the figures as an implementation, and other methods suitable for converting expert rules and expert factors into vector representations, such as self-encoder (AutoEncoder), embedding (Embedding), etc., are within the scope of the invention. Therefore, all equivalent changes or modifications according to the principles and design ideas of the present invention are within the scope of the present invention.