WO2023042301A1 - Information processing device, information processing method, and recording medium - Google Patents
Information processing device, information processing method, and recording medium Download PDFInfo
- Publication number
- WO2023042301A1 WO2023042301A1 PCT/JP2021/033932 JP2021033932W WO2023042301A1 WO 2023042301 A1 WO2023042301 A1 WO 2023042301A1 JP 2021033932 W JP2021033932 W JP 2021033932W WO 2023042301 A1 WO2023042301 A1 WO 2023042301A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- prediction
- data
- input data
- abnormal values
- absence
- Prior art date
Links
- 230000010365 information processing Effects 0.000 title claims description 27
- 238000003672 processing method Methods 0.000 title claims description 4
- 230000002159 abnormal effect Effects 0.000 claims abstract description 102
- 238000004458 analytical method Methods 0.000 claims description 48
- 238000011157 data evaluation Methods 0.000 claims description 30
- 238000012545 processing Methods 0.000 claims description 17
- 238000011156 evaluation Methods 0.000 description 30
- 238000012806 monitoring device Methods 0.000 description 19
- 238000012544 monitoring process Methods 0.000 description 16
- 238000003745 diagnosis Methods 0.000 description 10
- 238000010586 diagram Methods 0.000 description 8
- 238000000034 method Methods 0.000 description 7
- 230000008569 process Effects 0.000 description 6
- 230000004044 response Effects 0.000 description 6
- 230000015654 memory Effects 0.000 description 5
- 230000000694 effects Effects 0.000 description 4
- 238000010801 machine learning Methods 0.000 description 4
- 208000035473 Communicable disease Diseases 0.000 description 1
- 230000005856 abnormality Effects 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 230000033228 biological regulation Effects 0.000 description 1
- 239000003086 colorant Substances 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000006866 deterioration Effects 0.000 description 1
- 238000013210 evaluation model Methods 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
- 238000012549 training Methods 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
- 230000003936 working memory Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
Definitions
- This disclosure relates to monitoring machine learning models.
- Patent Literature 1 discloses a method of generating an inspector model for detecting deterioration in accuracy of an operational model and using this to detect changes in output results of the operational model due to temporal changes in data trends.
- One object of the present disclosure is to provide an information processing device capable of creating and presenting diagnostic information regarding a plurality of factors affecting prediction accuracy for a model in operation.
- an information processing device includes: input data acquisition means for acquiring a plurality of input data; input data evaluation means for detecting outliers and abnormal values in the input data; prediction data acquisition means for acquiring prediction data generated from the input data using the trained model; prediction data evaluation means for detecting abnormal values in the prediction data; prediction accuracy acquisition means for acquiring prediction accuracy by the trained model; display means for displaying diagnostic information including the presence or absence of outliers and abnormal values in the input data, the presence or absence of abnormal values in the prediction data, and the prediction accuracy; Prepare.
- an information processing method includes: get multiple input data, detecting outliers and abnormal values in the input data; Obtaining prediction data generated from the input data using the trained model, detecting abnormal values in the predicted data; Obtaining the prediction accuracy of the trained model, Diagnosis information including presence/absence of outliers and abnormal values in the input data, presence/absence of abnormal values in the predicted data, and the prediction accuracy is displayed.
- the recording medium comprises get multiple input data, detecting outliers and abnormal values in the input data; Obtaining prediction data generated from the input data using the trained model, detecting abnormal values in the predicted data; Obtaining the prediction accuracy of the trained model, A program for causing a computer to execute a process of displaying diagnostic information including the presence or absence of outliers and abnormal values in the input data, the presence or absence of abnormal values in the predicted data, and the prediction accuracy is recorded.
- FIG. 1 is a block diagram showing the overall configuration of a monitoring system 1 according to the first embodiment.
- the monitoring system 1 is a system that monitors the state of a machine learning model that has been learned in advance and is in operation.
- the monitoring system 1 includes a prediction device 2 and a monitoring device 100 .
- the prediction device 2 is a device that makes predictions using a prediction model.
- a prediction model is an example of a machine learning model to be monitored by the monitoring system 1, and is a model that has already been trained using learning data.
- the prediction device 2 makes a prediction based on the input data D1, generates prediction data D2 as a prediction result, and outputs the prediction data D2 to the monitoring device 100.
- FIG. 1 A prediction model is an example of a machine learning model to be monitored by the monitoring system 1, and is a model that has already been trained using learning data.
- the prediction device 2 makes a prediction based on the input data D1, generates prediction data D2 as a prediction result, and outputs the prediction data D2 to the monitoring device 100.
- the monitoring device 100 evaluates whether the input data D1 is normal and evaluates whether the prediction data D2 generated by the prediction device 2 is normal. Also, the monitoring device 100 is input with performance data D3. The performance data D3 is data corresponding to the input data D1 and data actually obtained in the real world. The monitoring device 100 evaluates the prediction accuracy of the prediction model by calculating the error rate between the prediction data D2 generated by the prediction device 2 and the performance data D3. Then, the monitoring device 100 generates display data including the input data D1, the prediction data D2, and the prediction accuracy evaluation result. By looking at the displayed data, the user can know the operation status of the prediction model and consider the necessity of re-learning the prediction model.
- FIG. 2 is a block diagram showing the hardware configuration of the monitoring device 100. As shown in FIG. As illustrated, the monitoring device 100 includes an interface (I/F) 11, a processor 12, a memory 13, a recording medium 14, a database (DB) 15, a display device 16, and an input device 17. Prepare.
- the interface 11 performs data input/output with an external device. Specifically, the input data D1, the prediction data D2, and the performance data D3 are input to the monitoring device 100 through the interface 11. FIG.
- the processor 12 is a computer such as a CPU (Central Processing Unit), and controls the entire monitoring device 100 by executing a program prepared in advance.
- the processor 12 may be a GPU (Graphics Processing Unit) or an FPGA (Field-Programmable Gate Array).
- the processor 12 executes monitoring processing, which will be described later.
- the memory 13 is composed of ROM (Read Only Memory), RAM (Random Access Memory), and the like. Memory 13 is also used as a working memory during execution of various processes by processor 12 .
- the recording medium 14 is a non-volatile, non-temporary recording medium such as a disk-shaped recording medium or a semiconductor memory, and is configured to be detachable from the monitoring device 100 .
- the recording medium 14 records various programs executed by the processor 12 .
- DB15 memorize
- the display device 16 is, for example, a liquid crystal display device, etc., and displays the monitoring results generated by the monitoring device 100 .
- the input device 17 is, for example, a mouse, a keyboard, etc., and is used by the user to make necessary instructions and inputs in the monitoring process.
- FIG. 3 is a block diagram showing the functional configuration of the monitoring device 100 of the first embodiment.
- the monitoring device 100 functionally includes an input data evaluation unit 21 , a prediction data evaluation unit 22 , a prediction accuracy evaluation unit 23 , and a display data generation unit 24 .
- Input data D1 is input to the input data evaluation unit 21 .
- the input data evaluation unit 21 detects outliers and abnormal values in the input data D1.
- the "outlier” is a value within a predetermined range that is defined as an impossible value for the input data D1.
- an "abnormal value” is a value within a predetermined range that deviates from the normal value when the value of the input data with which the prediction model can make an appropriate prediction is assumed to be the normal value.
- the input data evaluation unit 21 adds information indicating an outlier or an abnormal value to the input data D1 and outputs the data to the display data generation unit 24 .
- Prediction data D2 which is the result of prediction by the prediction device 2, is input to the prediction data evaluation unit 22.
- the predicted data evaluation unit 22 detects abnormal values in the predicted data D2.
- the "abnormal value” is a value within a predetermined range outside the normal value when the value of the prediction data output when the prediction model makes a proper prediction is assumed to be the normal value.
- the prediction data evaluation unit 22 adds information indicating an abnormal value to the prediction data D2 and outputs the data to the display data generation unit 24 .
- the prediction accuracy evaluation unit 23 receives the prediction data D2 and the performance data D3.
- the prediction accuracy evaluation unit 23 calculates an error rate, which is an index indicating prediction accuracy, based on the prediction data D2 and the performance data D3.
- the error rate is a value that indicates the deviation of the prediction result from the actual value, and is used as an index that indicates the prediction accuracy of the prediction model.
- the prediction accuracy evaluation section 23 outputs the calculated error rate to the display data generation section 24 .
- the prediction accuracy evaluation unit 23 uses the error rate as an index indicating the prediction accuracy. may be evaluated.
- the display data generation unit 24 uses the input data D1 input from the input data evaluation unit 21, the prediction data D2 input from the prediction data evaluation unit 22, and the error rate input from the prediction accuracy evaluation unit 23. , to generate the display data.
- the display data is data for displaying to the user the presence/absence of outliers and abnormal values in the input data D1, the presence/absence of abnormal values in the prediction data D2, and the prediction accuracy, and is output to the display device 16.
- the display device 16 displays the input display data on the screen. This allows the user to easily obtain information on the performance and status of the prediction model currently in operation.
- FIG. 4 is a flow chart of processing by the monitoring device. This processing is realized by executing a program prepared in advance by the processor 12 shown in FIG. 2 and operating as each element shown in FIG.
- the input data evaluation unit 21 acquires the input data D1, the prediction data evaluation unit 22 acquires the prediction data D2, and the prediction accuracy evaluation unit 23 acquires the prediction data D2 and the performance data D3 (step S11).
- the input data evaluation unit 21 evaluates the input data D1, adds information indicating an outlier or an abnormal value to the input data D1, and outputs the evaluation result to the display data generation unit 24 (step S12).
- the prediction data evaluation unit 22 evaluates the prediction data, adds information indicating an abnormal value to the prediction data D2, and outputs the evaluation result to the display data generation unit 24 (step S13).
- the prediction accuracy evaluation unit 23 calculates prediction accuracy (error rate) based on the prediction data D2 and the performance data D3, and outputs the evaluation result to the display data generation unit 24 (step S14).
- the display data generation unit 24 generates display data using the evaluation results of the input data evaluation unit 21, the prediction data evaluation unit 22, and the prediction accuracy evaluation unit 23, and outputs the display data to the display device 16 (step S15).
- the display device 16 displays the display data (step S16). Then the process ends.
- the display data includes diagnostic reports and, if necessary, analysis result screens. That is, as the monitoring result regarding the prediction model, only the above diagnostic report may be presented to the user, or the following analysis result screen may be presented in addition to the diagnostic report.
- the person in charge at Company A displays the diagnostic report and the analysis result screen, continues detailed monitoring of the forecast model delivered to Company B, and presents only the diagnostic report to Company B.
- a report on the state of the predictive model may be made.
- FIG. 5 shows an example of a diagnosis report included in display data.
- the diagnostic report is created based on the evaluation results by the input data evaluation unit 21, the prediction data evaluation unit 22, and the prediction accuracy evaluation unit 23, and other various data obtained by monitoring the prediction model, and is used to describe the current state of the prediction model. This report briefly describes the situation of
- the diagnostic report is a report on 10 diagnostic items.
- the diagnostic report includes an ID, description, evaluation result, countermeasures, and priority of countermeasures for each diagnostic item.
- ID is the identification information of each diagnostic item reported by the diagnostic report
- Delivery is the description of the diagnostic item.
- the “evaluation result” is the diagnosis result for each diagnostic item, and the “countermeasure” is the action proposed to improve the condition when the evaluation result of the diagnostic item is not good.
- "Response priority” indicates the order of priority of each countermeasure when a plurality of countermeasures are proposed.
- the diagnostic item 001 “existence of outlier in input data” indicates whether or not the input data is an outlier.
- An “outlier” is a value that cannot exist in reality. For example, in the real estate rent prediction model described later, "required time from the station” is used as input data. Negative values are set as outliers for "required time from station”. In the example of FIG. 5, since the evaluation result of the diagnostic item 001 is "none", the diagnostic report indicates that there are no outliers in the input data D1.
- Presence or absence of abnormal value in input data indicates whether or not the input data is an abnormal value.
- the "abnormal value” here is a value that indicates that the tendency of the input data has changed, and specifically refers to a value that does not belong to the range of the input data used during learning of the prediction model. If the input data value corresponds to an abnormal value, it is determined that the input data value input to the prediction model has changed during actual prediction. In the example of FIG. 5, since the evaluation result of the diagnostic item 002 is "none", the diagnostic report indicates that there were no abnormal values in the input data.
- Presence or absence of abnormal value in prediction data indicates whether or not the prediction data generated by the prediction model is an abnormal value.
- the "abnormal value” here refers to a value that does not belong to the range of the actually obtained performance data or the correct data used during the learning of the prediction model. If the predicted data value corresponds to an abnormal value, it is suspected that the input data is an outlier or an abnormal value, or that the prediction accuracy of the prediction model is low. In the example of FIG. 5, since the evaluation result of the diagnostic item 003 is "none", the diagnostic report indicates that there is no abnormal value in the prediction data.
- the "prediction accuracy” of the diagnosis item 004 indicates the accuracy of prediction by the prediction model, that is, the reliability of the model, and is indicated by the above-mentioned error rate, for example.
- the evaluation result of the diagnostic item 004 is "40% lower than the initial”, and the diagnostic report shows that the accuracy of the prediction model currently in use has decreased by 40% compared to when it was first used. It is shown that. For this reason, the diagnosis report proposes "re-learning" as a countermeasure, and the priority of this countermeasure is set to "1", which is the highest.
- the diagnostic item 005 "whether exceptions occurred in data processing" indicates whether or not any problem occurred in the processing when the input data was processed before being input to the prediction model.
- Data processing means that, for example, instead of inputting certain input data as it is to a prediction model, when calculating and inputting the average value of a predetermined number or the moving average value of the last 7 days, those values means to calculate
- the defect includes the case where the value obtained by the data processing corresponds to an outlier of the input data.
- the evaluation result of the diagnostic item 005 is "none", so the diagnostic report indicates that no exception occurred during data processing.
- Model creation time in diagnostic item 006 indicates the time required to create a prediction model, that is, to learn a prediction model using learning data.
- the creation of the model here includes updating (relearning) of the model in addition to the creation of the initial model.
- the time required to create a prediction model can be predicted to some extent according to the amount of learning data to be used, conditions for terminating the learning process, and the like. Therefore, if the prediction model creation time is much shorter or longer than usual, it is doubtful whether the learning process was performed correctly. Therefore, it is diagnosed whether or not the model creation time was appropriate.
- the evaluation result of the diagnostic item 006 is "within the regulation", so the diagnostic report indicates that the model creation time was appropriate.
- Diagnosis item 007 "cause of fluctuation in prediction accuracy" indicates an item presumed to be the cause when the diagnosis item 004 diagnoses that the prediction accuracy has deteriorated. For example, in the case of the forecast model for real estate rents mentioned above, factors such as ⁇ stagnation of economic activity due to the epidemic of infectious diseases'' and ⁇ construction of luxury condominiums and shopping malls nearby'' can be considered as causes of fluctuations in forecast accuracy. be done. Note that the cause of the prediction accuracy variation may be estimated using an algorithm for evaluating a prediction model or an evaluation model, or may be estimated by humans. In the example of FIG. 5, the evaluation result of diagnostic item 007 is presumed to be "because XX has occurred and the tendency has changed". In addition, regarding this, the diagnostic report proposes a countermeasure to add YY to the explanatory variable, and the priority of the countermeasure is set to "2.”
- the diagnostic item 008 "whether or not appropriate system output is implemented" indicates whether or not the prediction data is correctly output from the prediction device using the prediction model. Prediction results using prediction models are usually input to other related systems and used. Therefore, it is diagnosed whether or not the prediction device is correctly outputting prediction data to another system. In the example of FIG. 5, the evaluation result of the diagnostic item 008 is "implemented", so the diagnostic report indicates that the prediction data is correctly output from the prediction device.
- "Response time to system" of diagnostic item 009 indicates whether or not the prediction data is correctly exchanged between the prediction device using the prediction model and another system that receives and uses the prediction data. . Specifically, if the prediction device is designed to transmit prediction data to another system, the response time to the system is the response from the other system indicating receipt after the prediction device has transmitted the prediction data to the other system. It is time to receive. In addition, in the case of specifications in which another system requests prediction data from the prediction device, the response time to the system is the time from when the prediction device receives the request from the other system to when it transmits the prediction data to the other system. It's time. It is diagnosed whether or not these response times are within the specified time. In the example of FIG. 5, the evaluation result of the diagnostic item 009 is "within specification", and the diagnostic report indicates that the prediction data is correctly transmitted and received.
- the diagnostic item 010 "display of caution information" displays various alerts such as alerts related to input data, prediction data and prediction accuracy, which will be described later, and alerts output when prediction data is not output from the prediction device. 16 indicates whether or not it is displayed correctly. In the example of FIG. 5, the evaluation result of the diagnostic item 010 is "displayed", and the diagnostic report indicates that various alerts are correctly displayed.
- diagnostic items 001 to 010 are diagnostic items mainly related to the state of the prediction model
- diagnostic items 005 to 010 are the overall operation of the prediction device that performs prediction using the prediction model. It is an item related to the situation.
- the diagnostic report displays a list of diagnostic results for predetermined diagnostic items, so the overall status of the prediction model can be easily grasped.
- the diagnosis item for which the diagnosis result is determined to be abnormal is highlighted by, for example, changing the color of the characters or the background, so as to alert the user that there is an abnormality. good too.
- the analysis result screen displays the diagnostic results by graphically showing the numerical grounds for the diagnostic items 001 to 004, which are particularly related to the prediction model, among the diagnostic items described in the diagnostic report.
- the prediction model will predict real estate rents. Specifically, the prediction model predicts the monthly rent of a rental property based on input data such as the size of the property, the floor plan, and the time required from the station. Note that the prediction model is a trained model that has been trained using input data such as the size of the actual rental property, the floor plan, and the time required from the station, and the actual rent of the rental property as correct data.
- Fig. 6 shows an example of the analysis result screen.
- the analysis result screen 200 includes input data, prediction data, and an error rate indicating prediction accuracy as display items. As described above, there are multiple pieces of input data such as the size of the property, the floor plan, and the time required from the station. can be made Note that the example of FIG. 6 shows analysis results in a steady operation state, that is, in a state in which all of the input data, prediction data, and error rate are normal.
- the analysis result screen 200 includes graphs 201 and 204 regarding input data, graphs 202 and 205 regarding predicted data, and graphs 203 and 206 regarding error rates.
- a graph 201 is a box plot showing transition of input data.
- the horizontal axis indicates the date when the input data was input, and the vertical axis indicates the required time from the station, which is one of the input data.
- the graph 201 is a box plot showing the minimum, maximum, median, average, and interquartile range (25% to 75% of the total) of the required time from the station for each day when the input data is entered. distribution of values belonging to ).
- the graph 202 is a boxplot of forecast data, with the horizontal axis showing the date when the forecast data was obtained and the vertical axis showing the rent as the forecast result.
- a graph 202 shows the minimum value, maximum value, median value, average value, interquartile range, etc. of the rent for each day on which forecast data is obtained by means of a box plot.
- the graph 203 is a boxplot of the error rate, the horizontal axis indicates the date when the forecast data was obtained, and the vertical axis indicates the error rate of the forecast data.
- a graph 203 shows the minimum value, maximum value, median value, average value, interquartile range, etc. of the error rate that indicates the prediction accuracy for each day on which the prediction is made, using a box plot.
- a graph 204 is a histogram of the required time from the station, which is one of the input data, and shows the distribution of the required time from the station over a certain period of time, for example, the period shown in the graph 201 .
- the horizontal axis indicates the required time from the station, and the vertical axis indicates the frequency.
- diagonally hatched bins indicate the distribution of values used as input data when training the prediction model
- gray bins indicate the distribution of values input as input data when making predictions using the prediction model.
- the value "8.03" in the figure is the average value of input data during learning, and the value "5.91" is the average value of input data during prediction.
- Graph 205 is a histogram of forecast data, and shows the distribution of rent over a certain period of time, for example, the period shown in graph 202.
- the horizontal axis indicates rent, and the vertical axis indicates frequency.
- obliquely hatched bins indicate the distribution of rents used as correct data when learning the prediction model
- gray bins indicate the distribution of rents obtained as prediction results when making predictions using the prediction model. Similar to the graph 204, the average value of correct data during learning is "90,364" and the average value of prediction data during prediction is "70,036".
- a graph 206 is a histogram of error rates, showing the distribution of error rates over a certain period of time, for example, the period shown in graph 203 .
- the horizontal axis indicates the error rate, and the vertical axis indicates the frequency.
- obliquely hatched bins indicate the distribution of error rates calculated during prediction model learning
- gray bins indicate the distribution of error rates calculated during prediction using the prediction model. Similar to the graph 204, the average value of the error rate during learning is “24.977” and the average value of the error rate during prediction is “13.15”.
- the analysis result screen shows the analysis results of input data, prediction data, and error rate (prediction accuracy) for a certain period of time in graphs such as box plots and histograms.
- the state of the error rate can be grasped based on a concrete numerical value.
- graphs 204 to 206 for the input data, prediction data, and error rate, by displaying the values at the time of learning of the prediction model and the values at the time of prediction in the same graph, It is possible to visualize the extent and trend of changes in each data.
- Fig. 7 shows an example of the analysis result screen on another day.
- the analysis result screen 210 includes boxplots 211-213 and histograms 214-216 for input data, predicted data and error rate, as in the example of FIG.
- the view of each graph is the same as in the example of FIG.
- FIG. 7 shows analysis results when input data includes outliers.
- "a value less than 0" is set as an outlier for the input data "required time from the station”.
- the minimum value of the required time from the station on August 13 is "-1", which corresponds to an outlier. Therefore, in the graph 211, a circle AL1 is displayed as an alert.
- the minimum value of the required time from the station is "-1".
- an alert indicating that fact is displayed.
- Fig. 8 shows an example of the analysis result screen on yet another day.
- the analysis result screen 220 includes boxplots 221-223 and histograms 224-226 for input data, prediction data and error rate, as in the example of FIG.
- the view of each graph is the same as in the example of FIG.
- the input data is an abnormal value.
- the case where "the difference between the average value at the time of learning and the average value at the time of prediction is 3 or more” is set as an abnormal value.
- the difference between the average value “8.03” during learning of the input data “required time from the station” and the average value “14.13” during prediction is 3 or more.
- the data are judged to be outliers. Therefore, an arrow AL2 is displayed in the graph 224 as an alert to that effect.
- the predicted data is an abnormal value.
- the prediction data is set as an abnormal value when it is "260,000 or more" or "the difference between the average value at the time of learning and the average value at the time of prediction is 30,000 or more".
- the maximum value of forecast data for September 12 exceeds 260,000, and a circle AL3 is displayed as an alert to that effect.
- the error rate is an abnormal value.
- an error rate of "40% or more” is set as an abnormal value.
- FIG. 8 as shown in graph 223, there are days when the error rate exceeds 40%, and a rectangle AL4 is displayed as an alert to that effect.
- the analysis result screen displays alerts for outliers and abnormal values in the input data, abnormal values in the predicted data, and abnormal values in the error rate (prediction accuracy). Abnormal state of the error rate can be easily known.
- the alerts (circle AL3, rectangle AL4) related to the prediction data and the error rate are displayed in the boxplot graphs 222 and 223, but these alerts are displayed in the histogram graphs 225 and 226. You may That is, alerts regarding input data, predicted data, and error rates may be displayed on one or more of the multiple graphs.
- FIGS. 6 to 8 the input data, predicted data, and error rate are shown by box plots and histograms, but line graphs may be used instead of box plots. 9 to 11 show examples of displaying input data, predicted data, and error rates using line graphs and histograms.
- line graphs 201a to 203a are displayed instead of the boxplots 201 to 203 shown in FIG.
- line graphs 211a to 213a are displayed instead of the boxplots 211 to 213 shown in FIG.
- a circle AL5 is displayed as an alert indicating an outlier in the input data.
- line graphs 221a to 223a are displayed instead of the boxplots 221 to 223 shown in FIG.
- an arrow AL6 is displayed as an alert in the graph 224 as in FIG.
- a circle AL7 is displayed as an alert indicating an abnormal value of predicted data
- a rectangle AL8 is displayed as an alert indicating an abnormal value of the error rate.
- the input data, prediction data and error rate for one prediction model are displayed on one analysis result screen, but if there are multiple prediction models , input data, prediction data and error rates for a plurality of prediction models may be simultaneously displayed on one analysis result screen.
- graphs of input data, prediction data, and error rates as shown in FIG. 6 may be prepared for each model and displayed side by side on one analysis result screen.
- the values of the two models may be displayed simultaneously in different colors and superimposed on each graph of the input data, prediction data, and error rate. In this case, it may be determined arbitrarily whether to use a box plot, a line graph, or a histogram.
- FIG. 12 is a block diagram showing the functional configuration of the information processing apparatus according to the second embodiment.
- the information processing device 70 includes input data acquisition means 71 , input data evaluation means 72 , prediction data acquisition means 73 , prediction data evaluation means 74 , prediction accuracy acquisition means 75 , and display means 76 .
- FIG. 13 is a flowchart of processing by the information processing apparatus of the second embodiment.
- the input data obtaining means 71 obtains a plurality of input data (step S71).
- the input data evaluation means 72 detects outliers and abnormal values in the input data (step S72).
- the prediction data acquisition unit 73 acquires prediction data generated from the input data using the trained model (step S73).
- the predicted data evaluation means 74 detects abnormal values in the predicted data (step S74).
- the prediction accuracy acquisition unit 75 acquires the prediction accuracy of the learned model (step S75). Note that steps S71 to S72, steps S73 to S74, and step S75 may be performed in a different order from the above, or may be performed in parallel in terms of time.
- the display means 76 displays diagnostic information including the presence or absence of outliers and abnormal values in the input data, the presence or absence of abnormal values in the prediction data, and the prediction accuracy (step S76).
- the information processing device 70 of the second embodiment it is possible to create and present diagnostic information regarding multiple factors that affect the prediction accuracy for the model in operation.
- the diagnostic information includes: a diagnostic report describing the presence or absence of outliers and abnormal values in the input data, the presence or absence of abnormal values in the prediction data, and the prediction accuracy; an analysis result screen displaying the input data, the prediction data, and the prediction accuracy on a graph, respectively;
- the information processing device comprising:
- appendix 3 The information according to appendix 2, wherein the analysis result screen displays the presence or absence of outliers and abnormal values in the input data, the presence or absence of abnormal values in the prediction data, and the prediction accuracy, respectively, on time-series graphs. processing equipment.
- Appendix 4 The information processing according to appendix 2 or 3, wherein the analysis result screen displays the presence or absence of outliers and abnormal values in the input data, the presence or absence of abnormal values in the prediction data, and the prediction accuracy, respectively, on histograms. Device.
- the analysis result screen includes a plurality of graphs showing the presence or absence of outliers and abnormal values in the input data for each of the plurality of trained models, and the presence or absence of abnormal values in the prediction data for each of the plurality of trained models. 6.
- the information processing apparatus according to any one of appendices 2 to 5, wherein a plurality of graphs and a plurality of graphs showing the prediction accuracy for each of the plurality of trained models are displayed side by side.
- the analysis result screen includes one graph that simultaneously displays the presence or absence of outliers and abnormal values in the input data for a plurality of trained models, and the presence or absence of abnormal values in the prediction data for each of the plurality of trained models. 6.
- the information processing apparatus according to any one of Appendices 2 to 5, including one graph that simultaneously displays , and one graph that simultaneously displays the prediction accuracy for each of a plurality of trained models.
- the diagnostic report includes countermeasures and corresponding countermeasures. 8.
- the information processing apparatus according to any one of Appendices 2 to 7, including a description of the priority of measures.
- a recording medium recording a program for causing a computer to execute processing for displaying diagnostic information including the presence or absence of outliers and abnormal values in the input data, the presence or absence of abnormal values in the predicted data, and the prediction accuracy.
- prediction device 12 processor 16 display device 21 input data evaluation unit 22 prediction data evaluation unit 23 prediction accuracy evaluation unit 24 display data generation unit 100 monitoring device 200, 200a, 210, 210a, 220, 220a Analysis result screen
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Medical Informatics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Artificial Intelligence (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
An input data acquiring means that acquires a plurality of pieces of input data, and an input data evaluating means that detects outliers and abnormal values in the input data. A prediction data acquiring means that acquires prediction data generated from the input data by using a trained model, and a prediction data evaluating means that detects abnormal values in the prediction data. A prediction accuracy acquiring means that acquires a prediction accuracy from a trained model. A display means that displays diagnostic information including whether the input data has outliers or abnormal values, whether the prediction data has abnormal values, and the prediction accuracy.
Description
本開示は、機械学習モデルのモニタリングに関する。
This disclosure relates to monitoring machine learning models.
近年、様々な分野において、機械学習により得られた予測モデルを利用して業務が行われている。当初のモデルの作成から時間が経過し、予測に用いるデータの傾向が変化すると、予測精度が低下し、モデルの再学習が必要となる。特許文献1は、運用モデルの精度劣化を検知するインスペクターモデルを生成し、これを用いてデータの傾向の時間変化に起因する運用モデルの出力結果の変化を検出する手法を開示している。
In recent years, in various fields, work is being done using prediction models obtained by machine learning. As time passes from the creation of the initial model and the tendency of the data used for prediction changes, the prediction accuracy decreases and the model needs to be re-learned. Patent Literature 1 discloses a method of generating an inspector model for detecting deterioration in accuracy of an operational model and using this to detect changes in output results of the operational model due to temporal changes in data trends.
運用中のモデルについて再学習のタイミングを適切に決定するには、モデルの予測精度に影響を与える複数の要因に関する診断情報を作成し、提示することが好ましい。
In order to appropriately determine the timing of re-learning for the model in operation, it is preferable to create and present diagnostic information on multiple factors that affect the prediction accuracy of the model.
本開示の1つの目的は、運用中のモデルについて、予測精度に影響を与える複数の要因に関する診断情報を作成し、提示することが可能な情報処理装置を提供することにある。
One object of the present disclosure is to provide an information processing device capable of creating and presenting diagnostic information regarding a plurality of factors affecting prediction accuracy for a model in operation.
本開示の一つの観点では、情報処理装置は、
複数の入力データを取得する入力データ取得手段と、
前記入力データの外れ値及び異常値を検出する入力データ評価手段と、
学習済みモデルを用いて前記入力データから生成された予測データを取得する予測データ取得手段と、
前記予測データの異常値を検出する予測データ評価手段と、
前記学習済みモデルによる予測精度を取得する予測精度取得手段と、
前記入力データの外れ値及び異常値の有無と、前記予測データの異常値の有無と、前記予測精度とを含む診断情報を表示する表示手段と、
を備える。
In one aspect of the present disclosure, an information processing device includes:
input data acquisition means for acquiring a plurality of input data;
input data evaluation means for detecting outliers and abnormal values in the input data;
prediction data acquisition means for acquiring prediction data generated from the input data using the trained model;
prediction data evaluation means for detecting abnormal values in the prediction data;
prediction accuracy acquisition means for acquiring prediction accuracy by the trained model;
display means for displaying diagnostic information including the presence or absence of outliers and abnormal values in the input data, the presence or absence of abnormal values in the prediction data, and the prediction accuracy;
Prepare.
複数の入力データを取得する入力データ取得手段と、
前記入力データの外れ値及び異常値を検出する入力データ評価手段と、
学習済みモデルを用いて前記入力データから生成された予測データを取得する予測データ取得手段と、
前記予測データの異常値を検出する予測データ評価手段と、
前記学習済みモデルによる予測精度を取得する予測精度取得手段と、
前記入力データの外れ値及び異常値の有無と、前記予測データの異常値の有無と、前記予測精度とを含む診断情報を表示する表示手段と、
を備える。
In one aspect of the present disclosure, an information processing device includes:
input data acquisition means for acquiring a plurality of input data;
input data evaluation means for detecting outliers and abnormal values in the input data;
prediction data acquisition means for acquiring prediction data generated from the input data using the trained model;
prediction data evaluation means for detecting abnormal values in the prediction data;
prediction accuracy acquisition means for acquiring prediction accuracy by the trained model;
display means for displaying diagnostic information including the presence or absence of outliers and abnormal values in the input data, the presence or absence of abnormal values in the prediction data, and the prediction accuracy;
Prepare.
本開示の他の観点では、情報処理方法は、
複数の入力データを取得し、
前記入力データの外れ値及び異常値を検出し、
学習済みモデルを用いて前記入力データから生成された予測データを取得し、
前記予測データの異常値を検出し、
前記学習済みモデルによる予測精度を取得し、
前記入力データの外れ値及び異常値の有無と、前記予測データの異常値の有無と、前記予測精度とを含む診断情報を表示する。 In another aspect of the present disclosure, an information processing method includes:
get multiple input data,
detecting outliers and abnormal values in the input data;
Obtaining prediction data generated from the input data using the trained model,
detecting abnormal values in the predicted data;
Obtaining the prediction accuracy of the trained model,
Diagnosis information including presence/absence of outliers and abnormal values in the input data, presence/absence of abnormal values in the predicted data, and the prediction accuracy is displayed.
複数の入力データを取得し、
前記入力データの外れ値及び異常値を検出し、
学習済みモデルを用いて前記入力データから生成された予測データを取得し、
前記予測データの異常値を検出し、
前記学習済みモデルによる予測精度を取得し、
前記入力データの外れ値及び異常値の有無と、前記予測データの異常値の有無と、前記予測精度とを含む診断情報を表示する。 In another aspect of the present disclosure, an information processing method includes:
get multiple input data,
detecting outliers and abnormal values in the input data;
Obtaining prediction data generated from the input data using the trained model,
detecting abnormal values in the predicted data;
Obtaining the prediction accuracy of the trained model,
Diagnosis information including presence/absence of outliers and abnormal values in the input data, presence/absence of abnormal values in the predicted data, and the prediction accuracy is displayed.
本開示のさらに他の観点では、記録媒体は、
複数の入力データを取得し、
前記入力データの外れ値及び異常値を検出し、
学習済みモデルを用いて前記入力データから生成された予測データを取得し、
前記予測データの異常値を検出し、
前記学習済みモデルによる予測精度を取得し、
前記入力データの外れ値及び異常値の有無と、前記予測データの異常値の有無と、前記予測精度とを含む診断情報を表示する処理をコンピュータに実行させるプログラムを記録する。 In yet another aspect of the present disclosure, the recording medium comprises
get multiple input data,
detecting outliers and abnormal values in the input data;
Obtaining prediction data generated from the input data using the trained model,
detecting abnormal values in the predicted data;
Obtaining the prediction accuracy of the trained model,
A program for causing a computer to execute a process of displaying diagnostic information including the presence or absence of outliers and abnormal values in the input data, the presence or absence of abnormal values in the predicted data, and the prediction accuracy is recorded.
複数の入力データを取得し、
前記入力データの外れ値及び異常値を検出し、
学習済みモデルを用いて前記入力データから生成された予測データを取得し、
前記予測データの異常値を検出し、
前記学習済みモデルによる予測精度を取得し、
前記入力データの外れ値及び異常値の有無と、前記予測データの異常値の有無と、前記予測精度とを含む診断情報を表示する処理をコンピュータに実行させるプログラムを記録する。 In yet another aspect of the present disclosure, the recording medium comprises
get multiple input data,
detecting outliers and abnormal values in the input data;
Obtaining prediction data generated from the input data using the trained model,
detecting abnormal values in the predicted data;
Obtaining the prediction accuracy of the trained model,
A program for causing a computer to execute a process of displaying diagnostic information including the presence or absence of outliers and abnormal values in the input data, the presence or absence of abnormal values in the predicted data, and the prediction accuracy is recorded.
本開示によれば、運用中のモデルについて、予測精度に影響を与える複数の要因に関する診断情報を作成し、提示することが可能となる。
According to this disclosure, it is possible to create and present diagnostic information regarding multiple factors that affect prediction accuracy for models in operation.
以下、図面を参照して、本開示の好適な実施形態について説明する。
<第1実施形態>
[全体構成]
図1は、第1実施形態に係るモニタリングシステム1の全体構成を示すブロック図である。モニタリングシステム1は、予め学習され、運用中の機械学習モデルの状態をモニタリングするシステムである。本実施形態では、モニタリングシステム1は、予測装置2と、モニタリング装置100とを備える。 Preferred embodiments of the present disclosure will be described below with reference to the drawings.
<First embodiment>
[overall structure]
FIG. 1 is a block diagram showing the overall configuration of amonitoring system 1 according to the first embodiment. The monitoring system 1 is a system that monitors the state of a machine learning model that has been learned in advance and is in operation. In this embodiment, the monitoring system 1 includes a prediction device 2 and a monitoring device 100 .
<第1実施形態>
[全体構成]
図1は、第1実施形態に係るモニタリングシステム1の全体構成を示すブロック図である。モニタリングシステム1は、予め学習され、運用中の機械学習モデルの状態をモニタリングするシステムである。本実施形態では、モニタリングシステム1は、予測装置2と、モニタリング装置100とを備える。 Preferred embodiments of the present disclosure will be described below with reference to the drawings.
<First embodiment>
[overall structure]
FIG. 1 is a block diagram showing the overall configuration of a
予測装置2は、予測モデルを用いて予測を行う装置である。予測モデルは、モニタリングシステム1によるモニタリングの対象となる機械学習モデルの一例であり、既に学習データを用いて学習済みのモデルである。予測装置2は、入力データD1に基づいて予測を行い、予測結果としての予測データD2を生成してモニタリング装置100へ出力する。
The prediction device 2 is a device that makes predictions using a prediction model. A prediction model is an example of a machine learning model to be monitored by the monitoring system 1, and is a model that has already been trained using learning data. The prediction device 2 makes a prediction based on the input data D1, generates prediction data D2 as a prediction result, and outputs the prediction data D2 to the monitoring device 100. FIG.
モニタリング装置100は、入力データD1が正常であるか否かなどの評価を行うとともに、予測装置2が生成した予測データD2が正常であるか否かの評価を行う。また、モニタリング装置100には、実績データD3が入力される。実績データD3は、入力データD1に対応するデータであって、実世界で実際に得られたデータである。モニタリング装置100は、予測装置2が生成した予測データD2と、実績データD3との誤差率を計算することにより、予測モデルの予測精度を評価する。そして、モニタリング装置100は、入力データD1、予測データD2及び予測精度の評価結果を含む表示データを生成する。ユーザは、表示データを見ることにより、予測モデルの運用状況を知り、予測モデルの再学習の必要性などを検討することができる。
The monitoring device 100 evaluates whether the input data D1 is normal and evaluates whether the prediction data D2 generated by the prediction device 2 is normal. Also, the monitoring device 100 is input with performance data D3. The performance data D3 is data corresponding to the input data D1 and data actually obtained in the real world. The monitoring device 100 evaluates the prediction accuracy of the prediction model by calculating the error rate between the prediction data D2 generated by the prediction device 2 and the performance data D3. Then, the monitoring device 100 generates display data including the input data D1, the prediction data D2, and the prediction accuracy evaluation result. By looking at the displayed data, the user can know the operation status of the prediction model and consider the necessity of re-learning the prediction model.
[モニタリング装置]
(ハードウェア構成)
図2は、モニタリング装置100のハードウェア構成を示すブロック図である。図示のように、モニタリング装置100は、インタフェース(I/F)11と、プロセッサ12と、メモリ13と、記録媒体14と、データベース(DB)15と、表示装置16と、入力装置17と、を備える。 [Monitoring device]
(Hardware configuration)
FIG. 2 is a block diagram showing the hardware configuration of themonitoring device 100. As shown in FIG. As illustrated, the monitoring device 100 includes an interface (I/F) 11, a processor 12, a memory 13, a recording medium 14, a database (DB) 15, a display device 16, and an input device 17. Prepare.
(ハードウェア構成)
図2は、モニタリング装置100のハードウェア構成を示すブロック図である。図示のように、モニタリング装置100は、インタフェース(I/F)11と、プロセッサ12と、メモリ13と、記録媒体14と、データベース(DB)15と、表示装置16と、入力装置17と、を備える。 [Monitoring device]
(Hardware configuration)
FIG. 2 is a block diagram showing the hardware configuration of the
インタフェース11は、外部装置との間でデータの入出力を行う。具体的に、入力データD1、予測データD2及び実績データD3は、インタフェース11を通じてモニタリング装置100に入力される。
The interface 11 performs data input/output with an external device. Specifically, the input data D1, the prediction data D2, and the performance data D3 are input to the monitoring device 100 through the interface 11. FIG.
プロセッサ12は、CPU(Central Processing Unit)などのコンピュータであり、予め用意されたプログラムを実行することによりモニタリング装置100の全体を制御する。なお、プロセッサ12は、GPU(Graphics Processing Unit)またはFPGA(Field-Programmable Gate Array)であってもよい。プロセッサ12は、後述するモニタリング処理を実行する。
The processor 12 is a computer such as a CPU (Central Processing Unit), and controls the entire monitoring device 100 by executing a program prepared in advance. The processor 12 may be a GPU (Graphics Processing Unit) or an FPGA (Field-Programmable Gate Array). The processor 12 executes monitoring processing, which will be described later.
メモリ13は、ROM(Read Only Memory)、RAM(Random Access Memory)などにより構成される。メモリ13は、プロセッサ12による各種の処理の実行中に作業メモリとしても使用される。
The memory 13 is composed of ROM (Read Only Memory), RAM (Random Access Memory), and the like. Memory 13 is also used as a working memory during execution of various processes by processor 12 .
記録媒体14は、ディスク状記録媒体、半導体メモリなどの不揮発性で非一時的な記録媒体であり、モニタリング装置100に対して着脱可能に構成される。記録媒体14は、プロセッサ12が実行する各種のプログラムを記録している。モニタリング装置100が各種の処理を実行する際には、記録媒体14に記録されているプログラムがメモリ13にロードされ、プロセッサ12により実行される。DB15は、必要に応じて、I/F11を通じて入力された入力データD1、予測データD2、及び、実績データD3を記憶する。
The recording medium 14 is a non-volatile, non-temporary recording medium such as a disk-shaped recording medium or a semiconductor memory, and is configured to be detachable from the monitoring device 100 . The recording medium 14 records various programs executed by the processor 12 . When the monitoring device 100 executes various processes, a program recorded on the recording medium 14 is loaded into the memory 13 and executed by the processor 12 . DB15 memorize|stores the input data D1, the prediction data D2, and the performance data D3 which were input through I/F11 as needed.
表示装置16は、例えば、液晶表示装置などであり、モニタリング装置100が生成したモニタリング結果を表示する。入力装置17は、例えばマウス、キーボードなどであり、ユーザがモニタリング処理において必要な指示、入力を行うために使用される。
The display device 16 is, for example, a liquid crystal display device, etc., and displays the monitoring results generated by the monitoring device 100 . The input device 17 is, for example, a mouse, a keyboard, etc., and is used by the user to make necessary instructions and inputs in the monitoring process.
(機能構成)
図3は、第1実施形態のモニタリング装置100の機能構成を示すブロック図である。モニタリング装置100は、機能的には、入力データ評価部21と、予測データ評価部22と、予測精度評価部23と、表示データ生成部24と、を備える。 (Functional configuration)
FIG. 3 is a block diagram showing the functional configuration of themonitoring device 100 of the first embodiment. The monitoring device 100 functionally includes an input data evaluation unit 21 , a prediction data evaluation unit 22 , a prediction accuracy evaluation unit 23 , and a display data generation unit 24 .
図3は、第1実施形態のモニタリング装置100の機能構成を示すブロック図である。モニタリング装置100は、機能的には、入力データ評価部21と、予測データ評価部22と、予測精度評価部23と、表示データ生成部24と、を備える。 (Functional configuration)
FIG. 3 is a block diagram showing the functional configuration of the
入力データ評価部21には、入力データD1が入力される。入力データ評価部21は、入力データD1の外れ値及び異常値を検出する。ここで、「外れ値」とは、入力データD1の値として本来ありえない値として規定される所定範囲の値である。一方、「異常値」とは、予測モデルが適正な予測を行うことができる入力データの値を正常値としたときに、正常値から外れた所定範囲の値である。入力データ評価部21は、入力データD1に対して外れ値又は異常値であることを示す情報を付加し、表示データ生成部24へ出力する。
Input data D1 is input to the input data evaluation unit 21 . The input data evaluation unit 21 detects outliers and abnormal values in the input data D1. Here, the "outlier" is a value within a predetermined range that is defined as an impossible value for the input data D1. On the other hand, an "abnormal value" is a value within a predetermined range that deviates from the normal value when the value of the input data with which the prediction model can make an appropriate prediction is assumed to be the normal value. The input data evaluation unit 21 adds information indicating an outlier or an abnormal value to the input data D1 and outputs the data to the display data generation unit 24 .
予測データ評価部22には、予測装置2による予測結果である予測データD2が入力される。予測データ評価部22は、予測データD2の異常値を検出する。ここで、「異常値」とは、予測モデルが適正な予測を行った場合に出力される予測データの値を正常値としたときに、正常値から外れた所定範囲の値である。予測データ評価部22は、予測データD2に対して異常値であることを示す情報を付加し、表示データ生成部24へ出力する。
Prediction data D2, which is the result of prediction by the prediction device 2, is input to the prediction data evaluation unit 22. The predicted data evaluation unit 22 detects abnormal values in the predicted data D2. Here, the "abnormal value" is a value within a predetermined range outside the normal value when the value of the prediction data output when the prediction model makes a proper prediction is assumed to be the normal value. The prediction data evaluation unit 22 adds information indicating an abnormal value to the prediction data D2 and outputs the data to the display data generation unit 24 .
予測精度評価部23には、予測データD2と、実績データD3とが入力される。予測精度評価部23は、予測データD2と実績データD3とに基づいて、予測精度を示す指標である誤差率を算出する。ここで、誤差率は、予測データと実績データの差の絶対値を用いて、以下の式で算出される。
(誤差率)=|(予測データの値)-(実績データの値)|/(実績データの値) The predictionaccuracy evaluation unit 23 receives the prediction data D2 and the performance data D3. The prediction accuracy evaluation unit 23 calculates an error rate, which is an index indicating prediction accuracy, based on the prediction data D2 and the performance data D3. Here, the error rate is calculated by the following formula using the absolute value of the difference between the predicted data and the actual data.
(error rate) = | (predicted data value) - (actual data value) | / (actual data value)
(誤差率)=|(予測データの値)-(実績データの値)|/(実績データの値) The prediction
(error rate) = | (predicted data value) - (actual data value) | / (actual data value)
実績データD3は、入力データD1に対応する実際の値であるので、誤差率は、実際の値に対する予測結果のずれを示す値であり、予測モデルの予測精度を示す指標として用いられる。予測精度評価部23は、算出された誤差率を表示データ生成部24へ出力する。なお、本実施形態では、予測精度評価部23は予測精度を示す指標として誤差率を使用しているが、予測精度評価部23は誤差率以外の他の指標を用いて予測モデルの予測精度を評価してもよい。
Since the actual data D3 is the actual value corresponding to the input data D1, the error rate is a value that indicates the deviation of the prediction result from the actual value, and is used as an index that indicates the prediction accuracy of the prediction model. The prediction accuracy evaluation section 23 outputs the calculated error rate to the display data generation section 24 . In addition, in the present embodiment, the prediction accuracy evaluation unit 23 uses the error rate as an index indicating the prediction accuracy. may be evaluated.
表示データ生成部24は、入力データ評価部21から入力された入力データD1と、予測データ評価部22から入力された予測データD2と、予測精度評価部23から入力された誤差率とを用いて、表示データを生成する。表示データは、入力データD1の外れ値及び異常値の有無、予測データD2の異常値の有無、並びに、予測精度をユーザに対して表示するためのデータであり、表示装置16へ出力される。表示装置16は、入力された表示データを画面に表示する。これにより、ユーザは、現在運用中の予測モデルの性能や状態に関する情報を容易に得ることができる。
The display data generation unit 24 uses the input data D1 input from the input data evaluation unit 21, the prediction data D2 input from the prediction data evaluation unit 22, and the error rate input from the prediction accuracy evaluation unit 23. , to generate the display data. The display data is data for displaying to the user the presence/absence of outliers and abnormal values in the input data D1, the presence/absence of abnormal values in the prediction data D2, and the prediction accuracy, and is output to the display device 16. FIG. The display device 16 displays the input display data on the screen. This allows the user to easily obtain information on the performance and status of the prediction model currently in operation.
(モニタリング処理)
図4は、モニタリング装置による処理のフローチャートである。この処理は、図2に示すプロセッサ12が予め用意されたプログラムを実行し、図3に示す各要素として動作することにより実現される。 (monitoring process)
FIG. 4 is a flow chart of processing by the monitoring device. This processing is realized by executing a program prepared in advance by theprocessor 12 shown in FIG. 2 and operating as each element shown in FIG.
図4は、モニタリング装置による処理のフローチャートである。この処理は、図2に示すプロセッサ12が予め用意されたプログラムを実行し、図3に示す各要素として動作することにより実現される。 (monitoring process)
FIG. 4 is a flow chart of processing by the monitoring device. This processing is realized by executing a program prepared in advance by the
まず、入力データ評価部21は入力データD1を取得し、予測データ評価部22は予測データD2を取得し、予測精度評価部23は、予測データD2及び実績データD3を取得する(ステップS11)。次に、入力データ評価部21は、入力データD1を評価し、入力データD1に対して外れ値又は異常値であることを示す情報を付加し、評価結果として表示データ生成部24へ出力する(ステップS12)。次に、予測データ評価部22は、予測データを評価し、予測データD2に対して異常値であることを示す情報を付加し、評価結果として表示データ生成部24へ出力する(ステップS13)。次に、予測精度評価部23は、予測データD2と実績データD3とに基づいて予測精度(誤差率)を算出し、評価結果として表示データ生成部24へ出力する(ステップS14)。
First, the input data evaluation unit 21 acquires the input data D1, the prediction data evaluation unit 22 acquires the prediction data D2, and the prediction accuracy evaluation unit 23 acquires the prediction data D2 and the performance data D3 (step S11). Next, the input data evaluation unit 21 evaluates the input data D1, adds information indicating an outlier or an abnormal value to the input data D1, and outputs the evaluation result to the display data generation unit 24 ( step S12). Next, the prediction data evaluation unit 22 evaluates the prediction data, adds information indicating an abnormal value to the prediction data D2, and outputs the evaluation result to the display data generation unit 24 (step S13). Next, the prediction accuracy evaluation unit 23 calculates prediction accuracy (error rate) based on the prediction data D2 and the performance data D3, and outputs the evaluation result to the display data generation unit 24 (step S14).
次に、表示データ生成部24は、入力データ評価部21、予測データ評価部22及び予測精度評価部23による評価結果を用いて表示データを生成し、表示装置16へ出力する(ステップS15)。表示装置16は、表示データを表示する(ステップS16)。そして、処理は終了する。
Next, the display data generation unit 24 generates display data using the evaluation results of the input data evaluation unit 21, the prediction data evaluation unit 22, and the prediction accuracy evaluation unit 23, and outputs the display data to the display device 16 (step S15). The display device 16 displays the display data (step S16). Then the process ends.
(表示データ)
次に、ユーザに提示される表示データについて詳しく説明する。表示データは、診断レポートを含み、さらに必要に応じて分析結果画面を含む。即ち、予測モデルに関するモニタリング結果としては、ユーザに対して上記の診断レポートのみを提示してもよいし、診断レポートに加えて以下の分析結果画面を提示してもよい。例えば、ある企業Aが予測モデルを生成して企業Bに納入し、企業Bがその予測モデルを利用して業務を運用しているとする。この場合、企業Aの担当者は、診断レポート及び分析結果画面を表示して、企業Bに納入した予測モデルに関する詳細なモニタリングを継続するとともに、企業Bに対しては診断レポートのみを提示して予測モデルの状態についての報告を行ってもよい。 (display data)
Next, the display data presented to the user will be described in detail. The display data includes diagnostic reports and, if necessary, analysis result screens. That is, as the monitoring result regarding the prediction model, only the above diagnostic report may be presented to the user, or the following analysis result screen may be presented in addition to the diagnostic report. For example, assume that a certain company A generates a prediction model and delivers it to a company B, and the company B uses the prediction model to operate business. In this case, the person in charge at Company A displays the diagnostic report and the analysis result screen, continues detailed monitoring of the forecast model delivered to Company B, and presents only the diagnostic report to Company B. A report on the state of the predictive model may be made.
次に、ユーザに提示される表示データについて詳しく説明する。表示データは、診断レポートを含み、さらに必要に応じて分析結果画面を含む。即ち、予測モデルに関するモニタリング結果としては、ユーザに対して上記の診断レポートのみを提示してもよいし、診断レポートに加えて以下の分析結果画面を提示してもよい。例えば、ある企業Aが予測モデルを生成して企業Bに納入し、企業Bがその予測モデルを利用して業務を運用しているとする。この場合、企業Aの担当者は、診断レポート及び分析結果画面を表示して、企業Bに納入した予測モデルに関する詳細なモニタリングを継続するとともに、企業Bに対しては診断レポートのみを提示して予測モデルの状態についての報告を行ってもよい。 (display data)
Next, the display data presented to the user will be described in detail. The display data includes diagnostic reports and, if necessary, analysis result screens. That is, as the monitoring result regarding the prediction model, only the above diagnostic report may be presented to the user, or the following analysis result screen may be presented in addition to the diagnostic report. For example, assume that a certain company A generates a prediction model and delivers it to a company B, and the company B uses the prediction model to operate business. In this case, the person in charge at Company A displays the diagnostic report and the analysis result screen, continues detailed monitoring of the forecast model delivered to Company B, and presents only the diagnostic report to Company B. A report on the state of the predictive model may be made.
(1)診断レポート
図5は、表示データに含まれる診断レポートの一例を示す。診断レポートは、入力データ評価部21、予測データ評価部22及び予測精度評価部23による評価結果、並びに、予測モデルのモニタリングにより得られた他の各種のデータに基づいて作成され、予測モデルの現在の状況を簡潔に説明するレポートである。 (1) Diagnosis Report FIG. 5 shows an example of a diagnosis report included in display data. The diagnostic report is created based on the evaluation results by the inputdata evaluation unit 21, the prediction data evaluation unit 22, and the prediction accuracy evaluation unit 23, and other various data obtained by monitoring the prediction model, and is used to describe the current state of the prediction model. This report briefly describes the situation of
図5は、表示データに含まれる診断レポートの一例を示す。診断レポートは、入力データ評価部21、予測データ評価部22及び予測精度評価部23による評価結果、並びに、予測モデルのモニタリングにより得られた他の各種のデータに基づいて作成され、予測モデルの現在の状況を簡潔に説明するレポートである。 (1) Diagnosis Report FIG. 5 shows an example of a diagnosis report included in display data. The diagnostic report is created based on the evaluation results by the input
図5の例では、診断レポートは、10個の診断項目についてのレポートとなっている。診断レポートは、各診断項目について、IDと、説明と、評価結果と、対応策と、対応優先度とを含む。「ID」は、診断レポートにより報告される各診断項目の識別情報であり、「説明」は診断項目の説明である。「評価結果」は、各診断項目についての診断結果であり、「対応策」は診断項目の評価結果が良くない場合に、その状態を改善するために提案される処置である。「対応優先度」は、複数の対応策が提案される場合に、各対応策の優先順位を示す。
In the example of FIG. 5, the diagnostic report is a report on 10 diagnostic items. The diagnostic report includes an ID, description, evaluation result, countermeasures, and priority of countermeasures for each diagnostic item. "ID" is the identification information of each diagnostic item reported by the diagnostic report, and "Description" is the description of the diagnostic item. The "evaluation result" is the diagnosis result for each diagnostic item, and the "countermeasure" is the action proposed to improve the condition when the evaluation result of the diagnostic item is not good. "Response priority" indicates the order of priority of each countermeasure when a plurality of countermeasures are proposed.
以下、各診断項目について詳しく説明する。診断項目001の「入力データの外れ値有無」は、入力データが外れ値であるか否かを示す。「外れ値」とは、現実にありえない値をいう。例えば、後述する不動産賃料の予測モデルにおいては入力データとして「駅からの所要時間」が使用されるが、駅からの所要時間が負の値になることは現実にはあり得ないため、入力データ「駅からの所要時間」については、負の値が外れ値と設定される。図5の例では、診断項目001の評価結果が「なし」となっているので、診断レポートは入力データD1に外れ値が無かったことを示している。
Each diagnostic item will be explained in detail below. The diagnostic item 001, “existence of outlier in input data” indicates whether or not the input data is an outlier. An "outlier" is a value that cannot exist in reality. For example, in the real estate rent prediction model described later, "required time from the station" is used as input data. Negative values are set as outliers for "required time from station". In the example of FIG. 5, since the evaluation result of the diagnostic item 001 is "none", the diagnostic report indicates that there are no outliers in the input data D1.
診断項目002の「入力データの異常値有無」は、入力データが異常値であるか否かを示す。ここでの「異常値」は、入力データの傾向が変わったことを示す値であり、具体的には予測モデルの学習時に使用した入力データの範囲に属さない値を言う。入力データの値が異常値に該当する場合には、実際の予測の際に予測モデルに入力される入力データの値が変化したと判断される。図5の例では、診断項目002の評価結果は「なし」となっているので、診断レポートは入力データに異常値が無かったことを示している。
"Presence or absence of abnormal value in input data" in the diagnostic item 002 indicates whether or not the input data is an abnormal value. The "abnormal value" here is a value that indicates that the tendency of the input data has changed, and specifically refers to a value that does not belong to the range of the input data used during learning of the prediction model. If the input data value corresponds to an abnormal value, it is determined that the input data value input to the prediction model has changed during actual prediction. In the example of FIG. 5, since the evaluation result of the diagnostic item 002 is "none", the diagnostic report indicates that there were no abnormal values in the input data.
診断項目003の「予測データの異常値有無」は、予測モデルが生成した予測データが異常値であるか否かを示す。ここでの「異常値」は、実際に得られた実績データや予測モデルの学習時に用いた正解データの範囲に属さない値をいう。予測データの値が異常値に該当する場合には、入力データが外れ値や異常値であること、又は、予測モデルによる予測の精度が低いことなどが疑われる。図5の例では、診断項目003の評価結果は「なし」となっているので、診断レポートは予測データに異常値が無かったことを示している。
"Presence or absence of abnormal value in prediction data" in the diagnostic item 003 indicates whether or not the prediction data generated by the prediction model is an abnormal value. The "abnormal value" here refers to a value that does not belong to the range of the actually obtained performance data or the correct data used during the learning of the prediction model. If the predicted data value corresponds to an abnormal value, it is suspected that the input data is an outlier or an abnormal value, or that the prediction accuracy of the prediction model is low. In the example of FIG. 5, since the evaluation result of the diagnostic item 003 is "none", the diagnostic report indicates that there is no abnormal value in the prediction data.
診断項目004の「予測精度」は、予測モデルによる予測の精度、即ち、モデルの信頼性を示し、例えば前述の誤差率により示される。図5の例では、診断項目004の評価結果は「当初より40%低下」となっており、診断レポートは、現在使用している予測モデルの精度が使用開始時と比較して40%低下したことを示している。このため、診断レポートでは、対応策として、「再学習の実施」が提案されており、さらにこの対応策の優先度は最も高い「1」に設定されている。
The "prediction accuracy" of the diagnosis item 004 indicates the accuracy of prediction by the prediction model, that is, the reliability of the model, and is indicated by the above-mentioned error rate, for example. In the example of FIG. 5, the evaluation result of the diagnostic item 004 is "40% lower than the initial", and the diagnostic report shows that the accuracy of the prediction model currently in use has decreased by 40% compared to when it was first used. It is shown that. For this reason, the diagnosis report proposes "re-learning" as a countermeasure, and the priority of this countermeasure is set to "1", which is the highest.
診断項目005の「データ加工での例外発生有無」は、予測モデルに入力する前に入力データを加工する場合に、その加工で何らかの不具合が発生したか否かを示す。「データ加工」とは、例えば、ある入力データをそのまま予測モデルに入力するのではなく、所定数の平均値や直近7日間の移動平均値を計算して入力するような場合に、それらの値を計算することをいう。なお、不具合とは、何らかの理由によりデータ加工ができない場合に加え、データ加工により得られた値が入力データの外れ値に該当した場合などを含む。図5の例では、診断項目005の評価結果は「なし」となっているので、診断レポートは、データ加工での例外は発生しなかったことを示している。
The diagnostic item 005 "whether exceptions occurred in data processing" indicates whether or not any problem occurred in the processing when the input data was processed before being input to the prediction model. "Data processing" means that, for example, instead of inputting certain input data as it is to a prediction model, when calculating and inputting the average value of a predetermined number or the moving average value of the last 7 days, those values means to calculate In addition to the case where the data cannot be processed for some reason, the defect includes the case where the value obtained by the data processing corresponds to an outlier of the input data. In the example of FIG. 5, the evaluation result of the diagnostic item 005 is "none", so the diagnostic report indicates that no exception occurred during data processing.
診断項目006の「モデル作成時間」は、予測モデルの作成、即ち、学習データを用いた予測モデルの学習に要した時間を示す。ここでのモデルの作成は、最初のモデルの作成に加えて、モデルの更新(再学習)を含む。通常、予測モデルの作成に要する時間は使用する学習データの量や学習処理の終了条件などに応じてある程度予測できるものである。よって、予測モデルの作成時間が通常より大幅に短い又は長い場合には、学習処理が正しく行われたかが疑わしいと考えられる。このため、モデル作成時間が適正であったか否かが診断される。図5の例では、診断項目006の評価結果は「規定内」となっているので、診断レポートは、モデル作成時間が適正であったことを示している。
"Model creation time" in diagnostic item 006 indicates the time required to create a prediction model, that is, to learn a prediction model using learning data. The creation of the model here includes updating (relearning) of the model in addition to the creation of the initial model. Normally, the time required to create a prediction model can be predicted to some extent according to the amount of learning data to be used, conditions for terminating the learning process, and the like. Therefore, if the prediction model creation time is much shorter or longer than usual, it is doubtful whether the learning process was performed correctly. Therefore, it is diagnosed whether or not the model creation time was appropriate. In the example of FIG. 5, the evaluation result of the diagnostic item 006 is "within the regulation", so the diagnostic report indicates that the model creation time was appropriate.
診断項目007の「予測精度変動の原因」は、診断項目004において予測精度が低下したと診断された場合に、その原因として推測される事項を示す。例えば、前述の不動産賃料の予測モデルの場合、予測精度変動の原因として、「感染症の流行により経済活動が停滞したこと、」「近くに高級マンションやショッピングモールが建設されたこと」などが考えられる。なお、予測精度変動の原因は、予測モデルを評価するアルゴリズムや評価モデルを用いて推測したものでもよく、人間が推測したものでもよい。図5の例では、診断項目007の評価結果は「XXが発生し傾向が変化したため」と推測されている。また、これについて、診断レポートは「説明変数にYYを追加するという対応策を提案しており、その対応策の優先度は「2」に設定されている。
Diagnosis item 007 "cause of fluctuation in prediction accuracy" indicates an item presumed to be the cause when the diagnosis item 004 diagnoses that the prediction accuracy has deteriorated. For example, in the case of the forecast model for real estate rents mentioned above, factors such as ``stagnation of economic activity due to the epidemic of infectious diseases'' and ``construction of luxury condominiums and shopping malls nearby'' can be considered as causes of fluctuations in forecast accuracy. be done. Note that the cause of the prediction accuracy variation may be estimated using an algorithm for evaluating a prediction model or an evaluation model, or may be estimated by humans. In the example of FIG. 5, the evaluation result of diagnostic item 007 is presumed to be "because XX has occurred and the tendency has changed". In addition, regarding this, the diagnostic report proposes a countermeasure to add YY to the explanatory variable, and the priority of the countermeasure is set to "2."
診断項目008の「適切なシステム出力の実施有無」は、予測モデルを用いた予測装置から予測データが正しく出力されているか否かを示す。通常、予測モデルを用いた予測結果は、関連する他のシステムに入力し、利用される。よって、予測装置が予測データを他のシステムへ正しく出力できているか否かが診断される。図5の例では、診断項目008の評価結果は「実施できている」となっているので、診断レポートは、予測装置から予測データが正しく出力されていることを示している。
The diagnostic item 008 "whether or not appropriate system output is implemented" indicates whether or not the prediction data is correctly output from the prediction device using the prediction model. Prediction results using prediction models are usually input to other related systems and used. Therefore, it is diagnosed whether or not the prediction device is correctly outputting prediction data to another system. In the example of FIG. 5, the evaluation result of the diagnostic item 008 is "implemented", so the diagnostic report indicates that the prediction data is correctly output from the prediction device.
診断項目009の「システムへの返答時間」は、予測モデルを用いた予測装置と、予測データを受け取って使用する他のシステムとの間の予測データの授受が正しく行われているか否かを示す。具体的に、予測装置が他のシステムへ予測データを送信する仕様の場合、システムへの返答時間は、予測装置が予測データを他のシステムへ送信した後、他のシステムから受領を示す返信を受け取るまでの時間となる。また、他のシステムが予測装置へ予測データをリクエストする仕様の場合、システムへの返答時間は、予測装置が他のシステムからリクエストを受信してから、他のシステムへ予測データを送信するまでの時間となる。これらの返答時間が規定時間内であるか否かが診断される。図5の例では、診断項目009の評価結果は「規定内」となっており、診断レポートは予測データが正しく送受信されていることを示している。
"Response time to system" of diagnostic item 009 indicates whether or not the prediction data is correctly exchanged between the prediction device using the prediction model and another system that receives and uses the prediction data. . Specifically, if the prediction device is designed to transmit prediction data to another system, the response time to the system is the response from the other system indicating receipt after the prediction device has transmitted the prediction data to the other system. It is time to receive. In addition, in the case of specifications in which another system requests prediction data from the prediction device, the response time to the system is the time from when the prediction device receives the request from the other system to when it transmits the prediction data to the other system. It's time. It is diagnosed whether or not these response times are within the specified time. In the example of FIG. 5, the evaluation result of the diagnostic item 009 is "within specification", and the diagnostic report indicates that the prediction data is correctly transmitted and received.
診断項目010の「注意情報の表示」は、後述する入力データ、予測データ及び予測精度に関するアラートや、予測装置から予測データが出力されていない場合に出力されるアラートなどの各種のアラートが表示装置16に正しく表示されているか否かを示す。図5の例では、診断項目010の評価結果は「表示できている」となっており、診断レポートは、各種のアラートが正しく表示されていることを示している。
The diagnostic item 010 "display of caution information" displays various alerts such as alerts related to input data, prediction data and prediction accuracy, which will be described later, and alerts output when prediction data is not output from the prediction device. 16 indicates whether or not it is displayed correctly. In the example of FIG. 5, the evaluation result of the diagnostic item 010 is "displayed", and the diagnostic report indicates that various alerts are correctly displayed.
なお、上記の診断項目001~010のうち、診断項目001~004は主として予測モデルの状態に関する診断項目であり、診断項目005~010は、予測モデルを用いて予測を行う予測装置の全体の動作状況に関する項目となっている。
Of the above diagnostic items 001 to 010, diagnostic items 001 to 004 are diagnostic items mainly related to the state of the prediction model, and diagnostic items 005 to 010 are the overall operation of the prediction device that performs prediction using the prediction model. It is an item related to the situation.
以上のように、診断レポートは予め決められた診断項目についての診断結果をリスト表示するので、予測モデルの全体的な状況を容易に把握することができる。なお、図5の例において、診断結果が異常と判定された診断項目については、例えば文字や背景の色を変えるなどして強調表示することにより、異常があることをユーザにアラートするようにしてもよい。
As described above, the diagnostic report displays a list of diagnostic results for predetermined diagnostic items, so the overall status of the prediction model can be easily grasped. In the example of FIG. 5, the diagnosis item for which the diagnosis result is determined to be abnormal is highlighted by, for example, changing the color of the characters or the background, so as to alert the user that there is an abnormality. good too.
(2)分析結果画面
次に、分析結果画面について詳しく説明する。分析結果画面は、診断レポートに記載された診断項目のうち、特に予測モデルに関する診断項目001~004について、数値的な根拠をグラフに示して診断結果を表示するものである。 (2) Analysis Result Screen Next, the analysis result screen will be described in detail. The analysis result screen displays the diagnostic results by graphically showing the numerical grounds for the diagnostic items 001 to 004, which are particularly related to the prediction model, among the diagnostic items described in the diagnostic report.
次に、分析結果画面について詳しく説明する。分析結果画面は、診断レポートに記載された診断項目のうち、特に予測モデルに関する診断項目001~004について、数値的な根拠をグラフに示して診断結果を表示するものである。 (2) Analysis Result Screen Next, the analysis result screen will be described in detail. The analysis result screen displays the diagnostic results by graphically showing the numerical grounds for the diagnostic items 001 to 004, which are particularly related to the prediction model, among the diagnostic items described in the diagnostic report.
以下の分析結果画面の説明においては、予測モデルは、不動産賃料の予測するものとする。具体的に、予測モデルは、物件の広さ、間取り、駅からの所要時間などの入力データに基づいて、賃貸物件の1か月あたりの賃料を予測するものとする。なお、予測モデルは、実際の賃貸物件についての物件の広さ、間取り、駅からの所要時間などを入力データとし、その賃貸物件の実際の賃料を正解データとして学習された学習済みモデルである。
In the explanation of the analysis result screen below, the prediction model will predict real estate rents. Specifically, the prediction model predicts the monthly rent of a rental property based on input data such as the size of the property, the floor plan, and the time required from the station. Note that the prediction model is a trained model that has been trained using input data such as the size of the actual rental property, the floor plan, and the time required from the station, and the actual rent of the rental property as correct data.
図6は、分析結果画面の一例を示す。分析結果画面200は、表示項目として、入力データと、予測データと、予測精度を示す誤差率とを含む。入力データは、上記のように、物件の広さ、間取り、駅からの所要時間など複数あるが、ユーザはプルダウンメニュー81を操作することにより、複数の入力データのうちの1つを選んで表示させることができる。なお、図6の例は、定常運用状態、即ち、入力データ、予測データ及び誤差率のいずれも正常な状態の分析結果を示している。
Fig. 6 shows an example of the analysis result screen. The analysis result screen 200 includes input data, prediction data, and an error rate indicating prediction accuracy as display items. As described above, there are multiple pieces of input data such as the size of the property, the floor plan, and the time required from the station. can be made Note that the example of FIG. 6 shows analysis results in a steady operation state, that is, in a state in which all of the input data, prediction data, and error rate are normal.
分析結果画面200は、入力データに関するグラフ201、204と、予測データに関するグラフ202、205と、誤差率に関するグラフ203、206とを含む。グラフ201は、入力データの推移を示す箱ひげ図である。グラフ201において、横軸は入力データが入力された日を示し、縦軸は入力データの1つである駅からの所要時間を示す。グラフ201は、箱ひげ図により、入力データが入力された日毎に、駅からの所要時間の最小値、最大値、中央値、平均値、及び、四分位範囲(全体の25%~75%に属する値の分布)などを示している。
The analysis result screen 200 includes graphs 201 and 204 regarding input data, graphs 202 and 205 regarding predicted data, and graphs 203 and 206 regarding error rates. A graph 201 is a box plot showing transition of input data. In the graph 201, the horizontal axis indicates the date when the input data was input, and the vertical axis indicates the required time from the station, which is one of the input data. The graph 201 is a box plot showing the minimum, maximum, median, average, and interquartile range (25% to 75% of the total) of the required time from the station for each day when the input data is entered. distribution of values belonging to ).
グラフ202は、予測データの箱ひげ図であり、横軸は予測データが得られた日を示し、縦軸は予測結果である賃料を示す。グラフ202は、箱ひげ図により、予測データが得られた日毎に、賃料の最小値、最大値、中央値、平均値、及び、四分位範囲などを示している。
The graph 202 is a boxplot of forecast data, with the horizontal axis showing the date when the forecast data was obtained and the vertical axis showing the rent as the forecast result. A graph 202 shows the minimum value, maximum value, median value, average value, interquartile range, etc. of the rent for each day on which forecast data is obtained by means of a box plot.
グラフ203は、誤差率の箱ひげ図であり、横軸は予測データが得られた日を示し、縦軸は予測データの誤差率を示す。グラフ203は、箱ひげ図により、予測が行われた日毎に、予測精度を示す誤差率の最小値、最大値、中央値、平均値、及び、四分位範囲などを示している。
The graph 203 is a boxplot of the error rate, the horizontal axis indicates the date when the forecast data was obtained, and the vertical axis indicates the error rate of the forecast data. A graph 203 shows the minimum value, maximum value, median value, average value, interquartile range, etc. of the error rate that indicates the prediction accuracy for each day on which the prediction is made, using a box plot.
グラフ204は、入力データの1つである駅からの所要時間のヒストグラムであり、一定期間、例えばグラフ201に示す期間における駅からの所要時間の分布を示している。横軸は駅からの所要時間を示し、縦軸は度数を示す。グラフ204において、斜めのハッチングのビンは予測モデルの学習時に入力データとして使用された値の分布を示し、グレーのビンは予測モデルを用いた予測時に入力データとして入力された値の分布を示す。図中の値「8.03」は学習時の入力データの平均値であり、値「5.91」は予測時の入力データの平均値である。
A graph 204 is a histogram of the required time from the station, which is one of the input data, and shows the distribution of the required time from the station over a certain period of time, for example, the period shown in the graph 201 . The horizontal axis indicates the required time from the station, and the vertical axis indicates the frequency. In graph 204, diagonally hatched bins indicate the distribution of values used as input data when training the prediction model, and gray bins indicate the distribution of values input as input data when making predictions using the prediction model. The value "8.03" in the figure is the average value of input data during learning, and the value "5.91" is the average value of input data during prediction.
グラフ205は、予測データのヒストグラムであり、一定期間、例えばグラフ202に示す期間における賃料の分布を示している。横軸は賃料を示し、縦軸は度数を示す。グラフ205において、斜めのハッチングのビンは予測モデルの学習時に正解データとして使用された賃料の分布を示し、グレーのビンは予測モデルを用いた予測時に予測結果として得られた賃料の分布を示す。グラフ204と同様に、学習時の正解データの平均値「90,364」と、予測時の予測データの平均値「70,036」が示されている。
Graph 205 is a histogram of forecast data, and shows the distribution of rent over a certain period of time, for example, the period shown in graph 202. The horizontal axis indicates rent, and the vertical axis indicates frequency. In graph 205, obliquely hatched bins indicate the distribution of rents used as correct data when learning the prediction model, and gray bins indicate the distribution of rents obtained as prediction results when making predictions using the prediction model. Similar to the graph 204, the average value of correct data during learning is "90,364" and the average value of prediction data during prediction is "70,036".
グラフ206は、誤差率のヒストグラムであり、一定期間、例えばグラフ203に示す期間における誤差率の分布を示している。横軸は誤差率を示し、縦軸は度数を示す。グラフ206において、斜めのハッチングのビンは予測モデルの学習時に算出された誤差率の分布を示し、グレーのビンは予測モデルを用いた予測時に算出された誤差率の分布を示す。グラフ204と同様に、学習時の誤差率の平均値「24.977」と、予測時の誤差率の平均値「13.15」が示されている。
A graph 206 is a histogram of error rates, showing the distribution of error rates over a certain period of time, for example, the period shown in graph 203 . The horizontal axis indicates the error rate, and the vertical axis indicates the frequency. In the graph 206, obliquely hatched bins indicate the distribution of error rates calculated during prediction model learning, and gray bins indicate the distribution of error rates calculated during prediction using the prediction model. Similar to the graph 204, the average value of the error rate during learning is “24.977” and the average value of the error rate during prediction is “13.15”.
このように、分析結果画面は、一定期間における入力データ、予測データ及び誤差率(予測精度)の分析結果を、箱ひげ図やヒストグラムなどのグラフで示すので、ユーザは、入力データ、予測データ及び誤差率の状態を具体的な数値に基づいて把握することができる。また、グラフ204~206のように、入力データ、予測データ及び誤差率について、予測モデルの学習時の値と、予測時の値とを同一のグラフに表示することにより、モデルの運用開始時からの各データの変化の程度や傾向などを可視化することができる。
In this way, the analysis result screen shows the analysis results of input data, prediction data, and error rate (prediction accuracy) for a certain period of time in graphs such as box plots and histograms. The state of the error rate can be grasped based on a concrete numerical value. In addition, as shown in graphs 204 to 206, for the input data, prediction data, and error rate, by displaying the values at the time of learning of the prediction model and the values at the time of prediction in the same graph, It is possible to visualize the extent and trend of changes in each data.
図7は、別の日における分析結果画面の例を示す。分析結果画面210は、図6の例と同様に、入力データ、予測データ及び誤差率に対する箱ひげ図211~213と、ヒストグラム214~216と、を含む。各グラフの見方は図6の例と同様である。図7は、入力データに外れ値が含まれる場合の分析結果を示している。図7の例では、入力データ「駅からの所要時間」について、「0より小さい値」が外れ値と設定されているものとする。図7では、8月13日の駅からの所要時間の最小値が「-1」であり、外れ値に該当している。このため、グラフ211中において、アラートとして円AL1が表示されている。なお、グラフ214においても、駅からの所要時間の最小値は「-1」となっている。このように、分析結果画面では、入力データに外れ値が含まれる場合には、その旨を示すアラートが表示される。
Fig. 7 shows an example of the analysis result screen on another day. The analysis result screen 210 includes boxplots 211-213 and histograms 214-216 for input data, predicted data and error rate, as in the example of FIG. The view of each graph is the same as in the example of FIG. FIG. 7 shows analysis results when input data includes outliers. In the example of FIG. 7, it is assumed that "a value less than 0" is set as an outlier for the input data "required time from the station". In FIG. 7, the minimum value of the required time from the station on August 13 is "-1", which corresponds to an outlier. Therefore, in the graph 211, a circle AL1 is displayed as an alert. In the graph 214 as well, the minimum value of the required time from the station is "-1". As described above, on the analysis result screen, when the input data includes an outlier, an alert indicating that fact is displayed.
図8は、さらに別の日における分析結果画面の例を示す。分析結果画面220は、図6の例と同様に、入力データ、予測データ及び誤差率に対する箱ひげ図221~223と、ヒストグラム224~226と、を含む。各グラフの見方は図6の例と同様である。図8の例では、まず、入力データが異常値となっている。なお、ここでは入力データ「駅からの所要時間」について、「学習時の平均値と予測時の平均値との差が3以上」である場合が異常値と設定されているものとする。グラフ224に示すように、入力データ「駅からの所要時間」の学習時の平均値「8.03」と、予測時の平均値「14.13」との差は3以上であるため、入力データは異常値と判定されている。よって、その旨を示すアラートとして、グラフ224に矢印AL2が表示されている。
Fig. 8 shows an example of the analysis result screen on yet another day. The analysis result screen 220 includes boxplots 221-223 and histograms 224-226 for input data, prediction data and error rate, as in the example of FIG. The view of each graph is the same as in the example of FIG. In the example of FIG. 8, first, the input data is an abnormal value. Here, for the input data "required time from the station", the case where "the difference between the average value at the time of learning and the average value at the time of prediction is 3 or more" is set as an abnormal value. As shown in the graph 224, the difference between the average value “8.03” during learning of the input data “required time from the station” and the average value “14.13” during prediction is 3 or more. The data are judged to be outliers. Therefore, an arrow AL2 is displayed in the graph 224 as an alert to that effect.
また、図8の例では、予測データが異常値となっている。ここで、予測データは、「260,000以上」、又は、「学習時の平均値と予測時の平均値の差が30,000以上」である場合が異常値であると設定されているものとする。図8の例では、グラフ222に示すように、9月12日の予測データの最大値が260,000を超えており、その旨を示すアラートとして円AL3が表示されている。
Also, in the example of FIG. 8, the predicted data is an abnormal value. Here, the prediction data is set as an abnormal value when it is "260,000 or more" or "the difference between the average value at the time of learning and the average value at the time of prediction is 30,000 or more". and In the example of FIG. 8, as shown in the graph 222, the maximum value of forecast data for September 12 exceeds 260,000, and a circle AL3 is displayed as an alert to that effect.
さらに、図8の例では、誤差率が異常値となっている。ここで、誤差率は、「40%以上」が異常値と設定されているものとする。図8の例では、グラフ223に示すように、誤差率が40%を超える日があり、その旨を示すアラートとして矩形AL4が表示されている。
Furthermore, in the example of FIG. 8, the error rate is an abnormal value. Here, it is assumed that an error rate of "40% or more" is set as an abnormal value. In the example of FIG. 8, as shown in graph 223, there are days when the error rate exceeds 40%, and a rectangle AL4 is displayed as an alert to that effect.
このように、分析結果画面は、入力データの外れ値及び異常値、予測データの異常値、及び、誤差率(予測精度)の異常値をアラート表示するので、ユーザは、入力データ、予測データ及び誤差率の異常状態を容易に知ることができる。
In this way, the analysis result screen displays alerts for outliers and abnormal values in the input data, abnormal values in the predicted data, and abnormal values in the error rate (prediction accuracy). Abnormal state of the error rate can be easily known.
なお、図8の例では、予測データ及び誤差率に関するアラート(円AL3、矩形AL4)を箱ひげ図のグラフ222、223に表示しているが、それらのアラートをヒストグラムのグラフ225、226に表示してもよい。即ち、入力データ、予測データ及び誤差率に関するアラートは、複数のグラフのうちの1つ以上に表示されればよい。
In the example of FIG. 8, the alerts (circle AL3, rectangle AL4) related to the prediction data and the error rate are displayed in the boxplot graphs 222 and 223, but these alerts are displayed in the histogram graphs 225 and 226. You may That is, alerts regarding input data, predicted data, and error rates may be displayed on one or more of the multiple graphs.
なお、図6~8においては、入力データ、予測データ及び誤差率を箱ひげ図とヒストグラムで示しているが、箱ひげ図の代わりに、折れ線グラフを用いてもよい。図9~11は、入力データ、予測データ及び誤差率を折れ線グラフとヒストグラムを用いて表示した例を示す。
In FIGS. 6 to 8, the input data, predicted data, and error rate are shown by box plots and histograms, but line graphs may be used instead of box plots. 9 to 11 show examples of displaying input data, predicted data, and error rates using line graphs and histograms.
具体的に、図9に示す分析結果画面200aでは、図6に示す箱ひげ図201~203の代わりに、折れ線グラフ201a~203aが表示されている。また、図10に示す分析結果画面210aでは、図7に示す箱ひげ図211~213の代わりに、折れ線グラフ211a~213aが表示されている。なお、グラフ211aにおいては、入力データの外れ値を示すアラートとして円AL5が表示されている。
Specifically, on the analysis result screen 200a shown in FIG. 9, line graphs 201a to 203a are displayed instead of the boxplots 201 to 203 shown in FIG. Further, on the analysis result screen 210a shown in FIG. 10, line graphs 211a to 213a are displayed instead of the boxplots 211 to 213 shown in FIG. Note that in the graph 211a, a circle AL5 is displayed as an alert indicating an outlier in the input data.
また、図11に示す分析結果画面220aでは、図8に示す箱ひげ図221~223の代わりに、折れ線グラフ221a~223aが表示されている。分析結果画面220aでは、図8と同様にグラフ224にアラートとして矢印AL6が表示されている。さらに、グラフ222aでは、予測データの異常値を示すアラートとして円AL7が表示され、グラフ223aでは、誤差率の異常値を示すアラートとして矩形AL8が表示されている。
Also, on the analysis result screen 220a shown in FIG. 11, line graphs 221a to 223a are displayed instead of the boxplots 221 to 223 shown in FIG. On the analysis result screen 220a, an arrow AL6 is displayed as an alert in the graph 224 as in FIG. Furthermore, in the graph 222a, a circle AL7 is displayed as an alert indicating an abnormal value of predicted data, and in the graph 223a a rectangle AL8 is displayed as an alert indicating an abnormal value of the error rate.
なお、上記の例では、分析結果画面において、箱ひげ図又は折れ線グラフを使用しているが、入力データ、予測データ及び誤差率を日時などの順序に基づいて時系列で示すグラフであれば、他の種類のグラフを用いてもよい。
In the above example, a box plot or a line graph is used on the analysis result screen. Other types of graphs may be used.
(変形例)
上記の例では、図6~11に示すように、1つの予測モデルについての入力データ、予測データ及び誤差率を1つの分析結果画面に表示しているが、複数の予測モデルがある場合には、複数の予測モデルについての入力データ、予測データ及び誤差率を1つの分析結果画面に同時に表示してもよい。例えば、対象とする2つのモデルがある場合、モデル毎に図6に示すような入力データ、予測データ及び誤差率のグラフを用意し、それらを1つの分析結果画面において並べて表示してもよい。もしくは、1つの分析結果画面において、入力データ、予測データ及び誤差率の各グラフ中に2つのモデルの値を異なる色などで同時に重ねて表示してもよい。この場合、箱ひげ図、折れ線グラフ、ヒストグラムのいずれを使用するかは任意に決定すればよい。 (Modification)
In the above example, as shown in FIGS. 6 to 11, the input data, prediction data and error rate for one prediction model are displayed on one analysis result screen, but if there are multiple prediction models , input data, prediction data and error rates for a plurality of prediction models may be simultaneously displayed on one analysis result screen. For example, when there are two target models, graphs of input data, prediction data, and error rates as shown in FIG. 6 may be prepared for each model and displayed side by side on one analysis result screen. Alternatively, on one analysis result screen, the values of the two models may be displayed simultaneously in different colors and superimposed on each graph of the input data, prediction data, and error rate. In this case, it may be determined arbitrarily whether to use a box plot, a line graph, or a histogram.
上記の例では、図6~11に示すように、1つの予測モデルについての入力データ、予測データ及び誤差率を1つの分析結果画面に表示しているが、複数の予測モデルがある場合には、複数の予測モデルについての入力データ、予測データ及び誤差率を1つの分析結果画面に同時に表示してもよい。例えば、対象とする2つのモデルがある場合、モデル毎に図6に示すような入力データ、予測データ及び誤差率のグラフを用意し、それらを1つの分析結果画面において並べて表示してもよい。もしくは、1つの分析結果画面において、入力データ、予測データ及び誤差率の各グラフ中に2つのモデルの値を異なる色などで同時に重ねて表示してもよい。この場合、箱ひげ図、折れ線グラフ、ヒストグラムのいずれを使用するかは任意に決定すればよい。 (Modification)
In the above example, as shown in FIGS. 6 to 11, the input data, prediction data and error rate for one prediction model are displayed on one analysis result screen, but if there are multiple prediction models , input data, prediction data and error rates for a plurality of prediction models may be simultaneously displayed on one analysis result screen. For example, when there are two target models, graphs of input data, prediction data, and error rates as shown in FIG. 6 may be prepared for each model and displayed side by side on one analysis result screen. Alternatively, on one analysis result screen, the values of the two models may be displayed simultaneously in different colors and superimposed on each graph of the input data, prediction data, and error rate. In this case, it may be determined arbitrarily whether to use a box plot, a line graph, or a histogram.
<第2実施形態>
図12は、第2実施形態の情報処理装置の機能構成を示すブロック図である。情報処理装置70は、入力データ取得手段71と、入力データ評価手段72と、予測データ取得手段73と、予測データ評価手段74と、予測精度取得手段75と、表示手段76と、を備える。 <Second embodiment>
FIG. 12 is a block diagram showing the functional configuration of the information processing apparatus according to the second embodiment. Theinformation processing device 70 includes input data acquisition means 71 , input data evaluation means 72 , prediction data acquisition means 73 , prediction data evaluation means 74 , prediction accuracy acquisition means 75 , and display means 76 .
図12は、第2実施形態の情報処理装置の機能構成を示すブロック図である。情報処理装置70は、入力データ取得手段71と、入力データ評価手段72と、予測データ取得手段73と、予測データ評価手段74と、予測精度取得手段75と、表示手段76と、を備える。 <Second embodiment>
FIG. 12 is a block diagram showing the functional configuration of the information processing apparatus according to the second embodiment. The
図13は、第2実施形態の情報処理装置による処理のフローチャートである。まず、入力データ取得手段71は、複数の入力データを取得する(ステップS71)。次に、入力データ評価手段72は、入力データの外れ値及び異常値を検出する(ステップS72)。次に、予測データ取得手段73は、学習済みモデルを用いて入力データから生成された予測データを取得する(ステップS73)。次に、予測データ評価手段74は、予測データの異常値を検出する(ステップS74)。次に、予測精度取得手段75は、学習済みモデルによる予測精度を取得する(ステップS75)。なお、ステップS71~S72と、ステップS73~S74と、ステップS75とは、上記と異なる順序で行われてもよく、時間的に並列して行われてもよい。そして、表示手段76は、入力データの外れ値及び異常値の有無と、予測データの異常値の有無と、予測精度とを含む診断情報を表示する(ステップS76)。
FIG. 13 is a flowchart of processing by the information processing apparatus of the second embodiment. First, the input data obtaining means 71 obtains a plurality of input data (step S71). Next, the input data evaluation means 72 detects outliers and abnormal values in the input data (step S72). Next, the prediction data acquisition unit 73 acquires prediction data generated from the input data using the trained model (step S73). Next, the predicted data evaluation means 74 detects abnormal values in the predicted data (step S74). Next, the prediction accuracy acquisition unit 75 acquires the prediction accuracy of the learned model (step S75). Note that steps S71 to S72, steps S73 to S74, and step S75 may be performed in a different order from the above, or may be performed in parallel in terms of time. Then, the display means 76 displays diagnostic information including the presence or absence of outliers and abnormal values in the input data, the presence or absence of abnormal values in the prediction data, and the prediction accuracy (step S76).
第2実施形態の情報処理装置70によれば、運用中のモデルについて、予測精度に影響を与える複数の要因に関する診断情報を作成し、提示することが可能となる。
According to the information processing device 70 of the second embodiment, it is possible to create and present diagnostic information regarding multiple factors that affect the prediction accuracy for the model in operation.
上記の実施形態の一部又は全部は、以下の付記のようにも記載されうるが、以下には限られない。
Some or all of the above embodiments can also be described as the following additional remarks, but are not limited to the following.
(付記1)
複数の入力データを取得する入力データ取得手段と、
前記入力データの外れ値及び異常値を検出する入力データ評価手段と、
学習済みモデルを用いて前記入力データから生成された予測データを取得する予測データ取得手段と、
前記予測データの異常値を検出する予測データ評価手段と、
前記学習済みモデルによる予測精度を取得する予測精度取得手段と、
前記入力データの外れ値及び異常値の有無と、前記予測データの異常値の有無と、前記予測精度とを含む診断情報を表示する表示手段と、
を備える情報処理装置。 (Appendix 1)
input data acquisition means for acquiring a plurality of input data;
input data evaluation means for detecting outliers and abnormal values in the input data;
prediction data acquisition means for acquiring prediction data generated from the input data using the trained model;
prediction data evaluation means for detecting abnormal values in the prediction data;
prediction accuracy acquisition means for acquiring prediction accuracy by the trained model;
display means for displaying diagnostic information including the presence or absence of outliers and abnormal values in the input data, the presence or absence of abnormal values in the prediction data, and the prediction accuracy;
Information processing device.
複数の入力データを取得する入力データ取得手段と、
前記入力データの外れ値及び異常値を検出する入力データ評価手段と、
学習済みモデルを用いて前記入力データから生成された予測データを取得する予測データ取得手段と、
前記予測データの異常値を検出する予測データ評価手段と、
前記学習済みモデルによる予測精度を取得する予測精度取得手段と、
前記入力データの外れ値及び異常値の有無と、前記予測データの異常値の有無と、前記予測精度とを含む診断情報を表示する表示手段と、
を備える情報処理装置。 (Appendix 1)
input data acquisition means for acquiring a plurality of input data;
input data evaluation means for detecting outliers and abnormal values in the input data;
prediction data acquisition means for acquiring prediction data generated from the input data using the trained model;
prediction data evaluation means for detecting abnormal values in the prediction data;
prediction accuracy acquisition means for acquiring prediction accuracy by the trained model;
display means for displaying diagnostic information including the presence or absence of outliers and abnormal values in the input data, the presence or absence of abnormal values in the prediction data, and the prediction accuracy;
Information processing device.
(付記2)
前記診断情報は、
前記入力データの外れ値及び異常値の有無と、前記予測データの異常値の有無と、前記予測精度とを記述した診断レポートと、
前記入力データ、前記予測データ及び前記予測精度をそれぞれグラフ上に表示した分析結果画面と、
を含む付記1に記載の情報処理装置。 (Appendix 2)
The diagnostic information includes:
a diagnostic report describing the presence or absence of outliers and abnormal values in the input data, the presence or absence of abnormal values in the prediction data, and the prediction accuracy;
an analysis result screen displaying the input data, the prediction data, and the prediction accuracy on a graph, respectively;
The information processing device according toappendix 1, comprising:
前記診断情報は、
前記入力データの外れ値及び異常値の有無と、前記予測データの異常値の有無と、前記予測精度とを記述した診断レポートと、
前記入力データ、前記予測データ及び前記予測精度をそれぞれグラフ上に表示した分析結果画面と、
を含む付記1に記載の情報処理装置。 (Appendix 2)
The diagnostic information includes:
a diagnostic report describing the presence or absence of outliers and abnormal values in the input data, the presence or absence of abnormal values in the prediction data, and the prediction accuracy;
an analysis result screen displaying the input data, the prediction data, and the prediction accuracy on a graph, respectively;
The information processing device according to
(付記3)
前記分析結果画面は、前記入力データの外れ値及び異常値の有無と、前記予測データの異常値の有無と、前記予測精度と、をそれぞれ時系列のグラフ上に表示する付記2に記載の情報処理装置。 (Appendix 3)
The information according toappendix 2, wherein the analysis result screen displays the presence or absence of outliers and abnormal values in the input data, the presence or absence of abnormal values in the prediction data, and the prediction accuracy, respectively, on time-series graphs. processing equipment.
前記分析結果画面は、前記入力データの外れ値及び異常値の有無と、前記予測データの異常値の有無と、前記予測精度と、をそれぞれ時系列のグラフ上に表示する付記2に記載の情報処理装置。 (Appendix 3)
The information according to
(付記4)
前記分析結果画面は、前記入力データの外れ値及び異常値の有無と、前記予測データの異常値の有無と、前記予測精度と、をそれぞれヒストグラム上に表示する付記2又は3に記載の情報処理装置。 (Appendix 4)
The information processing according to appendix 2 or 3, wherein the analysis result screen displays the presence or absence of outliers and abnormal values in the input data, the presence or absence of abnormal values in the prediction data, and the prediction accuracy, respectively, on histograms. Device.
前記分析結果画面は、前記入力データの外れ値及び異常値の有無と、前記予測データの異常値の有無と、前記予測精度と、をそれぞれヒストグラム上に表示する付記2又は3に記載の情報処理装置。 (Appendix 4)
The information processing according to
(付記5)
前記分析結果画面は、前記入力データ、前記予測データ、及び、前記予測精度の少なくとも1つについて、前記学習済みモデルの学習時の値と、前記学習済みモデルを用いた予測時の値とを同一のグラフに表示する付記4に記載の情報処理装置。 (Appendix 5)
In the analysis result screen, for at least one of the input data, the prediction data, and the prediction accuracy, the value at the time of learning of the learned model and the value at the time of prediction using the learned model are the same. The information processing device according toappendix 4, which is displayed in the graph of
前記分析結果画面は、前記入力データ、前記予測データ、及び、前記予測精度の少なくとも1つについて、前記学習済みモデルの学習時の値と、前記学習済みモデルを用いた予測時の値とを同一のグラフに表示する付記4に記載の情報処理装置。 (Appendix 5)
In the analysis result screen, for at least one of the input data, the prediction data, and the prediction accuracy, the value at the time of learning of the learned model and the value at the time of prediction using the learned model are the same. The information processing device according to
(付記6)
前記分析結果画面は、複数の学習済みモデルの各々について前記入力データの外れ値及び異常値の有無を示す複数のグラフと、複数の学習済みモデルの各々について前記予測データの異常値の有無を示す複数のグラフと、複数の学習済みモデルの各々について前記予測精度を示す複数のグラフと、をそれぞれ並べて表示する付記2乃至5のいずれか一項に記載の情報処理装置。 (Appendix 6)
The analysis result screen includes a plurality of graphs showing the presence or absence of outliers and abnormal values in the input data for each of the plurality of trained models, and the presence or absence of abnormal values in the prediction data for each of the plurality of trained models. 6. The information processing apparatus according to any one ofappendices 2 to 5, wherein a plurality of graphs and a plurality of graphs showing the prediction accuracy for each of the plurality of trained models are displayed side by side.
前記分析結果画面は、複数の学習済みモデルの各々について前記入力データの外れ値及び異常値の有無を示す複数のグラフと、複数の学習済みモデルの各々について前記予測データの異常値の有無を示す複数のグラフと、複数の学習済みモデルの各々について前記予測精度を示す複数のグラフと、をそれぞれ並べて表示する付記2乃至5のいずれか一項に記載の情報処理装置。 (Appendix 6)
The analysis result screen includes a plurality of graphs showing the presence or absence of outliers and abnormal values in the input data for each of the plurality of trained models, and the presence or absence of abnormal values in the prediction data for each of the plurality of trained models. 6. The information processing apparatus according to any one of
(付記7)
前記分析結果画面は、複数の学習済みモデルについての前記入力データの外れ値及び異常値の有無を同時に表示する1つのグラフと、複数の学習済みモデルの各々についての前記予測データの異常値の有無を同時に表示する1つのグラフと、複数の学習済みモデルの各々についての前記予測精度を同時に表示する1つのグラフと、を含む付記2乃至5のいずれか一項に記載の情報処理装置。 (Appendix 7)
The analysis result screen includes one graph that simultaneously displays the presence or absence of outliers and abnormal values in the input data for a plurality of trained models, and the presence or absence of abnormal values in the prediction data for each of the plurality of trained models. 6. The information processing apparatus according to any one ofAppendices 2 to 5, including one graph that simultaneously displays , and one graph that simultaneously displays the prediction accuracy for each of a plurality of trained models.
前記分析結果画面は、複数の学習済みモデルについての前記入力データの外れ値及び異常値の有無を同時に表示する1つのグラフと、複数の学習済みモデルの各々についての前記予測データの異常値の有無を同時に表示する1つのグラフと、複数の学習済みモデルの各々についての前記予測精度を同時に表示する1つのグラフと、を含む付記2乃至5のいずれか一項に記載の情報処理装置。 (Appendix 7)
The analysis result screen includes one graph that simultaneously displays the presence or absence of outliers and abnormal values in the input data for a plurality of trained models, and the presence or absence of abnormal values in the prediction data for each of the plurality of trained models. 6. The information processing apparatus according to any one of
(付記8)
前記診断レポートは、前記入力データに外れ値又は異常値があった場合、前記予測データに異常値があった場合、又は、前記予測精度が所定値以下であった場合に、対応策及び当該対応策の優先度の記述を含む付記2乃至7のいずれか一項に記載の情報処理装置。 (Appendix 8)
If the input data has an outlier or abnormal value, the predicted data has an abnormal value, or the prediction accuracy is less than or equal to a predetermined value, the diagnostic report includes countermeasures and corresponding countermeasures. 8. The information processing apparatus according to any one ofAppendices 2 to 7, including a description of the priority of measures.
前記診断レポートは、前記入力データに外れ値又は異常値があった場合、前記予測データに異常値があった場合、又は、前記予測精度が所定値以下であった場合に、対応策及び当該対応策の優先度の記述を含む付記2乃至7のいずれか一項に記載の情報処理装置。 (Appendix 8)
If the input data has an outlier or abnormal value, the predicted data has an abnormal value, or the prediction accuracy is less than or equal to a predetermined value, the diagnostic report includes countermeasures and corresponding countermeasures. 8. The information processing apparatus according to any one of
(付記9)
複数の入力データを取得し、
前記入力データの外れ値及び異常値を検出し、
学習済みモデルを用いて前記入力データから生成された予測データを取得し、
前記予測データの異常値を検出し、
前記学習済みモデルによる予測精度を取得し、
前記入力データの外れ値及び異常値の有無と、前記予測データの異常値の有無と、前記予測精度とを含む診断情報を表示する情報処理方法。 (Appendix 9)
get multiple input data,
detecting outliers and abnormal values in the input data;
Obtaining prediction data generated from the input data using the trained model,
detecting abnormal values in the predicted data;
Obtaining the prediction accuracy of the trained model,
An information processing method for displaying diagnostic information including the presence or absence of outliers and abnormal values in the input data, the presence or absence of abnormal values in the predicted data, and the prediction accuracy.
複数の入力データを取得し、
前記入力データの外れ値及び異常値を検出し、
学習済みモデルを用いて前記入力データから生成された予測データを取得し、
前記予測データの異常値を検出し、
前記学習済みモデルによる予測精度を取得し、
前記入力データの外れ値及び異常値の有無と、前記予測データの異常値の有無と、前記予測精度とを含む診断情報を表示する情報処理方法。 (Appendix 9)
get multiple input data,
detecting outliers and abnormal values in the input data;
Obtaining prediction data generated from the input data using the trained model,
detecting abnormal values in the predicted data;
Obtaining the prediction accuracy of the trained model,
An information processing method for displaying diagnostic information including the presence or absence of outliers and abnormal values in the input data, the presence or absence of abnormal values in the predicted data, and the prediction accuracy.
(付記10)
複数の入力データを取得し、
前記入力データの外れ値及び異常値を検出し、
学習済みモデルを用いて前記入力データから生成された予測データを取得し、
前記予測データの異常値を検出し、
前記学習済みモデルによる予測精度を取得し、
前記入力データの外れ値及び異常値の有無と、前記予測データの異常値の有無と、前記予測精度とを含む診断情報を表示する処理をコンピュータに実行させるプログラムを記録した記録媒体。 (Appendix 10)
get multiple input data,
detecting outliers and abnormal values in the input data;
Obtaining prediction data generated from the input data using the trained model,
detecting abnormal values in the predicted data;
Obtaining the prediction accuracy of the trained model,
A recording medium recording a program for causing a computer to execute processing for displaying diagnostic information including the presence or absence of outliers and abnormal values in the input data, the presence or absence of abnormal values in the predicted data, and the prediction accuracy.
複数の入力データを取得し、
前記入力データの外れ値及び異常値を検出し、
学習済みモデルを用いて前記入力データから生成された予測データを取得し、
前記予測データの異常値を検出し、
前記学習済みモデルによる予測精度を取得し、
前記入力データの外れ値及び異常値の有無と、前記予測データの異常値の有無と、前記予測精度とを含む診断情報を表示する処理をコンピュータに実行させるプログラムを記録した記録媒体。 (Appendix 10)
get multiple input data,
detecting outliers and abnormal values in the input data;
Obtaining prediction data generated from the input data using the trained model,
detecting abnormal values in the predicted data;
Obtaining the prediction accuracy of the trained model,
A recording medium recording a program for causing a computer to execute processing for displaying diagnostic information including the presence or absence of outliers and abnormal values in the input data, the presence or absence of abnormal values in the predicted data, and the prediction accuracy.
以上、実施形態及び実施例を参照して本開示を説明したが、本開示は上記実施形態及び実施例に限定されるものではない。本開示の構成や詳細には、本開示のスコープ内で当業者が理解し得る様々な変更をすることができる。
Although the present disclosure has been described above with reference to the embodiments and examples, the present disclosure is not limited to the above embodiments and examples. Various changes that can be understood by those skilled in the art can be made to the configuration and details of the present disclosure within the scope of the present disclosure.
2 予測装置
12 プロセッサ
16 表示装置
21 入力データ評価部
22 予測データ評価部
23 予測精度評価部
24 表示データ生成部
100 モニタリング装置
200、200a、210、210a、220、220a 分析結果画面 2prediction device 12 processor 16 display device 21 input data evaluation unit 22 prediction data evaluation unit 23 prediction accuracy evaluation unit 24 display data generation unit 100 monitoring device 200, 200a, 210, 210a, 220, 220a Analysis result screen
12 プロセッサ
16 表示装置
21 入力データ評価部
22 予測データ評価部
23 予測精度評価部
24 表示データ生成部
100 モニタリング装置
200、200a、210、210a、220、220a 分析結果画面 2
Claims (10)
- 複数の入力データを取得する入力データ取得手段と、
前記入力データの外れ値及び異常値を検出する入力データ評価手段と、
学習済みモデルを用いて前記入力データから生成された予測データを取得する予測データ取得手段と、
前記予測データの異常値を検出する予測データ評価手段と、
前記学習済みモデルによる予測精度を取得する予測精度取得手段と、
前記入力データの外れ値及び異常値の有無と、前記予測データの異常値の有無と、前記予測精度とを含む診断情報を表示する表示手段と、
を備える情報処理装置。 input data acquisition means for acquiring a plurality of input data;
input data evaluation means for detecting outliers and abnormal values in the input data;
prediction data acquisition means for acquiring prediction data generated from the input data using the trained model;
prediction data evaluation means for detecting abnormal values in the prediction data;
prediction accuracy acquisition means for acquiring prediction accuracy by the trained model;
display means for displaying diagnostic information including the presence or absence of outliers and abnormal values in the input data, the presence or absence of abnormal values in the prediction data, and the prediction accuracy;
Information processing device. - 前記診断情報は、
前記入力データの外れ値及び異常値の有無と、前記予測データの異常値の有無と、前記予測精度とを記述した診断レポートと、
前記入力データ、前記予測データ及び前記予測精度をそれぞれグラフ上に表示した分析結果画面と、
を含む請求項1に記載の情報処理装置。 The diagnostic information includes:
a diagnostic report describing the presence or absence of outliers and abnormal values in the input data, the presence or absence of abnormal values in the prediction data, and the prediction accuracy;
an analysis result screen displaying the input data, the prediction data, and the prediction accuracy on a graph, respectively;
The information processing apparatus according to claim 1, comprising: - 前記分析結果画面は、前記入力データの外れ値及び異常値の有無と、前記予測データの異常値の有無と、前記予測精度と、をそれぞれ時系列のグラフ上に表示する請求項2に記載の情報処理装置。 3. The analysis result screen according to claim 2, wherein the presence or absence of outliers and abnormal values in the input data, the presence or absence of abnormal values in the prediction data, and the prediction accuracy are each displayed on a time-series graph. Information processing equipment.
- 前記分析結果画面は、前記入力データの外れ値及び異常値の有無と、前記予測データの異常値の有無と、前記予測精度と、をそれぞれヒストグラム上に表示する請求項2又は3に記載の情報処理装置。 4. The information according to claim 2, wherein the analysis result screen displays the presence or absence of outliers and abnormal values in the input data, the presence or absence of abnormal values in the predicted data, and the prediction accuracy, respectively, on histograms. processing equipment.
- 前記分析結果画面は、前記入力データ、前記予測データ、及び、前記予測精度の少なくとも1つについて、前記学習済みモデルの学習時の値と、前記学習済みモデルを用いた予測時の値とを同一のグラフに表示する請求項4に記載の情報処理装置。 In the analysis result screen, for at least one of the input data, the prediction data, and the prediction accuracy, the value at the time of learning of the learned model and the value at the time of prediction using the learned model are the same. 5. The information processing apparatus according to claim 4, wherein the graph of .
- 前記分析結果画面は、複数の学習済みモデルの各々について前記入力データの外れ値及び異常値の有無を示す複数のグラフと、複数の学習済みモデルの各々について前記予測データの異常値の有無を示す複数のグラフと、複数の学習済みモデルの各々について前記予測精度を示す複数のグラフと、をそれぞれ並べて表示する請求項2乃至5のいずれか一項に記載の情報処理装置。 The analysis result screen includes a plurality of graphs showing the presence or absence of outliers and abnormal values in the input data for each of the plurality of trained models, and the presence or absence of abnormal values in the prediction data for each of the plurality of trained models. 6. The information processing apparatus according to any one of claims 2 to 5, wherein a plurality of graphs and a plurality of graphs showing the prediction accuracy for each of the plurality of trained models are displayed side by side.
- 前記分析結果画面は、複数の学習済みモデルについての前記入力データの外れ値及び異常値の有無を同時に表示する1つのグラフと、複数の学習済みモデルの各々についての前記予測データの異常値の有無を同時に表示する1つのグラフと、複数の学習済みモデルの各々についての前記予測精度を同時に表示する1つのグラフと、を含む請求項2乃至5のいずれか一項に記載の情報処理装置。 The analysis result screen includes one graph that simultaneously displays the presence or absence of outliers and abnormal values in the input data for a plurality of trained models, and the presence or absence of abnormal values in the prediction data for each of the plurality of trained models. 6. The information processing apparatus according to any one of claims 2 to 5, comprising: one graph that simultaneously displays , and one graph that simultaneously displays the prediction accuracy for each of a plurality of trained models.
- 前記診断レポートは、前記入力データに外れ値又は異常値があった場合、前記予測データに異常値があった場合、又は、前記予測精度が所定値以下であった場合に、対応策及び当該対応策の優先度の記述を含む請求項2乃至7のいずれか一項に記載の情報処理装置。 If the input data has an outlier or abnormal value, the predicted data has an abnormal value, or the prediction accuracy is less than or equal to a predetermined value, the diagnostic report includes countermeasures and corresponding countermeasures. 8. The information processing apparatus according to any one of claims 2 to 7, comprising a description of priority of measures.
- 複数の入力データを取得し、
前記入力データの外れ値及び異常値を検出し、
学習済みモデルを用いて前記入力データから生成された予測データを取得し、
前記予測データの異常値を検出し、
前記学習済みモデルによる予測精度を取得し、
前記入力データの外れ値及び異常値の有無と、前記予測データの異常値の有無と、前記予測精度とを含む診断情報を表示する情報処理方法。 get multiple input data,
detecting outliers and abnormal values in the input data;
Obtaining prediction data generated from the input data using the trained model,
detecting abnormal values in the predicted data;
Obtaining the prediction accuracy of the trained model,
An information processing method for displaying diagnostic information including the presence or absence of outliers and abnormal values in the input data, the presence or absence of abnormal values in the predicted data, and the prediction accuracy. - 複数の入力データを取得し、
前記入力データの外れ値及び異常値を検出し、
学習済みモデルを用いて前記入力データから生成された予測データを取得し、
前記予測データの異常値を検出し、
前記学習済みモデルによる予測精度を取得し、
前記入力データの外れ値及び異常値の有無と、前記予測データの異常値の有無と、前記予測精度とを含む診断情報を表示する処理をコンピュータに実行させるプログラムを記録した記録媒体。 get multiple input data,
detecting outliers and abnormal values in the input data;
Obtaining prediction data generated from the input data using the trained model,
detecting abnormal values in the predicted data;
Obtaining the prediction accuracy of the trained model,
A recording medium recording a program for causing a computer to execute processing for displaying diagnostic information including the presence or absence of outliers and abnormal values in the input data, the presence or absence of abnormal values in the predicted data, and the prediction accuracy.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2023548001A JPWO2023042301A5 (en) | 2021-09-15 | Information processing device, information processing method, and program | |
PCT/JP2021/033932 WO2023042301A1 (en) | 2021-09-15 | 2021-09-15 | Information processing device, information processing method, and recording medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/JP2021/033932 WO2023042301A1 (en) | 2021-09-15 | 2021-09-15 | Information processing device, information processing method, and recording medium |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2023042301A1 true WO2023042301A1 (en) | 2023-03-23 |
Family
ID=85602005
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2021/033932 WO2023042301A1 (en) | 2021-09-15 | 2021-09-15 | Information processing device, information processing method, and recording medium |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2023042301A1 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118283268A (en) * | 2024-05-29 | 2024-07-02 | 深圳市莫尼迪科技有限责任公司 | Image compression quality prediction method and system |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2017217050A1 (en) * | 2016-06-16 | 2017-12-21 | ソニー株式会社 | Information processing device, information processing method and storage medium |
JP2020014799A (en) * | 2018-07-27 | 2020-01-30 | コニカミノルタ株式会社 | X-ray image object recognition system |
US20210097433A1 (en) * | 2019-09-30 | 2021-04-01 | Amazon Technologies, Inc. | Automated problem detection for machine learning models |
-
2021
- 2021-09-15 WO PCT/JP2021/033932 patent/WO2023042301A1/en active Application Filing
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2017217050A1 (en) * | 2016-06-16 | 2017-12-21 | ソニー株式会社 | Information processing device, information processing method and storage medium |
JP2020014799A (en) * | 2018-07-27 | 2020-01-30 | コニカミノルタ株式会社 | X-ray image object recognition system |
US20210097433A1 (en) * | 2019-09-30 | 2021-04-01 | Amazon Technologies, Inc. | Automated problem detection for machine learning models |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118283268A (en) * | 2024-05-29 | 2024-07-02 | 深圳市莫尼迪科技有限责任公司 | Image compression quality prediction method and system |
Also Published As
Publication number | Publication date |
---|---|
JPWO2023042301A1 (en) | 2023-03-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP6756374B2 (en) | Process error status diagnostic device and error status diagnosis method | |
Alaswad et al. | A review on condition-based maintenance optimization models for stochastically deteriorating system | |
JP2010506256A (en) | Univariate methods for monitoring and analyzing multivariate data | |
JP2007249997A (en) | Method and system for monitoring industrial process | |
JP2018112903A (en) | Plant operation support apparatus, plant operation support method, plant operation support program, and recording medium | |
JP6702297B2 (en) | Abnormal state diagnosis method and abnormal state diagnosis device | |
US20120116827A1 (en) | Plant analyzing system | |
EP4310620A2 (en) | Hybrid ensemble approach for iot predictive modelling | |
JP6711323B2 (en) | Abnormal state diagnosis method and abnormal state diagnosis device | |
JP2016539425A (en) | Computer-implemented method and system for automatically monitoring and determining the status of all process segments in a process unit | |
JPWO2018073960A1 (en) | Display method, display device, and program | |
US12210328B2 (en) | Control support apparatus, control support method, computer readable medium with control support program recorded thereon and control system | |
WO2023042301A1 (en) | Information processing device, information processing method, and recording medium | |
WO2020066124A1 (en) | Process state analysis device and process state display method | |
JP7466479B2 (en) | Business improvement support device, program, and storage medium storing the program | |
JP7318612B2 (en) | MONITORING DEVICE, MONITORING METHOD, AND MONITORING PROGRAM | |
JP6973445B2 (en) | Display method, display device, and program | |
CN118133952A (en) | Event influence determining method, device, equipment and storage medium of batch system | |
CN117291266A (en) | Equipment fault reasoning method and system based on extended FMECA analysis method | |
US20220147039A1 (en) | Event analytics in modular industrial plants | |
JP7223947B2 (en) | Manufacturing condition calculation device, manufacturing condition calculation method, and manufacturing condition calculation program | |
WO2020262709A1 (en) | Processing-system monitoring device, processing-system monitoring method, and program | |
US11669082B2 (en) | Online fault localization in industrial processes without utilizing a dynamic system model | |
US20240019827A1 (en) | Energy-saving prediction method for factory electricity consumption and electronic apparatus | |
JP7052914B1 (en) | Abnormality diagnosis system, abnormality diagnosis device, abnormality diagnosis method, and program |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21957483 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2023548001 Country of ref document: JP |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 21957483 Country of ref document: EP Kind code of ref document: A1 |