[go: up one dir, main page]

TWI860923B - Model reconstruction method and system - Google Patents

Model reconstruction method and system Download PDF

Info

Publication number
TWI860923B
TWI860923B TW112151066A TW112151066A TWI860923B TW I860923 B TWI860923 B TW I860923B TW 112151066 A TW112151066 A TW 112151066A TW 112151066 A TW112151066 A TW 112151066A TW I860923 B TWI860923 B TW I860923B
Authority
TW
Taiwan
Prior art keywords
training
model
demand
evaluated
result
Prior art date
Application number
TW112151066A
Other languages
Chinese (zh)
Inventor
王俊權
黃逸琴
周芳儀
戴君翰
Original Assignee
中國信託商業銀行股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中國信託商業銀行股份有限公司 filed Critical 中國信託商業銀行股份有限公司
Priority to TW112151066A priority Critical patent/TWI860923B/en
Application granted granted Critical
Publication of TWI860923B publication Critical patent/TWI860923B/en

Links

Images

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

一種模型重建系統包含一處理單元,其執行以下操作:根據多個訓練資料集、多個測試資料集及多個訓練組合獲得一目標需求預測模型及所對應的多個待更新需求預測結果;根據一第一級距排列方式及該等待更新需求預測結果獲得一第一級距排列結果;根據該等測試資料集及一第二級距排列方式,利用一待比對需求預測模型獲得一第二級距排列結果;根據該第一級距排列結果及該第二級距排列結果分別獲得一第一指標值及一第二指標值;及根據該第一指標值及該第二指標值判定是否以該目標需求預測模型取代該待比對需求預測模型。A model reconstruction system includes a processing unit, which performs the following operations: obtaining a target demand forecasting model and corresponding multiple demand forecasting results to be updated based on multiple training data sets, multiple test data sets and multiple training combinations; obtaining a first-level arrangement result based on a first-level arrangement method and the demand forecasting results to be updated; obtaining a second-level arrangement result using a demand forecasting model to be compared based on the test data sets and a second-level arrangement method; obtaining a first indicator value and a second indicator value based on the first-level arrangement result and the second-level arrangement result respectively; and determining whether to replace the demand forecasting model to be compared with the target demand forecasting model based on the first indicator value and the second indicator value.

Description

模型重建方法及系統Model reconstruction method and system

本發明是有關於一種重建方法及系統,特別是指一種模型重建方法及系統。The present invention relates to a reconstruction method and system, and more particularly to a model reconstruction method and system.

對於已在線上部署的機器學習模型,當模型因現實情況改變而效度明顯下降時,就需要對已部署的模型進行優化。然而,若是以改變訓練資料或是選用其他機器學習演算法重新構建新模型,可能會造成新模型的預測結果與舊模型的預測結果差異較大,無法以統一的標準來衡量,因而無法比較兩個模型的效度孰優孰劣。For machine learning models that have been deployed online, when the model's validity has dropped significantly due to changes in the real situation, the deployed model needs to be optimized. However, if a new model is rebuilt by changing the training data or using a different machine learning algorithm, the prediction results of the new model may be very different from those of the old model, and cannot be measured by a unified standard, so it is impossible to compare the validity of the two models.

因此,如何改善現有的模型優化方式,以統一的標準衡量新舊模型的優劣已成為相關技術領域所欲解決的議題之一。Therefore, how to improve the existing model optimization methods and measure the pros and cons of new and old models with a unified standard has become one of the issues that the relevant technical fields want to solve.

因此,本發明之目的,即在提供一種模型重建方法及系統,其能克服現有技術至少一個缺點。Therefore, an object of the present invention is to provide a model reconstruction method and system, which can overcome at least one disadvantage of the prior art.

於是,本發明所提供的一種模型重建方法,適用於優化一需求預測模型。該需求預測模型是用於根據相關於一待分析客戶與一金融機構之互動的待分析互動紀錄資料,獲得一需求預測結果。該需求預測結果包含一指示出該待分析客戶可能購買一金融產品的機率值。該模型重建方法藉由儲存有多筆對應多個客戶的客戶行為資料集,及多個訓練組合的一電腦系統來執行。每一客戶行為資料集包含多筆相關於所對應之客戶在多個不同時間區間與該金融機構之互動的互動紀錄資料,與指示出所對應之客戶在該等時間區間內是否購買該金融產品的多個購買結果。每一訓練組合包含所對應的一機器學習演算法及一模型參數組。該模型重建方法包含以下步驟:(A)根據一第一取用規則自該等客戶行為資料集選取多筆訓練資料集,並根據一第二取用規則自該等客戶行為資料集選取多筆測試資料集,其中每一訓練資料集包含自所對應之客戶的該等互動紀錄資料及該等購買結果選取出的至少一互動訓練資料與一購買訓練結果,每一測試資料集包含自所對應之客戶的該等互動紀錄資料及該等購買結果選取出的至少一互動測試資料與一購買測試結果;(B)根據該等訓練資料集、該等測試資料集及該等訓練組合獲得一目標需求預測模型;(C)對於每一測試資料集,根據該測試資料集利用該目標需求預測模型獲得一包含所對應之客戶可能購買該金融產品的機率值的待更新需求預測結果;(D)根據一對應有多個第一分群級距閾值的第一級距排列方式將步驟(C)所獲得的所有待更新需求預測結果進行分群以獲得一第一級距排列結果;(E)對於每一測試資料集,根據該測試資料集利用一待比對需求預測模型獲得一包含所對應之客戶可能購買該金融產品的機率值的當前需求預測結果;(F)根據一對應有多個第二分群級距閾值的第二級距排列方式將步驟(E)所獲得的所有當前需求預測結果進行分群以獲得一第二級距排列結果;(G)根據該第一級距排列結果及該第二級距排列結果分別獲得一對應於該目標需求預測模型的第一指標值及一對應於該待比對需求預測模型的第二指標值;(H)判定該第一指標值是否大於該第二指標值;及(I)當判定出該第一指標值大於該第二指標值時,以該目標需求預測模型取代該待比對需求預測模型。Therefore, a model reconstruction method provided by the present invention is applicable to optimizing a demand forecasting model. The demand forecasting model is used to obtain a demand forecasting result based on the interaction record data to be analyzed related to the interaction between a customer to be analyzed and a financial institution. The demand forecasting result includes a probability value indicating that the customer to be analyzed may purchase a financial product. The model reconstruction method is executed by a computer system that stores multiple customer behavior data sets corresponding to multiple customers and multiple training combinations. Each customer behavior data set includes multiple interaction record data related to the corresponding customer's interaction with the financial institution in multiple different time periods, and multiple purchase results indicating whether the corresponding customer purchased the financial product within the time period. Each training set includes a corresponding machine learning algorithm and a model parameter set. The model reconstruction method includes the following steps: (A) selecting multiple training data sets from the customer behavior data sets according to a first access rule, and selecting multiple test data sets from the customer behavior data sets according to a second access rule, wherein each training data set includes at least one interactive training data and one purchase training result selected from the interactive record data and the purchase results of the corresponding customer, and each test data set includes at least one interactive training data and one purchase training result selected from the interactive record data and the purchase results of the corresponding customer. and at least one interactive test data and a purchase test result selected from the purchase results; (B) obtaining a target demand prediction model based on the training data sets, the test data sets and the training combinations; (C) for each test data set, obtaining a to-be-updated demand prediction result including a probability value of the corresponding customer possibly purchasing the financial product based on the test data set using the target demand prediction model; (D) obtaining a to-be-updated demand prediction result based on a first set of first grouping level threshold values corresponding to the ... (E) for each test data set, a current demand forecast result including a probability value of the corresponding customer to purchase the financial product is obtained based on the test data set using a demand forecast model to be compared; (F) all current demand forecast results obtained in step (E) are grouped according to a second grouping class threshold value corresponding to a plurality of second grouping class threshold values. (G) obtaining a first indicator value corresponding to the target demand forecast model and a second indicator value corresponding to the demand forecast model to be compared according to the first-level arrangement result and the second-level arrangement result; (H) determining whether the first indicator value is greater than the second indicator value; and (I) when it is determined that the first indicator value is greater than the second indicator value, replacing the demand forecast model to be compared with the target demand forecast model.

於是,本發明所提供的一種模型重建系統,適用於優化一需求預測模型。該需求預測模型是用於根據相關於一待分析客戶與一金融機構之互動的待分析互動紀錄資料,獲得一需求預測結果。該需求預測結果包含一指示出該待分析客戶可能購買一金融產品的機率值。該模型重建系統包含一儲存單元及一處理單元。Therefore, the present invention provides a model reconstruction system, which is suitable for optimizing a demand forecasting model. The demand forecasting model is used to obtain a demand forecasting result based on the interaction record data to be analyzed related to the interaction between a customer to be analyzed and a financial institution. The demand forecasting result includes a probability value indicating that the customer to be analyzed may purchase a financial product. The model reconstruction system includes a storage unit and a processing unit.

該儲存單元儲存有多筆對應多個客戶的客戶行為資料集,及多個訓練組合。每一客戶行為資料集包含多筆相關於所對應之客戶在多個不同時間區間與該金融機構之互動的互動紀錄資料,與指示出所對應之客戶在該等時間區間內是否購買該金融產品的多個購買結果。每一訓練組合包含所對應的一機器學習演算法及一模型參數組。The storage unit stores a plurality of customer behavior data sets corresponding to a plurality of customers, and a plurality of training combinations. Each customer behavior data set includes a plurality of interaction record data related to the corresponding customer's interaction with the financial institution in a plurality of different time periods, and a plurality of purchase results indicating whether the corresponding customer purchased the financial product in the time periods. Each training combination includes a corresponding machine learning algorithm and a model parameter set.

該處理單元訊號連接該儲存單元,用於根據一第一取用規則自該等客戶行為資料集選取多筆訓練資料集,並根據一第二取用規則自該等客戶行為資料集選取多筆測試資料集。其中每一訓練資料集包含自所對應之客戶的該等互動紀錄資料及該等購買結果選取出的至少一互動訓練資料與一購買訓練結果,每一測試資料集包含自所對應之客戶的該等互動紀錄資料及該等購買結果選取出的至少一互動測試資料與一購買測試結果。根據該等訓練資料集、該等測試資料集及該等訓練組合獲得一目標需求預測模型。對於每一測試資料集,根據該測試資料集利用該目標需求預測模型獲得一包含所對應之客戶可能購買該金融產品的機率值的待更新需求預測結果。根據一對應有多個第一分群級距閾值的第一級距排列方式將所獲得的所有待更新需求預測結果進行分群以獲得一第一級距排列結果。對於每一測試資料集,根據該測試資料集利用一待比對需求預測模型獲得一包含所對應之客戶可能購買該金融產品的機率值的當前需求預測結果。根據一對應有多個第二分群級距閾值的第二級距排列方式所獲得的所有當前需求預測結果進行分群以獲得一第二級距排列結果。根據該第一級距排列結果及該第二級距排列結果分別獲得一對應於該目標需求預測模型的第一指標值及一對應於該待比對需求預測模型的第二指標值。判定該第一指標值是否大於該第二指標值。當判定出該第一指標值大於該第二指標值時,以該目標需求預測模型取代該待比對需求預測模型。The processing unit signal is connected to the storage unit, and is used to select multiple training data sets from the customer behavior data sets according to a first access rule, and select multiple test data sets from the customer behavior data sets according to a second access rule. Each training data set includes at least one interactive training data and one purchase training result selected from the interactive record data and the purchase results of the corresponding customer, and each test data set includes at least one interactive test data and one purchase test result selected from the interactive record data and the purchase results of the corresponding customer. A target demand forecasting model is obtained based on the training data sets, the test data sets and the training combinations. For each test data set, a to-be-updated demand forecasting result including a probability value of a corresponding customer who may purchase the financial product is obtained based on the test data set using the target demand forecasting model. All the to-be-updated demand forecasting results obtained are grouped according to a first-level interval arrangement method corresponding to a plurality of first grouping level thresholds to obtain a first-level interval arrangement result. For each test data set, a to-be-compared demand forecasting model is used to obtain a current demand forecasting result including a probability value of a corresponding customer who may purchase the financial product based on the test data set. All current demand forecast results obtained by a second level arrangement method corresponding to a plurality of second grouping level thresholds are grouped to obtain a second level arrangement result. A first index value corresponding to the target demand forecast model and a second index value corresponding to the demand forecast model to be compared are obtained according to the first level arrangement result and the second level arrangement result. It is determined whether the first index value is greater than the second index value. When it is determined that the first index value is greater than the second index value, the target demand forecast model replaces the demand forecast model to be compared.

本發明之功效在於:藉由採用具有相同的級距標準的分群方式將該目標需求預測模型的所有待更新需求預測結果進行分群獲得該第一級距排列結果,並將該待比對需求預測模型的所有當前需求預測結果進行分群獲得該第二級距排列結果,使得該第一級距排列結果及該第二級距排列結果中的各機率值能被相同的級距標準劃分和排序,即便不同需求預測模型產生出來的需求預測結果的機率值分佈範圍因外在因素的變化有大幅提升或下降,仍能以共同的評價標準來評判不同需求預測模型之間的優劣。The effect of the present invention is that all the demand forecast results to be updated of the target demand forecast model are grouped by adopting a grouping method with the same level standard to obtain the first level arrangement result, and all the current demand forecast results of the demand forecast model to be compared are grouped to obtain the second level arrangement result, so that each probability value in the first level arrangement result and the second level arrangement result can be divided and sorted by the same level standard. Even if the probability value distribution range of the demand forecast results generated by different demand forecast models is greatly increased or decreased due to changes in external factors, the advantages and disadvantages of different demand forecast models can still be judged by a common evaluation standard.

在本發明被詳細描述之前,應當注意在以下的説明內容中,類似的元件是以相同的編號來表示。Before the present invention is described in detail, it should be noted that similar elements are represented by the same reference numerals in the following description.

參閲圖1,本發明實施例的一種模型重建系統1,適用於優化一需求預測模型。該需求預測模型是用於根據相關於一待分析客戶與一金融機構之互動的待分析互動紀錄資料,獲得一需求預測結果,該需求預測結果包含一指示出該待分析客戶可能購買一金融產品(例如但不限於基金、保單、外幣)的機率值。在本實施例中,該需求預測模型是用來預測在一指定時間間隔N後每一待分析客戶可能購買該金融產品的機率值。每一待分析互動紀錄資料例如但不限於包括所對應的待分析客戶在該金融機構進行交易行為的次數、頻率與每次交易的金額,以及瀏覽該金融機構的網路頁面的頻率與每次瀏覽頁面的時長等。該模型重建系統1包含一儲存單元11,及一訊號連接該儲存單元11的處理單元12。Referring to FIG. 1 , a model reconstruction system 1 of an embodiment of the present invention is suitable for optimizing a demand prediction model. The demand prediction model is used to obtain a demand prediction result based on the interaction record data to be analyzed related to the interaction between a customer to be analyzed and a financial institution, and the demand prediction result includes a probability value indicating that the customer to be analyzed may purchase a financial product (such as but not limited to a fund, insurance policy, foreign currency). In this embodiment, the demand prediction model is used to predict the probability value of each customer to be analyzed to purchase the financial product after a specified time interval N. Each interactive record data to be analyzed includes, for example but not limited to, the number, frequency, and amount of each transaction of the corresponding customer to be analyzed in the financial institution, as well as the frequency of browsing the financial institution's web pages and the duration of each page browsing, etc. The model reconstruction system 1 includes a storage unit 11 and a processing unit 12 connected to the storage unit 11 by signal.

該儲存單元11儲存有多筆對應多個客戶的客戶行為資料集,及多個訓練組合。每一客戶行為資料集包含多筆相關於所對應之客戶在多個不同時間區間(例如,每一個月)與該金融機構之互動的互動紀錄資料,與指示出所對應之客戶在該等時間區間內是否購買該金融產品的多個購買結果。每一訓練組合包含所對應的一機器學習演算法(例如但不限於XGBoost、LightGBM、CatBoost等演算法)及一模型參數組。The storage unit 11 stores a plurality of customer behavior data sets corresponding to a plurality of customers, and a plurality of training combinations. Each customer behavior data set includes a plurality of interaction record data related to the corresponding customer's interaction with the financial institution in a plurality of different time periods (e.g., each month), and a plurality of purchase results indicating whether the corresponding customer purchased the financial product in the time period. Each training combination includes a corresponding machine learning algorithm (e.g., but not limited to, algorithms such as XGBoost, LightGBM, CatBoost, etc.) and a model parameter set.

參閲圖1及圖2,示例性地説明該實施例的該處理單元12如何執行一模型重建程序。該模型重建程序包含以下步驟201~210。1 and 2 , it is exemplarily explained how the processing unit 12 of the embodiment executes a model reconstruction procedure. The model reconstruction procedure includes the following steps 201-210.

首先,在步驟201中,該處理單元12判定當前時間T是否到達一預定的更新時間。該更新時間例如為每月的月底,以使該處理單元12每月都執行一次該模型重建程序,以達到定時自動重建模型的效果。當該處理單元12判定出該當前時間T到達該更新時間時,流程進行步驟202;否則,流程回到步驟201。First, in step 201, the processing unit 12 determines whether the current time T has reached a predetermined update time. The update time is, for example, the end of each month, so that the processing unit 12 executes the model reconstruction program once a month to achieve the effect of automatic model reconstruction on a regular basis. When the processing unit 12 determines that the current time T has reached the update time, the process proceeds to step 202; otherwise, the process returns to step 201.

在步驟202中,該處理單元12根據一第一取用規則自該等客戶行為資料集選取多筆訓練資料集,並根據一第二取用規則自該等客戶行為資料集選取多筆測試資料集。其中,每一訓練資料集包含自所對應之客戶的該等互動紀錄資料及該等購買結果選取出的至少一互動訓練資料與一購買訓練結果;每一測試資料集包含自所對應之客戶的該等互動紀錄資料及該等購買結果選取出的至少一互動測試資料與一購買測試結果。In step 202, the processing unit 12 selects a plurality of training data sets from the customer behavior data sets according to a first access rule, and selects a plurality of test data sets from the customer behavior data sets according to a second access rule. Each training data set includes at least one interactive training data and one purchase training result selected from the interactive record data and the purchase results of the corresponding customer; each test data set includes at least one interactive test data and one purchase test result selected from the interactive record data and the purchase results of the corresponding customer.

舉例來説,若該需求預測模型是用來預測在1個月後(即,N=1)後每一待分析客戶可能購買該金融產品的機率值,並將一預定的資料取用視窗大小(window-size)K設定為3個月(即,K=3),將該第一取用規則例如設定為取用一第一時間區間為[T-N-K,T-N)內的客戶行為資料集作爲該等訓練資料集,將該第二取用規則例如設定為取用一第二時間區間為[T-1,T)內的客戶行為資料集作爲該等測試資料集:假設該當前時間T為11月底(即T=11),那麼每一訓練資料集是以7~10月(不包含10月)的所有互動紀錄資料作為該至少一互動訓練資料並以10月的購買結果作為該購買訓練結果,每一測試資料集是以10月的互動紀錄資料作為該至少一互動測試資料並以11月的購買結果作為該購買測試結果。For example, if the demand forecasting model is used to predict the probability value of each analyzed customer purchasing the financial product one month later (i.e., N=1), and a predetermined data acquisition window size (window-size) K is set to 3 months (i.e., K=3), the first acquisition rule is set to, for example, acquire a customer behavior data set within a first time interval of [T-N-K, T-N) as the training data set, and the second acquisition rule is set to, for example, acquire a customer behavior data set within a second time interval of [T-N-K, T-N) as the training data set. The customer behavior data sets within the time interval [T-1,T) are used as the test data sets: assuming that the current time T is the end of November (i.e., T=11), then each training data set uses all interaction record data from July to October (excluding October) as the at least one interaction training data and the purchase result in October as the purchase training result, and each test data set uses the interaction record data in October as the at least one interaction test data and the purchase result in November as the purchase test result.

在步驟203中,該處理單元12根據該等訓練資料集、該等測試資料集及該等訓練組合獲得一目標需求預測模型。更具體地説,步驟203包含以下子步驟301~307(參閱圖3)。In step 203, the processing unit 12 obtains a target demand prediction model according to the training data sets, the test data sets and the training combinations. More specifically, step 203 includes the following sub-steps 301-307 (see FIG. 3).

在子步驟301中,該處理單元12將該等訓練組合作爲多個待評價訓練組合,並將一訓練計數值設定為一,且將一訓練總數設定為M(M>1)。更具體地說,該訓練總數代表對於該等訓練組合共會進行M輪進行訓練,該訓練計數值用於代表當次訓練是第幾輪訓練。In sub-step 301, the processing unit 12 groups the training combinations into a plurality of training combinations to be evaluated, sets a training count value to 1, and sets a total number of trainings to M (M>1). More specifically, the total number of trainings represents that a total of M rounds of training will be performed for the training combinations, and the training count value is used to represent which round of training the current training is.

在子步驟302中,對於每一待評價訓練組合,該處理單元12自該等訓練資料集選取出i筆待訓練資料集,並利用該待評價訓練組合,根據該等i筆待訓練資料集進行訓練,以獲得一對應於該待評價訓練組合的待評價需求預測模型。i為該訓練計數值與該訓練總數的比值乘上該等訓練資料集的總數。舉例來說,假設共有100筆訓練資料集,並設定該訓練總數為5(即,M=5):當進行第一輪訓練時(即,訓練計數值=1),i=20(即, ),該處理單元12自該等訓練資料集選取出20筆待訓練資料集;當進行第二輪訓練時(即,訓練計數值=2),i=40(即, ),該處理單元12自該等訓練資料集選取出40筆待訓練資料集;以此類推。 In sub-step 302, for each training combination to be evaluated, the processing unit 12 selects i training data sets from the training data sets, and uses the training combination to be evaluated to perform training according to the i training data sets to be trained to obtain a demand forecasting model to be evaluated corresponding to the training combination to be evaluated. i is the ratio of the training count value to the total number of training data sets multiplied by the total number of training data sets. For example, assuming that there are 100 training data sets in total, and the total number of training data sets is set to 5 (i.e., M=5): when the first round of training is performed (i.e., the training count value=1), i=20 (i.e., ), the processing unit 12 selects 20 training data sets from the training data sets; when the second round of training is performed (i.e., the training count value = 2), i = 40 (i.e., ), the processing unit 12 selects 40 sets of data to be trained from the training data sets; and so on.

在子步驟303中,對於每一待評價需求預測模型,該處理單元12根據該等測試資料集對該待評價需求預測模型進行測試以獲得一待評價指標值。更具體地說,對於每一測試資料集,該處理單元12根據該測試資料集利用該待評價需求預測模型獲得一包含所對應之客戶可能購買該金融產品的機率值的待評價需求預測結果;接著該處理單元12將所獲得的所有待評價需求預測結果的機率值由高到低進行排序,以獲得一待評價排序後總預測結果;最後,該處理單元12獲得位列於該待評價排序後總預測結果一目標百分比內的客戶中實際購買該金融產品的人數的比例以作爲該待評價指標值。舉例來說,假設該待評價排序後總預測結果中含有100名客戶的100筆待評價需求預測結果,該目標百分比為30%,則該待評價指標值即為該100筆該待評價需求預測結果中機率值前30%高的30名客戶中實際購買該金融產品的人數的比例,若這30名客戶中共有27人實際購買了該金融產品,則該待評價指標值為0.9。In sub-step 303, for each demand forecasting model to be evaluated, the processing unit 12 tests the demand forecasting model to be evaluated according to the test data sets to obtain an index value to be evaluated. More specifically, for each test data set, the processing unit 12 obtains a demand prediction result to be evaluated including the probability value of the corresponding customer who may purchase the financial product according to the test data set using the demand prediction model to be evaluated; then the processing unit 12 sorts the probability values of all the obtained demand prediction results to be evaluated from high to low to obtain a total prediction result after sorting to be evaluated; finally, the processing unit 12 obtains the proportion of the number of people who actually purchase the financial product among the customers who are ranked within a target percentage of the total prediction result after sorting to be evaluated as the index value to be evaluated. For example, assuming that the total forecast results after the evaluation sorting contain 100 demand forecast results to be evaluated from 100 customers, and the target percentage is 30%, then the value of the index to be evaluated is the proportion of the number of people who actually purchased the financial product among the 30 customers with the highest probability values in the top 30% of the 100 demand forecast results to be evaluated. If 27 of these 30 customers actually purchased the financial product, the value of the index to be evaluated is 0.9.

在子步驟304中,根據該等待評價指標值,該處理單元12自該等待評價需求預測模型中選取出j個候選需求預測模型。j為該等待評價需求預測模型的總數的1/M。舉例來說,假設已設定該訓練總數為5(即,M=5),若本輪總共獲得50個待評價需求預測模型,則該處理單元12從中選取10(即,j= )個候選需求預測模型。選取方式例如為選取其中對應有前10高的待評價指標值的待評價需求預測模型作為該等候選需求預測模型。 In sub-step 304, the processing unit 12 selects j candidate demand forecast models from the demand forecast models to be evaluated according to the evaluation index value. j is 1/M of the total number of the demand forecast models to be evaluated. For example, assuming that the total number of training is set to 5 (i.e., M=5), if a total of 50 demand forecast models to be evaluated are obtained in this round, the processing unit 12 selects 10 (i.e., j= ) candidate demand forecast models. The selection method is, for example, to select the demand forecast models to be evaluated that correspond to the top 10 highest values of the indicators to be evaluated as the candidate demand forecast models.

值得一提的是,由於每輪僅選取出數量爲當輪待評價需求預測模型的總數的1/M的候選需求預測模型,因此往後的每一輪都會比前一輪選取出更少的候選需求預測模型;此外,由於在進行模型訓練時並非一次使用所有的訓練資料集進行訓練,而是每輪僅採用部分的訓練資料集作為該等待訓練資料集,並逐漸增加該等待訓練資料集的數量,藉由將此種待訓練資料集的選取方式與前述的候選需求預測模型選取方式兩者並行,有助於減少模型篩選的時間,使其小於通常採用的以完整的訓練資料對所有候選模型進行訓練的方式所耗費的時間。It is worth mentioning that, since each round only selects 1/M of the total number of demand forecast models to be evaluated in that round as candidate demand forecast models, each subsequent round will select fewer candidate demand forecast models than the previous round; in addition, since not all training data sets are used for training at once during model training, but only part of the training data sets are used as the waiting training data sets in each round, and the number of the waiting training data sets is gradually increased, by combining this method of selecting the waiting training data sets with the aforementioned method of selecting candidate demand forecast models, it helps to reduce the time for model screening and make it less than the time spent on the commonly used method of training all candidate models with complete training data.

在子步驟305中,該處理單元12判定該訓練計數值是否等於M。若該處理單元12判定出該訓練計數值不等於M時,流程進行子步驟306;若該處理單元12判定出該訓練計數值等於M時,流程進行子步驟307。In sub-step 305, the processing unit 12 determines whether the training count value is equal to M. If the processing unit 12 determines that the training count value is not equal to M, the process proceeds to sub-step 306; if the processing unit 12 determines that the training count value is equal to M, the process proceeds to sub-step 307.

在子步驟306中,該處理單元12將該等j個候選需求預測模型所對應的訓練組合作爲j個待評價訓練組合,並將該訓練計數值加一,且流程回到子步驟302,以進行下一輪訓練。In sub-step 306, the processing unit 12 combines the training combinations corresponding to the j candidate demand forecasting models into j training combinations to be evaluated, and increases the training count value by one, and the process returns to sub-step 302 to perform the next round of training.

在子步驟307中,該處理單元12自該等j個候選需求預測模型中選取對應有最高待評價指標值的候選預測模型作爲該目標需求預測模型。In sub-step 307, the processing unit 12 selects the candidate forecasting model corresponding to the highest value of the to-be-evaluated index from the j candidate demand forecasting models as the target demand forecasting model.

接著,在步驟204中,對於每一測試資料集,該處理單元12根據該測試資料集利用該目標需求預測模型獲得一包含所對應之客戶可能購買該金融產品的機率值的待更新需求預測結果。Next, in step 204, for each test data set, the processing unit 12 uses the target demand prediction model according to the test data set to obtain a demand prediction result to be updated, which includes the probability value of the corresponding customer possibly purchasing the financial product.

在步驟205中,該處理單元12根據一對應有多個第一分群級距閾值的第一級距排列方式將所獲得的所有待更新需求預測結果進行分群以獲得一第一級距排列結果。更具體地説,步驟205包含以下子步驟401~404(參閱圖4)。In step 205, the processing unit 12 groups all the demand forecast results to be updated according to a first-level arrangement method corresponding to a plurality of first grouping level thresholds to obtain a first-level arrangement result. More specifically, step 205 includes the following sub-steps 401-404 (see FIG. 4).

在子步驟401中,該處理單元12將所獲得的所有待更新需求預測結果的機率值由高到低進行排序,以獲得一待更新排序後總預測結果。In sub-step 401, the processing unit 12 sorts the probability values of all the demand forecast results to be updated from high to low to obtain a total forecast result to be updated after sorting.

在子步驟402中,該處理單元12根據該等測試資料集之購買結果及多個欲分群準確率,將該待更新排序後總預測結果分成多個分群。每一分群中之待更新需求預測結果符合所對應之購買結果的比例等於該等欲分群準確率之其中一者。舉例來說,假設該待更新排序後總預測結果含有10名客戶(客戶#1~#10)的10筆待更新需求預測結果(如下表1所示),並預定了三個欲分群準確率,分別為80%、50%及30%:客戶#1~#5被分至對應於欲分群準確率80%的分群,客戶#6~#7被分至對應於欲分群準確率50%的分群,客戶#8~#10被分至對應於欲分群準確率30%的分群。 表1 客戶 待更新需求預測結果 購買結果 分群 客戶#1 98% 欲分群準確率80% 客戶#2 95% 客戶#3 88% 客戶#4 87% 沒買 客戶#5 85% 客戶#6 80% 沒買 欲分群準確率50% 客戶#7 75% 客戶#8 74% 沒買 欲分群準確率30% 客戶#9 70% 沒買 客戶#10 64% In sub-step 402, the processing unit 12 divides the total forecast results to be updated after sorting into multiple groups according to the purchase results of the test data sets and multiple grouping accuracy rates. The proportion of the demand forecast results to be updated in each group that meet the corresponding purchase results is equal to one of the grouping accuracy rates. For example, suppose the total forecast results after the update sorting contain 10 forecast results of 10 customers (customers #1~#10) to be updated (as shown in Table 1 below), and three desired grouping accuracy rates are predetermined, which are 80%, 50% and 30% respectively: customers #1~#5 are divided into the group corresponding to the desired grouping accuracy rate of 80%, customers #6~#7 are divided into the group corresponding to the desired grouping accuracy rate of 50%, and customers #8~#10 are divided into the group corresponding to the desired grouping accuracy rate of 30%. Table 1 customer Demand forecast results to be updated Purchase Results Grouping Customer #1 98% Buy The accuracy of the desired grouping is 80% Customer #2 95% Buy Client #3 88% Buy Client #4 87% Didn't buy Customer #5 85% Buy Customer #6 80% Didn't buy The accuracy of grouping is 50% Client #7 75% Buy Customer #8 74% Didn't buy The accuracy of grouping is 30% Customer #9 70% Didn't buy Client #10 64% Buy

在子步驟403中,對於每一分群,該處理單元12將該分群中對應有排序最後的待更新需求預測結果作為該分群的第一分群級距閾值。延續前述表1之例,0.85為對應於欲分群準確率80%的分群的第一分群級距閾值,0.75為對應於欲分群準確率50%的分群的第一分群級距閾值,0.64為對應於欲分群準確率80%的分群的第一分群級距閾值。In sub-step 403, for each cluster, the processing unit 12 uses the demand forecast result to be updated that is the last in the sorting in the cluster as the first clustering level threshold value of the cluster. Continuing with the example of Table 1 above, 0.85 is the first clustering level threshold value corresponding to the cluster with an accuracy of 80%, 0.75 is the first clustering level threshold value corresponding to the cluster with an accuracy of 50%, and 0.64 is the first clustering level threshold value corresponding to the cluster with an accuracy of 80%.

在子步驟404中,該處理單元12根據該等第一分群級距閾值將所獲得的所有待更新需求預測結果進行分群以獲得該第一級距排列結果。該第一級距排列結果包含該等分群。更具體地,對於每一第一分群級距閾值,該處理單元12是將大於等於該第一分群級距閾值且不屬於另一分群的所有待更新需求預測結果加入該第一分群級距閾值所對應的分群。值得注意的是,子步驟404的再次分群與子步驟402的分群的差異為,子步驟404是對於未排序的所有待更新需求預測結果進行分群,而子步驟402則是對於排序後的該待更新排序後總預測結果進行分群。延續前述表一之例,該第一級距排列結果例如為下表2所示: 表2 客戶 待更新需求預測結果 購買結果 分群 客戶#5 85% 欲分群準確率80% 客戶#3 88% 客戶#2 95% 客戶#4 87% 沒買 客戶#1 98% 客戶#6 80% 沒買 欲分群準確率50% 客戶#7 75% 客戶#8 74% 沒買 欲分群準確率30% 客戶#10 64% 客戶#9 70% 沒買 In sub-step 404, the processing unit 12 groups all the demand forecast results to be updated according to the first grouping level thresholds to obtain the first level arrangement result. The first level arrangement result includes the groupings. More specifically, for each first grouping level threshold, the processing unit 12 adds all the demand forecast results to be updated that are greater than or equal to the first grouping level threshold and do not belong to another group to the group corresponding to the first grouping level threshold. It is worth noting that the difference between the re-grouping in sub-step 404 and the grouping in sub-step 402 is that sub-step 404 groups all the unsorted demand forecast results to be updated, while sub-step 402 groups the sorted total forecast results to be updated. Continuing with the example of Table 1 above, the first level arrangement result is shown in Table 2 below: Table 2 customer Demand forecast results to be updated Purchase Results Grouping Customer #5 85% Buy The accuracy of the desired grouping is 80% Client #3 88% Buy Customer #2 95% Buy Client #4 87% Didn't buy Customer #1 98% Buy Customer #6 80% Didn't buy The accuracy of grouping is 50% Client #7 75% Buy Customer #8 74% Didn't buy The accuracy of grouping is 30% Client #10 64% Buy Customer #9 70% Didn't buy

之後,在步驟206中,對於每一測試資料集,該處理單元12根據該測試資料集利用一待比對需求預測模型獲得一包含所對應之客戶可能購買該金融產品的機率值的當前需求預測結果。該待比對需求預測模型即為當前已部署在線上需要被優化的需求預測模型。Then, in step 206, for each test data set, the processing unit 12 obtains a current demand forecast result including a probability value of the corresponding customer to purchase the financial product using a demand forecast model to be compared according to the test data set. The demand forecast model to be compared is the demand forecast model currently deployed online that needs to be optimized.

在步驟207中,該處理單元12根據一對應有多個第二分群級距閾值的第二級距排列方式將所獲得的所有當前需求預測結果進行分群以獲得一第二級距排列結果。值得說明的是,該等第二分群級距閾值是該待比對需求預測模型在之前被挑選為目標需求預測模型時所獲得的第一分群級距閾值。獲得該第二級距排列結果的細節與子步驟404大致相同,在此不多做贅述。In step 207, the processing unit 12 groups all the current demand forecast results obtained according to a second level arrangement method corresponding to a plurality of second grouping level thresholds to obtain a second level arrangement result. It is worth noting that the second grouping level thresholds are the first grouping level thresholds obtained when the demand forecast model to be compared was previously selected as the target demand forecast model. The details of obtaining the second level arrangement result are roughly the same as sub-step 404, and no further details are given here.

在步驟208中,該處理單元12根據該第一級距排列結果及該第二級距排列結果分別獲得一對應於該目標需求預測模型的第一指標值及一對應於該待比對需求預測模型的第二指標值。在本實施例中,該第一指標值為位列於該第一級距排列結果該目標百分比內的客戶中實際購買該金融產品的人數的比例,該第二指標值為位列於該第二級距排列結果該目標百分比內的客戶中實際購買該金融產品的人數的比例。此處計算該第一指標值及該第二指標值的方式與子步驟303中計算待評價指標值的方式大致相同,不同之處在於子步驟303的排序方式是將對應有越高機率值的需求預測結果排在越前面,而此處則是將對應有越高欲分群準確率的需求預測結果排在越前面。舉例來說,若該目標百分比為30%,延續前述表2之例,排在前3名(前30%)的客戶依次為客戶#5、客戶#3、客戶#2,並且這三人實際皆購買了該金融產品,則該第一指標值為1。In step 208, the processing unit 12 obtains a first index value corresponding to the target demand forecast model and a second index value corresponding to the demand forecast model to be compared according to the first-level ranking result and the second-level ranking result. In this embodiment, the first index value is the proportion of the number of people who actually purchase the financial product among the customers who are ranked within the target percentage of the first-level ranking result, and the second index value is the proportion of the number of people who actually purchase the financial product among the customers who are ranked within the target percentage of the second-level ranking result. The way of calculating the first index value and the second index value here is roughly the same as the way of calculating the index value to be evaluated in sub-step 303. The difference is that the sorting method of sub-step 303 is to put the demand forecast results corresponding to the higher probability value in the front, while here the demand forecast results corresponding to the higher accuracy of the desired grouping are put in the front. For example, if the target percentage is 30%, continuing the example of Table 2 above, the top 3 customers (top 30%) are customer #5, customer #3, and customer #2, and these three people actually purchased the financial product, then the first index value is 1.

在步驟209中,該處理單元12判定該第一指標值是否大於該第二指標值。當該處理單元12判定出該第一指標值大於該第二指標值時,流程進行步驟210;否則,流程回到步驟201。In step 209, the processing unit 12 determines whether the first index value is greater than the second index value. When the processing unit 12 determines that the first index value is greater than the second index value, the process proceeds to step 210; otherwise, the process returns to step 201.

在步驟210中,該處理單元12以該目標需求預測模型取代該待比對需求預測模型。In step 210, the processing unit 12 replaces the demand forecast model to be compared with the target demand forecast model.

綜上所述,藉由採用具有相同的級距標準的分群方式將該目標需求預測模型的所有待更新需求預測結果進行分群獲得該第一級距排列結果,並將該待比對需求預測模型的所有當前需求預測結果進行分群獲得該第二級距排列結果,使得該第一級距排列結果及該第二級距排列結果中的各機率值能被相同的級距標準劃分和排序。即便不同需求預測模型產生出來的需求預測結果的機率值分佈範圍因外在因素的變化有大幅提升或下降,仍能以共同的評價標準來評判不同需求預測模型之間的優劣。其次,本案中採用的模型訓練方式是藉由在每輪模型訓練中逐步減少候選模型數量並增加模型的訓練資料量的方式來進行,能夠比以往的模型訓練方式節省更多時間,從而更有效地篩選訓練組合以得到該目標需求預測模型。此外,定時對模型進行重建并重新部署上線也能使在線上的模型總是維持著最好的效度,減少模型效度因現實情況改變而受到的影響。因此,確實能達成本發明之目的。In summary, by adopting a grouping method with the same level standard to group all the demand forecast results to be updated of the target demand forecast model to obtain the first level ranking result, and grouping all the current demand forecast results of the demand forecast model to be compared to obtain the second level ranking result, the probability values in the first level ranking result and the second level ranking result can be divided and sorted by the same level standard. Even if the probability value distribution range of the demand forecast results generated by different demand forecast models is greatly increased or decreased due to changes in external factors, the advantages and disadvantages of different demand forecast models can still be judged by a common evaluation standard. Secondly, the model training method adopted in this case is to gradually reduce the number of candidate models and increase the amount of model training data in each round of model training, which can save more time than previous model training methods, thereby more effectively screening the training combination to obtain the target demand prediction model. In addition, regular reconstruction and redeployment of the model can also enable the online model to always maintain the best validity and reduce the impact of model validity due to changes in actual conditions. Therefore, the purpose of this invention can indeed be achieved.

惟以上所述者,僅為本發明之實施例而已,當不能以此限定本發明實施之範圍,凡是依本發明申請專利範圍及專利說明書內容所作之簡單的等效變化與修飾,皆仍屬本發明專利涵蓋之範圍內。However, the above is only an example of the implementation of the present invention, and it should not be used to limit the scope of the implementation of the present invention. All simple equivalent changes and modifications made according to the scope of the patent application of the present invention and the content of the patent specification are still within the scope of the patent of the present invention.

1:模型重建系統 11:儲存單元 12:處理單元 201~210:步驟 301~307:子步驟 401~404:子步驟1: Model reconstruction system 11: Storage unit 12: Processing unit 201~210: Steps 301~307: Sub-steps 401~404: Sub-steps

本發明之其他的特徵及功效,將於參照圖式的實施方式中清楚地呈現,其中: 圖1是一方塊圖,示例性地說明本發明實施例的一種模型重建系統; 圖2是一流程圖,示例性地説明該實施例的一處理單元如何執行一模型重建程序; 圖3是一流程圖,示例性地説明該實施例的該處理單元如何獲得一目標需求預測模型;及 圖4是一流程圖,示例性地説明該實施例的該處理單元如何獲得一第一級距排列結果。 Other features and functions of the present invention will be clearly presented in the implementation method with reference to the drawings, wherein: FIG. 1 is a block diagram, exemplarily illustrating a model reconstruction system of an embodiment of the present invention; FIG. 2 is a flow chart, exemplarily illustrating how a processing unit of the embodiment executes a model reconstruction procedure; FIG. 3 is a flow chart, exemplarily illustrating how the processing unit of the embodiment obtains a target demand prediction model; and FIG. 4 is a flow chart, exemplarily illustrating how the processing unit of the embodiment obtains a first-order interval arrangement result.

1:模型重建系統 1: Model reconstruction system

11:儲存單元 11: Storage unit

12:處理單元 12: Processing unit

Claims (12)

一種模型重建方法,適用於優化一需求預測模型,該需求預測模型是用於根據相關於一待分析客戶與一金融機構之互動的待分析互動紀錄資料,獲得一需求預測結果,該需求預測結果包含一指示出該待分析客戶可能購買一金融產品的機率值,該模型重建方法藉由儲存有多筆對應多個客戶的客戶行為資料集,及多個訓練組合的一電腦系統來執行,每一客戶行為資料集包含多筆相關於所對應之客戶在多個不同時間區間與該金融機構之互動的互動紀錄資料,與指示出所對應之客戶在該等時間區間內是否購買該金融產品的多個購買結果,每一訓練組合包含所對應的一機器學習演算法及一模型參數組,該模型重建方法包含以下步驟: (A)根據一第一取用規則自該等客戶行為資料集選取多筆訓練資料集,並根據一第二取用規則自該等客戶行為資料集選取多筆測試資料集,其中每一訓練資料集包含自所對應之客戶的該等互動紀錄資料及該等購買結果選取出的至少一互動訓練資料與一購買訓練結果,每一測試資料集包含自所對應之客戶的該等互動紀錄資料及該等購買結果選取出的至少一互動測試資料與一購買測試結果; (B)根據該等訓練資料集、該等測試資料集及該等訓練組合獲得一目標需求預測模型; (C)對於每一測試資料集,根據該測試資料集利用該目標需求預測模型獲得一包含所對應之客戶可能購買該金融產品的機率值的待更新需求預測結果; (D)根據一對應有多個第一分群級距閾值的第一級距排列方式將步驟(C)所獲得的所有待更新需求預測結果進行分群以獲得一第一級距排列結果; (E)對於每一測試資料集,根據該測試資料集利用一待比對需求預測模型獲得一包含所對應之客戶可能購買該金融產品的機率值的當前需求預測結果; (F)根據一對應有多個第二分群級距閾值的第二級距排列方式將步驟(E)所獲得的所有當前需求預測結果進行分群以獲得一第二級距排列結果; (G)根據該第一級距排列結果及該第二級距排列結果分別獲得一對應於該目標需求預測模型的第一指標值及一對應於該待比對需求預測模型的第二指標值; (H)判定該第一指標值是否大於該第二指標值;及 (I)當判定出該第一指標值大於該第二指標值時,以該目標需求預測模型取代該待比對需求預測模型。 A model reconstruction method is used for optimizing a demand prediction model. The demand prediction model is used to obtain a demand prediction result based on interaction record data related to the interaction between a customer to be analyzed and a financial institution. The demand prediction result includes a probability value indicating that the customer to be analyzed may purchase a financial product. The model reconstruction method stores a plurality of customer behavior data sets corresponding to a plurality of customers, and a plurality of training The model reconstruction method is executed by a computer system of a training combination. Each customer behavior data set includes multiple interaction record data related to the corresponding customer's interaction with the financial institution in multiple different time periods, and multiple purchase results indicating whether the corresponding customer purchased the financial product in the time period. Each training combination includes a corresponding machine learning algorithm and a model parameter set. The model reconstruction method includes the following steps: (A) Selecting multiple training data sets from the customer behavior data sets according to a first access rule, and selecting multiple test data sets from the customer behavior data sets according to a second access rule, wherein each training data set includes at least one interactive training data and one purchase training result selected from the corresponding customer interaction record data and the purchase results, and each test data set includes at least one interactive test data and one purchase test result selected from the corresponding customer interaction record data and the purchase results; (B) Obtaining a target demand prediction model based on the training data sets, the test data sets and the training combinations; (C) For each test data set, obtain a to-be-updated demand forecast result including the probability value of the corresponding customer who may purchase the financial product based on the test data set using the target demand forecast model; (D) Group all to-be-updated demand forecast results obtained in step (C) according to a first-level interval arrangement method corresponding to a plurality of first-level grouping interval thresholds to obtain a first-level interval arrangement result; (E) For each test data set, obtain a current demand forecast result including the probability value of the corresponding customer who may purchase the financial product based on the test data set using a to-be-compared demand forecast model; (F) grouping all current demand forecast results obtained in step (E) according to a second level arrangement method corresponding to a plurality of second grouping level thresholds to obtain a second level arrangement result; (G) obtaining a first index value corresponding to the target demand forecast model and a second index value corresponding to the demand forecast model to be compared according to the first level arrangement result and the second level arrangement result; (H) determining whether the first index value is greater than the second index value; and (I) when it is determined that the first index value is greater than the second index value, replacing the demand forecast model to be compared with the target demand forecast model. 如請求項1所述的模型重建方法,其中,步驟(B)包含以下子步驟: (B-1)將該等訓練組合作爲多個待評價訓練組合,並將一訓練計數值設定為一,且將一訓練總數設定為M(M>1); (B-2)對於每一待評價訓練組合,自該等訓練資料集選取出i筆待訓練資料集,並利用該待評價訓練組合,根據該等i筆待訓練資料集進行訓練,以獲得一對應於該待評價訓練組合的待評價需求預測模型,i為該訓練計數值與該訓練總數的比值乘上該等訓練資料集的總數; (B-3)對於每一待評價需求預測模型,根據該等測試資料集對該待評價需求預測模型進行測試以獲得一待評價指標值; (B-4)根據該等待評價指標值,自該等待評價需求預測模型中選取出j個候選需求預測模型,j為該等待評價需求預測模型的總數的1/M; (B-5)判定該訓練計數值是否等於M; (B-6)當判定出該訓練計數值不等於M時,將該等j個候選需求預測模型所對應的訓練組合作爲j個待評價訓練組合,並將該訓練計數值加一,且流程回到步驟(B-2);及 (B-7)當判定出該訓練計數值等於M時,自該等j個候選需求預測模型中選取對應有最高待評價指標值的候選預測模型作爲該目標需求預測模型。 The model reconstruction method as described in claim 1, wherein step (B) comprises the following sub-steps: (B-1) the training combinations are grouped into a plurality of training combinations to be evaluated, and a training count value is set to 1, and a total number of training is set to M (M>1); (B-2) for each training combination to be evaluated, i training data sets to be trained are selected from the training data sets, and the training combination to be evaluated is used to perform training according to the i training data sets to be trained, so as to obtain a demand forecasting model to be evaluated corresponding to the training combination to be evaluated, i being the ratio of the training count value to the total number of training times the total number of the training data sets; (B-3) For each demand forecasting model to be evaluated, the demand forecasting model to be evaluated is tested according to the test data sets to obtain an evaluation index value; (B-4) According to the evaluation index value, j candidate demand forecasting models are selected from the demand forecasting models to be evaluated, where j is 1/M of the total number of the demand forecasting models to be evaluated; (B-5) Determine whether the training count value is equal to M; (B-6) When it is determined that the training count value is not equal to M, the training combinations corresponding to the j candidate demand forecasting models are grouped as j training combinations to be evaluated, and the training count value is increased by one, and the process returns to step (B-2); and (B-7) When it is determined that the training count value is equal to M, the candidate prediction model corresponding to the highest value of the index to be evaluated is selected from the j candidate demand prediction models as the target demand prediction model. 如請求項2所述的模型重建方法,其中,對於每一待評價需求預測模型,步驟(B-3)包含以下子步驟: (B-3-1)對於每一測試資料集,根據該測試資料集利用該待評價需求預測模型獲得一包含所對應之客戶可能購買該金融產品的機率值的待評價需求預測結果; (B-3-2)將步驟(B-3-1)所獲得的所有待評價需求預測結果的機率值進行排序,以獲得一待評價排序後總預測結果;及 (B-3-3)獲得位列於該待評價排序後總預測結果一目標百分比內的客戶中實際購買該金融產品的人數的比例以作爲該待評價指標值。 The model reconstruction method as described in claim 2, wherein, for each demand prediction model to be evaluated, step (B-3) includes the following sub-steps: (B-3-1) for each test data set, using the demand prediction model to be evaluated to obtain a demand prediction result to be evaluated including the probability value of the corresponding customer to purchase the financial product according to the test data set; (B-3-2) sorting the probability values of all demand prediction results to be evaluated obtained in step (B-3-1) to obtain a total prediction result after sorting; and (B-3-3) obtaining the proportion of the number of people who actually purchase the financial product among the customers who are within a target percentage of the total prediction result after sorting as the evaluation index value. 如請求項3所述的模型重建方法,其中,步驟(D)包含以下子步驟: (D-1)將步驟(C)所獲得的所有待更新需求預測結果的機率值進行排序,以獲得一待更新排序後總預測結果; (D-2)根據該等測試資料集之購買結果及多個欲分群準確率,將該待更新排序後總預測結果分成多個分群,每一分群中之待更新需求預測結果符合所對應之購買結果的比例等於該等欲分群準確率之其中一者; (D-3)對於每一分群,將該分群中對應有排序最後的待更新需求預測結果作為該分群的第一分群級距閾值;及 (D-4)根據該等第一分群級距閾值將步驟(C)所獲得的所有待更新需求預測結果進行分群以獲得該第一級距排列結果,該第一級距排列結果包含該等分群。 The model reconstruction method as described in claim 3, wherein step (D) includes the following sub-steps: (D-1) sorting the probability values of all demand forecast results to be updated obtained in step (C) to obtain a total forecast result to be updated after sorting; (D-2) dividing the total forecast result to be updated after sorting into multiple clusters according to the purchase results of the test data sets and multiple clustering accuracies, and the proportion of the demand forecast results to be updated in each cluster that meet the corresponding purchase results is equal to one of the clustering accuracies; (D-3) for each cluster, the demand forecast result to be updated corresponding to the last sorting in the cluster is used as the first clustering level threshold value of the cluster; and (D-4) Group all the demand forecast results to be updated obtained in step (C) according to the first grouping level thresholds to obtain the first level arrangement result, and the first level arrangement result includes the groupings. 如請求項4所述的模型重建方法,其中,在步驟(G)中: 該第一指標值為位列於該第一級距排列結果該目標百分比內的客戶中實際購買該金融產品的人數的比例;及 該第二指標值為位列於該第二級距排列結果該目標百分比內的客戶中實際購買該金融產品的人數的比例。 The model reconstruction method as described in claim 4, wherein in step (G): The first indicator value is the proportion of the number of customers who actually purchase the financial product among the customers who are ranked within the target percentage of the first-level ranking result; and The second indicator value is the proportion of the number of customers who actually purchase the financial product among the customers who are ranked within the target percentage of the second-level ranking result. 如請求項5所述的模型重建方法,在步驟(A)之前,還包含以下步驟: (J)判定當前時間是否到達一預定的更新時間; 其中,當判定出該當前時間到達該更新時間時,執行步驟(A)。 The model reconstruction method as described in claim 5, before step (A), further comprises the following steps: (J) determining whether the current time has reached a predetermined update time; wherein, when it is determined that the current time has reached the update time, step (A) is executed. 一種模型重建系統,適用於優化一需求預測模型,該需求預測模型是用於根據相關於一待分析客戶與一金融機構之互動的待分析互動紀錄資料,獲得一需求預測結果,該需求預測結果包含一指示出該待分析客戶可能購買一金融產品的機率值,該模型重建系統包含: 一儲存單元,儲存有多筆對應多個客戶的客戶行為資料集,及多個訓練組合,每一客戶行為資料集包含多筆相關於所對應之客戶在多個不同時間區間與該金融機構之互動的互動紀錄資料,與指示出所對應之客戶在該等時間區間內是否購買該金融產品的多個購買結果,每一訓練組合包含所對應的一機器學習演算法及一模型參數組;及 一處理單元,訊號連接該儲存單元,用於根據一第一取用規則自該等客戶行為資料集選取多筆訓練資料集,並根據一第二取用規則自該等客戶行為資料集選取多筆測試資料集,其中每一訓練資料集包含自所對應之客戶的該等互動紀錄資料及該等購買結果選取出的至少一互動訓練資料與一購買訓練結果,每一測試資料集包含自所對應之客戶的該等互動紀錄資料及該等購買結果選取出的至少一互動測試資料與一購買測試結果, 根據該等訓練資料集、該等測試資料集及該等訓練組合獲得一目標需求預測模型, 對於每一測試資料集,根據該測試資料集利用該目標需求預測模型獲得一包含所對應之客戶可能購買該金融產品的機率值的待更新需求預測結果, 根據一對應有多個第一分群級距閾值的第一級距排列方式將所獲得的所有待更新需求預測結果進行分群以獲得一第一級距排列結果, 對於每一測試資料集,根據該測試資料集利用一待比對需求預測模型獲得一包含所對應之客戶可能購買該金融產品的機率值的當前需求預測結果, 根據一對應有多個第二分群級距閾值的第二級距排列方式所獲得的所有當前需求預測結果進行分群以獲得一第二級距排列結果, 根據該第一級距排列結果及該第二級距排列結果分別獲得一對應於該目標需求預測模型的第一指標值及一對應於該待比對需求預測模型的第二指標值, 判定該第一指標值是否大於該第二指標值, 當判定出該第一指標值大於該第二指標值時,以該目標需求預測模型取代該待比對需求預測模型。 A model reconstruction system is used for optimizing a demand forecasting model, wherein the demand forecasting model is used to obtain a demand forecasting result based on interaction record data to be analyzed related to the interaction between a customer to be analyzed and a financial institution, wherein the demand forecasting result includes a probability value indicating that the customer to be analyzed may purchase a financial product. The model reconstruction system includes: A storage unit storing a plurality of customer behavior data sets corresponding to a plurality of customers, and a plurality of training combinations, each customer behavior data set comprising a plurality of interaction record data related to the corresponding customer's interaction with the financial institution in a plurality of different time periods, and a plurality of purchase results indicating whether the corresponding customer purchased the financial product in the time periods, and each training combination comprising a corresponding machine learning algorithm and a model parameter set; and A processing unit, signal-connected to the storage unit, for selecting a plurality of training data sets from the customer behavior data sets according to a first access rule, and selecting a plurality of test data sets from the customer behavior data sets according to a second access rule, wherein each training data set includes at least one interactive training data and one purchase training result selected from the interactive record data and the purchase results of the corresponding customer, and each test data set includes at least one interactive test data and one purchase test result selected from the interactive record data and the purchase results of the corresponding customer, A target demand forecasting model is obtained based on the training data sets, the test data sets and the training combinations. For each test data set, a to-be-updated demand forecasting result including the probability value of the corresponding customer who may purchase the financial product is obtained based on the test data set using the target demand forecasting model. All the to-be-updated demand forecasting results are grouped according to a first-level interval arrangement method corresponding to a plurality of first grouping level thresholds to obtain a first-level interval arrangement result. For each test data set, a current demand forecasting result including the probability value of the corresponding customer who may purchase the financial product is obtained based on the test data set using a to-be-compared demand forecasting model. All current demand forecast results obtained by a second-level arrangement method corresponding to a plurality of second grouping level thresholds are grouped to obtain a second-level arrangement result. A first index value corresponding to the target demand forecast model and a second index value corresponding to the demand forecast model to be compared are obtained according to the first-level arrangement result and the second-level arrangement result, and whether the first index value is greater than the second index value is determined. When it is determined that the first index value is greater than the second index value, the target demand forecast model replaces the demand forecast model to be compared. 如請求項7所述的模型重建系統,其中: 該處理單元將該等訓練組合作爲多個待評價訓練組合,並將一訓練計數值設定為一,且將一訓練總數設定為M(M>1), 對於每一待評價訓練組合,該處理單元自該等訓練資料集選取出i筆待訓練資料集,並利用該待評價訓練組合,根據該等i筆待訓練資料集進行訓練,以獲得一對應於該待評價訓練組合的待評價需求預測模型,i為該訓練計數值與該訓練總數的比值乘上該等訓練資料集的總數, 對於每一待評價需求預測模型,該處理單元根據該等測試資料集對該待評價需求預測模型進行測試以獲得一待評價指標值, 該處理單元根據該等待評價指標值,自該等待評價需求預測模型中選取出j個候選需求預測模型,j為該等待評價需求預測模型的總數的1/M, 該處理單元判定該訓練計數值是否等於M, 當判定出該訓練計數值不等於M時,該處理單元將該等j個候選需求預測模型所對應的訓練組合作爲j個待評價訓練組合,並將該訓練計數值加一,且重複獲得每一待評價訓練組合所對應的待評價需求預測模型及其待評價指標值並從中選取出j個候選需求預測模型,直到該訓練計數值等於M, 當判定出該訓練計數值等於M時,該處理單元自該等j個候選需求預測模型中選取對應有最高待評價指標值的候選預測模型作爲該目標需求預測模型。 A model reconstruction system as described in claim 7, wherein: The processing unit combines the training combinations into a plurality of training combinations to be evaluated, and sets a training count value to one, and sets a total number of training to M (M>1), For each training combination to be evaluated, the processing unit selects i training data sets to be trained from the training data sets, and uses the training combination to be evaluated to perform training based on the i training data sets to be trained to obtain a demand forecast model to be evaluated corresponding to the training combination to be evaluated, i is the ratio of the training count value to the total number of training times the total number of training data sets, For each demand forecasting model to be evaluated, the processing unit tests the demand forecasting model to be evaluated according to the test data sets to obtain an evaluation index value. The processing unit selects j candidate demand forecasting models from the demand forecasting models to be evaluated according to the evaluation index value, where j is 1/M of the total number of the demand forecasting models to be evaluated. The processing unit determines whether the training count value is equal to M. When it is determined that the training count value is not equal to M, the processing unit combines the training combinations corresponding to the j candidate demand forecast models into j training combinations to be evaluated, and increases the training count value by one, and repeatedly obtains the demand forecast model to be evaluated and its to-be-evaluated index value corresponding to each training combination to be evaluated, and selects j candidate demand forecast models therefrom until the training count value is equal to M. When it is determined that the training count value is equal to M, the processing unit selects the candidate forecast model corresponding to the highest to-be-evaluated index value from the j candidate demand forecast models as the target demand forecast model. 如請求項8所述的模型重建系統,其中,對於每一待評價需求預測模型: 對於每一測試資料集,該處理單元根據該測試資料集利用該待評價需求預測模型獲得一包含所對應之客戶可能購買該金融產品的機率值的待評價需求預測結果, 該處理單元將所獲得的所有待評價需求預測結果的機率值進行排序,以獲得一待評價排序後總預測結果, 該處理單元獲得位列於該待評價排序後總預測結果一目標百分比內的客戶中實際購買該金融產品的人數的比例以作爲該待評價指標值。 A model reconstruction system as described in claim 8, wherein for each demand prediction model to be evaluated: For each test data set, the processing unit obtains a demand prediction result to be evaluated including a probability value of the corresponding customer to purchase the financial product according to the test data set using the demand prediction model to be evaluated, The processing unit sorts the probability values of all the obtained demand prediction results to be evaluated to obtain a total prediction result after sorting, The processing unit obtains the proportion of the number of people who actually purchase the financial product among the customers who are ranked within a target percentage of the total prediction result after sorting as the evaluation index value. 如請求項9所述的模型重建系統,其中: 該處理單元將所獲得的所有待更新需求預測結果的機率值進行排序,以獲得一待更新排序後總預測結果, 該處理單元根據該等測試資料集之購買結果及多個欲分群準確率,將該待更新排序後總預測結果分成多個分群,每一分群中之待更新需求預測結果符合所對應之購買結果的比例等於該等欲分群準確率之其中一者, 對於每一分群,該處理單元將該分群中對應有排序最後的待更新需求預測結果作為該分群的第一分群級距閾值, 該處理單元根據該等第一分群級距閾值將所獲得的所有待更新需求預測結果進行分群以獲得該第一級距排列結果,該第一級距排列結果包含該等分群。 A model reconstruction system as described in claim 9, wherein: The processing unit sorts the probability values of all the demand forecast results to be updated to obtain a total forecast result after sorting to be updated, The processing unit divides the total forecast result to be updated after sorting into multiple groups according to the purchase results of the test data sets and multiple grouping accuracy rates, and the proportion of the demand forecast results to be updated in each group that meet the corresponding purchase results is equal to one of the grouping accuracy rates, For each group, the processing unit uses the demand forecast result to be updated corresponding to the last sorting in the group as the first grouping level threshold value of the group, The processing unit groups all the demand forecast results to be updated according to the first grouping level thresholds to obtain the first level arrangement result, and the first level arrangement result includes the groupings. 如請求項10所述的模型重建系統,其中: 該第一指標值為位列於該第一級距排列結果該目標百分比內的客戶中實際購買該金融產品的人數的比例;及 該第二指標值為位列於該第二級距排列結果該目標百分比內的客戶中實際購買該金融產品的人數的比例。 A model reconstruction system as described in claim 10, wherein: The first indicator value is the proportion of the number of customers who actually purchase the financial product among the customers who are ranked within the target percentage of the first-level ranking result; and The second indicator value is the proportion of the number of customers who actually purchase the financial product among the customers who are ranked within the target percentage of the second-level ranking result. 如請求項11所述的模型重建系統,其中: 該處理單元判定當前時間是否到達一預定的更新時間, 當該處理單元判定出該當前時間到達該更新時間時,該處理單元才進行該等訓練資料集及該等測試資料集之選取。 A model reconstruction system as described in claim 11, wherein: The processing unit determines whether the current time has reached a predetermined update time, When the processing unit determines that the current time has reached the update time, the processing unit selects the training data sets and the test data sets.
TW112151066A 2023-12-27 2023-12-27 Model reconstruction method and system TWI860923B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
TW112151066A TWI860923B (en) 2023-12-27 2023-12-27 Model reconstruction method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
TW112151066A TWI860923B (en) 2023-12-27 2023-12-27 Model reconstruction method and system

Publications (1)

Publication Number Publication Date
TWI860923B true TWI860923B (en) 2024-11-01

Family

ID=94379759

Family Applications (1)

Application Number Title Priority Date Filing Date
TW112151066A TWI860923B (en) 2023-12-27 2023-12-27 Model reconstruction method and system

Country Status (1)

Country Link
TW (1) TWI860923B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109716346A (en) * 2016-07-18 2019-05-03 河谷生物组学有限责任公司 Distributed machines learning system, device and method
CN111095308A (en) * 2017-05-14 2020-05-01 数字推理系统有限公司 System and method for quickly building, managing and sharing machine learning models
CN114463691A (en) * 2020-11-10 2022-05-10 阿里巴巴集团控股有限公司 Model training method, device and system and pedestrian re-identification method
CN115841342A (en) * 2022-11-22 2023-03-24 蔚来汽车科技(安徽)有限公司 Demand forecasting method and system, and storage medium
TW202333080A (en) * 2022-02-09 2023-08-16 美商應用材料股份有限公司 Machine learning model generation and updating for manufacturing equipment
CN117033995A (en) * 2022-08-26 2023-11-10 腾讯科技(深圳)有限公司 Training method, device, equipment, medium and product of prediction model
TWM654142U (en) * 2023-12-27 2024-04-11 中國信託商業銀行股份有限公司 Model Reconstruction System

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109716346A (en) * 2016-07-18 2019-05-03 河谷生物组学有限责任公司 Distributed machines learning system, device and method
CN111095308A (en) * 2017-05-14 2020-05-01 数字推理系统有限公司 System and method for quickly building, managing and sharing machine learning models
CN114463691A (en) * 2020-11-10 2022-05-10 阿里巴巴集团控股有限公司 Model training method, device and system and pedestrian re-identification method
TW202333080A (en) * 2022-02-09 2023-08-16 美商應用材料股份有限公司 Machine learning model generation and updating for manufacturing equipment
CN117033995A (en) * 2022-08-26 2023-11-10 腾讯科技(深圳)有限公司 Training method, device, equipment, medium and product of prediction model
CN115841342A (en) * 2022-11-22 2023-03-24 蔚来汽车科技(安徽)有限公司 Demand forecasting method and system, and storage medium
TWM654142U (en) * 2023-12-27 2024-04-11 中國信託商業銀行股份有限公司 Model Reconstruction System

Similar Documents

Publication Publication Date Title
CN112053234B (en) Enterprise credit rating method based on macroscopic region economic index and microscopic factor
CN108256802B (en) Crowd search algorithm-based multi-supplier order distribution cloud processing method
CN112308623A (en) High-quality client loss prediction method and device based on supervised learning and storage medium
CN111144941A (en) Merchant score generation method, device, equipment and readable storage medium
WO2020024456A1 (en) Quantitative transaction prediction method, device and equipment
CN112700324A (en) User loan default prediction method based on combination of Catboost and restricted Boltzmann machine
CN113450004A (en) Power credit report generation method and device, electronic equipment and readable storage medium
TWM654142U (en) Model Reconstruction System
CN113065969A (en) Enterprise scoring model construction method, enterprise scoring method, medium and electronic device
CN115391561A (en) Method and device for processing graph network data set, electronic equipment, program and medium
CN114912739A (en) Construction and application method of environment and transformer substation operation and maintenance cost correlation model
CN107742131A (en) Financial asset sorting technique and device
CN116862078A (en) Method, system, device and medium for predicting overdue of battery-change package user
CN111507824A (en) Wind control model mold-entering variable minimum entropy box separation method
CN111626855A (en) Bond credit interest difference prediction method and system
CN113379200B (en) Method and device for determining geological disaster susceptibility evaluation
TWI860923B (en) Model reconstruction method and system
CN118134652A (en) Asset configuration scheme generation method and device, electronic equipment and medium
CN117273919A (en) Internet credit assessment method for individual households and small micro-enterprise e-commerce merchants
CN118735315A (en) Innovation capability evaluation method and innovation capability scoring model generation method
CN115984007A (en) Enterprise credit risk monitoring method and device, storage medium and electronic equipment
CN116645014A (en) Provider supply data model construction method based on artificial intelligence
CN109685555A (en) Merchant screening method and device, electronic equipment and storage medium
CN115481694A (en) Data enhancement method, device, equipment and storage medium for training sample set
CN114862092A (en) Evaluation method and device based on neural network