CN108052979A - The method, apparatus and equipment merged to model predication value - Google Patents
The method, apparatus and equipment merged to model predication value Download PDFInfo
- Publication number
- CN108052979A CN108052979A CN201711353984.1A CN201711353984A CN108052979A CN 108052979 A CN108052979 A CN 108052979A CN 201711353984 A CN201711353984 A CN 201711353984A CN 108052979 A CN108052979 A CN 108052979A
- Authority
- CN
- China
- Prior art keywords
- predicted value
- interval
- value
- prediction model
- model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/251—Fusion techniques of input or preprocessed data
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
公开了一种对模型预测值进行融合的方法、装置和设备,其中对模型预测值进行融合的方法包括:基于给定的若干样本,按照设定分箱法来分别对在线预测模型的预测值和离线预测模型的预测值进行分箱;根据分箱的结果,将各样本的第一预测值转化为与该第一预测值所处的区间对应的第一区间特征,将各样本的第二预测值转化为与该第二预测值所处的区间对应的第二区间特征;以每一样本对应的所述第一区间特征、所述第二区间特征以及样本的标签构成转化后的样本数据,并利用转化后的样本数据来训练模型,该训练完成的模型用于对在线预测模型的预测值和离线预测模型的预测值进行融合得到最终的预测值。
Disclosed are a method, device and equipment for fusing model prediction values, wherein the method for fusing model prediction values includes: based on a number of given samples, according to the set binning method to respectively classify the prediction values of the online prediction model binning with the prediction value of the offline prediction model; according to the result of binning, the first prediction value of each sample is transformed into the first interval feature corresponding to the interval in which the first prediction value is located, and the second prediction value of each sample is The predicted value is converted into the second interval feature corresponding to the interval in which the second predicted value is located; the converted sample data is composed of the first interval feature, the second interval feature and the label of the sample corresponding to each sample , and use the converted sample data to train the model, and the trained model is used to fuse the prediction value of the online prediction model and the prediction value of the offline prediction model to obtain the final prediction value.
Description
技术领域technical field
本说明书涉及机器学习技术领域,尤其涉及一种对模型预测值进行融合的方法、装置和设备。This specification relates to the technical field of machine learning, and in particular to a method, device and equipment for fusing model prediction values.
背景技术Background technique
机器学习算法是一类能从数据中自动分析获得规律,并利用规律对未知数据进行预测的算法,被广泛应用于诸多领域中。Machine learning algorithm is a kind of algorithm that can automatically analyze and obtain laws from data, and use the laws to predict unknown data. It is widely used in many fields.
在实际应用中,包括在线预测模型和离线预测模型,其中,离线预测模型通常以定时任务来实现,其优势是可以纳入维度较高的特征、并使用较为复杂的算法,从而达到较为精准的预测效果;然而,由于特征较多且算法复杂,预测过程通常较为耗时。相比于离线预测模型,在线预测模型可以使用维度较低的特征以及较为简单的算法来达到更高效的预测,其缺点便是特征不够丰富,准确度不高。可见,在线预测模型和离线预测模型各具优势,如何将两者进行合理的融合是目前业内亟待解决的问题。In practical applications, it includes online prediction model and offline prediction model. Among them, the offline prediction model is usually implemented by timing tasks. Its advantage is that it can incorporate features with higher dimensions and use more complex algorithms to achieve more accurate predictions. effect; however, due to the large number of features and complex algorithms, the prediction process is usually time-consuming. Compared with the offline prediction model, the online prediction model can use lower-dimensional features and simpler algorithms to achieve more efficient prediction. The disadvantage is that the features are not rich enough and the accuracy is not high. It can be seen that the online prediction model and the offline prediction model have their own advantages, and how to integrate them reasonably is an urgent problem in the industry.
发明内容Contents of the invention
针对上述技术问题,本说明书实施例提供一种对模型预测值进行融合的方法、装置和设备,技术方案如下:In view of the above technical problems, the embodiment of this specification provides a method, device and equipment for fusing model prediction values. The technical solution is as follows:
在一个方面,提出的一种对模型预测值进行融合的方法,包括:In one aspect, a method for fusing model predictions is proposed, including:
基于给定的若干样本,按照设定分箱法来分别对在线预测模型的预测值和离线预测模型的预测值进行分箱,其中,所述若干样本中的每一样本包括:第一预测值、第二预测值以及样本的标签,所述第一预测值由在线预测模型预测得到,第二预测值由离线预测模型预测得到;Based on several given samples, according to the set binning method, the predicted value of the online prediction model and the predicted value of the offline prediction model are binned respectively, wherein each sample in the several samples includes: a first predicted value , the second predicted value and the label of the sample, the first predicted value is predicted by the online prediction model, and the second predicted value is predicted by the offline prediction model;
根据分箱的结果,将各样本的第一预测值转化为与该第一预测值所处的区间对应的第一区间特征,将各样本的第二预测值转化为与该第二预测值所处的区间对应的第二区间特征;According to the result of binning, the first predicted value of each sample is converted into the first interval feature corresponding to the interval in which the first predicted value is located, and the second predicted value of each sample is converted into a feature corresponding to the interval of the second predicted value. The second interval feature corresponding to the interval at ;
以每一样本对应的所述第一区间特征、所述第二区间特征以及样本的标签构成转化后的样本数据,并利用转化后的样本数据来训练模型,该训练完成的模型用于对在线预测模型的预测值和离线预测模型的预测值进行融合得到最终的预测值。The transformed sample data is composed of the first interval feature, the second interval feature and the label of the sample corresponding to each sample, and the transformed sample data is used to train the model, and the trained model is used for online The prediction value of the prediction model and the prediction value of the offline prediction model are fused to obtain the final prediction value.
在一个方面,提出的一种对模型预测值进行融合的方法,包括:In one aspect, a method for fusing model predictions is proposed, including:
获取目标用户在第一时间段内产生的业务数据,根据所述业务数据确定输入特征并输入到在线预测模型,输出第一预测值;Obtaining business data generated by the target user within the first time period, determining input features according to the business data and inputting them into the online prediction model, and outputting a first predicted value;
获取利用离线预测模型得到的与所述目标用户对应的第二预测值,其中,所述离线预测模型的输入特征是根据所述目标用户在第二时间段内产生的业务特征来确定的;Acquiring a second prediction value corresponding to the target user obtained by using an offline prediction model, wherein the input characteristics of the offline prediction model are determined according to service characteristics generated by the target user within a second time period;
获取对在线预测模型的第一预测值和离线预测模型的第二预测值进行分箱的结果,分别确定所述第一预测值所处的第一区间和所述第二预测值所处的第二区间;Obtain the results of binning the first predicted value of the online prediction model and the second predicted value of the offline prediction model, and respectively determine the first interval where the first predicted value is located and the second interval where the second predicted value is located. Two intervals;
根据所述第一区间和所述第二区间,利用预先训练得到的模型来对所述第一预测值和所述第二预测值进行融合,得到最终的融合预测值,所述融合预测值用来确定所述目标用户的标签。According to the first interval and the second interval, the model obtained in advance is used to fuse the first predicted value and the second predicted value to obtain a final fused predicted value, and the fused predicted value is used to determine the label of the target user.
在一个方面,提出的一种对模型预测值进行融合的装置,包括:In one aspect, a device for fusing model predictions is proposed, including:
分箱单元,基于给定的若干样本,按照设定分箱法来分别对在线预测模型的预测值和离线预测模型的预测值进行分箱,其中,所述若干样本中的每一样本包括:第一预测值、第二预测值以及样本的标签,所述第一预测值由在线预测模型预测得到,第二预测值由离线预测模型预测得到;The binning unit, based on several given samples, performs binning on the predicted values of the online prediction model and the predicted values of the offline prediction model according to the set binning method, wherein each sample in the several samples includes: a first predicted value, a second predicted value, and a label of the sample, the first predicted value is predicted by an online prediction model, and the second predicted value is predicted by an offline prediction model;
特征转换单元,根据分箱的结果,将各样本的第一预测值转化为与该第一预测值所处的区间对应的第一区间特征,将各样本的第二预测值转化为与该第二预测值所处的区间对应的第二区间特征;The feature conversion unit converts the first predicted value of each sample into a first interval feature corresponding to the interval in which the first predicted value is located according to the result of binning, and converts the second predicted value of each sample into a feature corresponding to the first interval of the first predicted value. The second interval feature corresponding to the interval where the second predicted value is located;
训练单元,以每一样本对应的所述第一区间特征、所述第二区间特征以及样本的标签构成转化后的样本数据,并利用转化后的样本数据来训练模型,该训练完成的模型用于对在线预测模型的预测值和离线预测模型的预测值进行融合得到最终的预测值。The training unit uses the first interval feature, the second interval feature and the label of the sample corresponding to each sample to form the converted sample data, and uses the converted sample data to train the model, and the trained model is used It is used to fuse the prediction value of the online prediction model and the prediction value of the offline prediction model to obtain the final prediction value.
在一个方面,提出的一种对模型预测值进行融合的装置,包括:In one aspect, a device for fusing model predictions is proposed, including:
在线分值预测单元,获取目标用户在触发时刻前的第一时间段内产生的业务数据,根据所述业务数据确定输入特征并输入到在线预测模型,输出第一预测值,所述在线预测模型用于预测用户的标签;The online score prediction unit obtains the business data generated by the target user in the first time period before the trigger moment, determines the input characteristics according to the business data and inputs them into the online prediction model, and outputs the first predicted value. The online prediction model Labels for predicting users;
离线分值获得单元,获取利用离线预测模型得到的与所述目标用户对应的第二预测值,其中,所述离线预测模型的输入特征是根据所述目标用户在过去的第二时间段内产生的业务特征来确定的,所述离线预测模型用于预测用户的标签;The offline score obtaining unit is configured to obtain a second predicted value corresponding to the target user obtained by using an offline prediction model, wherein the input characteristics of the offline prediction model are generated according to the target user in the past second time period Determined by business characteristics, the offline prediction model is used to predict the label of the user;
区间确定单元,根据预先对在线预测模型的预测值和离线预测模型的预测值进行分箱的结果,分别确定所述第一预测值所处的第一区间和所述第二预测值所处的第二区间;The interval determination unit determines the first interval in which the first prediction value is located and the interval in which the second prediction value is located according to the results of pre-binning the prediction values of the online prediction model and the prediction values of the offline prediction model. second interval;
分值融合单元,根据所述第一区间和所述第二区间,利用预先训练得到的模型来对所述第一预测值和所述第二预测值进行融合,得到最终的融合预测值,所述融合预测值用来确定所述目标用户的标签。The score fusion unit, according to the first interval and the second interval, uses the pre-trained model to fuse the first predicted value and the second predicted value to obtain a final fused predicted value, so The fusion prediction value is used to determine the label of the target user.
在一个方面,提出的一种计算机设备,包括:In one aspect, a computer device is provided, comprising:
处理器;processor;
用于存储处理器可执行指令的存储器;memory for storing processor-executable instructions;
所述处理器被配置为:The processor is configured to:
基于给定的若干样本,按照设定分箱法来分别对在线预测模型的预测值和离线预测模型的预测值进行分箱,其中,所述若干样本中的每一样本包括:第一预测值、第二预测值以及样本的标签,所述第一预测值由在线预测模型预测得到,第二预测值由离线预测模型预测得到;Based on several given samples, according to the set binning method, the predicted value of the online prediction model and the predicted value of the offline prediction model are binned respectively, wherein each sample in the several samples includes: a first predicted value , the second predicted value and the label of the sample, the first predicted value is predicted by the online prediction model, and the second predicted value is predicted by the offline prediction model;
根据分箱的结果,将各样本的第一预测值转化为与该第一预测值所处的区间对应的第一区间特征,将各样本的第二预测值转化为与该第二预测值所处的区间对应的第二区间特征;According to the result of binning, the first predicted value of each sample is converted into the first interval feature corresponding to the interval in which the first predicted value is located, and the second predicted value of each sample is converted into a feature corresponding to the interval of the second predicted value. The second interval feature corresponding to the interval at ;
以每一样本对应的所述第一区间特征、所述第二区间特征以及样本的标签构成转化后的样本数据,并利用转化后的样本数据来训练模型,该训练完成的模型用于对在线预测模型的预测值和离线预测模型的预测值进行融合得到最终的预测值。The transformed sample data is composed of the first interval feature, the second interval feature and the label of the sample corresponding to each sample, and the transformed sample data is used to train the model, and the trained model is used for online The prediction value of the prediction model and the prediction value of the offline prediction model are fused to obtain the final prediction value.
在一个方面,提出的一种计算机设备,包括:In one aspect, a computer device is provided, comprising:
处理器;processor;
用于存储处理器可执行指令的存储器;memory for storing processor-executable instructions;
所述处理器被配置为:The processor is configured to:
在线分值预测单元,获取目标用户在触发时刻前的第一时间段内产生的业务数据,根据所述业务数据确定输入特征并输入到在线预测模型,输出第一预测值,所述在线预测模型用于预测用户的标签;The online score prediction unit obtains the business data generated by the target user in the first time period before the trigger moment, determines the input characteristics according to the business data and inputs them into the online prediction model, and outputs the first predicted value. The online prediction model Labels for predicting users;
离线分值获得单元,获取利用离线预测模型得到的与所述目标用户对应的第二预测值,其中,所述离线预测模型的输入特征是根据所述目标用户在过去的第二时间段内产生的业务特征来确定的,所述离线预测模型用于预测用户的标签;The offline score obtaining unit is configured to obtain a second predicted value corresponding to the target user obtained by using an offline prediction model, wherein the input characteristics of the offline prediction model are generated according to the target user in the past second time period Determined by business characteristics, the offline prediction model is used to predict the label of the user;
区间确定单元,根据预先对在线预测模型的预测值和离线预测模型的预测值进行分箱的结果,分别确定所述第一预测值所处的第一区间和所述第二预测值所处的第二区间;The interval determination unit determines the first interval in which the first prediction value is located and the interval in which the second prediction value is located according to the results of pre-binning the prediction values of the online prediction model and the prediction values of the offline prediction model. second interval;
分值融合单元,根据所述第一区间和所述第二区间,利用预先训练得到的模型来对所述第一预测值和所述第二预测值进行融合,得到最终的融合预测值,所述融合预测值用来确定所述目标用户的标签。The score fusion unit, according to the first interval and the second interval, uses the pre-trained model to fuse the first predicted value and the second predicted value to obtain a final fused predicted value, so The fusion prediction value is used to determine the label of the target user.
本说明书实施例所提供的技术方案所产生的效果包括:The effects produced by the technical solutions provided in the embodiments of this specification include:
通过机器学习得到的模型来对所述线预测模型的预测值和所述离线预测模型的预测值进行融合,最终利用融合得到的分值来对用户的标签进行预测,从而在提高了对用户的标签进行预测的准确性的同时,还满足了业务对低时延的要求。The prediction value of the line prediction model and the prediction value of the offline prediction model are fused using the model obtained by machine learning, and finally the score obtained by fusion is used to predict the user's label, thereby improving the user's While predicting the accuracy of tags, it also meets the business requirements for low latency.
应当理解的是,以上的一般描述和后文的细节描述仅是示例性和解释性的,并不能限制本说明书实施例。It should be understood that the above general description and the following detailed description are only exemplary and explanatory, and are not intended to limit the embodiments of this specification.
此外,本说明书实施例中的任一实施例并不需要达到上述的全部效果。In addition, any embodiment in the embodiments of this specification does not need to achieve all the above-mentioned effects.
附图说明Description of drawings
为了更清楚地说明本说明书实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本说明书实施例中记载的一些实施例,对于本领域普通技术人员来讲,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of this specification or the prior art, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the drawings in the following description are only These are some embodiments described in the embodiments of this specification, and those skilled in the art can also obtain other drawings based on these drawings.
图1是本说明书实施例提供的一种对模型预测值进行融合的方法的流程示意图;Fig. 1 is a schematic flowchart of a method for fusing model prediction values provided by an embodiment of this specification;
图2是本说明书实施例提供的一种确定融合权重的过程;FIG. 2 is a process of determining fusion weights provided by an embodiment of this specification;
图3是本说明书实施例提供的一种对模型预测值进行融合的装置(权重训练阶段)的结构示意图;Fig. 3 is a schematic structural diagram of a device for fusing model prediction values (weight training stage) provided by an embodiment of this specification;
图4是本说明书实施例提供的一种对模型预测值进行融合的装置(分值融合阶段)的结构示意图;Fig. 4 is a schematic structural diagram of a device for fusing model prediction values (score fusion stage) provided by an embodiment of this specification;
图5是用于配置本说明书实施例装置的一种设备的结构示意图。Fig. 5 is a schematic structural diagram of a device for configuring the device of the embodiment of the present specification.
具体实施方式Detailed ways
为了使本领域技术人员更好地理解本说明书实施例中的技术方案,下面将结合本说明书实施例中的附图,对本说明书实施例中的技术方案进行详细地描述,显然,所描述的实施例仅仅是本说明书的一部分实施例,而不是全部的实施例。基于本说明书中的实施例,本领域普通技术人员所获得的所有其他实施例,都应当属于保护的范围。In order for those skilled in the art to better understand the technical solutions in the embodiments of this specification, the technical solutions in the embodiments of this specification will be described in detail below in conjunction with the drawings in the embodiments of this specification. Obviously, the described implementation Examples are only some of the embodiments in this specification, not all of them. All other embodiments obtained by persons of ordinary skill in the art based on the embodiments in this specification shall fall within the scope of protection.
参见图1所示,在本说明书一实施例中,一种对模型预测值进行融合的方法,其用来对在线预测模型所得到的分值和离线预测模型所得到的分值进行融合,该方法可以包括下述步骤101~104,其中:Referring to Figure 1, in one embodiment of this specification, a method for fusing model prediction values is used to fuse the scores obtained by the online prediction model and the scores obtained by the offline prediction model. The method may include the following steps 101-104, wherein:
步骤101:获取目标用户在第一时间段内产生的业务数据,根据所述业务数据确定输入特征并输入到在线预测模型,输出第一预测值。Step 101: Obtain the business data generated by the target user within the first time period, determine the input features according to the business data and input them into the online prediction model, and output the first predicted value.
步骤102:获取利用离线预测模型得到的与所述目标用户对应的第二预测值,其中,所述离线预测模型的输入特征是根据所述目标用户在第二时间段内产生的业务特征来确定的。Step 102: Obtain the second predicted value corresponding to the target user obtained by using the offline prediction model, wherein the input characteristics of the offline prediction model are determined according to the business characteristics generated by the target user within the second time period of.
本文中,所述在线预测模型和所述离线预测模型均为利用机器学习算法构建的用来对用户的标签进行预测的模型。这两个模型所需预测的用户标签可以是与具体业务相关的,比如:对于一种网络支付业务,所需预测的用户标签可以分为:“高风险用户”、“中风险用户”、“低风险用户”,等等。对于一种信息推荐业务,所需预测的用户标签可以分为:“体育类”、“教育类”、“财经类”,等等。在线预测模型和离线预测模型都是采用一定数量的训练样本来训练的,这些训练样本中的每一样本可以包括:样本用户在参与特定业务(如网络支付业务)的过程中所产生的一种或多种行为数据,以及样本用户被确定的标签。其中,可以采用同一批样本来对上述在线预测模型和离线预测模型进行训练,也可以采用两批不同的样本来对在线预测模型和离线预测模型进行训练,本文不作限制。Herein, both the online prediction model and the offline prediction model are models constructed using machine learning algorithms and used to predict user tags. The user tags that these two models need to predict can be related to specific businesses. For example, for an online payment business, the user tags that need to be predicted can be divided into: "high-risk users", "medium-risk users", " low-risk users", and so on. For an information recommendation service, the user labels to be predicted can be classified into: "sports", "education", "finance", etc. Both the online prediction model and the offline prediction model are trained with a certain number of training samples, and each of these training samples may include: a sample generated by a sample user in the process of participating in a specific service (such as an online payment service) or a variety of behavioral data, as well as the identified labels of sample users. Wherein, the same batch of samples can be used to train the online prediction model and the offline prediction model, or two different batches of samples can be used to train the online prediction model and the offline prediction model, which is not limited herein.
在本说明书实施例中,离线预测模型可以是通过定时任务来实现的,如:每天在指定时刻或指定时间段执行一次离线的分值预测,该预测过程可以是针对全量用户的;而在线预测模型可以由特定用户的操作来触发,如:用户点击某个网页的行为便可以触发一次在线预测模型的分值计算过程。In the embodiment of this specification, the offline prediction model can be implemented through timed tasks, such as: perform an offline score prediction at a specified time or a specified time period every day, and the prediction process can be aimed at all users; while online prediction The model can be triggered by a specific user's operation, for example, a user's click on a webpage can trigger a score calculation process of the online prediction model.
因为离线预测模型相较于在线预测模型,通常采用更高维度的特征数据,特征数据的时间跨度也可以更长,且可以采用更加复杂的算法。如图1所示,以特定例子来说,在T日,离线预测模型可以获取每一用户在T-1日在参与特定业务的过程中所产生的业务数据(特征A),根据获得的业务数据(特征A)进行相应的处理,可以得到输入特征并输入到离线预测模型中,得到各用户的离线预测分值(即文中的第二预测值)并写入到数据库X中。而对于在线预测模型,可以不断采集用户的在线特征数据(特征B)并写入到数据库Y中,其中,所述在线特征数据可以是用户在参与特定业务的过程中所产生的准实时的业务数据,例如:在线预测的触发时刻为t1,则在线特征数据可以是t0~t1(如3分钟)这段时间段内所产生的业务数据。可见,在用来发起预测流程的用户请求到来后,调度器需要做两个任务,其一是从数据库X中读取最近一次由离线预测模型计算获得的与目标用户对应的第二预测值;其二是从数据库Y中读取该目标用户的在线特征数据来进行接下来的在线预测模型的分值预测过程。Because offline prediction models usually use higher-dimensional feature data than online prediction models, the time span of feature data can also be longer, and more complex algorithms can be used. As shown in Figure 1, as a specific example, on day T, the offline prediction model can obtain the business data (feature A) generated by each user in the process of participating in a specific business on day T-1, according to the obtained business data The data (feature A) is processed accordingly, and the input features can be obtained and input into the offline prediction model, and the offline prediction score of each user (that is, the second prediction value in the text) is obtained and written into the database X. For the online prediction model, the user's online characteristic data (feature B) can be continuously collected and written into the database Y, wherein the online characteristic data can be quasi-real-time business generated by the user in the process of participating in a specific business Data, for example: the triggering time of the online prediction is t1, then the online feature data may be the business data generated during the time period of t0-t1 (for example, 3 minutes). It can be seen that after the user request for initiating the prediction process arrives, the scheduler needs to do two tasks. One is to read the second prediction value corresponding to the target user obtained by the latest offline prediction model calculation from the database X; The second is to read the online feature data of the target user from the database Y to carry out the next process of score prediction of the online prediction model.
至此,对于任何一个目标用户,都可以通过在线预测模型获得一个预测分值,和通过离线预测模型获得一个预测分值。So far, for any target user, a prediction score can be obtained through the online prediction model, and a prediction score can be obtained through the offline prediction model.
步骤103:根据预先对在线预测模型的预测值和离线预测模型的预测值进行分箱的结果,分别确定所述第一预测值所处的第一区间和所述第二预测值所处的第二区间。Step 103: According to the results of pre-binning the predicted values of the online prediction model and the predicted values of the offline prediction model, respectively determine the first interval in which the first predicted value is located and the first interval in which the second predicted value is located. Two intervals.
步骤104:根据所述第一区间和所述第二区间,利用预先训练得到的模型来对所述第一预测值和所述第二预测值进行融合,得到最终的融合预测值,其中,所述融合预测值用来确定所述目标用户的标签。Step 104: According to the first interval and the second interval, use the pre-trained model to fuse the first predicted value and the second predicted value to obtain a final fused predicted value, wherein the The fusion prediction value is used to determine the label of the target user.
在一可选的实施例中,步骤104可以具体包括:In an optional embodiment, step 104 may specifically include:
步骤1041:基于预先确定的与分箱得到的各区间对应的权重,获得与所述第一区间对应的第一权重及与所述第二区间对应的第二权重。其中,所述模型的待训练参数包括与分箱得到的各区间对应的权重。Step 1041: Obtain a first weight corresponding to the first interval and a second weight corresponding to the second interval based on the predetermined weights corresponding to each interval obtained by binning. Wherein, the parameters to be trained of the model include weights corresponding to each interval obtained by binning.
步骤1042:利用所述第一权重和所述第二权重来确定融合预测值,所述融合预测值用来确定所述目标用户的标签。Step 1042: Using the first weight and the second weight to determine a fusion prediction value, the fusion prediction value is used to determine the label of the target user.
由于上述步骤103~步骤104需要基于分箱结果和与分箱得到的各区间对应的权重来实现,故,在详细介绍步骤103~步骤104之前,需要介绍一种确定融合权重的方法。如图2所示,在一实施例中,该方法包括步骤201~步骤203,其中:Since the above steps 103 to 104 need to be implemented based on the binning results and the weights corresponding to the intervals obtained by binning, therefore, before introducing steps 103 to 104 in detail, it is necessary to introduce a method for determining fusion weights. As shown in Figure 2, in an embodiment, the method includes steps 201 to 203, wherein:
步骤201:基于给定的若干样本,按照设定分箱法来分别对在线预测模型的预测值和离线预测模型的预测值进行分箱,其中,所述若干样本中的每一样本包括:第一预测值、第二预测值以及样本的标签,所述第一预测值由在线预测模型预测得到,第二预测值由离线预测模型预测得到。Step 201: Based on several given samples, according to the set binning method, the predicted values of the online prediction model and the predicted values of the offline prediction model are respectively binned, wherein each sample in the several samples includes: A predicted value, a second predicted value and a label of the sample, the first predicted value is predicted by an online prediction model, and the second predicted value is predicted by an offline prediction model.
该步骤201中提及的样本可以与用来训练上述离线预测模型和/或在线预测模型的样本相同,当然,也可以是不同的样本,对此不作限制。The samples mentioned in step 201 may be the same as the samples used to train the above-mentioned offline prediction model and/or online prediction model, of course, they may also be different samples, which is not limited.
在一实施例中,所述设定分箱法可以为基于熵的分箱法。基于熵的分箱法是在分箱时考虑因变量的取值,使得分箱后达到最小熵(minimumentropy)。基于熵的分箱法的好处是能够在高分值区域展示较好的区分性。当然,所述设定分箱法还可以是基于基尼的分箱法、或等频分箱法等。In an embodiment, the set binning method may be an entropy-based binning method. The entropy-based binning method considers the value of the dependent variable when binning, so that the minimum entropy can be achieved after binning. The benefit of entropy-based binning is that it can show better discrimination in high-scoring regions. Certainly, the set binning method may also be a Gini-based binning method, or an equal frequency binning method, and the like.
步骤202:根据分箱的结果,将各样本的第一预测值转化为与该第一预测值所处的区间对应的第一区间特征,将各样本的第二预测值转化为与该第二预测值所处的区间对应的第二区间特征。Step 202: According to the result of binning, transform the first predicted value of each sample into a first interval feature corresponding to the interval in which the first predicted value is located, and convert the second predicted value of each sample into a feature corresponding to the second interval. The second interval feature corresponding to the interval where the predicted value is located.
在一个例子中,假设第一预测值和第二预测值都是介于0~1之间,则对在线预测模型的预测值进行分箱后,所得到的分割点包括:0、0.1、0.13、0.15、0.2、0.3、0.5、1;对离线预测模型的预测值进行分箱后,所得到的分割点包括:0、0.03、0.05、0.08、0.09、0.11、0.13、1;也就是说,在线预测模型和离线预测模型的输出值在分箱后分别得到7个区间。In one example, assuming that the first predicted value and the second predicted value are both between 0 and 1, after binning the predicted value of the online prediction model, the obtained segmentation points include: 0, 0.1, 0.13 . The output values of the online prediction model and the offline prediction model are divided into 7 intervals respectively after binning.
在一实施例中,可以采用one-hot规则来实现步骤202的特征转化。假设一个样本的第一预测值为0.17,第二预测值为0.12,则由于0.17处于第4个区间(0.15,0.2)内,0.12处于第6个区间(0.11,0.13)内,采用one-hot规则可以将第一预测值:0.17转换为第一区间特征:on-bin-0001000(“on-bin”为在线预测模型的标识),将第二预测值:0.12转换为第二区间特征:off-bin-0000010(“off-bin”为离线预测模型的标识)。按照同样的方法,可以逐一对其他样本中的第一预测值和第二预测值进行特征转化。In an embodiment, the feature conversion in step 202 can be realized by using the one-hot rule. Assuming that the first predicted value of a sample is 0.17 and the second predicted value is 0.12, since 0.17 is in the fourth interval (0.15, 0.2) and 0.12 is in the sixth interval (0.11, 0.13), one-hot The rule can convert the first predicted value: 0.17 to the first interval feature: on-bin-0001000 ("on-bin" is the identifier of the online prediction model), and convert the second predicted value: 0.12 to the second interval feature: off -bin-0000010 ("off-bin" is the identifier of the offline prediction model). According to the same method, feature transformation can be performed on the first predicted value and the second predicted value in other samples one by one.
步骤203:以每一样本对应的所述第一区间特征、所述第二区间特征以及样本的标签构成转化后的样本数据,并利用转化后的样本数据来训练模型,该训练完成的模型用于对在线预测模型的预测值和离线预测模型的预测值进行融合得到最终的预测值。Step 203: Use the first interval feature, the second interval feature and the label of the sample corresponding to each sample to form the converted sample data, and use the converted sample data to train the model. The trained model is used It is used to fuse the prediction value of the online prediction model and the prediction value of the offline prediction model to obtain the final prediction value.
其中,所述转化后的样本数据除了所述第一区间特征、所述第二区间特征以及样本的标签之外,还可以包括其他数据。即,所述“构成”并不是封闭的。Wherein, the converted sample data may also include other data besides the first interval feature, the second interval feature and the label of the sample. That is, the "composition" is not closed.
在以上例子中,在特征转化前,某条样本数据例如为:In the above example, before feature conversion, a piece of sample data is, for example:
{0.17,0.12,“中风险用户”};{0.17, 0.12, "Medium Risk User"};
在特征转化后,得到的新的一条样本数据例如为:After feature conversion, the new piece of sample data obtained is, for example:
{0001000,0000010,“中风险用户”}{0001000, 0000010, "Medium Risk User"}
本文待训练的模型可以为线性模型或非线性模型,在采用线性模型的一种实施例中,所述模型的待训练参数可以包括与分箱得到的各区间对应的权重,所述权重可以用于对线预测模型的预测值和离线预测模型的预测值进行融合得到最终的预测值。待训练的模型可以是逻辑回归(Logistic Regression,LR)模型,其中,可以为分箱得到的各区间分别分配一个权重,并将该权重作为LR模型的参数进行训练,最终可以求解出各个权重值。上述权重可以为相应区间的一个评分,该评分不仅是在不同模型特征间(在线、离线模型),也是在各个分数区间之间做了一个全局的重要性权衡和学习。The model to be trained herein can be a linear model or a nonlinear model. In an embodiment using a linear model, the parameters to be trained of the model can include weights corresponding to each interval obtained by binning, and the weights can be used The final prediction value is obtained by merging the prediction value of the line prediction model and the prediction value of the offline prediction model. The model to be trained can be a Logistic Regression (LR) model, wherein a weight can be assigned to each interval obtained by binning, and the weight can be used as a parameter of the LR model for training, and finally each weight value can be obtained . The above weight can be a score for the corresponding interval. The score is not only between different model features (online and offline models), but also a global importance trade-off and learning between each score interval.
沿用上文提到的例子,最终可以得到以下权重:Following the example mentioned above, the following weights can finally be obtained:
区间(0,0.1)的权重on-bin-1=1.054,The weight of interval (0,0.1) on-bin-1=1.054,
……...
区间(0.5,1)的权重on-bin-7=4.439;The weight of interval (0.5,1) on-bin-7=4.439;
区间(0,0.03)的权重off-bin-1=0.604,The weight of interval (0,0.03) off-bin-1=0.604,
……...
区间(0.13,1)的权重off-bin-7=3.237。The weight off-bin-7=3.237 for the interval (0.13,1).
接下来,继续结合以上具体例子来对上述步骤103至步骤104进行说明。假设对于某个目标用户,通过在线预测模型获得的第一预测值为0.66,通过离线预测模型获得的第二预测值为0.25,则结合上述例子,首先在步骤103中,确定所述第一预测值0.4所处的第一区间为:(0.5,1),所述第二预测值0.25所处的第二区间为:(0.13,1)。随后在步骤1041中,基于预先确定的与分箱得到的各区间对应的权重,可以获得与所述第一区间:(0.5,1)对应的第一权重是:4.439,与所述第二区间:(0.13,1)对应的第二权重是:3.237。Next, continue to describe the above steps 103 to 104 in conjunction with the above specific examples. Assuming that for a certain target user, the first predicted value obtained through the online prediction model is 0.66, and the second predicted value obtained through the offline prediction model is 0.25, then in combination with the above example, first in step 103, the first predicted value is determined The first interval of the value 0.4 is: (0.5, 1), and the second interval of the second predicted value 0.25 is: (0.13, 1). Then in step 1041, based on the predetermined weights corresponding to each interval obtained by binning, the first weight corresponding to the first interval: (0.5, 1) can be obtained: 4.439, and the second interval : (0.13, 1) corresponds to the second weight: 3.237.
最终,在步骤1042中,可以根据上述第一权重和第二权重来确定最终的融合预测值,在可选的实施例中,可以将所述第一权重和所述第二权重进行求和,并将求和结果作为融合预测值,即融合预测值=4.439+3.237=7.676。当然,融合的具体方式并不限于求和,如:求平均等。最终,可以根据具体业务来决定如何运用所述融合预测值。Finally, in step 1042, the final fused prediction value may be determined according to the first weight and the second weight, and in an optional embodiment, the first weight and the second weight may be summed, And the summation result is used as the fusion prediction value, that is, the fusion prediction value=4.439+3.237=7.676. Of course, the specific way of fusion is not limited to summation, such as averaging. Finally, how to use the fused prediction value can be decided according to the specific business.
本说明书实施例所提供的技术方案所产生的效果包括:The effects produced by the technical solutions provided in the embodiments of this specification include:
通过机器学习得到的权重来对所述线预测模型的预测值和所述离线预测模型的预测值进行融合,最终利用融合得到的分值来对用户的标签进行预测,从而在提高了对用户的标签进行预测的准确性的同时,还满足了业务对低时延的要求。此外,利用基于熵的分箱和逻辑回归模型,将在线模型分值和离线模型分值进行有效整合,使得在线离线分值之间的可比性在机器学习过程中得到自适应调整。The predicted value of the line prediction model and the predicted value of the offline prediction model are fused through the weight obtained by machine learning, and finally the score obtained by fusion is used to predict the user's label, thereby improving the user's While predicting the accuracy of tags, it also meets the business requirements for low latency. In addition, using entropy-based binning and logistic regression models, the online model scores and offline model scores are effectively integrated, so that the comparability between online and offline scores can be adaptively adjusted during the machine learning process.
相应于上述方法实施例,本说明书实施例还提供一种对模型预测值进行融合的装置。Corresponding to the foregoing method embodiments, the embodiments of this specification further provide an apparatus for fusing model prediction values.
参见图3所示,在一实施例中,在融合权重的训练阶段,一种确定融合权重的装置300可以包括:Referring to FIG. 3, in an embodiment, in the training phase of fusion weights, a device 300 for determining fusion weights may include:
分箱单元301,被配置为:基于给定的若干样本,按照设定分箱法来分别对在线预测模型的预测值和离线预测模型的预测值进行分箱,其中,所述若干样本中的每一样本包括:第一预测值、第二预测值以及样本的标签,所述第一预测值由在线预测模型预测得到,第二预测值由离线预测模型预测得到;The binning unit 301 is configured to bin the predicted values of the online prediction model and the predicted values of the offline prediction model according to a set binning method based on several given samples, wherein, among the several samples Each sample includes: a first predicted value, a second predicted value and a label of the sample, the first predicted value is predicted by an online prediction model, and the second predicted value is predicted by an offline prediction model;
特征转换单元302,被配置为:根据分箱的结果,将各样本的第一预测值转化为与该第一预测值所处的区间对应的第一区间特征,将各样本的第二预测值转化为与该第二预测值所处的区间对应的第二区间特征;The feature conversion unit 302 is configured to: convert the first predicted value of each sample into a first interval feature corresponding to the interval in which the first predicted value is located according to the binning result, and convert the second predicted value of each sample into Converted into the second interval feature corresponding to the interval where the second predicted value is located;
训练单元303,被配置为:以每一样本对应的所述第一区间特征、所述第二区间特征以及样本的标签构成转化后的样本数据,并利用转化后的样本数据来训练模型,该训练完成的模型用于对在线预测模型的预测值和离线预测模型的预测值进行融合得到最终的预测值。The training unit 303 is configured to: use the first interval feature, the second interval feature and the label of the sample corresponding to each sample to form converted sample data, and use the converted sample data to train the model, the The trained model is used to fuse the prediction value of the online prediction model and the prediction value of the offline prediction model to obtain the final prediction value.
参见图4所示,在一实施例中,在分值融合阶段,一种对模型预测值进行融合的装置400可以包括:Referring to FIG. 4, in an embodiment, in the score fusion stage, an apparatus 400 for fusing model prediction values may include:
在线分值预测单元401,被配置为:获取目标用户在触发时刻前的第一时间段内产生的业务数据,根据所述业务数据确定输入特征并输入到在线预测模型,输出第一预测值,所述在线预测模型用于预测用户的标签;The online score prediction unit 401 is configured to: acquire the business data generated by the target user within the first time period before the trigger moment, determine the input features according to the business data and input them into the online prediction model, and output the first predicted value, The online prediction model is used to predict the label of the user;
离线分值获得单元402,被配置为:获取利用离线预测模型得到的与所述目标用户对应的第二预测值,其中,所述离线预测模型的输入特征是根据所述目标用户在过去的第二时间段内产生的业务特征来确定的,所述离线预测模型用于预测用户的标签;The offline score obtaining unit 402 is configured to: obtain a second predicted value corresponding to the target user obtained by using an offline prediction model, wherein the input characteristics of the offline prediction model are based on the target user's past first determined by the business characteristics generated within two time periods, and the offline prediction model is used to predict the label of the user;
区间确定单元403,被配置为:根据预先对在线预测模型的预测值和离线预测模型的预测值进行分箱的结果,分别确定所述第一预测值所处的第一区间和所述第二预测值所处的第二区间;The interval determination unit 403 is configured to: respectively determine the first interval where the first prediction value is located and the second interval where the first prediction value is located according to the results of binning the prediction values of the online prediction model and the prediction values of the offline prediction model in advance. The second interval where the predicted value is located;
权重确定单元404,被配置为:根据所述第一区间和所述第二区间,利用预先训练得到的模型来对所述第一预测值和所述第二预测值进行融合,得到最终的融合预测值,所述融合预测值用来确定所述目标用户的标签。The weight determining unit 404 is configured to: according to the first interval and the second interval, use a pre-trained model to fuse the first predicted value and the second predicted value to obtain a final fusion A predicted value, the fused predicted value is used to determine the label of the target user.
在一可选实施例中,所述分值融合单元404可包括:In an optional embodiment, the score fusion unit 404 may include:
权重确定子单元,基于预先确定的与分箱得到的各区间对应的权重,获得与所述第一区间对应的第一权重及与所述第二区间对应的第二权重;The weight determining subunit obtains the first weight corresponding to the first interval and the second weight corresponding to the second interval based on the predetermined weight corresponding to each interval obtained by binning;
融合子单元,利用所述第一权重和所述第二权重来确定融合预测值,所述融合预测值用来确定所述目标用户的标签。The fusion subunit is configured to determine a fusion prediction value by using the first weight and the second weight, and the fusion prediction value is used to determine the label of the target user.
在一实施例中,所述融合子单元可以被配置为:In an embodiment, the fusion subunit may be configured as:
将所述第一权重和所述第二权重进行求和,并将求和结果作为融合预测值。Summing the first weight and the second weight, and using the sum result as a fusion prediction value.
上述装置中各个模块的功能和作用的实现过程具体详见上述方法中对应步骤的实现过程,在此不再赘述。For the implementation process of the functions and effects of each module in the above-mentioned device, please refer to the implementation process of the corresponding steps in the above-mentioned method for details, and details will not be repeated here.
本说明书实施例还提供一种计算机设备(如服务器),其至少包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,其中,处理器执行所述程序时实现前述方法。The embodiment of this specification also provides a computer device (such as a server), which at least includes a memory, a processor, and a computer program stored on the memory and operable on the processor, wherein the aforementioned method is implemented when the processor executes the program .
图5示出了本说明书实施例所提供的一种更为具体的计算设备硬件结构示意图,该设备可以包括:处理器1010、存储器1020、输入/输出接口1030、通信接口1040和总线1050。其中处理器1010、存储器1020、输入/输出接口1030和通信接口1040通过总线1050实现彼此之间在设备内部的通信连接。FIG. 5 shows a schematic diagram of a more specific hardware structure of a computing device provided by the embodiment of this specification. The device may include: a processor 1010 , a memory 1020 , an input/output interface 1030 , a communication interface 1040 and a bus 1050 . The processor 1010 , the memory 1020 , the input/output interface 1030 and the communication interface 1040 are connected to each other within the device through the bus 1050 .
处理器1010可以采用通用的CPU(Central Processing Unit,中央处理器)、微处理器、应用专用集成电路(Application Specific Integrated Circuit,ASIC)、或者一个或多个集成电路等方式实现,用于执行相关程序,以实现本说明书实施例所提供的技术方案。The processor 1010 may be implemented by a general-purpose CPU (Central Processing Unit, central processing unit), a microprocessor, an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), or one or more integrated circuits, and is used to execute related programs to realize the technical solutions provided by the embodiments of this specification.
存储器1020可以采用ROM(Read Only Memory,只读存储器)、RAM(Random AccessMemory,随机存取存储器)、静态存储设备,动态存储设备等形式实现。存储器1020可以存储操作系统和其他应用程序,在通过软件或者固件来实现本说明书实施例所提供的技术方案时,相关的程序代码保存在存储器1020中,并由处理器1010来调用执行。The memory 1020 may be implemented in the form of ROM (Read Only Memory, read only memory), RAM (Random Access Memory, random access memory), static storage device, dynamic storage device, and the like. The memory 1020 can store operating systems and other application programs. When implementing the technical solutions provided by the embodiments of this specification through software or firmware, the relevant program codes are stored in the memory 1020 and invoked by the processor 1010 for execution.
输入/输出接口1030用于连接输入/输出模块,以实现信息输入及输出。输入输出/模块可以作为组件配置在设备中(图中未示出),也可以外接于设备以提供相应功能。其中输入设备可以包括键盘、鼠标、触摸屏、麦克风、各类传感器等,输出设备可以包括显示器、扬声器、振动器、指示灯等。The input/output interface 1030 is used to connect the input/output module to realize information input and output. The input/output/module can be configured in the device as a component (not shown in the figure), or can be externally connected to the device to provide corresponding functions. The input device may include a keyboard, mouse, touch screen, microphone, various sensors, etc., and the output device may include a display, a speaker, a vibrator, an indicator light, and the like.
通信接口1040用于连接通信模块(图中未示出),以实现本设备与其他设备的通信交互。其中通信模块可以通过有线方式(例如USB、网线等)实现通信,也可以通过无线方式(例如移动网络、WIFI、蓝牙等)实现通信。The communication interface 1040 is used to connect a communication module (not shown in the figure), so as to realize the communication interaction between the device and other devices. The communication module can realize communication through wired means (such as USB, network cable, etc.), and can also realize communication through wireless means (such as mobile network, WIFI, Bluetooth, etc.).
总线1050包括一通路,在设备的各个组件(例如处理器1010、存储器1020、输入/输出接口1030和通信接口1040)之间传输信息。Bus 1050 includes a path that carries information between the various components of the device (eg, processor 1010, memory 1020, input/output interface 1030, and communication interface 1040).
需要说明的是,尽管上述设备仅示出了处理器1010、存储器1020、输入/输出接口1030、通信接口1040以及总线1050,但是在具体实施过程中,该设备还可以包括实现正常运行所必需的其他组件。此外,本领域的技术人员可以理解的是,上述设备中也可以仅包含实现本说明书实施例方案所必需的组件,而不必包含图中所示的全部组件。It should be noted that although the above device only shows the processor 1010, the memory 1020, the input/output interface 1030, the communication interface 1040 and the bus 1050, in the specific implementation process, the device may also include other components. In addition, those skilled in the art can understand that the above-mentioned device may only include components necessary to implement the solutions of the embodiments of this specification, and does not necessarily include all the components shown in the figure.
通过以上的实施方式的描述可知,本领域的技术人员可以清楚地了解到本说明书实施例可借助软件加必需的通用硬件平台的方式来实现。基于这样的理解,本说明书实施例的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品可以存储在存储介质中,如ROM/RAM、磁碟、光盘等,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本说明书实施例各个实施例或者实施例的某些部分所述的方法。It can be known from the above description of the implementation manners that those skilled in the art can clearly understand that the embodiments of this specification can be implemented by means of software plus a necessary general hardware platform. Based on this understanding, the essence of the technical solutions of the embodiments of this specification or the part that contributes to the prior art can be embodied in the form of software products, and the computer software products can be stored in storage media, such as ROM/RAM, A magnetic disk, an optical disk, etc., include several instructions to enable a computer device (which may be a personal computer, server, or network device, etc.) to execute the methods described in various embodiments or some parts of the embodiments of this specification.
上述实施例阐明的系统、装置、模块或单元,具体可以由计算机芯片或实体实现,或者由具有某种功能的产品来实现。一种典型的实现设备为计算机,计算机的具体形式可以是个人计算机、膝上型计算机、蜂窝电话、相机电话、智能电话、个人数字助理、媒体播放器、导航设备、电子邮件收发设备、游戏控制台、平板计算机、可穿戴设备或者这些设备中的任意几种设备的组合。The systems, devices, modules, or units described in the above embodiments can be specifically implemented by computer chips or entities, or by products with certain functions. A typical implementing device is a computer, which may take the form of a personal computer, laptop computer, cellular phone, camera phone, smart phone, personal digital assistant, media player, navigation device, e-mail device, game control device, etc. desktops, tablets, wearables, or any combination of these.
本说明书中的各个实施例均采用递进的方式描述,各个实施例之间相同相似的部分互相参见即可,每个实施例重点说明的都是与其他实施例的不同之处。尤其,对于装置实施例而言,由于其基本相似于方法实施例,所以描述得比较简单,相关之处参见方法实施例的部分说明即可。以上所描述的装置实施例仅仅是示意性的,其中所述作为分离部件说明的模块可以是或者也可以不是物理上分开的,在实施本说明书实施例方案时可以把各模块的功能在同一个或多个软件和/或硬件中实现。也可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。本领域普通技术人员在不付出创造性劳动的情况下,即可以理解并实施。Each embodiment in this specification is described in a progressive manner, the same and similar parts of each embodiment can be referred to each other, and each embodiment focuses on the differences from other embodiments. In particular, as for the device embodiment, since it is basically similar to the method embodiment, the description is relatively simple, and for relevant parts, please refer to part of the description of the method embodiment. The device embodiments described above are only illustrative, and the modules described as separate components may or may not be physically separated, and the functions of each module may be integrated in the same or multiple software and/or hardware implementations. Part or all of the modules can also be selected according to actual needs to achieve the purpose of the solution of this embodiment. It can be understood and implemented by those skilled in the art without creative effort.
以上所述仅是本说明书实施例的具体实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本说明书实施例原理的前提下,还可以做出若干改进和润饰,这些改进和润饰也应视为本说明书实施例的保护范围。The above is only the specific implementation of the embodiment of this specification. It should be pointed out that for those of ordinary skill in the art, without departing from the principle of the embodiment of this specification, some improvements and modifications can also be made. These Improvements and modifications should also be regarded as the scope of protection of the embodiments of this specification.
Claims (14)
Priority Applications (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201711353984.1A CN108052979A (en) | 2017-12-15 | 2017-12-15 | The method, apparatus and equipment merged to model predication value |
| TW107135970A TWI718422B (en) | 2017-12-15 | 2018-10-12 | Method, device and equipment for fusing model prediction values |
| PCT/CN2018/111824 WO2019114423A1 (en) | 2017-12-15 | 2018-10-25 | Method and apparatus for merging model prediction values, and device |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201711353984.1A CN108052979A (en) | 2017-12-15 | 2017-12-15 | The method, apparatus and equipment merged to model predication value |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CN108052979A true CN108052979A (en) | 2018-05-18 |
Family
ID=62132684
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201711353984.1A Pending CN108052979A (en) | 2017-12-15 | 2017-12-15 | The method, apparatus and equipment merged to model predication value |
Country Status (3)
| Country | Link |
|---|---|
| CN (1) | CN108052979A (en) |
| TW (1) | TWI718422B (en) |
| WO (1) | WO2019114423A1 (en) |
Cited By (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN108985489A (en) * | 2018-06-08 | 2018-12-11 | 阿里巴巴集团控股有限公司 | A kind of Risk Forecast Method, risk profile device and terminal device |
| CN109063886A (en) * | 2018-06-12 | 2018-12-21 | 阿里巴巴集团控股有限公司 | A kind of method for detecting abnormality, device and equipment |
| CN109635990A (en) * | 2018-10-12 | 2019-04-16 | 阿里巴巴集团控股有限公司 | A kind of training method, prediction technique, device and electronic equipment |
| WO2019114423A1 (en) * | 2017-12-15 | 2019-06-20 | 阿里巴巴集团控股有限公司 | Method and apparatus for merging model prediction values, and device |
| CN111242244A (en) * | 2020-04-24 | 2020-06-05 | 支付宝(杭州)信息技术有限公司 | Characteristic value sorting method, system and device |
| CN111582565A (en) * | 2020-04-26 | 2020-08-25 | 支付宝(杭州)信息技术有限公司 | Data fusion method and device and electronic equipment |
| CN112801358A (en) * | 2021-01-21 | 2021-05-14 | 上海东普信息科技有限公司 | Component prediction method, device, equipment and storage medium based on model fusion |
| CN116402241A (en) * | 2023-06-08 | 2023-07-07 | 浙江大学 | A multi-model-based supply chain data forecasting method and device |
Families Citing this family (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN112418258B (en) * | 2019-08-22 | 2024-08-16 | 北京京东振世信息技术有限公司 | Feature discretization method and device |
| CN111767982A (en) * | 2020-05-20 | 2020-10-13 | 北京大米科技有限公司 | Training method, device, storage medium and electronic device for user conversion prediction model |
| CN112288457B (en) * | 2020-06-23 | 2024-12-03 | 北京沃东天骏信息技术有限公司 | Data processing method, device, equipment and medium based on multi-model computing fusion |
| CN112711765B (en) * | 2020-12-30 | 2024-06-14 | 深圳前海微众银行股份有限公司 | Information value determining method, terminal, device and storage medium for sample characteristics |
| KR102344383B1 (en) * | 2021-02-01 | 2021-12-29 | 테이블매니저 주식회사 | Method for analyzing loyalty of the customer to store based on artificial intelligence and system thereof |
| CN113312512B (en) * | 2021-06-10 | 2023-10-31 | 北京百度网讯科技有限公司 | Training method, recommending device, electronic equipment and storage medium |
| CN113988527A (en) * | 2021-09-28 | 2022-01-28 | 东方微银科技股份有限公司 | User attribute data grouping method |
| CN113920166B (en) * | 2021-10-29 | 2024-05-28 | 广州文远知行科技有限公司 | Method, device, vehicle and storage medium for selecting object motion model |
Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN105679021A (en) * | 2016-02-02 | 2016-06-15 | 重庆云途交通科技有限公司 | Travel time fusion prediction and query method based on traffic big data |
| CN106873571A (en) * | 2017-02-10 | 2017-06-20 | 泉州装备制造研究所 | A kind of method for early warning based on data and Model Fusion |
| CN107025153A (en) * | 2016-01-29 | 2017-08-08 | 阿里巴巴集团控股有限公司 | The failure prediction method and device of disk |
Family Cites Families (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US9563855B2 (en) * | 2014-06-27 | 2017-02-07 | Intel Corporation | Using a generic classifier to train a personalized classifier for wearable devices |
| CN108052979A (en) * | 2017-12-15 | 2018-05-18 | 阿里巴巴集团控股有限公司 | The method, apparatus and equipment merged to model predication value |
-
2017
- 2017-12-15 CN CN201711353984.1A patent/CN108052979A/en active Pending
-
2018
- 2018-10-12 TW TW107135970A patent/TWI718422B/en not_active IP Right Cessation
- 2018-10-25 WO PCT/CN2018/111824 patent/WO2019114423A1/en not_active Ceased
Patent Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN107025153A (en) * | 2016-01-29 | 2017-08-08 | 阿里巴巴集团控股有限公司 | The failure prediction method and device of disk |
| CN105679021A (en) * | 2016-02-02 | 2016-06-15 | 重庆云途交通科技有限公司 | Travel time fusion prediction and query method based on traffic big data |
| CN106873571A (en) * | 2017-02-10 | 2017-06-20 | 泉州装备制造研究所 | A kind of method for early warning based on data and Model Fusion |
Cited By (13)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2019114423A1 (en) * | 2017-12-15 | 2019-06-20 | 阿里巴巴集团控股有限公司 | Method and apparatus for merging model prediction values, and device |
| CN108985489A (en) * | 2018-06-08 | 2018-12-11 | 阿里巴巴集团控股有限公司 | A kind of Risk Forecast Method, risk profile device and terminal device |
| CN108985489B (en) * | 2018-06-08 | 2021-12-31 | 创新先进技术有限公司 | Risk prediction method, risk prediction device and terminal equipment |
| CN109063886B (en) * | 2018-06-12 | 2022-05-31 | 创新先进技术有限公司 | Anomaly detection method, device and equipment |
| CN109063886A (en) * | 2018-06-12 | 2018-12-21 | 阿里巴巴集团控股有限公司 | A kind of method for detecting abnormality, device and equipment |
| CN109635990A (en) * | 2018-10-12 | 2019-04-16 | 阿里巴巴集团控股有限公司 | A kind of training method, prediction technique, device and electronic equipment |
| CN109635990B (en) * | 2018-10-12 | 2022-09-16 | 创新先进技术有限公司 | Training method, prediction method, device, electronic equipment and storage medium |
| CN111242244A (en) * | 2020-04-24 | 2020-06-05 | 支付宝(杭州)信息技术有限公司 | Characteristic value sorting method, system and device |
| CN111242244B (en) * | 2020-04-24 | 2020-09-18 | 支付宝(杭州)信息技术有限公司 | Characteristic value sorting method, system and device |
| CN111582565A (en) * | 2020-04-26 | 2020-08-25 | 支付宝(杭州)信息技术有限公司 | Data fusion method and device and electronic equipment |
| CN112801358A (en) * | 2021-01-21 | 2021-05-14 | 上海东普信息科技有限公司 | Component prediction method, device, equipment and storage medium based on model fusion |
| CN116402241A (en) * | 2023-06-08 | 2023-07-07 | 浙江大学 | A multi-model-based supply chain data forecasting method and device |
| CN116402241B (en) * | 2023-06-08 | 2023-08-18 | 浙江大学 | A multi-model-based supply chain data forecasting method and device |
Also Published As
| Publication number | Publication date |
|---|---|
| WO2019114423A1 (en) | 2019-06-20 |
| TW201928709A (en) | 2019-07-16 |
| TWI718422B (en) | 2021-02-11 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN108052979A (en) | The method, apparatus and equipment merged to model predication value | |
| CN108829808B (en) | Page personalized sorting method and device and electronic equipment | |
| CN113536139B (en) | Interest-based content recommendation method, apparatus, computer equipment and storage medium | |
| CN108280462A (en) | A kind of model training method and device, electronic equipment | |
| CN110008973B (en) | A model training method, a method and device for determining a target user based on a model | |
| CN108021931A (en) | A kind of data sample label processing method and device | |
| CN112579909A (en) | Object recommendation method and device, computer equipment and medium | |
| CN109902708A (en) | A kind of recommended models training method and relevant apparatus | |
| CN111582651A (en) | User risk analysis model training method, device and electronic equipment | |
| CN114121180B (en) | Drug screening method, device, electronic device and storage medium | |
| CN110390408A (en) | Trading object prediction technique and device | |
| CN111783810A (en) | Method and apparatus for determining attribute information of a user | |
| CN111026973B (en) | Commodity interest degree prediction method and device and electronic equipment | |
| CN112905885A (en) | Method, apparatus, device, medium, and program product for recommending resources to a user | |
| CN110020112A (en) | Object Push method and its system | |
| CN112182118B (en) | Target object prediction method based on multiple data sources and related equipment thereof | |
| CN113781079B (en) | Method and apparatus for training a model | |
| CN110766513A (en) | Information sorting method, device, electronic device and readable storage medium | |
| CN114266601A (en) | Marketing strategy determination method, device, terminal device and storage medium | |
| CN115796984A (en) | Training method of item recommendation model, storage medium and related equipment | |
| CN107633421A (en) | A kind of processing method and processing device of market prediction data | |
| CN114610996A (en) | Information pushing method and device | |
| CN112819555B (en) | Item recommendation method and device | |
| CN113792952A (en) | Method and apparatus for generating a model | |
| CN117112890B (en) | Data processing method, contribution value acquisition method and related equipment |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| REG | Reference to a national code |
Ref country code: HK Ref legal event code: DE Ref document number: 1254023 Country of ref document: HK |
|
| TA01 | Transfer of patent application right |
Effective date of registration: 20200925 Address after: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands Applicant after: Innovative advanced technology Co.,Ltd. Address before: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands Applicant before: Advanced innovation technology Co.,Ltd. Effective date of registration: 20200925 Address after: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands Applicant after: Advanced innovation technology Co.,Ltd. Address before: A four-storey 847 mailbox in Grand Cayman Capital Building, British Cayman Islands Applicant before: Alibaba Group Holding Ltd. |
|
| TA01 | Transfer of patent application right | ||
| RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180518 |
|
| RJ01 | Rejection of invention patent application after publication | ||
| REG | Reference to a national code |
Ref country code: HK Ref legal event code: WD Ref document number: 1254023 Country of ref document: HK |