CN111782937A

CN111782937A - Information sorting method, apparatus, electronic device and computer readable medium

Info

Publication number: CN111782937A
Application number: CN202010415457.4A
Authority: CN
Inventors: 石晓巍
Original assignee: Beijing Sankuai Online Technology Co Ltd
Current assignee: Beijing Sankuai Online Technology Co Ltd
Priority date: 2020-05-15
Filing date: 2020-05-15
Publication date: 2020-10-16

Abstract

The embodiments of the present application disclose an information sorting method, an apparatus, an electronic device, and a computer-readable medium. An embodiment of the method includes: acquiring a candidate information set for a target user; extracting feature information corresponding to each candidate information in the candidate information set; inputting each feature information into a pre-trained sorting model, obtaining a score of each candidate information, and sorting The model includes a multi-task network and a fusion network. The multi-task network includes multiple sub-task networks with dependencies. Different sub-task networks are used to predict the probability of different user behaviors. The fusion network is connected to each sub-task network. The probability predicted by the subtask network predicts the score of the candidate information; based on the score, the candidate information is sorted. This implementation improves the accuracy of the information sorting result.

Description

Information sorting method, apparatus, electronic device and computer readable medium

技术领域technical field

本申请实施例涉及计算机技术领域，具体涉及信息排序方法、装置、电子设备和计算机可读介质。The embodiments of the present application relate to the field of computer technologies, and in particular, to a method, an apparatus, an electronic device, and a computer-readable medium for sorting information.

背景技术Background technique

随着计算机技术的发展，在越来越多的场景中需要向用户发送信息。例如，在用户搜索以某一搜索词进行搜索时，可以为用户发送与该搜索词相关的信息；在用户浏览电商页面的过程中，可以在其所浏览的产品信息中推送一些推广的产品信息等。在发送信息之前，通常需要对所召回的候选信息进行排序，以便于将与用户需求更相关的候选信息优先发送给用户。With the development of computer technology, information needs to be sent to users in more and more scenarios. For example, when a user searches with a certain search term, information related to the search term can be sent to the user; when the user browses the e-commerce page, some promoted products can be pushed in the product information they browse. information, etc. Before sending the information, it is usually necessary to sort the recalled candidate information, so that the candidate information more relevant to the user's needs is preferentially sent to the user.

现有技术中，通常通过排序模型从用户行为数据中学习出用户对某个信息的兴趣关系，从而利用该排序模型对各候选信息进行排序。然而，现有的排序模型大多以单个目标作为预测目标，如以用户点击的概率或者下单的概率等为预测目标，进而根据预测结果对各候选信息进行排序。以单个目标作为预测目标的模型，在学习过程中所使用的信息较为单一，会忽略掉其他能够表征用户兴趣的信息。例如，以用户下单的概率作为预测目标时，仅使用下单信息，而忽略支付、浏览时长、评分等信息。由此，现有的排序模型，在学习过程中对信息的利用率通常较低，从而导致排序的准确性较低。In the prior art, the user's interest relationship for a certain information is usually learned from the user behavior data through a ranking model, so that each candidate information is ranked by using the ranking model. However, most of the existing ranking models take a single target as the prediction target, such as the probability of a user's click or the probability of placing an order, etc., and then sort each candidate information according to the prediction result. A model that uses a single target as a prediction target uses relatively single information in the learning process, ignoring other information that can represent user interests. For example, when the probability of a user placing an order is used as the prediction target, only the order information is used, and information such as payment, browsing time, and rating is ignored. As a result, the existing ranking models generally have low utilization of information during the learning process, resulting in low ranking accuracy.

发明内容SUMMARY OF THE INVENTION

本申请实施例提出了信息排序方法、装置、电子设备和计算机可读介质，以解决现有技术中由于排序模型对信息的利用率较低导致的排序不准确的技术问题。The embodiments of the present application propose an information sorting method, apparatus, electronic device, and computer-readable medium to solve the technical problem of inaccurate sorting caused by the low utilization rate of information by sorting models in the prior art.

第一方面，本申请实施例提供了一种信息排序方法，该方法包括：获取针对目标用户的候选信息集；提取候选信息集中的各候选信息对应的特征信息；分别将各特征信息输入至预先训练的排序模型，得到各候选信息的得分，排序模型包括多任务网络和融合网络，多任务网络包括具有依赖关系的多个子任务网络，不同子任务网络用于预测不同用户行为发生的概率，融合网络与各子任务网络相连接，用于基于各子任务网络预测的概率，预测候选信息的得分；基于得分，对各候选信息进行排序。In a first aspect, an embodiment of the present application provides an information sorting method, the method includes: acquiring a candidate information set for a target user; extracting feature information corresponding to each candidate information in the candidate information set; inputting each feature information into a preset The trained ranking model gets the score of each candidate information. The ranking model includes multi-task network and fusion network. The multi-task network includes multiple sub-task networks with dependencies. Different sub-task networks are used to predict the probability of different user behaviors. The network is connected with each subtask network, and is used for predicting the score of candidate information based on the probability predicted by each subtask network; based on the score, each candidate information is sorted.

第二方面，本申请实施例提供了一种信息排序装置，该装置包括：获取单元，被配置成获取针对目标用户的候选信息集；提取获取单元，被配置成获取候选信息集中的各候选信息对应的特征信息；输入单元，被配置成分别将各特征信息输入至预先训练的排序模型，得到各候选信息的得分，排序模型包括多任务网络和融合网络，多任务网络包括具有依赖关系的多个子任务网络，不同子任务网络用于预测不同用户行为发生的概率，融合网络与各子任务网络相连接，用于基于各子任务网络预测的概率，预测候选信息的得分；排序单元，被配置成基于得分，对各候选信息进行排序。In a second aspect, an embodiment of the present application provides an information sorting apparatus, the apparatus includes: an acquisition unit configured to acquire a candidate information set for a target user; an extraction and acquisition unit configured to acquire each candidate information in the candidate information set Corresponding feature information; the input unit is configured to input each feature information into a pre-trained sorting model, respectively, to obtain a score for each candidate information. The sorting model includes a multi-task network and a fusion network, and the multi-task network includes multiple tasks with dependencies. There are subtask networks, different subtask networks are used to predict the probability of different user behaviors, the fusion network is connected to each subtask network, and is used to predict the score of candidate information based on the probability predicted by each subtask network; the sorting unit is configured Based on the score, the candidate information is sorted.

第三方面，本申请实施例提供了一种电子设备，包括：一个或多个处理器；存储装置，其上存储有一个或多个程序，当一个或多个程序被一个或多个处理器执行，使得一个或多个处理器实现如第一方面所描述的方法。In a third aspect, an embodiment of the present application provides an electronic device, including: one or more processors; a storage device on which one or more programs are stored, when the one or more programs are processed by the one or more processors Execution causes one or more processors to implement the method as described in the first aspect.

第四方面，本申请实施例提供了一种计算机可读介质，其上存储有计算机程序，该程序被处理器执行时实现如第一方面所描述的方法。In a fourth aspect, an embodiment of the present application provides a computer-readable medium on which a computer program is stored, and when the program is executed by a processor, implements the method described in the first aspect.

本申请实施例提供的信息排序方法、装置、电子设备和计算机可读介质，通过提取针对目标用户的候选信息集中的各候选信息对应的特征信息，而后分别将各特征信息输入至预先训练的包含多任务网络和融合网络的排序模型，得到各候选信息的得分，最后基于各得分，对各候选信息进行排序。由于排序模型中的多任务网络包含具有依赖关系的多个子任务网络，而融合网络能够将各子任务网络的输出结果进行融合，作为排序模型最终的输出，因而，排序模型最终的输出结果能够兼顾多个子任务网络的预测目标，不仅提高了学习过程中的信息利用率，还可使排序模型在学习过程中考虑到了实际应用场景中信息之间的依赖关系，由此，提高了排序模型的预测能力以及排序结果的准确性。The information sorting method, device, electronic device, and computer-readable medium provided by the embodiments of the present application extract feature information corresponding to each candidate information in a candidate information set for a target user, and then input each feature information into a pre-trained set of candidate information respectively. The ranking model of multi-task network and fusion network obtains the score of each candidate information, and finally ranks each candidate information based on each score. Since the multi-task network in the sorting model includes multiple sub-task networks with dependencies, and the fusion network can fuse the output results of each sub-task network as the final output of the sorting model, the final output of the sorting model can take into account The prediction targets of multiple sub-task networks not only improve the information utilization in the learning process, but also make the ranking model take into account the dependencies between the information in the actual application scenarios during the learning process, thus improving the prediction of the ranking model. ability and accuracy of sorting results.

附图说明Description of drawings

通过阅读参照以下附图所作的对非限制性实施例所作的详细描述，本申请的其它特征、目的和优点将会变得更明显：Other features, objects and advantages of the present application will become more apparent by reading the detailed description of non-limiting embodiments made with reference to the following drawings:

图1是根据本申请的信息排序方法的一个实施例的流程图；1 is a flowchart of an embodiment of a method for sorting information according to the present application;

图2是根据本申请的贝叶斯多任务网络的有向无环图的示意图；2 is a schematic diagram of a directed acyclic graph of a Bayesian multitasking network according to the present application;

图3是根据本申请的贝叶斯多任务网络的结构示意图；3 is a schematic structural diagram of a Bayesian multi-task network according to the present application;

图4是根据本申请的排序模型的结构示意图；4 is a schematic structural diagram of a sorting model according to the present application;

图5是根据本申请的信息排序装置的一个实施例的结构示意图；5 is a schematic structural diagram of an embodiment of an information sorting apparatus according to the present application;

图6是适于用来实现本申请实施例的电子设备的计算机系统的结构示意图。FIG. 6 is a schematic structural diagram of a computer system suitable for implementing the electronic device according to the embodiment of the present application.

具体实施方式Detailed ways

下面结合附图和实施例对本申请作进一步的详细说明。可以理解的是，此处所描述的具体实施例仅仅用于解释相关发明，而非对该发明的限定。另外还需要说明的是，为了便于描述，附图中仅示出了与有关发明相关的部分。The present application will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the related invention, but not to limit the invention. In addition, it should be noted that, for the convenience of description, only the parts related to the related invention are shown in the drawings.

需要说明的是，在不冲突的情况下，本申请中的实施例及实施例中的特征可以相互组合。下面将参考附图并结合实施例来详细说明本申请。It should be noted that the embodiments in the present application and the features of the embodiments may be combined with each other in the case of no conflict. The present application will be described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.

请参考图1，其示出了根据本申请的信息排序方法的一个实施例的流程。信息排序方法的执行主体可以是服务器等电子设备。服务器可以是硬件，也可以是软件。当服务器为硬件时，可以实现成多个设备组成的分布式设备集群，也可以实现成单个设备。当服务器为软件时，可以实现成多个软件或软件模块，也可以实现成单个软件或软件模块。在此不做具体限定。该信息排序方法，包括以下步骤：Please refer to FIG. 1 , which shows the flow of an embodiment of the information sorting method according to the present application. The execution body of the information sorting method may be an electronic device such as a server. Servers can be hardware or software. When the server is hardware, it can be implemented as a distributed device cluster composed of multiple devices, or can be implemented as a single device. When the server is software, it may be implemented as multiple software or software modules, or may be implemented as a single software or software module. There is no specific limitation here. The information sorting method includes the following steps:

步骤101，获取针对目标用户的候选信息集。Step 101: Obtain a candidate information set for the target user.

在本实施例中，信息排序方法的执行主体可以获取针对目标用户的候选信息集。此处的目标用户可以是待向其推送或返回信息的用户。候选信息集可以是针对目标用户召回的信息集。候选信息集中可以包括多个候选信息。实践中，候选信息集可由其他服务器或者调用其他服务进行召回，也可由上述执行主体召回，本实施例对此不作限定。In this embodiment, the executor of the information sorting method may acquire a candidate information set for the target user. The target user here can be the user to whom information is to be pushed or returned. The candidate information set may be the information set recalled for the target user. A plurality of candidate information may be included in the candidate information set. In practice, the candidate information set can be recalled by other servers or by calling other services, or by the above-mentioned execution subject, which is not limited in this embodiment.

在本实施例中，候选信息可以在多种场景下召回。本申请实施例对候选信息的召回场景不作限定。In this embodiment, the candidate information can be recalled in various scenarios. This embodiment of the present application does not limit the recall scenarios of candidate information.

作为示例，在一种场景中，候选信息集可以在用户搜索场景中召回。此时的候选信息集即为基于目标用户的搜索请求召回的搜索结果集。例如，用户在某客户端应用中通过某一关键词进行了搜索后，即可召回与该搜索词对应的相关信息，以便于从中选取部分信息返回给用户，来响应用户的搜索请求。以点餐类客户端为例，用户在点餐类客户端中搜索了“川菜”，在接收到点餐客户端发送的搜索请求后，即可召回点餐平台中供应“川菜”的门店的门店信息。此时，所召回的每一条门店信息即为一个候选信息。各门店信息构成的信息集，即为候选信息集。As an example, in one scenario, a candidate information set may be recalled in a user search scenario. The candidate information set at this time is the search result set recalled based on the target user's search request. For example, after a user searches through a certain keyword in a client application, relevant information corresponding to the search word can be recalled, so as to select part of the information and return it to the user to respond to the user's search request. Taking the ordering client as an example, the user searches for "Sichuan cuisine" in the ordering client, and after receiving the search request sent by the ordering client, he can recall the restaurant serving "Sichuan cuisine" on the ordering platform. Store information. At this time, each piece of store information recalled is a candidate information. The information set formed by the information of each store is the candidate information set.

在另一种场景中，候选信息集可在推送场景中召回。此时的候选信息集为在目标用户满足预设推送条件时召回的候选推送信息集。当目标用户符合预设推送条件时，可以自动召回一些可推送至目标用户的候选推送信息。继续以点餐类客户端为例，当用户开启点餐客户端时，即可自动召回一些菜品信息或者门店信息，如用户曾经浏览或者点餐或的菜品信息。此时，可将所召回的每一个信息作为候选信息，从而得到候选信息集。In another scenario, the candidate information set can be recalled in the push scenario. The candidate information set at this time is the candidate push information set that is recalled when the target user satisfies the preset push condition. When the target user meets the preset push conditions, some candidate push information that can be pushed to the target user can be automatically recalled. Continuing to take the ordering client as an example, when the user opens the ordering client, some dish information or store information can be automatically recalled, such as the dish information that the user has browsed or ordered. At this time, each recalled information can be used as candidate information, so as to obtain a candidate information set.

步骤102，提取候选信息集中的各候选信息对应的特征信息。Step 102: Extract feature information corresponding to each candidate information in the candidate information set.

在本实施例中，上述执行主体可以提取候选信息集中的各候选信息对应的特征信息。此处，每一个候选信息对应的特征信息可以包括从该候选信息中提取的特征(可称为候选信息特征)、从目标用户的特征(可称为用户特征)以及其他特征。以点餐场景为例，候选信息具体可以为门店信息。此时，候选信息特征可以包括但不限于门店的类型、门店所供应的菜品的类别、口味、风格、受众群体、地点、菜品平均价格等。用户特征可以包括但不限于用户的属性信息(如年龄、地域、性别等)、行为特征等。其他特征可以包括但不限于时间、地理位置信息等。In this embodiment, the above-mentioned execution body may extract feature information corresponding to each candidate information in the candidate information set. Here, the feature information corresponding to each candidate information may include features extracted from the candidate information (which may be referred to as candidate information features), features from the target user (which may be referred to as user features), and other features. Taking the ordering scene as an example, the candidate information may specifically be store information. At this time, the candidate information features may include, but are not limited to, the type of store, the category, taste, style, audience, location, and average price of dishes provided by the store. User characteristics may include, but are not limited to, user attribute information (such as age, region, gender, etc.), behavior characteristics, and the like. Other characteristics may include, but are not limited to, time, geographic location information, and the like.

在本实施例的一些可选的实现方式中，在满足预设推送条件时，上述执行主体可主动向目标用户进行信息推送。在这种场景下，上述执行主体可以通过如下步骤提取各候选信息对应的特征信息：首先，提取目标用户的用户特征，并分别从候选信息集中的各候选信息中提取候选信息特征。而后，对于每一个候选信息，将用户特征和该候选信息的候选信息特征进行汇总，得到该候选信息对应的特征信息。In some optional implementation manners of this embodiment, when a preset push condition is satisfied, the above-mentioned execution body may actively push information to the target user. In this scenario, the above-mentioned execution body can extract the feature information corresponding to each candidate information through the following steps: first, extract the user features of the target user, and extract the candidate information features from each candidate information in the candidate information set. Then, for each candidate information, the user characteristics and the candidate information characteristics of the candidate information are aggregated to obtain characteristic information corresponding to the candidate information.

在本实施例的一些可选的实现方式中，上述执行主体可以在用户搜索场景下，响应用户的搜索请求，进行信息返回。在这种场景下，候选信息集中的候选信息为与目标用户的搜索请求对应的搜索结果。此时，各候选信息对应的特征信息中，还可以包括从搜索请求中提取的特征(可称为请求特征)。此时，上述执行主体可以通过如下步骤提取各候选信息对应的特征信息：首先，提取目标用户的用户特征，并从搜索请求中提取请求特征，以及，分别从候选信息集中的各候选信息中提取候选信息特征。此处的请求特征可以包括请求时间、请求位置、搜索词等。而后，对于每一个候选信息，将用户特征、请求特征和该候选信息的候选信息特征进行汇总，得到该候选信息对应的特征信息。In some optional implementation manners of this embodiment, the above-mentioned execution body may respond to a user's search request and return information in a user search scenario. In this scenario, the candidate information in the candidate information set is the search result corresponding to the target user's search request. At this time, the feature information corresponding to each candidate information may further include features extracted from the search request (which may be referred to as request features). At this time, the above-mentioned execution body can extract the feature information corresponding to each candidate information through the following steps: first, extract the user feature of the target user, and extract the request feature from the search request, and, respectively, extract from each candidate information in the candidate information set Candidate informative features. The request features here may include request time, request location, search terms, and the like. Then, for each candidate information, the user characteristics, the request characteristics and the candidate information characteristics of the candidate information are aggregated to obtain the characteristic information corresponding to the candidate information.

步骤103，分别将各特征信息输入至预先训练的排序模型，得到各候选信息的得分。Step 103 , respectively input each feature information into the pre-trained ranking model, and obtain the score of each candidate information.

在本实施例中，上述执行主体可以分别将各特征信息输入至预先训练的排序模型，得到各候选信息的得分。此处的排序模型可以包括多任务网络和融合网络。In this embodiment, the above-mentioned execution body may respectively input each feature information into a pre-trained ranking model to obtain a score for each candidate information. The ranking model here can include multi-task networks and fusion networks.

在本实施例中，多任务网络可以包括具有依赖关系的多个子任务网络。例如，多任务网络可以是贝叶斯多任务网络。此外，多任务网络还可以是由具有依赖关系的多个子任务网络构建的其他多任务网络，本申请实施例对多任务网络的结构不作限定。In this embodiment, the multi-task network may include multiple sub-task networks with dependencies. For example, the multi-task network can be a Bayesian multi-task network. In addition, the multi-task network may also be other multi-task networks constructed by multiple sub-task networks with dependencies, and the structure of the multi-task network is not limited in this embodiment of the present application.

在本实施例中，不同的子任务网络用于预测不同用户行为发生的概率。作为示例，在以支付为目标的排序模型应用场景中，用户行为可以包括但不限于以下至少两项：曝光行为、点击行为、浏览行为(如停留时长大于5秒)、下单行为、支付行为、退款行为、评价行为(如5分好评)等。其中，曝光行为可以指曝光给用户使信息被用户看到。In this embodiment, different subtask networks are used to predict the probability of occurrence of different user behaviors. As an example, in an application scenario of a ranking model targeting payment, user behaviors may include, but are not limited to, at least two of the following: exposure behavior, click behavior, browsing behavior (such as staying longer than 5 seconds), ordering behavior, and payment behavior , refund behavior, evaluation behavior (such as 5-point praise), etc. The exposure behavior may refer to exposure to users so that information can be seen by users.

以用户行为包括上述列举的各项行为为例，此时的任务网络可包括用于预测曝光概率的子任务网络、用于预测用户执行点击行为的概率的子任务网络、用于预测用户执行满足第一预设条件的浏览行为的概率的子任务网络、用于预测用户执行下单行为的概率的子任务网络、用于预测用户执行支付行为的概率的子任务网络、用于预测用户执行退款行为的概率的子任务网络、用于预测用户执行满足第二预设条件的评价行为的概率的子任务网络。曝光概率可以指用户看到候选信息的概率。Taking the user behavior including the above listed behaviors as an example, the task network at this time may include a subtask network for predicting exposure probability, a subtask network for predicting the probability of the user performing click behavior, and a subtask network for predicting the user’s performance satisfaction. A subtask network for the probability of browsing behavior of the first preset condition, a subtask network for predicting the probability of the user performing an ordering behavior, a subtask network for predicting the probability of the user performing a payment behavior, and a subtask network for predicting the user performing a withdrawal. A subtask network for predicting the probability of the user performing the evaluation behavior that satisfies the second preset condition. Exposure probability may refer to the probability that a user sees candidate information.

由于用户行为之间通常具有依赖关系，因而，多任务网络中的子网络之间，也存在依赖关系。继续上述示例，由于该候选信息需要先曝光给用户后，用户才能进行点击，因而点击行为依赖于曝光行为。由此，用于预测用户执行点击行为的概率的子任务网络，则依赖于用于预测曝光概率的子任务网络的输出。Since there is usually a dependency between user actions, there is also a dependency between sub-networks in a multitasking network. Continuing the above example, since the candidate information needs to be exposed to the user before the user can click, the click behavior depends on the exposure behavior. Thus, the subtask network for predicting the probability of a user performing a click behavior depends on the output of the subtask network for predicting the exposure probability.

同理，由于该候选信息需要先被用户点击，才能被用户浏览或者下单，因而浏览行为或者下单行为依赖于点击行为。由此，用于预测用户执行浏览行为的概率的子任务网络，以及，用于预测用户执行下单行为的概率的子任务网络，均依赖于用于预测用户执行点击行为的概率的子任务网络的输出。Similarly, since the candidate information needs to be clicked by the user before the user can browse or place an order, the browsing behavior or the ordering behavior depends on the click behavior. Therefore, the subtask network used to predict the probability of the user performing browsing behavior, and the subtask network used to predict the probability of the user performing the ordering behavior, both depend on the subtask network used to predict the probability of the user performing the click behavior. Output.

同理，由于该候选信息涉及的产品需要先被用户下单，才能进行支付，因而支付行为依赖于下单行为。由此，用于预测用户执行支付行为的概率的子任务网络，则依赖于用于预测用户执行下单行为的概率的子任务网络的输出。Similarly, since the product involved in the candidate information needs to be ordered by the user before payment can be made, the payment behavior depends on the ordering behavior. Thus, the subtask network for predicting the probability of the user performing the payment behavior depends on the output of the subtask network for predicting the probability of the user performing the ordering behavior.

同理，由于需要先对订单进行支付，才能进行退款或者评论，因而退款行为和评论行为均依赖于支付行为。由此，用于预测用户执行退款行为的概率的子任务网络，以及，用于预测用户执行评论行为的概率的子任务网络，均依赖于用于预测用户执行支付行为的概率的子任务网络的输出。Similarly, since the order needs to be paid first before a refund or comment can be made, the refund behavior and the comment behavior both depend on the payment behavior. Therefore, the subtask network used to predict the probability of the user performing the refund behavior, and the subtask network used to predict the probability of the user performing the review behavior, both depend on the subtask network used to predict the probability of the user performing the payment behavior. Output.

在本实施例中，融合网络可与各子任务网络相连接，用于基于各子任务网络预测的概率，预测候选信息的得分。此处，融合网络可以采用公式、函数或者神经网络等网络结构构建。In this embodiment, the fusion network can be connected to each subtask network, and is used to predict the score of the candidate information based on the probability predicted by each subtask network. Here, the fusion network can be constructed using a network structure such as a formula, a function, or a neural network.

通过融合网络，能够将各子任务网络的输出结果进行融合，使排序模型最终的输出结果能够兼顾多个子任务网络的预测目标，不仅提高了学习过程中的信息利用率，还可使排序模型在学习过程中考虑到了实际应用场景中信息之间的依赖关系，从而提高了排序模型的预测能力以及排序结果的准确性，进而使发送给用户的信息与用户需求更具相关性。Through the fusion network, the output results of each sub-task network can be fused, so that the final output of the sorting model can take into account the prediction targets of multiple sub-task networks, which not only improves the information utilization in the learning process, but also makes the sorting model in the In the learning process, the dependencies between information in practical application scenarios are considered, thereby improving the prediction ability of the ranking model and the accuracy of the ranking results, thereby making the information sent to users more relevant to user needs.

在本实施例的一些可选的实现方式中，上述多任务网络可以是贝叶斯多任务网络。贝叶斯多任务网络可以按照如下子步骤S11至子步骤S13预先构建：In some optional implementations of this embodiment, the above-mentioned multi-task network may be a Bayesian multi-task network. The Bayesian multi-task network can be pre-built according to the following sub-steps S11 to S13:

子步骤S11，基于用户行为之间的依赖关系，构建有向无环图。In sub-step S11, a directed acyclic graph is constructed based on the dependencies between user behaviors.

以用户行为包括展示、点击、停留时间大于5秒(可视为执行了浏览时长大于5秒的浏览行为)、下单、支付、退款、评分为5分为例。此时的有向无环图可如图2所示。Take user behavior including display, click, dwell time longer than 5 seconds (it can be regarded as performing browsing behavior with browsing duration longer than 5 seconds), order placement, payment, refund, and rating as 5 points as an example. The directed acyclic graph at this time can be shown in Figure 2.

图2中的圆圈可代表变量(或影响因子)。每一个变量对应一个用户行为。由此，变量具体可包括“展示”、“点击”、“停留时间大于5秒”、“下单”、“支付”、“退款”、“评分为5分”。The circles in Figure 2 may represent variables (or influencing factors). Each variable corresponds to a user behavior. Therefore, the variables may specifically include "display", "click", "dwell time greater than 5 seconds", "order", "payment", "refund", "rating is 5 points".

若两个变量之间的若存在连线，则表示这两个变量存在依赖关系。例如，由于某个信息只有先展示后，才可能被用于点击，因而点击行为需要依赖于展示行为。由此，有向无环图中存在从“展示”指向“点击”的连线。再例如，由于某个信息只有先被用户点击，才能进行下一步的下单操作，因而下单行为依赖于点击行为。由此，有向无环图中存在从“点击”指向“下单”的连线。If there is a connection between two variables, it means that the two variables are dependent. For example, since a piece of information cannot be used for a click until it is displayed, the click behavior needs to depend on the display behavior. Thus, there is a link from "show" to "click" in the directed acyclic graph. For another example, since a certain information is clicked by the user first, the next ordering operation can be performed, so the ordering behavior depends on the clicking behavior. Therefore, there is a connection from "click" to "order" in the directed acyclic graph.

子步骤S12，确定具有依赖关系的每两个用户行为之间的条件概率。Sub-step S12, determining the conditional probability between every two user behaviors having a dependency relationship.

继续上述示例，对于某个候选信息x，可将x能够被用户看到的概率(即曝光概率)表示为P(曝光)，将x被用户点击的概率记为P(点击)，则有P(点击)＝P(点击|曝光)P(点击)。其中，P(点击|曝光)即为在x被曝光(即被用户看到)的情况下，被用户点击的条件概率。Continuing the above example, for a certain candidate information x, the probability that x can be seen by the user (that is, the exposure probability) can be expressed as P(exposure), and the probability of x being clicked by the user is recorded as P(click), then there is P (click) = P(click|exposure) P(click). Among them, P (click|exposure) is the conditional probability of being clicked by the user when x is exposed (that is, seen by the user).

同理，可以将用户在x处停留时间大于5秒的概率记为P(停留时间大于5秒)，此时，则有P(停留时间大于5秒)＝P(停留时间大于5秒|点击)P(点击)。其中，P(停留时间大于5秒|点击)即为用户点击之后停留时间大于5秒的条件概率。In the same way, the probability that the user stays at x for more than 5 seconds can be recorded as P (stay time is greater than 5 seconds), at this time, there is P (stay time is greater than 5 seconds) = P (stay time is greater than 5 seconds | click )P (click). Among them, P (dwell time greater than 5 seconds | click) is the conditional probability that the user's dwell time is greater than 5 seconds after the user clicks.

同理，可以将x被用户下单的概率记为P(下单)，此时，则有P(下单)＝P(下单|点击)P(点击)。其中，P(下单|点击)即为在用户点击了x后下单的条件概率。In the same way, the probability of x being placed by the user can be recorded as P(order), at this time, there is P(order)=P(order|click)P(click). Among them, P(order|click) is the conditional probability of placing an order after the user clicks x.

同理，可以将x被用户支付的概率记为P(支付)，此时，则有P(支付)＝P(支付|下单)P(下单)。其中，P(支付|下单)即为在对x下单后支付的条件概率。Similarly, the probability of x being paid by the user can be recorded as P(payment), at this time, there is P(payment)=P(payment|order) P(order). Among them, P(Payment|Order) is the conditional probability of paying after placing an order for x.

同理，可以将x被用户退款的概率记为P(退款)，此时，则有P(退款)＝P(退款|支付)P(支付)。其中，P(退款|支付)即为在对x支付后退款的条件概率。Similarly, the probability of x being refunded by the user can be recorded as P(refund), in this case, P(refund)=P(refund|payment) P(payment). where P(refund|payment) is the conditional probability of refunding after paying x.

同理，可以将x被用户评分为5分的概率记为P(评分为5分)，此时，则有P(评分为5分)＝P(评分为5分|支付)P(支付)。其中，P(评分为5分|支付)即为在对x支付后评分为5分的条件概率。Similarly, the probability of x being rated as 5 points by the user can be recorded as P (rated as 5 points), at this time, there is P (rated as 5 points) = P (rated as 5 points | payment) P (payment) . Among them, P(Rated 5 | Pay) is the conditional probability of scoring 5 after paying x.

子步骤S13，基于有向无环图和条件概率，构建包含具有依赖关系的多个子任务网络的贝叶斯多任务网络。Sub-step S13, based on the directed acyclic graph and the conditional probability, construct a Bayesian multi-task network including multiple sub-task networks with dependencies.

此处，由于预构建的贝叶斯多任务网络属于贝叶斯网络，因而，可采用构建贝叶斯网络(Bayesian Network，BN)的方法构建贝叶斯多任务网络。实践中，贝叶斯网络又称信念网络(Belief Network)或是有向无环图模型(Directed Acyclic Graphical Model)，是一种概率图型模型。贝叶斯网络可基于一个有向无环图(Directed Acyclic Graphical，DAG)和一个条件概率表(Conditional Probability Table，CPT)构建。由于已知贝叶斯多任务网络的有向无环图和条件概率，因而，基于有向无环图和条件概率即可建立贝叶斯多任务网络。Here, since the pre-built Bayesian multi-task network belongs to the Bayesian network, a Bayesian network (Bayesian Network, BN) method can be used to construct the Bayesian multi-task network. In practice, Bayesian network, also known as Belief Network or Directed Acyclic Graphical Model, is a probabilistic graphical model. A Bayesian network can be constructed based on a Directed Acyclic Graphical (DAG) and a Conditional Probability Table (CPT). Since the directed acyclic graph and conditional probability of the Bayesian multi-task network are known, the Bayesian multi-task network can be established based on the directed acyclic graph and the conditional probability.

具体地，有向无环图中的每个变量可对应一个预测目标。因而，首先可以根据不同预测目标，构建多个子任务网络，使每个子任务网络对应一个预测目标。此处的子任务网络可以采用神经网络结构，如深度神经网络等。由于有向无环图中的指向关系可表征预测目标之间的依赖关系，因而，在构建各子任务网络后，对于具有依赖关系的每两个子任务网络，可根据这两个子任务网络的条件概率公式，为具有依赖关系的子任务网络，建立连接关系。例如，某一子任务网络B输出的概率P(支付)＝P(支付|下单)P(下单)，该P(下单)由另一子任务网络A输出，因而，可将子任务网络A的输出层与子任务网络B的条件概率预测层相连接。在设置具有依赖关系的各个子任务网络进行连接后，即可得到贝叶斯多任务网络。Specifically, each variable in the directed acyclic graph can correspond to a prediction target. Therefore, firstly, multiple subtask networks can be constructed according to different prediction targets, so that each subtask network corresponds to a prediction target. The subtask network here can adopt a neural network structure, such as a deep neural network. Since the pointing relationship in the directed acyclic graph can represent the dependency between the prediction targets, after building each subtask network, for every two subtask networks with dependencies, the conditions of the two subtask networks can be used according to the conditions of the two subtask networks. The probability formula establishes the connection relationship for the subtask network with dependencies. For example, the probability P(payment) output by a certain subtask network B = P(payment|order) P(order), this P(order) is output by another subtask network A, therefore, the subtask can be The output layer of network A is connected to the conditional probability prediction layer of subtask network B. After setting up each sub-task network with dependencies to connect, the Bayesian multi-task network can be obtained.

作为示例，图3示出了贝叶斯多任务网络的结构示意图。如图3所示，贝叶斯多任务网络包括7个子任务网络。分别为用于预测曝光概率P(曝光)的子任务网络、用于预测用户执行点击行为的概率P(点击)的子任务网络、用于预测用户停留时间大于5秒的行为的概率P(停留时间大于5秒)的子任务网络、用于预测用户执行下单行为的概率P(下单)的子任务网络、用于预测用户执行支付行为的概率P(支付)的子任务网络、用于预测用户执行退款行为的概率P(退款)的子任务网络、用于预测用户执行评分为5分的行为的概率P(评分为5分)的子任务网络。As an example, Figure 3 shows a schematic diagram of the structure of a Bayesian multi-task network. As shown in Figure 3, the Bayesian multi-task network consists of 7 sub-task networks. are the sub-task network for predicting the exposure probability P (exposure), the sub-task network for predicting the probability P (click) of the user's click behavior, and the probability P (stay) for predicting the user's staying time of more than 5 seconds. Subtask network for time greater than 5 seconds), subtask network for predicting the probability P (ordering) of the user's execution of the order behavior, subtask network for predicting the probability P (payment) of the user's execution of the payment behavior, for A subtask network for predicting the probability P (refund) of a user performing a refund behavior, a subtask network for predicting the probability P (rating 5) for a user performing an action rated 5 points.

其中，用户执行点击行为的概率P(点击)依赖于曝光概率P(曝光)。用户停留时间大于5秒的行为的概率P(停留时间大于5秒)以及用户执行下单行为的概率P(下单)，均依赖于用户执行点击行为的概率P(点击)。用户执行支付行为的概率P(支付)，依赖于用户执行下单行为的概率P(下单)。用户执行退款行为的概率P(退款)以及用户执行评分为5分的行为的概率P(评分为5分)，均依赖于用户执行支付行为的概率P(支付)。Among them, the probability P(click) that the user performs the click behavior depends on the exposure probability P(exposure). The probability P of the user staying for more than 5 seconds (the staying time is more than 5 seconds) and the probability P (ordering) of the user performing the ordering behavior depend on the probability P (clicking) that the user performs the clicking behavior. The probability P (payment) that the user performs the payment behavior depends on the probability P (the order) that the user performs the ordering behavior. The probability P (refund) that the user performs the refund behavior and the probability P (the score is 5) that the user performs the behavior with a score of 5 both depend on the probability P (payment) of the user performing the payment behavior.

由于多任务网络包含具有依赖关系的多个子任务网络，因而，排序模型在学习过程中不仅能够利用到更多的信息，还能够考虑到实际应用场景中信息之间的依赖关系，由此，提高了有助于提高排序模型的预测能力以及排序结果的准确性。Since the multi-task network includes multiple sub-task networks with dependencies, the ranking model can not only use more information in the learning process, but also take into account the dependencies between information in practical application scenarios. It helps to improve the predictive ability of the ranking model and the accuracy of the ranking results.

在本实施例的一些可选的实现方式中，多任务网络中的各个子任务网络，可以共享同一个输入层和至少一个隐藏层，所共享的隐藏层可位于各子任务网络的浅层。此处，可将各子任务网络共享的浅层隐藏层称为共享浅层。以图3所示的贝叶斯多任务网络的结构为例，7个子任务网络可以共用同一个输入层和同样的共享浅层。In some optional implementations of this embodiment, each subtask network in the multitask network may share the same input layer and at least one hidden layer, and the shared hidden layer may be located in the shallow layer of each subtask network. Here, the shallow hidden layer shared by each subtask network can be referred to as a shared shallow layer. Taking the structure of the Bayesian multi-task network shown in Figure 3 as an example, seven sub-task networks can share the same input layer and the same shared shallow layer.

由于不同子任务网络的监督信号不一样，因此学习到的模式通常不一样，如果某一个子任务进行单独训练，可能容易拟合某种随机噪声，而通过共享输入层和浅层隐藏层，可使多个不同的子任务网络一起训练，从而使拟合同一种噪声的概率则会大大降低。由此，多任务模型减少了过拟合的风险，提升了模型的泛化性能。Since the supervision signals of different subtask networks are different, the learned patterns are usually different. If a subtask is trained separately, it may be easy to fit some random noise. By sharing the input layer and the shallow hidden layer, it is possible to By training multiple different subtask networks together, the probability of fitting the same noise is greatly reduced. As a result, the multi-task model reduces the risk of overfitting and improves the generalization performance of the model.

在本实施例的一些可选的实现方式中，融合网络除了可与各个子任务网络(具体为各子任务网络的输出层)相连接外，还可以与各子任务网络所共享的隐藏层(即共享浅层)相连接。作为示例，在图3所示的贝叶斯多任务网络的结构示意图的基础上，图4示出了排序模型的结构示意图。如图4所示，排序模型中的融合网络与多任务网络中的各个子任务网络相连接，并与各个子任务网络的共享浅层相连接，从而将共享浅层的浅层特征以及各个子任务网络输出的概率作为输入。In some optional implementations of this embodiment, in addition to being connected to each subtask network (specifically, the output layer of each subtask network), the fusion network can also be connected to a hidden layer (specifically, the output layer of each subtask network) shared by each subtask network i.e. shared shallow) connections. As an example, on the basis of the structural schematic diagram of the Bayesian multi-task network shown in FIG. 3 , FIG. 4 shows the structural schematic diagram of the ranking model. As shown in Fig. 4, the fusion network in the ranking model is connected with each subtask network in the multitask network, and is connected with the shared shallow layer of each subtask network, so that the shallow features of the shared shallow layer and each subtask network are connected. The probability of the task network output is taken as input.

由此，融合网络可以同时利用共享浅层的浅层特征以及各子任务网络的输出的高级特征(即及各子任务网络的输出的概率)，使多种特征共同决定排序模型最终的输出。由此，进一步提高了信息使用率，从而进一步提高了排序模型的预测能力以及排序结果的准确性。As a result, the fusion network can simultaneously utilize the shallow features of the shared shallow layer and the high-level features of the output of each subtask network (ie, and the probability of the output of each subtask network), so that various features jointly determine the final output of the ranking model. As a result, the information utilization rate is further improved, thereby further improving the prediction ability of the ranking model and the accuracy of the ranking result.

在本实施例的一些可选的实现方式中，排序模型可以通过如下子步骤S21至子步骤S22预先训练得到：In some optional implementations of this embodiment, the ranking model may be pre-trained through the following sub-steps S21 to S22:

子步骤S21，获取样本集。Sub-step S21, acquiring a sample set.

样本集中的样本可以包括历史候选信息对应的特征信息和多个用户行为标注信息。不同的用户行为标注信息用于指示用户是否对历史候选信息执行了不同的用户行为。此处，不同的用户行为标注信息可用于训练不同的子任务网络。The samples in the sample set may include feature information corresponding to the historical candidate information and multiple user behavior annotation information. Different user behavior annotation information is used to indicate whether the user has performed different user behaviors on the historical candidate information. Here, different user behavior annotations can be used to train different subtask networks.

以历史候选信息为产品信息为例，用户行为标注信息可以包括但不限于：用于指示历史候选信息是否曝光给用户的曝光标注、用户指示历史候选信息是否被用户点击的点击行为标注、用于指示历史候选信息是否被用户浏览大于5秒的浏览行为标注、用于指示用户是否对历史候选信息进行下单的下单行为标注、用于指示用户是否针对历史候选信息进行了支付的支付行为标注、用于指示用户是否针对历史候选信息进行了退款的退款行为标注、用于指示用户是否针对历史候选信息评论了5分的评论行为标注等。Taking historical candidate information as product information as an example, user behavior annotation information may include, but is not limited to: exposure annotations used to indicate whether the historical candidate information is exposed to the user, click behavior annotations used by the user to indicate whether the historical candidate information has been clicked by the user, Browsing behavior annotation indicating whether the historical candidate information has been browsed by the user for more than 5 seconds, order placing behavior annotation indicating whether the user places an order for the historical candidate information, payment behavior annotation indicating whether the user has paid for the historical candidate information , a refund behavior annotation used to indicate whether the user has made a refund for the historical candidate information, a comment behavior annotation used to indicate whether the user has commented on the historical candidate information with 5 points, and so on.

此外，可以基于实际业务目标，从各用户行为标注信息中，选取其中一种用户行为标注信息作为目标用户行为标注信息，用以训练融合网络。作为示例，当实际业务目标是预测用户点击的概率时，可以将用于指示用户是否执行了点击行为的用户行为标注信息作为目标用户行为标注信息。作为又一示例，当实际业务目标是预测用户下单的概率时，可以将用于指示用户是否执行了下单行为的用户行为标注信息作为目标用户行为标注信息。标注信息的种类越多，所利用的信息越多，模型的预测效果越好。In addition, one of the user behavior annotation information can be selected from the user behavior annotation information as the target user behavior annotation information based on the actual business goal to train the fusion network. As an example, when the actual business goal is to predict the probability of a user's click, the user behavior annotation information used to indicate whether the user performs a click behavior may be used as the target user behavior annotation information. As another example, when the actual business goal is to predict the probability of a user placing an order, the user behavior annotation information used to indicate whether the user has performed an ordering behavior may be used as the target user behavior annotation information. The more types of annotation information and the more information used, the better the prediction effect of the model.

可选的，子步骤S21中的样本集可以通过如下步骤预先生成：Optionally, the sample set in sub-step S21 may be pre-generated by the following steps:

第一步，获取用户行为日志和用户行为日志对应的特征日志。其中，用户行为日志中包括历史场景(如历史推送场景、历史搜索场景等)中的历史候选信息集和用户对历史候选信息集中的各历史候选信息的行为数据。特征日志中可以包括历史候选信息集中的各历史候选信息对应的特征信息。The first step is to obtain the user behavior log and the feature log corresponding to the user behavior log. The user behavior log includes historical candidate information sets in historical scenarios (such as historical push scenarios, historical search scenarios, etc.) and user behavior data for each historical candidate information set in the historical candidate information set. The feature log may include feature information corresponding to each historical candidate information in the historical candidate information set.

第二步，对于每一个历史候选信息，基于用户对该历史候选信息的行为数据，生成该历史候选信息对应的多个用户行为标注信息。In the second step, for each historical candidate information, based on the behavior data of the user for the historical candidate information, a plurality of user behavior annotation information corresponding to the historical candidate information is generated.

第三步，将每一个历史候选信息对应的特征信息和用户行为标注信息作为一个样本，将各样本汇总为样本集。需要说明的是，训练过程以及样本集生成过程，可离线执行，以节约线上网络资源。In the third step, the feature information and user behavior annotation information corresponding to each historical candidate information are used as a sample, and each sample is aggregated into a sample set. It should be noted that the training process and the sample set generation process can be performed offline to save online network resources.

子步骤S22，基于样本集中的特征信息和各用户行为标注信息，利用机器学习方法对多任务网络和融合网络进行训练，得到排序模型。In sub-step S22, based on the feature information in the sample set and the labeling information of each user's behavior, a machine learning method is used to train the multi-task network and the fusion network to obtain a ranking model.

可选的，可以同时对多任务网络和融合网络同时进行训练。具体地，可以将样本集中的特征信息作为所述多任务网络的输入，将所输入的特征信息对应的各目标行为标注信息分别作为相应的子任务网络的输出，并将所输入的特征信息对应的用户行为标注信息中的目标行为标注信息作为融合网络的输出，利用机器学习方法同时对多任务网络和融合网络进行训练，得到排序模型。Optionally, the multi-task network and the fusion network can be trained simultaneously. Specifically, the feature information in the sample set can be used as the input of the multi-task network, the target behavior annotation information corresponding to the input feature information can be used as the output of the corresponding sub-task network, and the input feature information corresponding to The target behavior annotation information in the user behavior annotation information is used as the output of the fusion network, and the machine learning method is used to train the multi-task network and the fusion network at the same time to obtain a ranking model.

通过同时对多任务网络和融合网络进行联合训练，可减少训练时长，提高训练效率。By jointly training the multi-task network and the fusion network at the same time, the training time can be reduced and the training efficiency can be improved.

可选的，可以首先训练多任务网络，而后固定多任务网络的参数，对融合网络进行训练。具体地，首先可以将样本集中的特征信息作为多任务网络的输入，将所输入的特征信息对应的各用户行为标注信息分别作为多任务网络中的相应的子任务网络的输出，利用机器学习方法训练多任务网络。而后，可以固定多任务网络的参数，将样本集中的特征信息作为多任务网络的输入，将所输入的特征信息对应的用户行为标注信息中的目标行为标注信息作为融合网络的输出，利用机器学习方法训练融合网络，得到包含训练后的多任务网络和训练后的融合网络的排序模型。Optionally, the multi-task network can be trained first, and then the parameters of the multi-task network are fixed to train the fusion network. Specifically, first of all, the feature information in the sample set can be used as the input of the multi-task network, and the user behavior annotation information corresponding to the input feature information can be used as the output of the corresponding sub-task network in the multi-task network, using the machine learning method. Train a multi-task network. Then, the parameters of the multi-task network can be fixed, the feature information in the sample set can be used as the input of the multi-task network, and the target behavior annotation information in the user behavior annotation information corresponding to the input feature information can be used as the output of the fusion network. The method trains the fusion network and obtains a ranking model including the trained multi-task network and the trained fusion network.

由于联合训练时，子任务网络同时会收到子任务网络的输出结点的梯度信号和融合网络中反馈过来的梯度信号，因此模型参数更新可能会有比较大的抖动，模型不易收敛。而通过依次对多任务网络和融合网络进行单独训练，可基于一个梯度信号进行参数更新，使模型具有更好的收敛性。During joint training, the sub-task network will simultaneously receive the gradient signal of the output node of the sub-task network and the gradient signal fed back from the fusion network, so the model parameter update may have relatively large jitter, and the model is not easy to converge. By sequentially training the multi-task network and the fusion network separately, the parameters can be updated based on a gradient signal, so that the model has better convergence.

步骤104，基于得分，对各候选信息进行排序。Step 104 , sort each candidate information based on the score.

在本实施例中，上述执行主体可以基于得分，对各候选信息进行排序。例如，可按照得分由高到低的次序对各候选信息进行排序。In this embodiment, the above-mentioned execution body may sort each candidate information based on the score. For example, each candidate information may be sorted in order of high score to low score.

在本实施例的一些可选的实现方式中，候选信息集可以是在目标用户满足预设推送条件时召回的候选推送信息集。此时，在对各候选信息进行排序之后，上述执行主体还可以基于排序结果所指示的顺序，向目标用户推送候选信息集中的候选信息。例如，可选取候选信息集中的部分候选信息(如预设数量的信息)作为目标候选信息，向目标用户推送目标候选信息。从而，可将候选信息集中用户更为感兴趣的候选信息推荐给用户，以吸引用户点击、浏览等。由于排序结果更为准确，因而推送的信息与用户需求更具相关性。In some optional implementations of this embodiment, the candidate information set may be a candidate push information set that is recalled when the target user satisfies the preset push condition. At this time, after sorting each candidate information, the above-mentioned execution body may further push the candidate information in the candidate information set to the target user based on the order indicated by the sorting result. For example, part of the candidate information in the candidate information set (for example, a preset amount of information) may be selected as the target candidate information, and the target candidate information may be pushed to the target user. Therefore, the candidate information in the candidate information set that the user is more interested in can be recommended to the user, so as to attract the user to click, browse, and the like. Because the sorting results are more accurate, the pushed information is more relevant to user needs.

在本实施例的一些可选的实现方式中，候选信息集可以是基于目标用户的搜索请求召回的搜索结果集。此时，在对各候选信息进行排序之后，上述执行主体还可以基于排序结果所指示的顺序，向目标用户返回候选信息集中的候选信息。从而，可将用户更为感兴趣的候选信息展现于更靠前的位置，便于用户点击、浏览等。由于排序结果更为准确，因而返回的搜索结果与用户需求更具相关性。In some optional implementations of this embodiment, the candidate information set may be a search result set recalled based on the target user's search request. At this time, after sorting each candidate information, the above-mentioned execution body may also return the candidate information in the candidate information set to the target user based on the order indicated by the sorting result. Therefore, the candidate information that the user is more interested in can be displayed at a higher position, which is convenient for the user to click, browse, and the like. Because the sorting results are more accurate, the returned search results are more relevant to the user's needs.

本申请的上述实施例提供的方法，通过提取针对目标用户的候选信息集中的各候选信息对应的特征信息，而后分别将各特征信息输入至预先训练的包含多任务网络和融合网络的排序模型，得到各候选信息的得分，最后基于各得分，对各候选信息进行排序。由于排序模型中的多任务网络包含具有依赖关系的多个子任务网络，而融合网络能够将各子任务网络的输出结果进行融合，作为排序模型最终的输出，因而，排序模型最终的输出结果能够兼顾多个子任务网络的预测目标，不仅提高了学习过程中的信息利用率，还可是排序模型在学习过程中考虑到了实际应用场景中信息之间的依赖关系，由此，提高了排序模型的预测能力以及排序结果的准确性。In the method provided by the above-mentioned embodiments of the present application, the feature information corresponding to each candidate information in the candidate information set for the target user is extracted, and then each feature information is respectively input into a pre-trained sorting model including a multi-task network and a fusion network, The scores of each candidate information are obtained, and finally each candidate information is sorted based on each score. Since the multi-task network in the sorting model includes multiple sub-task networks with dependencies, and the fusion network can fuse the output results of each sub-task network as the final output of the sorting model, the final output of the sorting model can take into account The prediction targets of multiple sub-task networks not only improve the information utilization during the learning process, but also the ranking model takes into account the dependencies between the information in the actual application scenarios during the learning process, thus improving the prediction ability of the ranking model. and the accuracy of sorting results.

进一步参考图5，作为对上述各图所示方法的实现，本申请提供了一种信息排序装置的一个实施例，该装置实施例与图1所示的方法实施例相对应，该装置具体可以应用于各种电子设备中。Further referring to FIG. 5 , as an implementation of the methods shown in the above figures, the present application provides an embodiment of an information sorting apparatus, the apparatus embodiment corresponds to the method embodiment shown in FIG. 1 , and the apparatus may specifically be Used in various electronic devices.

如图5所示，本实施例所述的信息排序装置包括：获取单元501，被配置成获取针对目标用户的候选信息集；提取单元502，被配置成获取上述候选信息集中的各候选信息对应的特征信息；输入单元503，被配置成分别将各特征信息输入至预先训练的排序模型，得到各候选信息的得分，上述排序模型包括多任务网络和融合网络，上述多任务网络包括具有依赖关系的多个子任务网络，不同子任务网络用于预测不同用户行为发生的概率，上述融合网络与各子任务网络相连接，用于基于各子任务网络预测的概率，预测候选信息的得分；排序单元504，被配置成基于上述得分，对各候选信息进行排序。As shown in FIG. 5 , the information sorting apparatus according to this embodiment includes: an obtaining unit 501 configured to obtain a candidate information set for a target user; an extraction unit 502 configured to obtain the corresponding information of each candidate information in the above candidate information set The input unit 503 is configured to input each feature information into a pre-trained sorting model, respectively, to obtain the score of each candidate information, the above-mentioned sorting model includes a multi-task network and a fusion network, and the above-mentioned multi-task network includes a dependency relationship. multiple subtask networks, different subtask networks are used to predict the probability of different user behaviors, the above fusion network is connected to each subtask network, and is used to predict the score of candidate information based on the probability predicted by each subtask network; the sorting unit 504, which is configured to sort each candidate information based on the above-mentioned scores.

在本实施例的一些可选的实现方式中，上述多任务网络中的各子任务网络共享同一个输入层和至少一个隐藏层，所共享的隐藏层位于各子任务网络的浅层，上述融合网络还与各子任务网络所共享的隐藏层相连接。In some optional implementations of this embodiment, each sub-task network in the above-mentioned multi-task network shares the same input layer and at least one hidden layer, and the shared hidden layer is located in the shallow layer of each sub-task network, and the above-mentioned fusion The network is also connected to the hidden layers shared by the subtask networks.

在本实施例的一些可选的实现方式中，上述提取单元502，进一步被配置成：提取上述目标用户的用户特征；分别从上述候选信息集中的各候选信息中提取候选信息特征；对于每一个候选信息，将上述用户特征和该候选信息的候选信息特征进行汇总，得到该候选信息对应的特征信息。In some optional implementations of this embodiment, the above-mentioned extracting unit 502 is further configured to: extract the user characteristics of the above-mentioned target user; respectively extract candidate information characteristics from each candidate information in the above-mentioned candidate information set; for each For candidate information, the above-mentioned user characteristics and the candidate information characteristics of the candidate information are aggregated to obtain characteristic information corresponding to the candidate information.

在本实施例的一些可选的实现方式中，上述候选信息集中的候选信息为与上述目标用户的搜索请求对应的搜索结果；上述提取单元502，进一步被配置成：提取上述目标用户的用户特征；从上述搜索请求中提取请求特征；分别从上述候选信息集中的各候选信息中提取候选信息特征；对于每一个候选信息，将上述用户特征、上述请求特征和该候选信息的候选信息特征进行汇总，得到该候选信息对应的特征信息。In some optional implementations of this embodiment, the candidate information in the candidate information set is a search result corresponding to the search request of the target user; the extraction unit 502 is further configured to: extract the user characteristics of the target user Extract the request feature from the above-mentioned search request; Extract the candidate information feature from each candidate information in the above-mentioned candidate information set respectively; For each candidate information, summarize the above-mentioned user feature, the above-mentioned request feature and the candidate information feature of the candidate information , to obtain the feature information corresponding to the candidate information.

在本实施例的一些可选的实现方式中，上述多任务网络为贝叶斯多任务网络，上述贝叶斯多任务网络通过如下步骤构建：基于用户行为之间的依赖关系，构建有向无环图；确定具有依赖关系的每两个用户行为之间的条件概率；基于上述有向无环图和上述条件概率，构建包含具有依赖关系的多个子任务网络的贝叶斯多任务网络。In some optional implementations of this embodiment, the above-mentioned multi-task network is a Bayesian multi-task network, and the above-mentioned Bayesian multi-task network is constructed by the following steps: based on the dependencies between user behaviors, construct a directed Ring graph; determine the conditional probability between every two user behaviors with dependencies; build a Bayesian multitask network containing multiple subtask networks with dependencies based on the above directed acyclic graph and the above conditional probabilities.

在本实施例的一些可选的实现方式中，上述排序模型通过如下步骤预先训练得到：获取样本集，上述样本集中的样本包括历史候选信息对应的特征信息和多个用户行为标注信息，不同的用户行为标注信息用于指示用户是否对上述历史候选信息执行了不同的用户行为；基于上述样本集中的特征信息和各用户行为标注信息，利用机器学习方法对上述多任务网络和上述融合网络进行训练，得到排序模型。In some optional implementations of this embodiment, the above-mentioned sorting model is pre-trained by the following steps: acquiring a sample set, where the samples in the above-mentioned sample set include feature information corresponding to the historical candidate information and a plurality of user behavior annotation information. The user behavior annotation information is used to indicate whether the user has performed different user behaviors on the above-mentioned historical candidate information; based on the feature information in the above-mentioned sample set and each user behavior annotation information, the above-mentioned multi-task network and the above-mentioned fusion network are trained by machine learning methods , get the sorting model.

在本实施例的一些可选的实现方式中，上述样本集通过如下步骤预先生成：获取用户行为日志和上述用户行为日志对应的特征日志，上述用户行为日志中包括历史推送场景中的历史候选信息集和用户对上述历史候选信息集中的各历史候选信息的行为数据，上述特征日志中包括上述历史候选信息集中的各历史候选信息对应的特征信息；对于每一个历史候选信息，基于用户对该历史候选信息的行为数据，生成该历史候选信息对应的多个用户行为标注信息；将每一个历史候选信息对应的特征信息和用户行为标注信息作为一个样本，将各样本汇总为样本集。In some optional implementations of this embodiment, the above-mentioned sample set is pre-generated by the following steps: obtaining a user behavior log and a feature log corresponding to the above-mentioned user behavior log, and the above-mentioned user behavior log includes historical candidate information in a historical push scenario set and the user's behavior data for each historical candidate information in the above-mentioned historical candidate information set, the above-mentioned feature log includes the characteristic information corresponding to each historical candidate information in the above-mentioned historical candidate information set; for each historical candidate information, based on the user's historical candidate information The behavior data of the candidate information is used to generate multiple user behavior annotation information corresponding to the historical candidate information; the feature information and user behavior annotation information corresponding to each historical candidate information are used as a sample, and each sample is aggregated into a sample set.

在本实施例的一些可选的实现方式中，上述基于上述样本集中的特征信息和各用户行为标注信息，利用机器学习方法对上述多任务网络和上述融合网络进行训练，得到排序模型，包括：将上述样本集中的特征信息作为上述多任务网络的输入，将所输入的特征信息对应的各用户行为标注信息分别作为上述多任务网络中的相应的子任务网络的输出，利用机器学习方法训练上述多任务网络；固定上述多任务网络的参数，将上述样本集中的特征信息作为上述多任务网络的输入，将所输入的特征信息对应的用户行为标注信息中的目标行为标注信息作为上述融合网络的输出，利用机器学习方法训练上述融合网络，得到包含训练后的多任务网络和训练后的融合网络的排序模型。In some optional implementations of this embodiment, the above-mentioned multi-task network and the above-mentioned fusion network are trained based on the above-mentioned feature information in the above-mentioned sample set and each user's behavior annotation information by using a machine learning method to obtain a ranking model, including: The feature information in the above-mentioned sample set is used as the input of the above-mentioned multi-task network, the user behavior annotation information corresponding to the input characteristic information is used as the output of the corresponding sub-task network in the above-mentioned multi-task network, and the above-mentioned multi-task network is trained by using machine learning method. Multi-task network; fix the parameters of the multi-task network, use the feature information in the sample set as the input of the multi-task network, and use the target behavior annotation information in the user behavior annotation information corresponding to the input feature information as the fusion network. Output, use the machine learning method to train the above fusion network, and obtain a ranking model including the trained multi-task network and the trained fusion network.

在本实施例的一些可选的实现方式中，上述基于上述样本集中的特征信息和各用户行为标注信息，利用机器学习方法对上述多任务网络和上述融合网络进行训练，得到排序模型，包括：将上述样本集中的特征信息作为上述多任务网络的输入，将所输入的特征信息对应的各目标行为标注信息分别作为相应的子任务网络的输出，并将所输入的特征信息对应的用户行为标注信息中的目标行为标注信息作为上述融合网络的输出，利用机器学习方法同时对上述多任务网络和上述融合网络进行训练，得到排序模型。In some optional implementations of this embodiment, the above-mentioned multi-task network and the above-mentioned fusion network are trained based on the above-mentioned feature information in the above-mentioned sample set and each user's behavior annotation information by using a machine learning method to obtain a ranking model, including: The feature information in the above sample set is used as the input of the multi-task network, the target behavior annotation information corresponding to the input feature information is used as the output of the corresponding sub-task network, and the user behavior corresponding to the input feature information is labeled. The target behavior annotation information in the information is used as the output of the above fusion network, and the above multi-task network and the above fusion network are simultaneously trained by using the machine learning method to obtain a ranking model.

在本实施例的一些可选的实现方式中，所述候选信息集为在所述目标用户满足预设推送条件时召回的候选推送信息集；以及，所述装置还包括：推送单元，被配置成基于排序结果所指示的顺序，向所述目标用户推送所述候选信息集中的候选信息。In some optional implementations of this embodiment, the candidate information set is a candidate push information set that is recalled when the target user satisfies a preset push condition; and the apparatus further includes: a push unit configured to The candidate information in the candidate information set is pushed to the target user based on the order indicated by the sorting result.

在本实施例的一些可选的实现方式中，所述候选信息集为基于所述目标用户的搜索请求召回的搜索结果集；以及，所述装置还包括：返回单元，被配置成基于排序结果所指示的顺序，向所述目标用户返回所述候选信息集中的候选信息。In some optional implementations of this embodiment, the candidate information set is a search result set recalled based on a search request of the target user; and the apparatus further includes: a returning unit configured to be based on the sorting results In the indicated order, the candidate information in the candidate information set is returned to the target user.

本申请的上述实施例提供的装置，通过提取针对目标用户的候选信息集中的各候选信息对应的特征信息，而后分别将各特征信息输入至预先训练的包含多任务网络和融合网络的排序模型，得到各候选信息的得分，最后基于各得分，对各候选信息进行排序。由于排序模型中的多任务网络包含具有依赖关系的多个子任务网络，而融合网络能够将各子任务网络的输出结果进行融合，作为排序模型最终的输出，因而，排序模型最终的输出结果能够兼顾多个子任务网络的预测目标，不仅提高了学习过程中的信息利用率，还可是排序模型在学习过程中考虑到了实际应用场景中信息之间的依赖关系，由此，提高了排序模型的预测能力以及排序结果的准确性。The device provided by the above-mentioned embodiment of the present application extracts the feature information corresponding to each candidate information in the candidate information set for the target user, and then inputs each feature information into the pre-trained sorting model including the multi-task network and the fusion network, respectively, The scores of each candidate information are obtained, and finally each candidate information is sorted based on each score. Since the multi-task network in the sorting model includes multiple sub-task networks with dependencies, and the fusion network can fuse the output results of each sub-task network as the final output of the sorting model, the final output of the sorting model can take into account The prediction targets of multiple sub-task networks not only improve the information utilization during the learning process, but also the ranking model takes into account the dependencies between the information in the actual application scenarios during the learning process, thus improving the prediction ability of the ranking model. and the accuracy of sorting results.

下面参考图6，其示出了适于用来实现本申请实施例的电子设备的计算机系统的结构示意图。图6示出的电子设备仅仅是一个示例，不应对本申请实施例的功能和使用范围带来任何限制。Referring to FIG. 6 below, it shows a schematic structural diagram of a computer system suitable for implementing the electronic device according to the embodiment of the present application. The electronic device shown in FIG. 6 is only an example, and should not impose any limitations on the functions and scope of use of the embodiments of the present application.

如图6所示，计算机系统600包括中央处理单元(CPU)601，其可以根据存储在只读存储器(ROM)602中的程序或者从存储部分608加载到随机访问存储器(RAM)603中的程序而执行各种适当的动作和处理。在RAM 603中，还存储有系统600操作所需的各种程序和数据。CPU 601、ROM 602以及RAM 603通过总线604彼此相连。输入/输出(I/O)接口605也连接至总线604。As shown in FIG. 6, a computer system 600 includes a central processing unit (CPU) 601, which can be loaded into a random access memory (RAM) 603 according to a program stored in a read only memory (ROM) 602 or a program from a storage section 608 Instead, various appropriate actions and processes are performed. In the RAM 603, various programs and data necessary for the operation of the system 600 are also stored. The CPU 601 , the ROM 602 , and the RAM 603 are connected to each other through a bus 604 . An input/output (I/O) interface 605 is also connected to bus 604 .

以下部件连接至I/O接口605：包括键盘、鼠标等的输入部分606；包括诸如液晶显示器(LCD)等以及扬声器等的输出部分607；包括硬盘等的存储部分608；以及包括诸如LAN卡、调制解调器等的网络接口卡的通信部分609。通信部分609经由诸如因特网的网络执行通信处理。驱动器610也根据需要连接至I/O接口605。可拆卸介质611，诸如磁盘、光盘、磁光盘、半导体存储器等等，根据需要安装在驱动器610上，以便于从其上读出的计算机程序根据需要被安装入存储部分608。The following components are connected to the I/O interface 605: an input section 606 including a keyboard, a mouse, etc.; an output section 607 including a liquid crystal display (LCD), etc. and a speaker, etc.; a storage section 608 including a hard disk, etc.; Communication section 609 of a network interface card such as a modem. The communication section 609 performs communication processing via a network such as the Internet. A drive 610 is also connected to the I/O interface 605 as needed. A removable medium 611, such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, etc., is mounted on the drive 610 as needed so that a computer program read therefrom is installed into the storage section 608 as needed.

特别地，根据本公开的实施例，上文参考流程图描述的过程可以被实现为计算机软件程序。例如，本公开的实施例包括一种计算机程序产品，其包括承载在计算机可读介质上的计算机程序，该计算机程序包含用于执行流程图所示的方法的程序代码。在这样的实施例中，该计算机程序可以通过通信部分609从网络上被下载和安装，和/或从可拆卸介质611被安装。在该计算机程序被中央处理单元(CPU)601执行时，执行本申请的方法中限定的上述功能。需要说明的是，本申请所述的计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件，或者任意以上的组合。计算机可读存储介质的更具体的例子可以包括但不限于：具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本申请中，计算机可读存储介质可以是任何包含或存储程序的有形介质，该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。而在本申请中，计算机可读的信号介质可以包括在基带中或者作为载波一部分传播的数据信号，其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式，包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读的信号介质还可以是计算机可读存储介质以外的任何计算机可读介质，该计算机可读介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。计算机可读介质上包含的程序代码可以用任何适当的介质传输，包括但不限于：无线、电线、光缆、RF等等，或者上述的任意合适的组合。In particular, according to embodiments of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program carried on a computer-readable medium, the computer program containing program code for performing the method illustrated in the flowchart. In such an embodiment, the computer program may be downloaded and installed from the network via the communication portion 609 and/or installed from the removable medium 611 . When the computer program is executed by the central processing unit (CPU) 601, the above-described functions defined in the method of the present application are performed. It should be noted that the computer-readable medium described in this application may be a computer-readable signal medium or a computer-readable storage medium, or any combination of the above two. The computer-readable storage medium can be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or a combination of any of the above. More specific examples of computer readable storage media may include, but are not limited to, electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable Programmable read only memory (EPROM or flash memory), fiber optics, portable compact disk read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing. In this application, a computer-readable storage medium can be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device. In this application, however, a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, carrying computer-readable program code therein. Such propagated data signals may take a variety of forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing. A computer-readable signal medium can also be any computer-readable medium other than a computer-readable storage medium that can transmit, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device . Program code embodied on a computer readable medium may be transmitted using any suitable medium including, but not limited to, wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

附图中的流程图和框图，图示了按照本申请各种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上，流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分，该模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意，在有些作为替换的实现中，方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如，两个接连地表示的方框实际上可以基本并行地执行，它们有时也可以按相反的顺序执行，这依所涉及的功能而定。也要注意的是，框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合，可以用执行规定的功能或操作的专用的基于硬件的系统来实现，或者可以用专用硬件与计算机指令的组合来实现。The flowchart and block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code that contains one or more logical functions for implementing the specified functions executable instructions. It should also be noted that, in some alternative implementations, the functions noted in the blocks may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It is also noted that each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations, can be implemented in dedicated hardware-based systems that perform the specified functions or operations , or can be implemented in a combination of dedicated hardware and computer instructions.

描述于本申请实施例中所涉及到的单元可以通过软件的方式实现，也可以通过硬件的方式来实现。所描述的单元也可以设置在处理器中，其中，这些单元的名称在某种情况下并不构成对该单元本身的限定。The units involved in the embodiments of the present application may be implemented in a software manner, and may also be implemented in a hardware manner. The described units can also be provided in the processor, wherein the names of these units in some cases do not constitute a limitation of the units themselves.

作为另一方面，本申请还提供了一种计算机可读介质，该计算机可读介质可以是上述实施例中描述的装置中所包含的；也可以是单独存在，而未装配入该装置中。上述计算机可读介质承载有一个或者多个程序，当上述一个或者多个程序被该装置执行时，使得该装置：获取针对目标用户的候选信息集；提取候选信息集中的各候选信息对应的特征信息；分别将各特征信息输入至预先训练的排序模型，得到各候选信息的得分，排序模型包括多任务网络和融合网络，多任务网络包括具有依赖关系的多个子任务网络，不同子任务网络用于预测不同用户行为发生的概率，融合网络与各子任务网络相连接，用于基于各子任务网络预测的概率，预测候选信息的得分；基于得分，对各候选信息进行排序。As another aspect, the present application also provides a computer-readable medium, which may be included in the apparatus described in the above-mentioned embodiments, or may exist independently without being assembled into the apparatus. The above-mentioned computer-readable medium carries one or more programs, and when the above-mentioned one or more programs are executed by the device, the device causes the device to: acquire a candidate information set for the target user; extract features corresponding to each candidate information in the candidate information set information; input each feature information into the pre-trained ranking model respectively, and get the score of each candidate information. The ranking model includes multi-task network and fusion network. The multi-task network includes multiple sub-task networks with dependencies, and different sub-task networks use In order to predict the probability of different user behaviors, the fusion network is connected to each subtask network, and is used to predict the score of candidate information based on the predicted probability of each subtask network; based on the score, each candidate information is sorted.

以上描述仅为本申请的较佳实施例以及对所运用技术原理的说明。本领域技术人员应当理解，本申请中所涉及的发明范围，并不限于上述技术特征的特定组合而成的技术方案，同时也应涵盖在不脱离上述发明构思的情况下，由上述技术特征或其等同特征进行任意组合而形成的其它技术方案。例如上述特征与本申请中公开的(但不限于)具有类似功能的技术特征进行互相替换而形成的技术方案。The above description is only a preferred embodiment of the present application and an illustration of the applied technical principles. Those skilled in the art should understand that the scope of the invention involved in this application is not limited to the technical solution formed by the specific combination of the above technical features, and should also cover the above technical features or Other technical solutions formed by any combination of its equivalent features. For example, a technical solution is formed by replacing the above features with the technical features disclosed in this application (but not limited to) with similar functions.

Claims

1. an information sorting method, it is characterised in that the method comprises:

Obtain candidate information sets for target users;

extracting feature information corresponding to each candidate information in the candidate information set;

Input each feature information into a pre-trained ranking model, respectively, to obtain the score of each candidate information. The ranking model includes a multi-task network and a fusion network. The multi-task network includes a plurality of sub-task networks with dependencies. Different sub-tasks The network is used to predict the probability of occurrence of different user behaviors, and the fusion network is connected to each subtask network, and is used to predict the score of candidate information based on the probability predicted by each subtask network;

Based on the scores, each candidate information is ranked.

2. The method according to claim 1, wherein each subtask network in the multitask network shares the same input layer and at least one hidden layer, and the shared hidden layer is located in the shallow layer of each subtask network , the fusion network is also connected to the hidden layer shared by each subtask network.

3. The method according to claim 1, wherein the extracting feature information corresponding to each candidate information in the candidate information set comprises:

extracting user characteristics of the target user;

Extracting candidate information features from each candidate information in the candidate information set respectively;

For each candidate information, the user characteristics and the candidate information characteristics of the candidate information are aggregated to obtain characteristic information corresponding to the candidate information.

4. The method according to claim 1, wherein the candidate information in the candidate information set is a search result corresponding to the search request of the target user;

The extracting feature information corresponding to each candidate information in the candidate information set includes:

extracting user characteristics of the target user;

extracting request features from the search request;

For each candidate information, the user characteristic, the request characteristic and the candidate information characteristic of the candidate information are aggregated to obtain characteristic information corresponding to the candidate information.

5. The method according to claim 1, wherein the multi-task network is a Bayesian multi-task network, and the Bayesian multi-task network is constructed by the following steps:

Based on the dependencies between user behaviors, construct a directed acyclic graph;

Determine the conditional probability between every two user actions with dependencies;

Based on the directed acyclic graph and the conditional probability, a Bayesian multi-task network including multiple sub-task networks with dependencies is constructed.

6. The method according to claim 1, wherein the sorting model is pre-trained by the following steps:

Obtaining a sample set, the samples in the sample set include feature information corresponding to the historical candidate information and a plurality of user behavior annotation information, and different user behavior annotation information is used to indicate whether the user has performed different user behaviors on the historical candidate information;

Based on the feature information in the sample set and the labeling information of each user's behavior, the multi-task network and the fusion network are trained by using a machine learning method to obtain a ranking model.

7. The method according to claim 6, wherein the sample set is pre-generated by the following steps:

Obtain a user behavior log and a feature log corresponding to the user behavior log, where the user behavior log includes the historical candidate information set in the historical scene and the user's behavior data for each historical candidate information in the historical candidate information set, the The feature log includes feature information corresponding to each historical candidate information in the historical candidate information set;

For each historical candidate information, based on the user's behavior data for the historical candidate information, generate a plurality of user behavior annotation information corresponding to the historical candidate information;

The feature information and user behavior annotation information corresponding to each historical candidate information are taken as a sample, and each sample is aggregated into a sample set.

8. The method according to claim 6, wherein the multi-task network and the fusion network are trained by using a machine learning method based on the feature information in the sample set and each user's behavior annotation information, Get the sorting model, including:

The feature information in the sample set is used as the input of the multi-task network, the user behavior annotation information corresponding to the input feature information is used as the output of the corresponding sub-task network in the multi-task network, and machine learning is used. method trains the multi-task network;

The parameters of the multi-task network are fixed, the feature information in the sample set is used as the input of the multi-task network, and the target behavior annotation information in the user behavior annotation information corresponding to the input feature information is used as the fusion network. Output, using the machine learning method to train the fusion network to obtain a ranking model including the trained multi-task network and the trained fusion network.

9. The method according to claim 6, wherein the multi-task network and the fusion network are trained based on the feature information in the sample set and each user's behavior annotation information, using a machine learning method, Get the sorting model, including:

The feature information in the sample set is used as the input of the multi-task network, the target behavior annotation information corresponding to the input feature information is used as the output of the corresponding sub-task network, and the user corresponding to the input feature information is used. The target behavior annotation information in the behavior annotation information is used as the output of the fusion network, and the multi-task network and the fusion network are simultaneously trained by using a machine learning method to obtain a ranking model.

10. The method according to claim 1, wherein the candidate information set is a candidate push information set recalled when the target user satisfies a preset push condition; and

After the sorting of each candidate information, the method further includes:

Based on the order indicated by the sorting result, the candidate information in the candidate information set is pushed to the target user.

11. The method according to claim 1, wherein the candidate information set is a search result set recalled based on a search request of the target user; and

After the sorting of each candidate information, the method further includes:

Based on the order indicated by the sorting result, the candidate information in the candidate information set is returned to the target user.

12. An information sorting device, characterized in that the device comprises:

an obtaining unit, configured to obtain a candidate information set for the target user;

an extraction unit, configured to extract feature information corresponding to each candidate information in the candidate information set;

The input unit is configured to respectively input each feature information into a pre-trained ranking model to obtain a score for each candidate information, the ranking model includes a multi-task network and a fusion network, and the multi-task network includes a plurality of sub-systems with dependencies Task network, different subtask networks are used to predict the probability of occurrence of different user behaviors, and the fusion network is connected to each subtask network, and is used to predict the score of candidate information based on the probability predicted by each subtask network;

The sorting unit is configured to sort each candidate information based on the score.

13. An electronic device, characterized in that, comprising:

one or more processors;

a storage device on which one or more programs are stored,

The one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-11.

14. A computer-readable medium on which a computer program is stored, characterized in that, when the program is executed by a processor, the method according to any one of claims 1-11 is implemented.